commit 0d0752bca1 upstream.
Looking into the active_asids array is not enough, as we also need
to look into the reserved_asids array (they both represent processes
that are currently running).
Also, not holding the ASID allocator lock is racy, as another CPU
could schedule that process and trigger a rollover, making the erratum
workaround miss an IPI.
Exposing this outside of context.c is a little ugly on the side, so
let's define a new entry point that the erratum workaround can call
to obtain the cpumask.
Acked-by: Will Deacon <will.deacon@arm.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit c5f927a6f6 upstream.
With this change, we no longer lose the innermost entry in the user-mode
part of the call chain. See also the x86 port, which includes the ip.
It's possible to partially work around this problem by post-processing
the data to use the PERF_SAMPLE_IP value, but this works only if the CPU
wasn't in the kernel when the sample was taken.
Signed-off-by: Jed Davis <jld@mozilla.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Updates:
-------
- Rebased over 3.10 final
- Differences from big-LITTLE-MP-master-v18
- New Patches:
- master-config-fragments: 1 new patch
- "config: Disable priority filtering for HMP Scheduler"
- master-misc-patches: 1 new patch
- "mm: make vmstat_update periodic run conditional"
- New Branches:
- master-task-placement-v2-updates: 7 patches
New patches from ARM added in a new topic branch stacked on top
of master-task-placement-v2-sysfs...
- Revert "sched: Enable HMP priority filter by default"
- "HMP: Use unweighted load for hmp migration decisions"
- "HMP: Select least-loaded CPU when performing HMP Migrations"
- "HMP: Avoid multiple calls to hmp_domain_min_load in fast path"
- "HMP: Force new non-kernel tasks onto big CPUs until load stabilises"
- "sched: Restrict nohz balance kicks to stay in the HMP domain"
- "HMP: experimental: Force all rt tasks to start on little domain."
Commands used for merge:
-----------------------
$ git checkout -b big-LITTLE-MP-master-v19 v3.10
$ git merge master-arm-multi_pmu_v2 master-config-fragments \
master-hw-bkpt-fix master-misc-patches master-task-placement-v2 \
master-task-placement-v2-sysfs master-task-placement-v2-updates
This patch restricts the allowed cpu mask for rt tasks initially started
with a full cpu mask to the little domain.
An rt task is specified as real time in __setscheduler() which is finally
called for all rt tasks (kernel and user land). In this function we
restrict the allowed cpu mask to the little domain.
This also prevents that a rt tasks can later be pushed to the big domain
because the function find_lowest_rq() will only recognize the allowed cpu
mask of a task to find the new cpu the task runs on.
Current kludges of the patch:
* Since we do not have an API to get the cpu mask of the A7 cluster,
hmp_slow_cpu_mask is made global in arm/kernel/topology.c for now.
* The watchdog_enable() function calls sched_setscheduler() before
kthread_bind() for the cpu specific watchdog kernel threads. The order of
these two calls has to be changed to make this patch work.
Signed-off-by: Dietmar Eggemann <dietmar.eggemann@arm.com>
Previously, an offline CPU would always appear to have a zero load
and this would distort the offload functionality used for balancing
big and little domains.
Maintain a mask of online CPUs in each domain and use this instead.
Change-Id: I639b564b2f40cb659af8ceb8bd37f84b8a1fe323
Signed-off-by: Chris Redpath <chris.redpath@arm.com>
The patch "sched: Use device-tree to provide fast/slow CPU list for HMP"
depends on the ordering of CPU's in the device tree. It breaks to determine
the logical mask correctly if the logical mask of the CPUs differ from
physical ordering in the device tree.
This patch fix the logic by depending on the mpidr in the device tree
and mapping that mpidr to the logical cpu.
Signed-off-by: Sudeep KarkadaNagesha <sudeep.karkadanagesha@arm.com>
Signed-off-by: Liviu Dudau <Liviu.Dudau@arm.com>
On homogeneous (non-heterogeneous) systems all CPUs will be declared
'fast' and the slow cpu list will be empty. In this situation we need to
avoid adding an empty slow HMP domain otherwise the scheduler code will
blow up when it attempts to move a task to the slow domain.
Signed-off-by: Jon Medhurst <tixy@linaro.org>
SCHED_HMP requires the different cpu types to be represented by an
ordered list of hmp_domains. Each hmp_domain represents all cpus of
a particular type using a cpumask.
The list is platform specific and therefore must be generated by
platform code by implementing arch_get_hmp_domains().
Signed-off-by: Morten Rasmussen <Morten.Rasmussen@arm.com>
We can't rely on Kconfig options to set the fast and slow CPU lists for
HMP scheduling if we want a single kernel binary to support multiple
devices with different CPU topology. E.g. TC2 (ARM's Test-Chip-2
big.LITTLE system), Fast Models, or even non big.LITTLE devices.
This patch adds the function arch_get_fast_and_slow_cpus() to generate
the lists at run-time by parsing the CPU nodes in device-tree; it
assumes slow cores are A7s and everything else is fast. The function
still supports the old Kconfig options as this is useful for testing the
HMP scheduler on devices without big.LITTLE.
This patch is reuse of a patch by Jon Medhurst <tixy@linaro.org> with a
few bits left out.
Signed-off-by: Morten Rasmussen <Morten.Rasmussen@arm.com>
Commit {9a6eb31 ARM: hw_breakpoint: Debug powerdown support for self-hosted
debug} introduces debug powerdown support for self-hosted debug.
While merging the patch 'has_ossr' check was removed which
was needed for hardwares which doesn't support self-hosted debug.
Pandaboard (A9) is one such hardware and Dietmar's orginial
patch did mention this issue.
Without that check on Panda with CPUIDLE enabled, a flood of
below messages thrown.
[ 3.597930] hw-breakpoint: CPU 0 failed to disable vector catch
[ 3.597991] hw-breakpoint: CPU 1 failed to disable vector catch
So restore that check back to avoid the mentioned issue.
Cc: Dietmar Eggemann <dietmar.eggemann@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Reported-by: Santosh Shilimkar <santosh.shilimkar@ti.com>
Acked-by: Santosh Shilimkar <santosh.shilimkar@ti.com>
Signed-off-by: Lokesh Vutla <lokeshvutla@ti.com>
This adds core support for saving and restoring CPU PMU registers
for suspend/resume support i.e. deeper C-states in cpuidle terms.
This patch adds support only to ARMv7 PMU registers save/restore.
It needs to be extended to xscale and ARMv6 if needed.
Signed-off-by: Sudeep KarkadaNagesha <sudeep.karkadanagesha@arm.com>
The userspace perf tool provides options to specify PMU names from command
line for the event. An example of pmu event syntax would be
(<pmu_name>/<config>/<modifier>)
However the parser in the perf tool breaks the tokens at spacesand fails to
identify the PMU name with spaces correctly.
This patch removes spaces in the ARMv7 CPU PMU names.
Signed-off-by: Sudeep KarkadaNagesha <sudeep.karkadanagesha@arm.com>
This patch sets the cpu affinity for the perf IRQs in the logical order
within the cluster. However interupts are assumed to be specified in the
same logical order within the cluster.
Signed-off-by: Sudeep KarkadaNagesha <sudeep.karkadanagesha@arm.com>
In a system with multiple heterogeneous CPU PMUs and each PMUs can handle
events on a subset of CPUs, probably belonging a the same cluster.
This patch introduces a cpumask to track which CPUs each PMU supports.
It also updates armpmu_event_init to reject cpu-specific events being
initialised for unsupported CPUs. Since process-specific events can be
initialised for all the CPU PMUs,armpmu_start/stop/add are modified to
prevent from being added on unsupported CPUs.
Signed-off-by: Sudeep KarkadaNagesha <sudeep.karkadanagesha@arm.com>
In order to support multiple, heterogeneous CPU PMUs and distinguish
them, they cannot be registered as PERF_TYPE_RAW type. Instead we can
get perf core to allocate a new idr type id for each PMU.
Userspace applications can refer sysfs entried to find a PMU's type,
which can then be used in tracking events on individual PMUs.
Signed-off-by: Sudeep KarkadaNagesha <sudeep.karkadanagesha@arm.com>
A single global CPU PMU pointer is not useful in a system with multiple,
heterogeneous CPU PMUs as we need to access the relevant PMU depending
on the current CPU.
This patch replaces the single global CPU PMU pointer with per-cpu
pointers and changes the OProfile accessors to refer to the PMU affine
to CPU0.
Signed-off-by: Sudeep KarkadaNagesha <Sudeep.KarkadaNagesha@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Some device drivers like PMU require to retrieve the logical cpu mask
that corresponds to a given cluster id. This patch provides a hook in
the topology code that, given an existing cluster id as input,
initializes the corresponding cpumask passed as a pointer, reusing all
existing topology information required by sched domains in the kernel.
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>