Stepan found:
CPU0 CPUn
_cpu_up()
__cpu_up()
boostrap()
notify_cpu_starting()
set_cpu_online()
while (!cpu_active())
cpu_relax()
<PREEMPT-out>
smp_call_function(.wait=1)
/* we find cpu_online() is true */
arch_send_call_function_ipi_mask()
/* wait-forever-more */
<PREEMPT-in>
local_irq_enable()
cpu_notify(CPU_ONLINE)
sched_cpu_active()
set_cpu_active()
Now the purpose of cpu_active is mostly with bringing down a cpu, where
we mark it !active to avoid the load-balancer from moving tasks to it
while we tear down the cpu. This is required because we only update the
sched_domain tree after we brought the cpu-down. And this is needed so
that some tasks can still run while we bring it down, we just don't want
new tasks to appear.
On cpu-up however the sched_domain tree doesn't yet include the new cpu,
so its invisible to the load-balancer, regardless of the active state.
So instead of setting the active state after we boot the new cpu (and
consequently having to wait for it before enabling interrupts) set the
cpu active before we set it online and avoid the whole mess.
Reported-by: Stepan Moskovchenko <stepanm@codeaurora.org>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1323965362.18942.71.camel@twins
Signed-off-by: Ingo Molnar <mingo@elte.hu>
It turns out that the logical CPU mapping is useful even when !CONFIG_SMP
for manipulation of devices like interrupt and power controllers when
running a UP kernel on a CPU other than 0. This can happen when kexecing
a UP image from an SMP kernel.
In the future, multi-cluster systems running AMP configurations will
require something similar for mapping cluster IDs, so it makes sense to
decouple this logic in preparation for this support.
Acked-by: Yang Bai <hamo.by@gmail.com>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Reported-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
We can stall RCU processing on SMP platforms if a CPU sits in its idle
loop for a long time. This happens because we don't call irq_enter()
and irq_exit() around generic_smp_call_function_interrupt() and
friends. Add the necessary calls, and remove the one from within
ipi_timer(), so that they're all in a common place.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Sending IPI_CPU_STOP to a CPU causes it to execute a busy cpu_relax
loop forever. This makes it impossible to kexec successfully on an SMP
system since the secondary CPUs do not reset.
This patch adds a callback to platform_cpu_kill, defined when
CONFIG_HOTPLUG_CPU=y, from the ipi_cpu_stop handling code. This function
currently just returns 1 on all platforms that define it but allows them
to do something more sophisticated in the future.
Signed-off-by: Will Deacon <will.deacon@arm.com>
The ARM SMP booting code allocates a temporary set of page tables
containing an identity mapping of the kernel image and provides this
to secondary CPUs for initial booting.
In reality, we only need to include the __turn_mmu_on function in the
identity mapping since the rest of the kernel is executing from virtual
addresses after this point.
This patch adds __turn_mmu_on to the .idmap.text section, allowing the
SMP booting code to use the idmap_pgd directly and not have to populate
its own set of page table.
As a result of this patch, we can make the identity_mapping_add function
static (since it is only used within mm/idmap.c) and also remove the
identity_mapping_del function. The identity map population is moved to
an early initcall so that it is setup in time for secondary CPU bringup.
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
When disabling and re-enabling the MMU, it is necessary to take out an
identity mapping for the code that manipulates the SCTLR in order to
avoid it disappearing from under our feet. This is useful when soft
rebooting and returning from CPU suspend.
This patch allocates a set of page tables during boot and populates them
with an identity mapping for the .idmap.text section. This means that
users of the identity map do not need to manage their own pgd and can
instead annotate their functions with __idmap or, in the case of assembly
code, place them in the correct section.
Acked-by: Dave Martin <dave.martin@linaro.org>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Tested-by: Lorenzo Pieralisi <Lorenzo.Pieralisi@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
* 'core-locking-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (27 commits)
rtmutex: Add missing rcu_read_unlock() in debug_rt_mutex_print_deadlock()
lockdep: Comment all warnings
lib: atomic64: Change the type of local lock to raw_spinlock_t
locking, lib/atomic64: Annotate atomic64_lock::lock as raw
locking, x86, iommu: Annotate qi->q_lock as raw
locking, x86, iommu: Annotate irq_2_ir_lock as raw
locking, x86, iommu: Annotate iommu->register_lock as raw
locking, dma, ipu: Annotate bank_lock as raw
locking, ARM: Annotate low level hw locks as raw
locking, drivers/dca: Annotate dca_lock as raw
locking, powerpc: Annotate uic->lock as raw
locking, x86: mce: Annotate cmci_discover_lock as raw
locking, ACPI: Annotate c3_lock as raw
locking, oprofile: Annotate oprofilefs lock as raw
locking, video: Annotate vga console lock as raw
locking, latencytop: Annotate latency_lock as raw
locking, timer_stats: Annotate table_lock as raw
locking, rwsem: Annotate inner lock as raw
locking, semaphores: Annotate inner lock as raw
locking, sched: Annotate thread_group_cputimer as raw
...
Fix up conflicts in kernel/posix-cpu-timers.c manually: making
cputimer->cputime a raw lock conflicted with the ABBA fix in commit
bcd5cff721 ("cputimer: Cure lock inversion").
The problem is related to the early enabling of interrupts and the
per cpu timer setup before the cpu is marked online. This doesn't
need to be done in order to call calibrate_delay().
calibrate_delay() monitors jiffies, which are updated from the CPU
which is waiting for the new CPU to set the online bit.
So simply calibrate_delay() can be called on the new CPU just from
the interrupt disabled region and move the local timer setup after
stored the cpu data and before enabling interrupts.
This solves both the cpu_online vs. cpu_active problem and the
affinity setting of the per cpu timers.
Signed-off-by: Thomas Gleinxer <tglx@linutronix.de>
Signed-off-by: Kukjin Kim <kgene.kim@samsung.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
This patch remove the hardcoded link between local timers and PPIs,
and convert the PPI users (TWD, MCT and MSM timers) to the new
*_percpu_irq interface. Also some collateral cleanup
(local_timer_ack() is gone, and the interrupt handler is strictly
private to each driver).
PPIs are now useable for more than just the local timers.
Additional testing by David Brown (msm8250 and msm8660) and
Shawn Guo (imx6q).
Cc: David Brown <davidb@codeaurora.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Acked-by: David Brown <davidb@codeaurora.org>
Tested-by: David Brown <davidb@codeaurora.org>
Tested-by: Shawn Guo <shawn.guo@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
PPI handling is a bit of an odd beast. It uses its own low level
handling code and is hardwired to the local timers (hence lacking
a registration interface).
Instead, switch the low handling to the normal SPI handling code.
PPIs are handled by the handle_percpu_devid_irq flow.
This also allows the removal of some duplicated code.
Cc: Kukjin Kim <kgene.kim@samsung.com>
Cc: David Brown <davidb@codeaurora.org>
Cc: Bryan Huntsman <bryanh@codeaurora.org>
Cc: Tony Lindgren <tony@atomide.com>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: Magnus Damm <magnus.damm@gmail.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Acked-by: David Brown <davidb@codeaurora.org>
Tested-by: David Brown <davidb@codeaurora.org>
Tested-by: Shawn Guo <shawn.guo@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
The definition of __exception_irq_entry for
CONFIG_FUNCTION_GRAPH_TRACER=y needs linux/ftrace.h, but this creates a
circular dependency with it's current home in asm/system.h. Create
asm/exception.h and update all current users.
v4: - rebase to rmk/for-next
v3: - remove redundant includes of linux/ftrace.h
v2: - document the usage restricitions of __exception*
Cc: Zoltan Devai <zdevai@gmail.com>
Signed-off-by: Jamie Iles <jamie@jamieiles.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
In order to be able to handle localtimer directly from C code instead of
assembly code, introduce handle_local_timer(), which is modeled after
handle_IRQ().
Signed-off-by: Shawn Guo <shawn.guo@linaro.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
In order to be able to handle IPI directly from C code instead of
assembly code, introduce handle_IPI(), which is modeled after handle_IRQ().
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
To allow booting Linux on a CPU with physical ID != 0, we need to
provide a mapping from the logical CPU number to the physical CPU
number.
This patch adds such a mapping and populates it during boot.
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
The affinity between ARM processors is defined in the MPIDR register.
We can identify which processors are in the same cluster,
and which ones have performance interdependency. We can define the
cpu topology of ARM platform, that is then used by sched_mc and sched_smt.
The default state of sched_mc and sched_smt config is disable.
When enabled, the behavior of the scheduler can be modified with
sched_mc_power_savings and sched_smt_power_savings sysfs interfaces.
Changes since v4 :
* Remove unnecessary parentheses and blank lines
Changes since v3 :
* Update the format of printk message
* Remove blank line
Changes since v2 :
* Update the commit message and some comments
Changes since v1 :
* Update the commit message
* Add read_cpuid_mpidr in arch/arm/include/asm/cputype.h
* Modify header of arch/arm/kernel/topology.c
* Modify tests and manipulation of MPIDR's bitfields
* Modify the place and dependancy of the config
* Modify Noop functions
Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org>
Reviewed-by: Amit Kucheria <amit.kucheria@linaro.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Annotate the low level hardware locks which must not be preempted.
In mainline this change documents the low level nature of
the lock - otherwise there's no functional difference. Lockdep
and Sparse checking will work as usual.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
This allows us to move duplicated code in <asm/atomic.h>
(atomic_inc_not_zero() for now) to <linux/atomic.h>
Signed-off-by: Arun Sharma <asharma@fb.com>
Reviewed-by: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: David Miller <davem@davemloft.net>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Acked-by: Mike Frysinger <vapier@gentoo.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
If an ARM system has multiple cpus in the same socket and the
kernel is booted with maxcpus=1, secondary cpus are possible but
not present due to how platform_smp_prepare_cpus() is called.
Since most typical ARM processors don't actually support physical
hotplug, initialize the present map to be equal to the possible
map in generic ARM SMP code. Also, always call
platform_smp_prepare_cpus() as long as max_cpus is non-zero (0
means no SMP) to allow platform code to do any SMP setup.
After applying this patch it's possible to boot an ARM system
with maxcpus=1 on the command line and then hotplug in secondary
cpus via sysfs. This is more in line with how x86 does things.
Signed-off-by: Stephen Boyd <sboyd@codeaurora.org>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: Kukjin Kim <kgene.kim@samsung.com>
Cc: David Brown <davidb@codeaurora.org>
Cc: Tony Lindgren <tony@atomide.com>
Cc: Srinidhi Kasagar <srinidhi.kasagar@stericsson.com>
Cc: Linus Walleij <linus.walleij@stericsson.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
When we bring a CPU online, we should wait for it to become active
before entering the idle thread, so we know that the scheduler and
thread migration is going to work.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
This patch makes TTBR1 point to swapper_pg_dir so that global, kernel
mappings can be used exclusively on v6 and v7 cores where they are
needed.
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Rather than having each platform class provide a mach/smp.h header for
smp_cross_call(), arrange for them to register the function with the
core ARM SMP code instead.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
This function is only called by percpu_timer_setup() which is
also __cpuinit marked. Thus it's safe to mark this function as
__cpuinit as well.
Signed-off-by: Stephen Boyd <sboyd@codeaurora.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
For future rework of try_to_wake_up() we'd like to push part of that
function onto the CPU the task is actually going to run on.
In order to do so we need a generic callback from the existing scheduler IPI.
This patch introduces such a generic callback: scheduler_ipi() and
implements it as a NOP.
BenH notes: PowerPC might use this IPI on offline CPUs under rare conditions!
Acked-by: Russell King <rmk+kernel@arm.linux.org.uk>
Acked-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Acked-by: Chris Metcalf <cmetcalf@tilera.com>
Acked-by: Jesper Nilsson <jesper.nilsson@axis.com>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Reviewed-by: Frank Rowand <frank.rowand@am.sony.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Nick Piggin <npiggin@kernel.dk>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/20110405152728.744338123@chello.nl
The current code support of dummy timers in absence of local
timer is compile time. This is an attempt to convert it to runtime
so that on few SOC version if the local timers aren't supported
kernel can switch to dummy timers. OMAP4430 ES1.0 does suffer from
this limitation.
This patch should not have any functional impact on affected
files.
Cc: Daniel Walker <dwalker@codeaurora.org>
Cc: Bryan Huntsman <bryanh@codeaurora.org>
Cc: Tony Lindgren <tony@atomide.com>
Cc: Kukjin Kim <kgene.kim@samsung.com>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: Magnus Damm <magnus.damm@gmail.com>
Cc: Colin Cross <ccross@android.com>
Cc: Erik Gilling <konkers@android.com>
Cc: Srinidhi Kasagar <srinidhi.kasagar@stericsson.com>
Cc: Linus Walleij <linus.walleij@stericsson.com>
Signed-off-by: Santosh Shilimkar <santosh.shilimkar@ti.com>
Acked-by: David Brown <davidb@codeaurora.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
We have two places where we create identity mappings - one when we bring
secondary CPUs online, and one where we setup some mappings for soft-
reboot. Combine these two into a single implementation. Also collect
the identity mapping deletion function.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
The MMU is always configured to read page tables from the L2 cache
so there's little point flushing them out of the L2 cache back to
RAM. Remove these flushes.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
When we soft-CPU hotplug a CPU, we reset the stack pointer and
jump back to start_secondary(). This allows us to restart as if
the CPU was actually reset.
However, we weren't resetting the frame pointer, which could cause
problems with backtracing. Reset the frame pointer to zero (which
means no parent frame) just like the early assembly code also does.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
smp.c is becoming too large, so split out the TLB maintainence
broadcasting into a separate smp_tlb.c file.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
When a CPU is hot unplugged, the generic tick code cleans up the
clock event device, but fails to call down to the device's set_mode
function to actually shut the device down.
To work around this, we've historically had a local_timer_stop()
callback out of the hotplug code. However, this adds needless
complexity when we have the clock event device itself available.
Explicitly call the clock event device's set_mode function with
CLOCK_EVT_MODE_UNUSED, so that the hardware can be cleanly shutdown
without any special external callbacks. When/if the generic code
is fixed, percpu_timer_stop() can be killed off.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
We used to print a bland error message which gave no clue as to the
failure when we failed to bring up a secondary CPU. Resolve this by
separating the two failure cases.
If boot_secondary() fails, we print a message indicating the returned
error code from boot_secondary():
"CPU%u: failed to boot: %d\n", cpu, ret.
However, if boot_secondary() succeeded, but the CPU did not appear to
mark itself online within the timeout, indicate that it failed to come
online:
"CPU%u: failed to come online\n", cpu
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Don't call idle_task_exit() with interrupts disabled, and ensure
that we have a memory barrier after interrupts are disabled but
before signalling that this CPU has shut down.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
We always need to wait for the dying CPU to reach a safe state before
taking it down, irrespective of the requirements of the platform.
Move the completion code into the ARM SMP hotplug code rather than
having each platform re-implement this.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
All platforms call trace_hardirqs_off() in their secondary startup code,
so move this into the core SMP code - it doesn't need to be in the
per-platform code.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
There is a certain amount of smp_prepare_cpus() which doesn't belong
in the platform support code - that is, code which is invariant to the
SMP implementation. Move this code into arch/arm/kernel/smp.c, and
add a platform_ prefix to the original function.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Wait for CPUs to indicate that they've stopped, after sending the
stop IPI, rather than blindly continuing on and hoping that they've
stopped in time. Print a warning if we fail to stop the other CPUs.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
The IPI and local timer interrupts weren't being properly accounted
for in /proc/stat. Collect them from the irq_stat structure, and
return their sum.
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
This separates out the individual IPI interrupt counts from the
total IPI count, which allows better visibility of what IPIs are
being used for.
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
As per x86, align the initial column according to how many IRQs we
have. Also, provide an english explaination for the 'LOC:' and
'IPI:' lines.
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Move the ipi_count into irq_stat, which allows the ipi_data structure
to be entirely removed.
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Provide __inc_irq_stat() and __get_irq_stat() to increment and
read the irq stat counters.
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
send_ipi_message() does nothing except call smp_cross_call(). As
this is a static function, nothing external to this file calls it,
so we can easily clean up this now unnecessary indirection.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>