linux/include/trace/events
Thomas Gleixner f2530dc71c kthread: Prevent unpark race which puts threads on the wrong cpu
The smpboot threads rely on the park/unpark mechanism which binds per
cpu threads on a particular core. Though the functionality is racy:

CPU0	       	 	CPU1  	     	    CPU2
unpark(T)				    wake_up_process(T)
  clear(SHOULD_PARK)	T runs
			leave parkme() due to !SHOULD_PARK  
  bind_to(CPU2)		BUG_ON(wrong CPU)						    

We cannot let the tasks move themself to the target CPU as one of
those tasks is actually the migration thread itself, which requires
that it starts running on the target cpu right away.

The solution to this problem is to prevent wakeups in park mode which
are not from unpark(). That way we can guarantee that the association
of the task to the target cpu is working correctly.

Add a new task state (TASK_PARKED) which prevents other wakeups and
use this state explicitly for the unpark wakeup.

Peter noticed: Also, since the task state is visible to userspace and
all the parked tasks are still in the PID space, its a good hint in ps
and friends that these tasks aren't really there for the moment.

The migration thread has another related issue.

CPU0	      	     	 CPU1
Bring up CPU2
create_thread(T)
park(T)
 wait_for_completion()
			 parkme()
			 complete()
sched_set_stop_task()
			 schedule(TASK_PARKED)

The sched_set_stop_task() call is issued while the task is on the
runqueue of CPU1 and that confuses the hell out of the stop_task class
on that cpu. So we need the same synchronizaion before
sched_set_stop_task().

Reported-by: Dave Jones <davej@redhat.com>
Reported-and-tested-by: Dave Hansen <dave@sr71.net>
Reported-and-tested-by: Borislav Petkov <bp@alien8.de>
Acked-by: Peter Ziljstra <peterz@infradead.org>
Cc: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
Cc: dhillf@gmail.com
Cc: Ingo Molnar <mingo@kernel.org>
Cc: stable@vger.kernel.org
Link: http://lkml.kernel.org/r/alpine.LFD.2.02.1304091635430.21884@ionos
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2013-04-12 14:18:43 +02:00
..
9p.h net/9p: Convert net/9p protocol dumps to tracepoints 2011-10-24 11:13:12 -05:00
asoc.h ASoC: dapm: Fix x86_64 build warning. 2012-04-23 13:15:35 +01:00
block.h block: add block_{touch|dirty}_buffer tracepoint 2013-01-14 15:00:36 +01:00
btrfs.h Btrfs: parse parent 0 into correct value in tracepoint 2012-12-16 20:46:18 -05:00
compaction.h UAPI: (Scripted) Convert #include "..." to #include <path/...> in kernel system headers 2012-10-02 18:01:25 +01:00
ext3.h userns: Convert ext3 to use kuid/kgid where appropriate 2012-05-15 14:59:27 -07:00
ext4.h ext4: optimize ext4_es_shrink() 2013-02-28 23:58:56 -05:00
gfpflags.h mm: add a __GFP_KMEMCG flag 2012-12-18 15:02:12 -08:00
gpio.h gpio: add trace events for setting direction and value 2011-05-20 00:40:19 -06:00
irq.h rcu: Use softirq to address performance regression 2011-06-14 15:25:39 -07:00
jbd2.h jbd2: add tracepoints which provide per-handle statistics 2013-02-08 13:00:22 -05:00
jbd.h jbd: Write journal superblock with WRITE_FUA after checkpointing 2012-05-15 23:34:37 +02:00
kmem.h UAPI: (Scripted) Convert #include "..." to #include <path/...> in kernel system headers 2012-10-02 18:01:25 +01:00
kvm.h KVM: trace: Fix exit decoding. 2013-01-10 15:51:11 -02:00
lock.h tracing: Factorize lock events in a lock class 2010-05-09 13:45:35 +02:00
mce.h tracing: Fix event alignment: mce:mce_record 2011-03-10 10:34:28 -05:00
migrate.h mm: migrate: Add a tracepoint for migrate_pages 2012-12-11 14:28:35 +00:00
module.h include: replace linux/module.h with "struct module" wherever possible 2011-10-31 19:32:32 -04:00
napi.h napi: Convert trace_napi_poll to TRACE_EVENT 2010-09-07 17:51:01 +02:00
net.h net: tracepoint of net_dev_xmit sees freed skb and causes panic 2011-06-02 14:06:31 -07:00
oom.h mm, oom: change type of oom_score_adj to short 2012-12-11 17:22:27 -08:00
power.h PM / tracing: remove deprecated power trace API 2013-01-26 00:39:12 +01:00
printk.h printk/tracing: Add console output tracing 2012-02-13 13:46:05 -05:00
random.h random: add tracepoints for easier debugging and verification 2012-07-14 20:17:48 -04:00
ras.h aerdrv: Trace Event for PCI Express Advanced Error Reporting 2013-01-03 14:31:44 -08:00
rcu.h Merge branches 'doctorture.2013.01.29a', 'fixes.2013.01.26a', 'tagcb.2013.01.24a' and 'tiny.2013.01.29b' into HEAD 2013-01-28 22:25:21 -08:00
regmap.h The following text was taken from the original review request: 2012-03-24 10:41:37 -07:00
regulator.h regulator: Add basic trace facilities 2011-01-12 14:33:00 +00:00
rpm.h device.h: audit and cleanup users in main include dir 2012-03-16 10:38:24 -04:00
sched.h kthread: Prevent unpark race which puts threads on the wrong cpu 2013-04-12 14:18:43 +02:00
scsi.h [SCSI] Include protection operation in SCSI command trace 2011-03-14 18:36:02 -05:00
signal.h tracing: let trace_signal_generate() report more info, kill overflow_fail/lose_info 2012-01-13 18:48:50 +01:00
skb.h tracing: Fix event alignment: skb:kfree_skb 2011-03-10 10:34:31 -05:00
sock.h core: add tracepoints for queueing skb to rcvbuf 2011-06-21 16:06:10 -07:00
sunrpc.h SUNRPC: Adding status trace points 2012-02-06 10:37:53 -05:00
syscalls.h tracing: Allow raw syscall trace events for non privileged users 2010-11-18 14:37:43 +01:00
task.h mm, oom: change type of oom_score_adj to short 2012-12-11 17:22:27 -08:00
timer.h tracing: Fix timer tracing 2010-08-19 13:00:41 +02:00
udp.h udp: add tracepoints for queueing skb to rcvbuf 2011-06-21 16:06:10 -07:00
vmscan.h UAPI: (Scripted) Convert #include "..." to #include <path/...> in kernel system headers 2012-10-02 18:01:25 +01:00
workqueue.h workqueue: rename cpu_workqueue to pool_workqueue 2013-02-13 19:29:12 -08:00
writeback.h writeback: add more tracepoints 2013-01-14 15:00:36 +01:00
xen.h xen/mmu: Use Xen specific TLB flush instead of the generic one. 2012-10-31 12:38:31 -04:00