linux/kernel/sched
Josh Poimboeuf eda8dca519 sched/debug: Fix deadlock when enabling sched events
I see a hang when enabling sched events:

  echo 1 > /sys/kernel/debug/tracing/events/sched/enable

The printk buffer shows:

  BUG: spinlock recursion on CPU#1, swapper/1/0
   lock: 0xffff88007d5d8c00, .magic: dead4ead, .owner: swapper/1/0, .owner_cpu: 1
  CPU: 1 PID: 0 Comm: swapper/1 Not tainted 4.7.0-rc2+ #1
  Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.8.1-20150318_183358- 04/01/2014
  ...
  Call Trace:
   <IRQ>  [<ffffffff8143d663>] dump_stack+0x85/0xc2
   [<ffffffff81115948>] spin_dump+0x78/0xc0
   [<ffffffff81115aea>] do_raw_spin_lock+0x11a/0x150
   [<ffffffff81891471>] _raw_spin_lock+0x61/0x80
   [<ffffffff810e5466>] ? try_to_wake_up+0x256/0x4e0
   [<ffffffff810e5466>] try_to_wake_up+0x256/0x4e0
   [<ffffffff81891a0a>] ? _raw_spin_unlock_irqrestore+0x4a/0x80
   [<ffffffff810e5705>] wake_up_process+0x15/0x20
   [<ffffffff810cebb4>] insert_work+0x84/0xc0
   [<ffffffff810ced7f>] __queue_work+0x18f/0x660
   [<ffffffff810cf9a6>] queue_work_on+0x46/0x90
   [<ffffffffa00cd95b>] drm_fb_helper_dirty.isra.11+0xcb/0xe0 [drm_kms_helper]
   [<ffffffffa00cdac0>] drm_fb_helper_sys_imageblit+0x30/0x40 [drm_kms_helper]
   [<ffffffff814babcd>] soft_cursor+0x1ad/0x230
   [<ffffffff814ba379>] bit_cursor+0x649/0x680
   [<ffffffff814b9d30>] ? update_attr.isra.2+0x90/0x90
   [<ffffffff814b5e6a>] fbcon_cursor+0x14a/0x1c0
   [<ffffffff81555ef8>] hide_cursor+0x28/0x90
   [<ffffffff81558b6f>] vt_console_print+0x3bf/0x3f0
   [<ffffffff81122c63>] call_console_drivers.constprop.24+0x183/0x200
   [<ffffffff811241f4>] console_unlock+0x3d4/0x610
   [<ffffffff811247f5>] vprintk_emit+0x3c5/0x610
   [<ffffffff81124bc9>] vprintk_default+0x29/0x40
   [<ffffffff811e965b>] printk+0x57/0x73
   [<ffffffff810f7a9e>] enqueue_entity+0xc2e/0xc70
   [<ffffffff810f7b39>] enqueue_task_fair+0x59/0xab0
   [<ffffffff8106dcd9>] ? kvm_sched_clock_read+0x9/0x20
   [<ffffffff8103fb39>] ? sched_clock+0x9/0x10
   [<ffffffff810e3fcc>] activate_task+0x5c/0xa0
   [<ffffffff810e4514>] ttwu_do_activate+0x54/0xb0
   [<ffffffff810e5cea>] sched_ttwu_pending+0x7a/0xb0
   [<ffffffff810e5e51>] scheduler_ipi+0x61/0x170
   [<ffffffff81059e7f>] smp_trace_reschedule_interrupt+0x4f/0x2a0
   [<ffffffff81893ba6>] trace_reschedule_interrupt+0x96/0xa0
   <EOI>  [<ffffffff8106e0d6>] ? native_safe_halt+0x6/0x10
   [<ffffffff8110fb1d>] ? trace_hardirqs_on+0xd/0x10
   [<ffffffff81040ac0>] default_idle+0x20/0x1a0
   [<ffffffff8104147f>] arch_cpu_idle+0xf/0x20
   [<ffffffff81102f8f>] default_idle_call+0x2f/0x50
   [<ffffffff8110332e>] cpu_startup_entry+0x37e/0x450
   [<ffffffff8105af70>] start_secondary+0x160/0x1a0

Note the hang only occurs when echoing the above from a physical serial
console, not from an ssh session.

The bug is caused by a deadlock where the task is trying to grab the rq
lock twice because printk()'s aren't safe in sched code.

Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Matt Fleming <matt@codeblueprint.co.uk>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: stable@vger.kernel.org
Fixes: cb2517653f ("sched/debug: Make schedstats a runtime tunable that is disabled by default")
Link: http://lkml.kernel.org/r/20160613073209.gdvdybiruljbkn3p@treble
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-06-14 12:47:21 +02:00
..
auto_group.c sched/core: Move the sched_to_prio[] arrays out of line 2015-12-04 10:34:46 +01:00
auto_group.h sched, timer: Convert usages of ACCESS_ONCE() in the scheduler to READ_ONCE()/WRITE_ONCE() 2015-05-08 12:11:32 +02:00
clock.c sched/clock: Make local_clock()/cpu_clock() inline 2016-04-13 12:25:22 +02:00
completion.c sched/completion: Serialize completion_done() with complete() 2015-02-18 14:27:40 +01:00
core.c sched/fair: Fix post_init_entity_util_avg() serialization 2016-06-14 10:58:34 +02:00
cpuacct.c sched/cpuacct: Check for NULL when using task_pt_regs() 2016-04-13 13:22:37 +02:00
cpuacct.h sched/cpuacct: Simplify the cpuacct code 2016-03-21 11:00:28 +01:00
cpudeadline.c sched/core: Use tsk_cpus_allowed() instead of accessing ->cpus_allowed 2016-05-12 09:55:35 +02:00
cpudeadline.h sched/deadline: Unify dl_time_before() usage 2015-09-23 09:51:25 +02:00
cpufreq_schedutil.c cpufreq: schedutil: Improve prints messages with pr_fmt 2016-05-19 01:02:52 +02:00
cpufreq.c cpufreq: sched: Helpers to add and remove update_util hooks 2016-04-02 01:08:43 +02:00
cpupri.c sched/core: Use tsk_cpus_allowed() instead of accessing ->cpus_allowed 2016-05-12 09:55:35 +02:00
cpupri.h sched/cpupri: Remove unnecessary definitions in cpupri.h 2014-11-16 10:58:59 +01:00
cputime.c sched/cputime: Fix steal_account_process_tick() to always return jiffies 2016-03-08 12:24:56 +01:00
deadline.c sched/core: Provide a tsk_nr_cpus_allowed() helper 2016-05-12 09:55:36 +02:00
debug.c sched/debug: Fix /proc/sched_debug regression 2016-06-08 14:31:58 +02:00
fair.c sched/debug: Fix deadlock when enabling sched events 2016-06-14 12:47:21 +02:00
features.h sched/fair: Convert arch_scale_cpu_capacity() from weak function to #define 2015-09-13 09:52:55 +02:00
idle_task.c locking/lockdep, sched/core: Implement a better lock pinning scheme 2016-05-05 09:23:59 +02:00
idle.c cpuidle: Do not access cpuidle_devices when !CONFIG_CPU_IDLE 2016-06-02 23:05:27 +02:00
loadavg.c sched/loadavg: Fix loadavg artifacts on fully idle and on fully loaded systems 2016-05-12 09:55:34 +02:00
Makefile cpufreq: schedutil: New governor based on scheduler utilization data 2016-04-02 01:09:12 +02:00
rt.c sched/core: Provide a tsk_nr_cpus_allowed() helper 2016-05-12 09:55:36 +02:00
sched.h Power management material for v4.7-rc1 2016-05-16 19:17:22 -07:00
stats.c sched: use %*pb[l] to print bitmaps including cpumasks and nodemasks 2015-02-13 21:21:37 -08:00
stats.h sched/debug: Fix /proc/sched_debug regression 2016-06-08 14:31:58 +02:00
stop_task.c locking/lockdep, sched/core: Implement a better lock pinning scheme 2016-05-05 09:23:59 +02:00
swait.c wait.[ch]: Introduce the simple waitqueue (swait) implementation 2016-02-25 11:27:16 +01:00
wait.c sched/wait: Fix the signal handling fix 2015-12-13 14:30:59 -08:00