linux/drivers
Srivatsa S. Bhat e8d05276f2 cpufreq: Revert commit 2f7021a8 to fix CPU hotplug regression
commit 2f7021a8 "cpufreq: protect 'policy->cpus' from offlining
during __gov_queue_work()" caused a regression in CPU hotplug,
because it lead to a deadlock between cpufreq governor worker thread
and the CPU hotplug writer task.

Lockdep splat corresponding to this deadlock is shown below:

[   60.277396] ======================================================
[   60.277400] [ INFO: possible circular locking dependency detected ]
[   60.277407] 3.10.0-rc7-dbg-01385-g241fd04-dirty #1744 Not tainted
[   60.277411] -------------------------------------------------------
[   60.277417] bash/2225 is trying to acquire lock:
[   60.277422]  ((&(&j_cdbs->work)->work)){+.+...}, at: [<ffffffff810621b5>] flush_work+0x5/0x280
[   60.277444] but task is already holding lock:
[   60.277449]  (cpu_hotplug.lock){+.+.+.}, at: [<ffffffff81042d8b>] cpu_hotplug_begin+0x2b/0x60
[   60.277465] which lock already depends on the new lock.

[   60.277472] the existing dependency chain (in reverse order) is:
[   60.277477] -> #2 (cpu_hotplug.lock){+.+.+.}:
[   60.277490]        [<ffffffff810ac6d4>] lock_acquire+0xa4/0x200
[   60.277503]        [<ffffffff815b6157>] mutex_lock_nested+0x67/0x410
[   60.277514]        [<ffffffff81042cbc>] get_online_cpus+0x3c/0x60
[   60.277522]        [<ffffffff814b842a>] gov_queue_work+0x2a/0xb0
[   60.277532]        [<ffffffff814b7891>] cs_dbs_timer+0xc1/0xe0
[   60.277543]        [<ffffffff8106302d>] process_one_work+0x1cd/0x6a0
[   60.277552]        [<ffffffff81063d31>] worker_thread+0x121/0x3a0
[   60.277560]        [<ffffffff8106ae2b>] kthread+0xdb/0xe0
[   60.277569]        [<ffffffff815bb96c>] ret_from_fork+0x7c/0xb0
[   60.277580] -> #1 (&j_cdbs->timer_mutex){+.+...}:
[   60.277592]        [<ffffffff810ac6d4>] lock_acquire+0xa4/0x200
[   60.277600]        [<ffffffff815b6157>] mutex_lock_nested+0x67/0x410
[   60.277608]        [<ffffffff814b785d>] cs_dbs_timer+0x8d/0xe0
[   60.277616]        [<ffffffff8106302d>] process_one_work+0x1cd/0x6a0
[   60.277624]        [<ffffffff81063d31>] worker_thread+0x121/0x3a0
[   60.277633]        [<ffffffff8106ae2b>] kthread+0xdb/0xe0
[   60.277640]        [<ffffffff815bb96c>] ret_from_fork+0x7c/0xb0
[   60.277649] -> #0 ((&(&j_cdbs->work)->work)){+.+...}:
[   60.277661]        [<ffffffff810ab826>] __lock_acquire+0x1766/0x1d30
[   60.277669]        [<ffffffff810ac6d4>] lock_acquire+0xa4/0x200
[   60.277677]        [<ffffffff810621ed>] flush_work+0x3d/0x280
[   60.277685]        [<ffffffff81062d8a>] __cancel_work_timer+0x8a/0x120
[   60.277693]        [<ffffffff81062e53>] cancel_delayed_work_sync+0x13/0x20
[   60.277701]        [<ffffffff814b89d9>] cpufreq_governor_dbs+0x529/0x6f0
[   60.277709]        [<ffffffff814b76a7>] cs_cpufreq_governor_dbs+0x17/0x20
[   60.277719]        [<ffffffff814b5df8>] __cpufreq_governor+0x48/0x100
[   60.277728]        [<ffffffff814b6b80>] __cpufreq_remove_dev.isra.14+0x80/0x3c0
[   60.277737]        [<ffffffff815adc0d>] cpufreq_cpu_callback+0x38/0x4c
[   60.277747]        [<ffffffff81071a4d>] notifier_call_chain+0x5d/0x110
[   60.277759]        [<ffffffff81071b0e>] __raw_notifier_call_chain+0xe/0x10
[   60.277768]        [<ffffffff815a0a68>] _cpu_down+0x88/0x330
[   60.277779]        [<ffffffff815a0d46>] cpu_down+0x36/0x50
[   60.277788]        [<ffffffff815a2748>] store_online+0x98/0xd0
[   60.277796]        [<ffffffff81452a28>] dev_attr_store+0x18/0x30
[   60.277806]        [<ffffffff811d9edb>] sysfs_write_file+0xdb/0x150
[   60.277818]        [<ffffffff8116806d>] vfs_write+0xbd/0x1f0
[   60.277826]        [<ffffffff811686fc>] SyS_write+0x4c/0xa0
[   60.277834]        [<ffffffff815bbbbe>] tracesys+0xd0/0xd5
[   60.277842] other info that might help us debug this:

[   60.277848] Chain exists of:
  (&(&j_cdbs->work)->work) --> &j_cdbs->timer_mutex --> cpu_hotplug.lock

[   60.277864]  Possible unsafe locking scenario:

[   60.277869]        CPU0                    CPU1
[   60.277873]        ----                    ----
[   60.277877]   lock(cpu_hotplug.lock);
[   60.277885]                                lock(&j_cdbs->timer_mutex);
[   60.277892]                                lock(cpu_hotplug.lock);
[   60.277900]   lock((&(&j_cdbs->work)->work));
[   60.277907]  *** DEADLOCK ***

[   60.277915] 6 locks held by bash/2225:
[   60.277919]  #0:  (sb_writers#6){.+.+.+}, at: [<ffffffff81168173>] vfs_write+0x1c3/0x1f0
[   60.277937]  #1:  (&buffer->mutex){+.+.+.}, at: [<ffffffff811d9e3c>] sysfs_write_file+0x3c/0x150
[   60.277954]  #2:  (s_active#61){.+.+.+}, at: [<ffffffff811d9ec3>] sysfs_write_file+0xc3/0x150
[   60.277972]  #3:  (x86_cpu_hotplug_driver_mutex){+.+...}, at: [<ffffffff81024cf7>] cpu_hotplug_driver_lock+0x17/0x20
[   60.277990]  #4:  (cpu_add_remove_lock){+.+.+.}, at: [<ffffffff815a0d32>] cpu_down+0x22/0x50
[   60.278007]  #5:  (cpu_hotplug.lock){+.+.+.}, at: [<ffffffff81042d8b>] cpu_hotplug_begin+0x2b/0x60
[   60.278023] stack backtrace:
[   60.278031] CPU: 3 PID: 2225 Comm: bash Not tainted 3.10.0-rc7-dbg-01385-g241fd04-dirty #1744
[   60.278037] Hardware name: Acer             Aspire 5741G    /Aspire 5741G    , BIOS V1.20 02/08/2011
[   60.278042]  ffffffff8204e110 ffff88014df6b9f8 ffffffff815b3d90 ffff88014df6ba38
[   60.278055]  ffffffff815b0a8d ffff880150ed3f60 ffff880150ed4770 3871c4002c8980b2
[   60.278068]  ffff880150ed4748 ffff880150ed4770 ffff880150ed3f60 ffff88014df6bb00
[   60.278081] Call Trace:
[   60.278091]  [<ffffffff815b3d90>] dump_stack+0x19/0x1b
[   60.278101]  [<ffffffff815b0a8d>] print_circular_bug+0x2b6/0x2c5
[   60.278111]  [<ffffffff810ab826>] __lock_acquire+0x1766/0x1d30
[   60.278123]  [<ffffffff81067e08>] ? __kernel_text_address+0x58/0x80
[   60.278134]  [<ffffffff810ac6d4>] lock_acquire+0xa4/0x200
[   60.278142]  [<ffffffff810621b5>] ? flush_work+0x5/0x280
[   60.278151]  [<ffffffff810621ed>] flush_work+0x3d/0x280
[   60.278159]  [<ffffffff810621b5>] ? flush_work+0x5/0x280
[   60.278169]  [<ffffffff810a9b14>] ? mark_held_locks+0x94/0x140
[   60.278178]  [<ffffffff81062d77>] ? __cancel_work_timer+0x77/0x120
[   60.278188]  [<ffffffff810a9cbd>] ? trace_hardirqs_on_caller+0xfd/0x1c0
[   60.278196]  [<ffffffff81062d8a>] __cancel_work_timer+0x8a/0x120
[   60.278206]  [<ffffffff81062e53>] cancel_delayed_work_sync+0x13/0x20
[   60.278214]  [<ffffffff814b89d9>] cpufreq_governor_dbs+0x529/0x6f0
[   60.278225]  [<ffffffff814b76a7>] cs_cpufreq_governor_dbs+0x17/0x20
[   60.278234]  [<ffffffff814b5df8>] __cpufreq_governor+0x48/0x100
[   60.278244]  [<ffffffff814b6b80>] __cpufreq_remove_dev.isra.14+0x80/0x3c0
[   60.278255]  [<ffffffff815adc0d>] cpufreq_cpu_callback+0x38/0x4c
[   60.278265]  [<ffffffff81071a4d>] notifier_call_chain+0x5d/0x110
[   60.278275]  [<ffffffff81071b0e>] __raw_notifier_call_chain+0xe/0x10
[   60.278284]  [<ffffffff815a0a68>] _cpu_down+0x88/0x330
[   60.278292]  [<ffffffff81024cf7>] ? cpu_hotplug_driver_lock+0x17/0x20
[   60.278302]  [<ffffffff815a0d46>] cpu_down+0x36/0x50
[   60.278311]  [<ffffffff815a2748>] store_online+0x98/0xd0
[   60.278320]  [<ffffffff81452a28>] dev_attr_store+0x18/0x30
[   60.278329]  [<ffffffff811d9edb>] sysfs_write_file+0xdb/0x150
[   60.278337]  [<ffffffff8116806d>] vfs_write+0xbd/0x1f0
[   60.278347]  [<ffffffff81185950>] ? fget_light+0x320/0x4b0
[   60.278355]  [<ffffffff811686fc>] SyS_write+0x4c/0xa0
[   60.278364]  [<ffffffff815bbbbe>] tracesys+0xd0/0xd5
[   60.280582] smpboot: CPU 1 is now offline

The intention of that commit was to avoid warnings during CPU
hotplug, which indicated that offline CPUs were getting IPIs from the
cpufreq governor's work items.  But the real root-cause of that
problem was commit a66b2e5 (cpufreq: Preserve sysfs files across
suspend/resume) because it totally skipped all the cpufreq callbacks
during CPU hotplug in the suspend/resume path, and hence it never
actually shut down the cpufreq governor's worker threads during CPU
offline in the suspend/resume path.

Reflecting back, the reason why we never suspected that commit as the
root-cause earlier, was that the original issue was reported with
just the halt command and nobody had brought in suspend/resume to the
equation.

The reason for _that_ in turn, as it turns out, is that earlier
halt/shutdown was being done by disabling non-boot CPUs while tasks
were frozen, just like suspend/resume....  but commit cf7df378a
(reboot: migrate shutdown/reboot to boot cpu) which came somewhere
along that very same time changed that logic: shutdown/halt no longer
takes CPUs offline.  Thus, the test-cases for reproducing the bug
were vastly different and thus we went totally off the trail.

Overall, it was one hell of a confusion with so many commits
affecting each other and also affecting the symptoms of the problems
in subtle ways.  Finally, now since the original problematic commit
(a66b2e5) has been completely reverted, revert this intermediate fix
too (2f7021a8), to fix the CPU hotplug deadlock.  Phew!

Reported-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Reported-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
Tested-by: Peter Wu <lekensteyn@gmail.com>
Cc: 3.10+ <stable@vger.kernel.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-07-16 22:46:48 +02:00
..
accessibility
acpi More power management and ACPI updates for 3.11-rc1 2013-07-11 12:28:17 -07:00
amba
ata Merge branch 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus 2013-07-13 14:52:21 -07:00
atm
auxdisplay
base Merge branch 'v4l_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media 2013-07-13 12:09:57 -07:00
bcma bcma: add support for BCM43142 2013-06-27 13:42:16 -04:00
block Merge branch 'next' of git://git.monstr.eu/linux-2.6-microblaze 2013-07-10 10:16:07 -07:00
bluetooth
bus ARM SoC device tree changes 2013-07-02 14:23:01 -07:00
cdrom drivers/cdrom/cdrom.c: use kzalloc() for failing hardware 2013-07-03 16:07:25 -07:00
char Merge branch 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus 2013-07-13 14:52:21 -07:00
clk Power management and ACPI updates for 3.11-rc1 2013-07-03 14:35:40 -07:00
clocksource Merge branch 'timers/clockevents' of git://git.linaro.org/people/dlezcano/clockevents into timers/urgent 2013-07-12 17:10:30 +02:00
connector
cpufreq cpufreq: Revert commit 2f7021a8 to fix CPU hotplug regression 2013-07-16 22:46:48 +02:00
cpuidle Power management and ACPI updates for 3.11-rc1 2013-07-03 14:35:40 -07:00
crypto crypto: talitos: use sg_pcopy_to_buffer() 2013-07-09 10:33:30 -07:00
dca
devfreq Merge branch 'akpm' (updates from Andrew Morton) 2013-07-03 17:12:13 -07:00
dio
dma drivers/dma/iop-adma.c: fix new warnings 2013-07-09 10:33:19 -07:00
edac Merge branch 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus 2013-07-13 14:52:21 -07:00
eisa
extcon drivers: avoid format string in dev_set_name 2013-07-03 16:07:41 -07:00
firewire
firmware Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc 2013-07-04 10:29:23 -07:00
fmc
gpio Power management and ACPI updates for 3.11-rc1 2013-07-03 14:35:40 -07:00
gpu drm: avoid warning in drm_load_edid_firmware() 2013-07-10 14:21:46 -07:00
hid Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input 2013-07-04 15:35:08 -07:00
hsi drivers: avoid format string in dev_set_name 2013-07-03 16:07:41 -07:00
hv
hwmon hwmon: (lm63) Drop redundant safety on cache lifetime 2013-07-08 14:18:24 +02:00
hwspinlock
i2c Merge branch 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus 2013-07-13 14:52:21 -07:00
ide Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/ide 2013-07-10 18:15:41 -07:00
idle
iio For the 3.11 merge we only have one new MFD driver for the Kontron PLD. 2013-07-10 11:10:27 -07:00
infiniband Main batch of InfiniBand/RDMA changes for 3.11 merge window: 2013-07-13 12:57:21 -07:00
input Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input 2013-07-13 18:05:13 -07:00
iommu IOMMU Updates for Linux 3.11 2013-07-10 14:46:40 -07:00
ipack
irqchip Merge branch 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2013-07-13 15:37:30 -07:00
isdn Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next 2013-07-09 18:24:39 -07:00
leds leds: mc13783: Fix "uninitialized variable" warning 2013-07-02 08:44:02 -07:00
lguest Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial 2013-07-04 11:40:58 -07:00
macintosh macintosh/windfarm: Remove obsolete cleanup for clientdata 2013-07-01 11:46:56 +10:00
mailbox
md Add a device-mapper target called dm-switch to provide a multipath 2013-07-11 13:05:40 -07:00
media Merge branch 'v4l_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media 2013-07-13 12:09:57 -07:00
memory
memstick drivers/memstick/host/r592.c: convert to module_pci_driver 2013-07-03 16:08:06 -07:00
message drivers: avoid format strings in names passed to alloc_workqueue() 2013-07-03 16:07:41 -07:00
mfd For the 3.11 merge we only have one new MFD driver for the Kontron PLD. 2013-07-10 11:10:27 -07:00
misc Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial 2013-07-04 11:40:58 -07:00
mmc MMC highlights for 3.11: 2013-07-10 11:16:00 -07:00
mtd A couple of fixes and clean-ups, allow for assigning user-defined 2013-07-05 12:09:48 -07:00
net Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2013-07-13 17:42:22 -07:00
nfc Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next 2013-07-09 18:24:39 -07:00
ntb
nubus
of Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next 2013-07-09 18:24:39 -07:00
oprofile
parisc parisc: fix LMMIO mismatch between PAT length and MASK register 2013-07-09 22:09:16 +02:00
parport Merge branch 'akpm' (updates from Andrew Morton) 2013-07-03 17:12:13 -07:00
pci Merge branch 'akpm' (updates from Andrew Morton) 2013-07-03 17:12:13 -07:00
pcmcia Driver core patches for 3.11-rc1 2013-07-02 11:44:19 -07:00
pinctrl Pin control changes for the v3.11 kernel cycle: 2013-07-03 11:48:03 -07:00
platform x86 platform drivers: fix gpio leak 2013-07-10 15:42:51 -04:00
pnp Power management and ACPI updates for 3.11-rc1 2013-07-03 14:35:40 -07:00
power Nothing exciting this time, just assorted fixes and cleanups. 2013-07-10 11:13:00 -07:00
pps pps-gpio: add device-tree binding and support 2013-07-03 16:08:06 -07:00
ps3
ptp
pwm
rapidio Merge branch 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus 2013-07-13 14:52:21 -07:00
regulator For the 3.11 merge we only have one new MFD driver for the Kontron PLD. 2013-07-10 11:10:27 -07:00
remoteproc Trivial remoteproc fixes by Suman Anna, Wei Yongjun and Thomas Meyer. 2013-07-11 12:35:09 -07:00
reset
rpmsg
rtc For the 3.11 merge we only have one new MFD driver for the Kontron PLD. 2013-07-10 11:10:27 -07:00
s390 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next 2013-07-09 18:24:39 -07:00
sbus
scsi SCSI for-linus on 20130713 2013-07-13 17:41:21 -07:00
sfi
sh Merge branch 'pm-assorted' 2013-06-28 13:01:40 +02:00
sn
spi Merge branch 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus 2013-07-13 14:52:21 -07:00
ssb Merge branch 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus 2013-07-13 14:52:21 -07:00
staging Merge branch 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus 2013-07-13 14:52:21 -07:00
target Merge branch 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending 2013-07-11 12:57:19 -07:00
tc
thermal Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/rzhang/linux 2013-07-11 12:26:08 -07:00
tty Merge branch 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus 2013-07-13 14:52:21 -07:00
uio uio: use vma_pages() to replace (vm_end - vm_start) >> PAGE_SHIFT 2013-07-03 16:07:26 -07:00
usb Merge branch 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus 2013-07-13 14:52:21 -07:00
uwb drivers: avoid format string in dev_set_name 2013-07-03 16:07:41 -07:00
vfio vfio Updates for v3.11 2013-07-10 14:50:08 -07:00
vhost Merge branch 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending 2013-07-11 12:57:19 -07:00
video Linux 3.11-rc1 2013-07-14 15:18:27 -07:00
virt
virtio No real surprises. 2013-07-10 14:50:58 -07:00
vlynq
vme
w1 drivers/w1/slaves/w1_ds2408.c: add magic sequence to disable P0 test mode 2013-07-03 16:08:06 -07:00
watchdog Merge branch 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus 2013-07-13 14:52:21 -07:00
xen Merge branch 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2013-07-06 14:09:38 -07:00
zorro zorro: switch to fixed_size_llseek() 2013-06-29 12:57:28 +04:00
Kconfig For the 3.11 merge we only have one new MFD driver for the Kontron PLD. 2013-07-10 11:10:27 -07:00
Makefile For the 3.11 merge we only have one new MFD driver for the Kontron PLD. 2013-07-10 11:10:27 -07:00