This problem was introduced in 72961ecf84
since no space was reserved for the new attributes NFULA_HWTYPE,
NFULA_HWLEN and NFULA_HWHEADER.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Patrick McHardy <kaber@trash.net>
The function dl_seq_show() returns 1 (equal to SEQ_SKIP) in case
a seq_printf() call return -1. It should return -1.
This SEQ_SKIP behavior brakes processing the proc file e.g. via a
pipe or just through less.
Signed-off-by: Jesper Dangaard Brouer <hawk@comx.dk>
Signed-off-by: Patrick McHardy <kaber@trash.net>
platform_data != driver_data
driver data is actually the "correct" place of the struct however it is
not placed there due to the need of the ac97 struct. This is broken since
d9105c2b01 aka "[ARM] 5184/1: Split ucb1400_ts into core and touchscreen"
Signed-off-by: Manuel Traut <manut@linutronix.de>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Dmitry Torokhov <dtor@mail.ru>
A recent patch to raid5.c use min on an int and a sector_t.
This isn't allowed.
So change it to min_t(sector_t,x,y).
Signed-off-by: NeilBrown <neilb@suse.de>
Kernel 2.6.18 broke the MotU Fastlane, which uses duplicate endpoint
numbers in a manner that is not only illegal but also confuses the
kernel's endpoint descriptor caching mechanism. To work around this, we
have to add a separate usb_set_interface() call to guide the USB core to
the correct descriptors.
Signed-off-by: Clemens Ladisch <clemens@ladisch.de>
Reported-and-tested-by: David Fries <david@fries.net>
Cc: <stable@kernel.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
The PCM hw_ptr jiffies check results sometimes in problems when a
hardware doesn't give smooth hw_ptr updates. So far, au88x0 and some
other drivers appear not working due to this strict check.
However, this check is a nice debug tool, and the capability should be
still kept.
Hence, we disable this check now as default unless the user enables it
by setting the xrun_debug mode to the specific stream via a proc file.
Signed-off-by: Takashi Iwai <tiwai@suse.de>
An oops can occur if a user attempts to use both PCI logical
hotplug and the ACPI physical hotplug driver (acpiphp) in this
sequence, where $slot/address == $device.
In other words, if acpiphp has claimed a PCI device, and that
device is logically removed, then acpiphp may oops when it
attempts to access it again.
# echo 1 > /sys/bus/pci/devices/$device/remove
# echo 0 > /sys/bus/pci/slots/$slot/power
Unable to handle kernel NULL pointer dereference (address 0000000000000000)
Call Trace:
[<a000000100016390>] show_stack+0x50/0xa0
[<a000000100016c60>] show_regs+0x820/0x860
[<a00000010003b390>] die+0x190/0x2a0
[<a000000100066a40>] ia64_do_page_fault+0x8e0/0xa40
[<a00000010000c7a0>] ia64_native_leave_kernel+0x0/0x270
[<a0000001003b2660>] pci_remove_bus_device+0x120/0x260
[<a0000002060549f0>] acpiphp_disable_slot+0x410/0x540 [acpiphp]
[<a0000002060505c0>] disable_slot+0xc0/0x120 [acpiphp]
[<a0000002040d21c0>] power_write_file+0x1e0/0x2a0 [pci_hotplug]
[<a0000001003bb820>] pci_slot_attr_store+0x60/0xa0
[<a000000100240f70>] sysfs_write_file+0x230/0x2c0
[<a000000100195750>] vfs_write+0x190/0x2e0
[<a0000001001961a0>] sys_write+0x80/0x100
[<a00000010000c600>] ia64_ret_from_syscall+0x0/0x20
[<a000000000010720>] __kernel_syscall_via_break+0x0/0x20
The root cause of this oops is that the logical remove ("echo 1 >
/sys/bus/pci/devices/$device/remove") destroyed the pci_dev. The
pci_dev struct itself wasn't deallocated because acpiphp kept a
reference, but some of its fields became invalid.
acpiphp doesn't have any real reason to keep a pointer to a
pci_dev around. It can always derive it using pci_get_slot().
If a logical remove destroys the pci_dev, acpiphp won't find it
and is thus prevented from causing mischief.
Reviewed-by: Matthew Wilcox <willy@linux.intel.com>
Reviewed-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Tested-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Reported-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Acked-by: Bjorn Helgaas <bjorn.helgaas@hp.com>
Signed-off-by: Alex Chiang <achiang@hp.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
The hw_ptr_jiffies has to be reset properly to avoid the invalid
check of jiffies delta in snd_pcm_update_hw_ptr*() functions.
Especailly this patch fixes the bogus jiffies check after the puase
and resume.
This patch is a modified version of the original patch by Jaroslav.
Signed-off-by: Takashi Iwai <tiwai@suse.de>
The calls to flush_work() are pointless in a single thread workqueue
and they are actually causing a lockdep warning.
=============================================
[ INFO: possible recursive locking detected ]
2.6.30-rc6-02911-gbb803cf #16
---------------------------------------------
bluetooth/2518 is trying to acquire lock:
(bluetooth){+.+.+.}, at: [<c0130c14>] flush_work+0x28/0xb0
but task is already holding lock:
(bluetooth){+.+.+.}, at: [<c0130424>] worker_thread+0x149/0x25e
other info that might help us debug this:
2 locks held by bluetooth/2518:
#0: (bluetooth){+.+.+.}, at: [<c0130424>] worker_thread+0x149/0x25e
#1: (&conn->work_del){+.+...}, at: [<c0130424>] worker_thread+0x149/0x25e
stack backtrace:
Pid: 2518, comm: bluetooth Not tainted 2.6.30-rc6-02911-gbb803cf #16
Call Trace:
[<c03d64d9>] ? printk+0xf/0x11
[<c0140d96>] __lock_acquire+0x7ce/0xb1b
[<c0141173>] lock_acquire+0x90/0xad
[<c0130c14>] ? flush_work+0x28/0xb0
[<c0130c2e>] flush_work+0x42/0xb0
[<c0130c14>] ? flush_work+0x28/0xb0
[<f8b84966>] del_conn+0x1c/0x84 [bluetooth]
[<c0130469>] worker_thread+0x18e/0x25e
[<c0130424>] ? worker_thread+0x149/0x25e
[<f8b8494a>] ? del_conn+0x0/0x84 [bluetooth]
[<c0133843>] ? autoremove_wake_function+0x0/0x33
[<c01302db>] ? worker_thread+0x0/0x25e
[<c013355a>] kthread+0x45/0x6b
[<c0133515>] ? kthread+0x0/0x6b
[<c01034a7>] kernel_thread_helper+0x7/0x10
Based on a report by Oliver Hartkopp <oliver@hartkopp.net>
Signed-off-by: Dave Young <hidave.darkstar@gmail.com>
Tested-by: Oliver Hartkopp <oliver@hartkopp.net>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
The implementation we just revived has issues, such as using a
Kconfig-defined virtual address area in kernel space that nothing
actually carves out (and thus will overlap whatever is there),
or having some dependencies on being self contained in a single
PTE page which adds unnecessary constraints on the kernel virtual
address space.
This fixes it by using more classic PTE accessors and automatically
locating the area for consistent memory, carving an appropriate hole
in the kernel virtual address space, leaving only the size of that
area as a Kconfig option. It also brings some dma-mask related fixes
from the ARM implementation which was almost identical initially but
grew its own fixes.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Make FIXADDR_TOP a compile time constant and cleanup a
couple of definitions relative to the layout of the kernel
address space on ppc32. We also print out that layout at
boot time for debugging purposes.
This is a pre-requisite for properly fixing non-coherent
DMA allocactions.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Fix some more fallout of the string changes:
CC arch/blackfin/lib/strncmp.o
In file included from include/linux/bitmap.h:9,
from include/linux/nodemask.h:90,
from include/linux/mmzone.h:17,
from include/linux/gfp.h:5,
from include/linux/kmod.h:23,
from include/linux/module.h:14,
from arch/blackfin/lib/strncmp.c:14:
include/linux/string.h: In function ‘strstarts’:
include/linux/string.h:132: error: implicit declaration of function ‘strncmp’
make[1]: *** [arch/blackfin/lib/strncmp.o] Error 1
Signed-off-by: Mike Frysinger <vapier@gentoo.org>
CC: Rusty Russell <rusty@rustcorp.com.au>
All of the Blackfin lists are transparently moderated for non-subscribers.
i.e. there are no annoying notices and people get whitelisted after first
their posting.
Signed-off-by: Mike Frysinger <vapier@gentoo.org>
The previous commit "convert to net_device_ops" broke the Blackfin MAC
driver as it declared the new structure before the function it used:
CC drivers/net/bfin_mac.o
drivers/net/bfin_mac.c:984: error: ‘bfin_mac_close’ undeclared here (not in a function)
make[1]: *** [drivers/net/bfin_mac.o] Error 1
Signed-off-by: Mike Frysinger <vapier@gentoo.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Both atl1.c and atl2.c include atlx.h, which defines some modinfo
stuff. But atl2.c seems like it doesn't want the modinfo data
from atlx.h, as it defines its own.
Running modinfo on atl2.ko, we get conflicting information:
$ /sbin/modinfo drivers/net/atlx/atl2.ko | egrep "version|description|author"
version: 2.2.3
description: Atheros Fast Ethernet Network Driver
author: Atheros Corporation <xiong.huang@atheros.com>, Chris Snook <csnook@redhat.com>
version: 2.1.3
author: Xiong Huang <xiong.huang@atheros.com>, Chris Snook <csnook@redhat.com>, Jay Cliburn <jcliburn@gmail.com>
Move the modinfo data out of atlx.h and into atl1.c to eliminate
the confusion:
$ /sbin/modinfo drivers/net/atlx/atl1.ko | egrep "version|description|author"
version: 2.1.3
author: Xiong Huang <xiong.huang@atheros.com>, Chris Snook <csnook@redhat.com>, Jay Cliburn <jcliburn@gmail.com>
description: Atheros L1 Gigabit Ethernet Driver
$ /sbin/modinfo drivers/net/atlx/atl2.ko | egrep "version|description|author"
version: 2.2.3
description: Atheros Fast Ethernet Network Driver
author: Atheros Corporation <xiong.huang@atheros.com>, Chris Snook <csnook@redhat.com>
Reported-by: Scott Scriven <scott.scriven@hp.com>
Signed-off-by: Alex Chiang <achiang@hp.com>
Acked-by: Jay Cliburn <jcliburn@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Gianfar interrupt handler uses IEVENT_ERR_MASK to check and handle errors.
Babbling RX error (IEVENT_BABR) should be included in IEVENT_ERROR_MASK.
Otherwise if BABR is raised, it never gets handled nor cleared, and an
interrupt storm results. This has been observed to happen on sending a
burst of ethernet frames to a gianfar based board.
Signed-off-by: Xiaotian Feng <xiaotian.feng@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Avoid reading the unsynchronized value cs->classid multiple times,
since it could change concurrently from non-zero to zero; this would
result in the classifier returning a positive result with a bogus
(zero) classid.
Signed-off-by: Paul Menage <menage@google.com>
Reviewed-by: Li Zefan <lizf@cn.fujitsu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When AMD C1E is enabled, local APIC timer will stop even in C1. To avoid
suspend/resume hang, this patch removes C1 and replace it with a cpu_relax() in
suspend/resume path. This hasn't any impact in runtime path.
http://bugzilla.kernel.org/show_bug.cgi?id=13233
[ impact: avoid suspend/resume hang in AMD CPU with C1E enabled ]
Tested-by: Dmitry Lyzhyn <thisistempbox@yahoo.com>
Signed-off-by: Shaohua Li <shaohua.li@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
When AMD C1E is enabled, local APIC timer will stop even in C1.
This patch uses broadcast IPI to replace local APIC timer in C1.
http://bugzilla.kernel.org/show_bug.cgi?id=13233
[ impact: avoid boot hang in AMD CPU with C1E enabled ]
Tested-by: Dmitry Lyzhyn <thisistempbox@yahoo.com>
Signed-off-by: Shaohua Li <shaohua.li@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
This reverts commit 33f00dcedb.
While it was a good idea to try to use the mm/vmalloc.c allocator instead
of our own (in fact, ours is itself a dup on an old variant of the vmalloc
one), unfortunately, the approach is terminally busted since
dma_alloc_coherent() can be called at interrupt time or in atomic contexts
and there's little chances we'll make the code in mm/vmalloc.c cope with\ that :-(
Until we can get the generic code to forbid that idiocy and fix all
drivers abusing it, we pretty much have no choice but revert to
our custom virtual space allocator.
There's also a problem with SMP safety since freeing such mapping
would require an IPI which cannot be done at interrupt time.
However, right now, I don't think we support any platform that is
both SMP and has non-coherent DMA (don't laugh, I know such things
do exist !) so we can sort that out later.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
On the 865, but not the 855, the clflush we do appears to not actually make
it out to the hardware all the time. An easy way to safely reproduce was
X -retro, which would show that some of the blits involved in drawing the
lovely root weave didn't make it out to the hardware. Those blits are 32
bytes each, and 1-2 would be missing at various points around the screen.
Other experimentation (doing more clflush, doing more AGP chipset flush,
poking at some more device registers to maybe trigger more flushing) didn't
help. krh came up with the wbinvd as a way to successfully get all those
blits to appear.
Signed-off-by: Eric Anholt <eric@anholt.net>
The pitch field is an exponent on pre-965, so we were rejecting buffers
on 8xx that we shouldn't have. 915 got lucky in that the largest legal
value happened to match (8KB / 512 = 0x10), but 8xx has a smaller tile width.
Additionally, we programmed that bad value into the register on 8xx, so the
only pitch that would work correctly was 4096 (512-1023 pixels), while others
would probably give bad rendering or hangs.
Signed-off-by: Eric Anholt <eric@anholt.net>
fd.o bug #20473.
cap_bprm_set_creds() has to be called from security_bprm_set_creds().
TOMOYO forgot to call cap_bprm_set_creds() from tomoyo_bprm_set_creds()
and suid executables were not being working.
Make sure we call cap_bprm_set_creds() with TOMOYO, to set credentials
properly inside tomoyo_bprm_set_creds().
Signed-off-by: Herton Ronaldo Krzesinski <herton@mandriva.com.br>
Acked-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Signed-off-by: James Morris <jmorris@namei.org>
* 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
x86: avoid back to back on_each_cpu in cpa_flush_array
x86, relocs: ignore R_386_NONE in kernel relocation entries
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/anholt/drm-intel:
drm/i915: Add support for VGA load detection (pre-945).
drm/i915: Use an I2C algo to do the flip to SDVO DDC bus.
drm/i915: Determine type before initialising connector
drm/i915: Return SDVO LVDS VBT mode if no EDID modes are detected.
drm/i915: Fetch SDVO LVDS mode lines from VBT, then reserve them
i915: support 8xx desktop cursors
drm/i915: allocate large pointer arrays with vmalloc
Cleanup cpa_flush_array() to avoid back to back on_each_cpu() calls.
[ Impact: optimizes fix 0af48f42df ]
Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
* 'bugfixes' of git://git.linux-nfs.org/projects/trondmy/nfs-2.6:
NFSv4: Fix the case where NFSv4 renewal fails
nfs: fix build error in nfsroot with initconst
XPRTRDMA: fix client rpcrdma FRMR registration on mlx4 devices
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound-2.6:
ALSA: hda - Add missing check of pin vref 50 and others in Realtek codecs
ALSA: hda - Add 5stack-no-fp model for STAC927x
ALSA: hda - Add forced codec-slots for ASUS W5Fm
* 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/davej/cpufreq:
[CPUFREQ] powernow-k8: determine exact CPU frequency for HW Pstates
[CPUFREQ] powernow-k8 cleanup msg if BIOS does not export ACPI _PSS cpufreq data
[CPUFREQ] fix timer teardown in ondemand governor
[CPUFREQ] fix timer teardown in conservative governor
[CPUFREQ] remove rwsem lock from CPUFREQ_GOV_STOP call
[CPUFREQ] powernow-k7 build fix when ACPI=n
[CPUFREQ] add atom family to p4-clockmod
When KVM is loaded, and hence VT set up, the vmcall instruction in an
lguest guest causes a #GP, not #UD.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (21 commits)
r8169: avoid losing MSI interrupts
tcp: tcp_vegas ssthresh bugfix
mac8390: fix regression caused during net_device_ops conversion
gianfar: fix BUG under load after introduction of skb recycling
wimax/i2400m: usb: fix device reset on autosuspend while not yet idle
RxRPC: Error handling for rxrpc_alloc_connection()
ipv4: Fix oops with FIB_TRIE
pktgen: do not access flows[] beyond its length
gigaset: beyond ARRAY_SIZE of iwb->data
IPv6: set RTPROT_KERNEL to initial route
net: fix rtable leak in net/ipv4/route.c
net: fix length computation in rt_check_expire()
wireless: beyond ARRAY_SIZE of intf->crypto_stats
iwlwifi: update 5000 ucode support to version 2 of API
cfg80211: fix race between core hint and driver's custom apply
airo: fix airo_get_encode{,ext} buffer overflow like I mean it...
ath5k: fix interpolation with equal power levels
iwlwifi: do not cancel delayed work inside spin_lock_irqsave
ath5k: fix exp off-by-one when computing OFDM delta slope
wext: verify buffer size for SIOCSIWENCODEEXT
...
If the asynchronous lease renewal fails (usually due to a soft timeout),
then we _must_ schedule state recovery in order to ensure that we don't
lose the lease unnecessarily or, if the lease is already lost, that we
recover the locking state promptly...
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
fix build error with latest kbuild adjustments to initconst.
The commit a447c09324 ("vfs: Use
const for kernel parser table") changed:
static match_table_t __initdata tokens = {
to
static match_table_t __initconst tokens = {
But the missing const causes popwerpc to fail with latest
updates to __initconst like this:
fs/nfs/nfsroot.c:400: error: __setup_str_nfs_root_setup causes a section type conflict
fs/nfs/nfsroot.c:400: error: __setup_str_nfs_root_setup causes a section type conflict
The bug is only present with kbuild-next.
Following patch has been build tested.
Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
Cc: Steven Whitehouse <swhiteho@redhat.com>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Acked-by: Jan Beulich <jbeulich@novell.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
mlx4/connectX FRMR requires local write enable together with remote
rdma write enable. This fixes NFS/RDMA operation over the ConnectX
Infiniband HCA in the default memreg mode.
Signed-off-by: Vu Pham <vu@mellanox.com>
Signed-off-by: Tom Talpey <tmtalpey@gmail.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Two approaches for VGA detections: hot plug detection for 945G onwards
and load pipe detection for Pre-945G. Load pipe detection will get one free
pipe, set border color as red and blue, then check CRT status by
swf register. This is a sync-up with the 2D driver.
Signed-off-by: Ma Ling <ling.ma@intel.com>
Signed-off-by: Eric Anholt <eric@anholt.net>
Slightly modified by trenn@suse.de -> only do this on fam 10h and fam 11h.
Currently powernow-k8 determines CPU frequency from ACPI PSS objects, but
according to AMD family 11h BKDG this frequency is just a rounded value:
"CoreFreq (MHz) = The CPU COF specified by MSRC001_00[6B:64][CpuFid]
rounded to the nearest 100 Mhz."
As a consequnce powernow-k8 reports wrong CPU frequency on some systems,
e.g. on Turion X2 Ultra:
powernow-k8: Found 1 AMD Turion(tm)X2 Ultra DualCore Mobile ZM-82
processors (2 cpu cores) (version 2.20.00)
powernow-k8: 0 : pstate 0 (2200 MHz)
powernow-k8: 1 : pstate 1 (1100 MHz)
powernow-k8: 2 : pstate 2 (600 MHz)
But this is wrong as frequency for Pstate2 is 550 MHz. x86info reports it
correctly:
#x86info -a |grep Pstate
...
Pstate-0: fid=e, did=0, vid=24 (2200MHz)
Pstate-1: fid=e, did=1, vid=30 (1100MHz)
Pstate-2: fid=e, did=2, vid=3c (550MHz) (current)
Solution is to determine the frequency directly from Pstate MSRs instead
of using rounded values from ACPI table.
Signed-off-by: Andreas Herrmann <andreas.herrmann3@amd.com>
Signed-off-by: Thomas Renninger <trenn@suse.de>
Signed-off-by: Dave Jones <davej@redhat.com>
- Make the message shorter and easier to grep for
- Use printk_once instead of WARN_ONCE (functionality of these was mixed)
Signed-off-by: Thomas Renninger <trenn@suse.de>
Cc: Langsdorf, Mark <mark.langsdorf@amd.com>
Signed-off-by: Dave Jones <davej@redhat.com>
* Rafael J. Wysocki (rjw@sisk.pl) wrote:
> This message has been generated automatically as a part of a report
> of regressions introduced between 2.6.28 and 2.6.29.
>
> The following bug entry is on the current list of known regressions
> introduced between 2.6.28 and 2.6.29. Please verify if it still should
> be listed and let me know (either way).
>
>
> Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=13186
> Subject : cpufreq timer teardown problem
> Submitter : Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
> Date : 2009-04-23 14:00 (24 days old)
> References : http://marc.info/?l=linux-kernel&m=124049523515036&w=4
> Handled-By : Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
> Patch : http://patchwork.kernel.org/patch/19754/
> http://patchwork.kernel.org/patch/19753/
>
(updated changelog)
cpufreq fix timer teardown in ondemand governor
The problem is that dbs_timer_exit() uses cancel_delayed_work() when it should
use cancel_delayed_work_sync(). cancel_delayed_work() does not wait for the
workqueue handler to exit.
The ondemand governor does not seem to be affected because the
"if (!dbs_info->enable)" check at the beginning of the workqueue handler returns
immediately without rescheduling the work. The conservative governor in
2.6.30-rc has the same check as the ondemand governor, which makes things
usually run smoothly. However, if the governor is quickly stopped and then
started, this could lead to the following race :
dbs_enable could be reenabled and multiple do_dbs_timer handlers would run.
This is why a synchronized teardown is required.
The following patch applies to, at least, 2.6.28.x, 2.6.29.1, 2.6.30-rc2.
Depends on patch
cpufreq: remove rwsem lock from CPUFREQ_GOV_STOP call
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
CC: Andrew Morton <akpm@linux-foundation.org>
CC: gregkh@suse.de
CC: stable@kernel.org
CC: cpufreq@vger.kernel.org
CC: Ingo Molnar <mingo@elte.hu>
CC: rjw@sisk.pl
CC: Ben Slusky <sluskyb@paranoiacs.org>
Signed-off-by: Dave Jones <davej@redhat.com>
* Rafael J. Wysocki (rjw@sisk.pl) wrote:
> This message has been generated automatically as a part of a report
> of regressions introduced between 2.6.28 and 2.6.29.
>
> The following bug entry is on the current list of known regressions
> introduced between 2.6.28 and 2.6.29. Please verify if it still should
> be listed and let me know (either way).
>
>
> Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=13186
> Subject : cpufreq timer teardown problem
> Submitter : Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
> Date : 2009-04-23 14:00 (24 days old)
> References : http://marc.info/?l=linux-kernel&m=124049523515036&w=4
> Handled-By : Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
> Patch : http://patchwork.kernel.org/patch/19754/
> http://patchwork.kernel.org/patch/19753/
>
(re-send with updated changelog)
cpufreq fix timer teardown in conservative governor
The problem is that dbs_timer_exit() uses cancel_delayed_work() when it should
use cancel_delayed_work_sync(). cancel_delayed_work() does not wait for the
workqueue handler to exit.
The ondemand governor does not seem to be affected because the
"if (!dbs_info->enable)" check at the beginning of the workqueue handler returns
immediately without rescheduling the work. The conservative governor in
2.6.30-rc has the same check as the ondemand governor, which makes things
usually run smoothly. However, if the governor is quickly stopped and then
started, this could lead to the following race :
dbs_enable could be reenabled and multiple do_dbs_timer handlers would run.
This is why a synchronized teardown is required.
Depends on patch
cpufreq: remove rwsem lock from CPUFREQ_GOV_STOP call
The following patch applies to 2.6.30-rc2. Stable kernels have a similar
issue which should also be fixed, but the code changed between 2.6.29
and 2.6.30, so this patch only applies to 2.6.30-rc.
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
CC: Andrew Morton <akpm@linux-foundation.org>
CC: gregkh@suse.de
CC: stable@kernel.org
CC: cpufreq@vger.kernel.org
CC: Ingo Molnar <mingo@elte.hu>
CC: rjw@sisk.pl
CC: Ben Slusky <sluskyb@paranoiacs.org>
Signed-off-by: Dave Jones <davej@redhat.com>
* Rafael J. Wysocki (rjw@sisk.pl) wrote:
> This message has been generated automatically as a part of a report
> of regressions introduced between 2.6.28 and 2.6.29.
>
> The following bug entry is on the current list of known regressions
> introduced between 2.6.28 and 2.6.29. Please verify if it still should
> be listed and let me know (either way).
>
>
> Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=13186
> Subject : cpufreq timer teardown problem
> Submitter : Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
> Date : 2009-04-23 14:00 (24 days old)
> References : http://marc.info/?l=linux-kernel&m=124049523515036&w=4
> Handled-By : Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
> Patch : http://patchwork.kernel.org/patch/19754/
> http://patchwork.kernel.org/patch/19753/
The patches linked above depend on the following patch to remove
circular locking dependency :
cpufreq: remove rwsem lock from CPUFREQ_GOV_STOP call
(the following issue was faced when using cancel_delayed_work_sync() in the
timer teardown (which fixes a race).
* KOSAKI Motohiro (kosaki.motohiro@jp.fujitsu.com) wrote:
> Hi
>
> my box output following warnings.
> it seems regression by commit 7ccc7608b836e58fbacf65ee4f8eefa288e86fac.
>
> A: work -> do_dbs_timer() -> cpu_policy_rwsem
> B: store() -> cpu_policy_rwsem -> cpufreq_governor_dbs() -> work
>
>
Hrm, I think it must be due to my attempt to fix the timer teardown race
in ondemand governor mixed with new locking behavior in 2.6.30-rc.
The rwlock seems to be taken around the whole call to
cpufreq_governor_dbs(), when it should be only taken around accesses to
the locked data, and especially *not* around the call to
dbs_timer_exit().
Reverting my fix attempt would put the teardown race back in place
(replacing the cancel_delayed_work_sync by cancel_delayed_work).
Instead, a proper fix would imply modifying this critical section :
cpufreq.c: __cpufreq_remove_dev()
...
if (cpufreq_driver->target)
__cpufreq_governor(data, CPUFREQ_GOV_STOP);
unlock_policy_rwsem_write(cpu);
To make sure the __cpufreq_governor() callback is not called with rwsem
held. This would allow execution of cancel_delayed_work_sync() without
being nested within the rwsem.
Applies on top of the 2.6.30-rc5 tree.
Required to remove circular dep in teardown of both conservative and
ondemande governors so they can use cancel_delayed_work_sync().
CPUFREQ_GOV_STOP does not modify the policy, therefore this locking seemed
unneeded.
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
CC: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Greg KH <greg@kroah.com>
CC: Ingo Molnar <mingo@elte.hu>
CC: "Rafael J. Wysocki" <rjw@sisk.pl>
CC: Ben Slusky <sluskyb@paranoiacs.org>
CC: Chris Wright <chrisw@sous-sol.org>
CC: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Dave Jones <davej@redhat.com>