Add support for extra ioctl calls by adding a
ioctl watchdog operation. This operation will be
called before we do our own handling of ioctl
commands. This way we can override the internal
ioctl command handling and we can also add
extra ioctl commands. The ioctl watchdog operation
should return the appropriate error codes or
-ENOIOCTLCMD if the ioctl command should be handled
through the internal ioctl handling of the framework.
Signed-off-by: Alan Cox <alan@lxorguk.ukuu.org.uk>
Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Wolfram Sang <w.sang@pengutronix.de>
Add support for the nowayout feature to the
WatchDog Timer Driver Core framework.
This feature prevents the watchdog timer from being
stopped.
Signed-off-by: Alan Cox <alan@lxorguk.ukuu.org.uk>
Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Wolfram Sang <w.sang@pengutronix.de>
Add support for the Magic Close feature to the
WatchDog Timer Driver Core framework.
Signed-off-by: Alan Cox <alan@lxorguk.ukuu.org.uk>
Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Wolfram Sang <w.sang@pengutronix.de>
This part add's the WDIOC_SETTIMEOUT and WDIOC_GETTIMEOUT ioctl
functionality to the WatchDog Timer Driver Core framework.
Signed-off-by: Alan Cox <alan@lxorguk.ukuu.org.uk>
Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Wolfram Sang <w.sang@pengutronix.de>
This part add's the WDIOC_SETOPTIONS ioctl functionality
to the WatchDog Timer Driver Core framework.
Signed-off-by: Alan Cox <alan@lxorguk.ukuu.org.uk>
Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Wolfram Sang <w.sang@pengutronix.de>
This part add's the WDIOC_KEEPALIVE ioctl functionality to the
WatchDog Timer Driver Core framework. Please note that the
WDIOF_KEEPALIVEPING bit has to be set in the watchdog_info
options field.
Signed-off-by: Alan Cox <alan@lxorguk.ukuu.org.uk>
Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Wolfram Sang <w.sang@pengutronix.de>
This part add's the basic ioctl functionality to the
WatchDog Timer Driver Core framework. The supported
ioctl call's are:
WDIOC_GETSUPPORT
WDIOC_GETSTATUS
WDIOC_GETBOOTSTATUS
Signed-off-by: Alan Cox <alan@lxorguk.ukuu.org.uk>
Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Wolfram Sang <w.sang@pengutronix.de>
The WatchDog Timer Driver Core is a framework
that contains the common code for all watchdog-driver's.
It also introduces a watchdog device structure and the
operations that go with it.
This is the introduction of this framework. This part
supports the minimal watchdog userspace API (or with
other words: the functionality to use /dev/watchdog's
open, release and write functionality as defined in
the simplest watchdog API). Extra functionality will
follow in the next set of patches.
Signed-off-by: Alan Cox <alan@lxorguk.ukuu.org.uk>
Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Wolfram Sang <w.sang@pengutronix.de>
Adds device tree probe support for imx2_wdt driver.
Signed-off-by: Shawn Guo <shawn.guo@linaro.org>
Cc: Grant Likely <grant.likely@secretlab.ca>
Cc: Wolfram Sang <w.sang@pengutronix.de>
Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
This patch adds the of_match_table to enable s3c2410-wdt driver
to be probed when watchdog device node is found in the device tree.
Signed-off-by: Thomas Abraham <thomas.abraham@linaro.org>
Acked-by: Grant Likely <grant.likely@secretlab.ca>
Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
x86. reboot: Make Dell Latitude E6320 use reboot=pci
x86, doc only: Correct real-mode kernel header offset for init_size
x86: Disable AMD_NUMA for 32bit for now
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (21 commits)
slip: fix wrong SLIP6 ifdef-endif placing
natsemi: fix another dma-debug report
sctp: ABORT if receive, reassmbly, or reodering queue is not empty while closing socket
net: Fix default in docs for tcp_orphan_retries.
hso: fix a use after free condition
net/natsemi: Fix module parameter permissions
XFRM: Fix memory leak in xfrm_state_update
sctp: Enforce retransmission limit during shutdown
mac80211: fix TKIP replay vulnerability
mac80211: fix ie memory allocation for scheduled scans
ssb: fix init regression of hostmode PCI core
rtlwifi: rtl8192cu: Add new USB ID for Netgear WNA1000M
ath9k: Fix tx throughput drops for AR9003 chips with AES encryption
carl9170: add NEC WL300NU-AG usbid
cfg80211: fix deadlock with rfkill/sched_scan by adding new mutex
ath5k: fix incorrect use of drvdata in PCI suspend/resume code
ath5k: fix incorrect use of drvdata in sysfs code
Bluetooth: Fix memory leak under page timeouts
Bluetooth: Fix regression with incoming L2CAP connections
Bluetooth: Fix hidp disconnect deadlocks and lost wakeup
...
Resize feature was supported by the commit 4e33f9eab0 but it was not
reflected to the list of unsupported features in nilfs2.txt file.
This updates the list to fix discrepancy.
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
The real-mode kernel header init_size field is located at 0x260 per the field
listing in th e"REAL-MODE KERNEL HEADER" section. It is listed as 0x25c in
the "DETAILS OF HEADER FIELDS" section, which overlaps with pref_address.
Correct the details listing to 0x260.
Signed-off-by: Darren Hart <dvhart@linux.intel.com>
Link: http://lkml.kernel.org/r/541cf88e2dfe5b8186d8b96b136d892e769a68c1.1310441260.git.dvhart@linux.intel.com
CC: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
That file harkens back to the days of the big 2.4 -> 2.6 version jump,
and was based even then on older versions. Some of it is just obsolete,
and Jesper Juhl points out that it talks about kernel versions 2.6 and
should be updated to 3.0.
Remove some obsolete text, and re-phrase some other to not be 2.6-specific.
Reported-by: Jesper Juhl <jj@chaosbits.net>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* 'v4l_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-2.6:
[media] msp3400: fill in v4l2_tuner based on vt->type field
[media] tuner-core.c: don't change type field in g_tuner or g_frequency
[media] cx18/ivtv: fix g_tuner support
[media] tuner-core: power up tuner when called with s_power(1)
[media] v4l2-ioctl.c: check for valid tuner type in S_HW_FREQ_SEEK
[media] tuner-core: simplify the standard fixup
[media] tuner-core/v4l2-subdev: document that the type field has to be filled in
[media] v4l2-subdev.h: remove unused s_mode tuner op
[media] feature-removal-schedule: change in how radio device nodes are handled
[media] bttv: fix s_tuner for radio
[media] pvrusb2: fix g/s_tuner support
[media] v4l2-ioctl.c: prefill tuner type for g_frequency and g/s_tuner
[media] tuner-core: fix tuner_resume: use t->mode instead of t->type
[media] tuner-core: fix s_std and s_tuner
* 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mjg59/platform-drivers-x86:
hp-wmi: fix use after free
dell-laptop - using buffer without mutex_lock
Revert: "dell-laptop: Toggle the unsupported hardware killswitch"
platform-drivers-x86: set backlight type to BACKLIGHT_PLATFORM
thinkpad-acpi: handle HKEY 0x4010, 0x4011 events
drivers/platform/x86: Fix memory leak
thinkpad-acpi: handle some new HKEY 0x60xx events
acer-wmi: fix bitwise bug when set device state
acer-wmi: Only update rfkill status for associated hotkey events
Since we removed sti()/cli() and related, how about removing it from
Documentation/spinlocks.txt?
Signed-off-by: Muthukumar R <muthur@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Multiple attempts to dynamically reallocate pci resources have
unfortunately lead to regressions. Though we continue to fix the
regressions and fine tune the dynamic-reallocation behavior, we have not
reached a acceptable state yet.
This patch provides a interim solution. It disables dynamic reallocation
by default, but adds the ability to enable it through pci=realloc kernel
command line parameter.
Tested-by: Oliver Hartkopp <socketcan@hartkopp.net>
Signed-off-by: Ram Pai <linuxram@us.ibm.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Add an FS-Cache helper to bulk uncache pages on an inode. This will
only work for the circumstance where the pages in the cache correspond
1:1 with the pages attached to an inode's page cache.
This is required for CIFS and NFS: When disabling inode cookie, we were
returning the cookie and setting cifsi->fscache to NULL but failed to
invalidate any previously mapped pages. This resulted in "Bad page
state" errors and manifested in other kind of errors when running
fsstress. Fix it by uncaching mapped pages when we disable the inode
cookie.
This patch should fix the following oops and "Bad page state" errors
seen during fsstress testing.
------------[ cut here ]------------
kernel BUG at fs/cachefiles/namei.c:201!
invalid opcode: 0000 [#1] SMP
Pid: 5, comm: kworker/u:0 Not tainted 2.6.38.7-30.fc15.x86_64 #1 Bochs Bochs
RIP: 0010: cachefiles_walk_to_object+0x436/0x745 [cachefiles]
RSP: 0018:ffff88002ce6dd00 EFLAGS: 00010282
RAX: ffff88002ef165f0 RBX: ffff88001811f500 RCX: 0000000000000000
RDX: 0000000000000000 RSI: 0000000000000100 RDI: 0000000000000282
RBP: ffff88002ce6dda0 R08: 0000000000000100 R09: ffffffff81b3a300
R10: 0000ffff00066c0a R11: 0000000000000003 R12: ffff88002ae54840
R13: ffff88002ae54840 R14: ffff880029c29c00 R15: ffff88001811f4b0
FS: 00007f394dd32720(0000) GS:ffff88002ef00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 00007fffcb62ddf8 CR3: 000000001825f000 CR4: 00000000000006e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process kworker/u:0 (pid: 5, threadinfo ffff88002ce6c000, task ffff88002ce55cc0)
Stack:
0000000000000246 ffff88002ce55cc0 ffff88002ce6dd58 ffff88001815dc00
ffff8800185246c0 ffff88001811f618 ffff880029c29d18 ffff88001811f380
ffff88002ce6dd50 ffffffff814757e4 ffff88002ce6dda0 ffffffff8106ac56
Call Trace:
cachefiles_lookup_object+0x78/0xd4 [cachefiles]
fscache_lookup_object+0x131/0x16d [fscache]
fscache_object_work_func+0x1bc/0x669 [fscache]
process_one_work+0x186/0x298
worker_thread+0xda/0x15d
kthread+0x84/0x8c
kernel_thread_helper+0x4/0x10
RIP cachefiles_walk_to_object+0x436/0x745 [cachefiles]
---[ end trace 1d481c9af1804caa ]---
I tested the uncaching by the following means:
(1) Create a big file on my NFS server (104857600 bytes).
(2) Read the file into the cache with md5sum on the NFS client. Look in
/proc/fs/fscache/stats:
Pages : mrk=25601 unc=0
(3) Open the file for read/write ("bash 5<>/warthog/bigfile"). Look in proc
again:
Pages : mrk=25601 unc=25601
Reported-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-and-Tested-by: Suresh Jayaraman <sjayaraman@suse.de>
cc: stable@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Radio devices have weird side-effects when used with combined TV/radio
tuners and the V4L2 spec is ambiguous on how it should work. This results
in inconsistent driver behavior which makes life hard for everyone.
Be more strict in when and how the switch between radio and tv mode
takes place and make sure all drivers behave the same.
Signed-off-by: Hans Verkuil <hans.verkuil@cisco.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Handle events 0x4010 and 0x4011 so that we do not pester users about them.
These events report when the thinkpad is docked/undocked to a native
hotplug dock (i.e. one that does not need ACPI handling, nor is represented
in the ACPI device tree). Such docks are based on USB 2.0/3.0, and also
work as port replicators.
We really want a proper dock class to report these, or at least new input
EV_SW events. Since it is not clear which one to use yet, keep reporting
them as vendor-specific ThinkPad events.
WARNING: As defined by the thinkpad-acpi sysfs ABI rules of engagement, the
vendor-specific events will be REMOVED as soon as generic events are made
available (duplicate events are a big problem), with an appropriate update
to the thinkpad-acpi sysfs/event ABI versioning. Userspace is already
prepared to provide easy backwards compatibility for such changes when
convenient to the distro (see acpi-fakekey).
* Event 0x4010: docking to hotplug dock/port replicator
* Event 0x4011: undocking from hotplug dock/port replicator
Typical usecase would be to trigger display reconfiguration.
Reports mention T410, T510, and series 3 docks/port replicators. Special
thanks to Robert de Rooy for his extensive report and analysis of the
situation.
http://www.thinkwiki.org/wiki/ThinkPad_Port_Replicator_Series_3http://www.thinkwiki.org/wiki/ThinkPad_Mini_Dock_Series_3http://www.thinkwiki.org/wiki/ThinkPad_Mini_Dock_Plus_Series_3http://www.thinkwiki.org/wiki/ThinkPad_Mini_Dock_Plus_Series_3_for_Mobile_Workstationshttp://lenovoblogs.com/insidethebox/?p=290
Signed-off-by: Henrique de Moraes Holschuh <hmh@hmh.eng.br>
Cc: Matthew Garrett <mjg59@srcf.ucam.org>
Reported-by: Claudius Hubig <claudiushubig@chubig.net>
Reported-by: Doctor Bill <docbill@gmail.com>
Reported-by: Korte Noack <gbk.noack@gmx.de>
Reported-by: Robert de Rooy <robert.de.rooy@gmail.com>
Reported-by: Sebastian Will <swill@csail.mit.edu>
Signed-off-by: Matthew Garrett <mjg@redhat.com>
Handle some user interface events from the newer Lenovo models. We are likely
to do something smart with these events in the future, for now, hide the ones
we are already certain about from the user and userspace both.
* Events 0x6000 and 0x6005 are key-related. 0x6005 is not properly identified
yet. Ignore these events, and do not report them.
* Event 0x6040 has not been properly identified yet, and we don't know if it
is important (looks like it isn't, but still...). Keep reporting it.
* Change the message the driver outputs on unknown 0x6xxx events, as all
recent events are not related to thermal alarms. Degrade log level from
ALERT to WARNING.
Thanks to all users who reported these events or asked about them in a number
of mailing lists. Your help is highly appreciated, even if I did took a lot of
time to act on them. For that I apologise.
I will list those that identified the reasons for the events as "reported-by",
and I apologise in advance if I leave anyone out: it was not done on purpose, I
made the mistake of not properly tagging all event report emails separately,
and might have missed some.
Signed-off-by: Henrique de Moraes Holschuh <hmh@hmh.eng.br>
Reported-by: Markus Malkusch <markus@malkusch.de>
Reported-by: Peter Giles <g1l3sp@gmail.com>
Signed-off-by: Matthew Garrett <mjg@redhat.com>
All the blkio.throttle.* file names are incorrectly reported without
".throttle" in the documentation. Fix it.
Signed-off-by: Andrea Righi <andrea@betterlinux.com>
Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Acked-by: Vivek Goyal <vgoyal@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The list of available general purpose memory allocators in
Documentation/CodingStyle chapter 14 is incomplete. This patch adds
the missing vzalloc() to the list.
Signed-off-by: Jesper Juhl <jj@chaosbits.net>
Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* 'hwmon-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jdelvare/staging:
hwmon: (k10temp) Update documentation for Fam12h
hwmon-vid: Fix typo in VIA CPU name
hwmon: (f71882fg) Add support for the F71869A
hwmon: Use <> rather than () around my e-mail address
hwmon: (emc6w201) Properly handle all errors
Add some CPU series IDs and links to the Fam12h datasheets.
Signed-off-by: Clemens Ladisch <clemens@ladisch.de>
Signed-off-by: Jean Delvare <khali@linux-fr.org>
The F71869A is almost the same as the F71869F/E, except that it has
the normal number of temp and pwm zones for a F71882FG derived chip,
rather then the limited number of the F71869F/E.
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Tested-by: Max Baldwin <archerseven@gmail.com>
Acked-by: Guenter Roeck <guenter.roeck@ericsson.com>
Signed-off-by: Jean Delvare <khali@linux-fr.org>
Commit e1866b33b1 (PM / Runtime: Rework
runtime PM handling during driver removal) forgot to update the
documentation in Documentation/power/runtime_pm.txt to match the new
code in drivers/base/dd.c. Update that documentation to match the
code it describes.
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Reviewed-by: Kevin Hilman <khilman@ti.com>
Replace reference to pm_runtime_idle_sync() in the driver core with
pm_runtime_put_sync() which is used in the code.
Signed-off-by: Kevin Hilman <khilman@ti.com>
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
* 'usb-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb-2.6:
MAINTAINERS: add myself as maintainer of USB/IP
usb: r8a66597-hcd: fix cannot detect low/full speed device
USB: ehci-ath79: fix a NULL pointer dereference
USB: Add new FT232H chip to drivers/usb/serial/ftdi_sio.c
usb/isp1760: Fix bug preventing the unlinking of control urbs
USB: Fix up URB error codes to reflect implementation.
xhci: Always set urb->status to zero for isoc endpoints.
xhci: Add reset on resume quirk for asrock p67 host
xHCI 1.0: Incompatible Device Error
USB: don't let errors prevent system sleep
USB: don't let the hub driver prevent system sleep
USB: change maintainership of ohci-hcd and ehci-hcd
xHCI 1.0: Force Stopped Event(FSE)
xhci: Don't warn about zeroed bMaxBurst descriptor field.
USB: Free bandwidth when usb_disable_device is called.
xhci: Reject double add of active endpoints.
USB: TI 3410/5052 USB Serial Driver: Fix mem leak when firmware is too big.
usb: musb: gadget: clear TXPKTRDY flag when set FLUSHFIFO
usb: musb: host: compare status for negative error values
Commit 4d27e9dcff (PM: Make power
domain callbacks take precedence over subsystem ones) forgot to
update the device power management documentation to take changes
made by it into account. Correct that mistake.
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
The part of Documentation/power/devices.txt regarding sysdevs is not
valid any more after commit 2e711c04db
(PM: Remove sysdev suspend, resume and shutdown operations), so
remove it.
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Commit e866500247 (PM: Allow
pm_runtime_suspend() to succeed during system suspend) removed usage
count increment across system PM.
Update doc to reflect this.
Signed-off-by: Kevin Hilman <khilman@ti.com>
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
* 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
rcu: Move RCU_BOOST #ifdefs to header file
rcu: use softirq instead of kthreads except when RCU_BOOST=y
rcu: Use softirq to address performance regression
rcu: Simplify curing of load woes
Documentation/usb/error-codes.txt mentions that urb->status can be set to
-EXDEV, if the isochronous transfer was not fully completed. However, in
practice, EHCI, UHCI, and OHCI all only set -EXDEV in the individual frame
status, never in the URB status. Those host controller actually always
pass in a zero status to usb_hcd_giveback_urb, and rely on the core to set
the appropriate status value.
The xHCI driver ran into issues with the uvcvideo driver when it tried to
set -EXDEV in urb->status, because the driver refused to submit URBs, and
the userspace camera application's video froze.
Clean up the documentation to reflect the actual implementation.
Signed-off-by: Sarah Sharp <sarah.a.sharp@linux.intel.com>
Acked-by: Alan Stern <stern@rowland.harvard.edu>
Fix format and spelling.
Signed-off-by: Jörg Sommer <joerg@alea.gnuu.de>
Acked-by: Paul Menage <menage@google.com>
Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
According to commit 676db4af04 ("cgroupfs: create /sys/fs/cgroup to
mount cgroupfs on") the canonical mountpoint for the cgroup filesystem
is /sys/fs/cgroup. Hence, this should be used in the documentation.
Signed-off-by: Jörg Sommer <joerg@alea.gnuu.de>
Acked-by: Paul Menage <menage@google.com>
Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Instead of listing the architectures that are supported by
kmemleak in Documentation/kmemleak.txt, just refer people to
the list of supported architecutures in lib/Kconfig.debug so
that Documentation/kmemleak.txt does not need more updates
for this.
Signed-off-by: Maxin B. John <maxin.john@gmail.com>
Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This patch updates the incomplete documentation concerning the printk
extended format specifiers.
Signed-off-by: Andrew Murray <amurray@mpc-data.co.uk>
Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Commit a77aea9201 ("cgroup: remove the ns_cgroup") removed the
ns_cgroup but it forgot to remove the related doc in
feature-removal-schedule.txt.
Signed-off-by: WANG Cong <xiyou.wangcong@gmail.com>
Cc: Daniel Lezcano <daniel.lezcano@free.fr>
Cc: Serge E. Hallyn <serge.hallyn@canonical.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Michael Hennerich <michael.hennerich@analog.com>
Signed-off-by: Mike Frysinger <vapier@gentoo.org>
Cc: Richard Purdie <rpurdie@rpsys.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Commit a26ac2455ffcf3(rcu: move TREE_RCU from softirq to kthread)
introduced performance regression. In an AIM7 test, this commit degraded
performance by about 40%.
The commit runs rcu callbacks in a kthread instead of softirq. We observed
high rate of context switch which is caused by this. Out test system has
64 CPUs and HZ is 1000, so we saw more than 64k context switch per second
which is caused by RCU's per-CPU kthread. A trace showed that most of
the time the RCU per-CPU kthread doesn't actually handle any callbacks,
but instead just does a very small amount of work handling grace periods.
This means that RCU's per-CPU kthreads are making the scheduler do quite
a bit of work in order to allow a very small amount of RCU-related
processing to be done.
Alex Shi's analysis determined that this slowdown is due to lock
contention within the scheduler. Unfortunately, as Peter Zijlstra points
out, the scheduler's real-time semantics require global action, which
means that this contention is inherent in real-time scheduling. (Yes,
perhaps someone will come up with a workaround -- otherwise, -rt is not
going to do well on large SMP systems -- but this patch will work around
this issue in the meantime. And "the meantime" might well be forever.)
This patch therefore re-introduces softirq processing to RCU, but only
for core RCU work. RCU callbacks are still executed in kthread context,
so that only a small amount of RCU work runs in softirq context in the
common case. This should minimize ksoftirqd execution, allowing us to
skip boosting of ksoftirqd for CONFIG_RCU_BOOST=y kernels.
Signed-off-by: Shaohua Li <shaohua.li@intel.com>
Tested-by: "Alex,Shi" <alex.shi@intel.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
* 'for-linus' of git://neil.brown.name/md:
md/raid5: remove unusual use of bio_iovec_idx()
md/raid5: fix FUA request handling in ops_run_io()
md/raid5: fix raid5_set_bi_hw_segments
md:Documentation/md.txt - fix typo
md/bitmap: remove unused fields from struct bitmap
md/bitmap: use proper accessor macro
md: check ->hot_remove_disk when removing disk
md: Using poll /proc/mdstat can monitor the events of adding a spare disks
MD: use is_power_of_2 macro
MD: raid5 do not set fullsync
MD: support initial bitmap creation in-kernel
MD: add sync_super to mddev_t struct
MD: raid1 changes to allow use by device mapper
MD: move thread wakeups into resume
MD: possible typo
MD: no sync IO while suspended
MD: no integrity register if no gendisk