Fix I/O stalls with some 4-bay RAID enclosures which are based on
OXUF936QSE:
- Onnto dataTale RSM4QO, old firmware (not anymore with current
firmware),
- inXtron Hydra Super-S LCM, old as well as current firmware
when used in RAID-5 mode, perhaps also in other RAID modes.
The stalls happen during heavy or moderate disk traffic in periods that
are a multiple of 5 minutes, roughly twice per hour. They are caused
by the target responding too late to an ORB_Pointer register write:
The target responds after Split_Timeout, hence firewire-core cancels
the transaction, and firewire-sbp2 fails the SCSI request. The SCSI
core retries the request, that fails again (and again), hence SCSI core
calls firewire-sbp2's abort handler (and even the Management_Agent
register write in the abort handler has the transaction timeout
problem).
During all that, the process which issued the I/O is stalled in I/O
wait state.
Meanwhile, the target actually acts on the first failed SCSI request:
It responds to the ORB_Pointer write later (seen in the kernel log as
"firewire_core: Unsolicited response") and also finishes the SCSI
request with proper status (seen in the kernel log as "firewire_sbp2:
status write for unknown orb").
So let's just ignore RCODE_CANCELLED in the transaction callback and
wait for the target to complete the ORB nevertheless. This requires
a small modification is sbp2_cancel_orbs(); it now needs to call
orb->callback() regardless whether fw_cancel_transaction() found the
transaction unfinished or finished.
A different solution is to increase Split_Timeout on the local node.
(Tested: 2000ms timeout; maybe 1000ms or something like that works too.
200ms is insufficient. Standard is 100ms.) However, I rather not do
this because any software on any node could change the Split_Timeout to
something unsuitable. Or such a large Split_Timeout may be undesirable
for other purposes.
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
When an ORB was canceled (Command ORB i.e. SCSI request timed out, or
Management ORB timed out), or there was a send error in the initial
transaction, we missed to drop one of the ORB's references and thus
leaked memory.
Background:
In total, we hold 3 references to each Operation Request Block:
- 1 during sbp2_scsi_queuecommand() or sbp2_send_management_orb()
respectively,
- 1 for the duration of the write transaction to the ORB_Pointer or
Management_Agent register of the target,
- 1 for as long as the ORB stays within the lu->orb_list, until
the ORB is unlinked from the list and the orb->callback was
executed.
The latter one of these 3 references is finished
- normally by sbp2_status_write() when the target wrote status
for a pending ORB,
- or by sbp2_cancel_orbs() in case of an ORB time-out,
- or by complete_transaction() in case of a send error.
Of them, the latter two lacked the kref_put.
Add the missing kref_put()s. Add comments to the gets and puts of
references for transaction callbacks and ORB callbacks so that it is
easier to see what is supposed to happen.
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
Conflicts:
drivers/firewire/core-card.c
drivers/firewire/core-cdev.c
and forgotten #include <linux/time.h> in drivers/firewire/ohci.c
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
There is an at least theoretic race condition in which .start_iso etc.
could still be called between when the dummy driver is bound to the card
and when the children devices are being shut down. Add dummy_start_iso
and friends.
On the other hand, .enable, .set_config_rom, .read_csr, write_csr do not
need to be implemented by the dummy driver, as commented.
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
nfs_commit_inode() needs to be defined irrespectively of whether or not
we are supporting NFSv3 and NFSv4.
Allow the compiler to optimise away code in the NFSv2-only case by
converting it into an inlined stub function.
Reported-and-tested-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* master.kernel.org:/home/rmk/linux-2.6-arm:
cyber2000fb: fix console in truecolor modes
cyber2000fb: fix machine hang on module load
SA1111: Eliminate use after free
ARM: Fix Versatile/Realview/VExpress MMC card detection sense
ARM: 6279/1: highmem: fix SMP preemption bug in kmap_high_l1_vipt
ARM: Add barriers to io{read,write}{8,16,32} accessors as well
ARM: 6273/1: Add barriers to the I/O accessors if ARM_DMA_MEM_BUFFERABLE
ARM: 6272/1: Convert L2x0 to use the IO relaxed operations
ARM: 6271/1: Introduce *_relaxed() I/O accessors
ARM: 6275/1: ux500: don't use writeb() in uncompress.h
ARM: 6270/1: clean files in arch/arm/boot/compressed/
ARM: Fix csum_partial_copy_from_user()
* 'bugfixes' of git://git.linux-nfs.org/projects/trondmy/nfs-2.6:
NFS: Ensure that writepage respects the nonblock flag
NFS: kswapd must not block in nfs_release_page
nfs: include space for the NUL in root path
Debian's ia64 autobuilders have been seeing kernel freeze or reboot
when running the gdb testsuite (Debian bug 588574): dannf bisected to
2.6.32 62eede62da "mm: ZERO_PAGE without
PTE_SPECIAL"; and reproduced it with gdb's gcore on a simple target.
I'd missed updating the gate_vma handling in __get_user_pages(): that
happens to use vm_normal_page() (nowadays failing on the zero page),
yet reported success even when it failed to get a page - boom when
access_process_vm() tried to copy that to its intermediate buffer.
Fix this, resisting cleanups: in particular, leave it for now reporting
success when not asked to get any pages - very probably safe to change,
but let's not risk it without testing exposure.
Why did ia64 crash with 16kB pages, but succeed with 64kB pages?
Because setup_gate() pads each 64kB of its gate area with zero pages.
Reported-by: Andreas Barth <aba@not.so.argh.org>
Bisected-by: dann frazier <dannf@debian.org>
Signed-off-by: Hugh Dickins <hughd@google.com>
Tested-by: dann frazier <dannf@dannf.org>
Cc: stable@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Remove the __exit mark from cifs_exit_dns_resolver() as it's called by the
module init routine in case of error, and so may have been discarded during
linkage.
Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Return value was not set to 0 in setcolreg() with truecolor modes. This causes
fb_set_cmap() to abort after first color, resulting in blank palette - and
blank console in 24bpp and 32bpp modes.
Signed-off-by: Ondrej Zary <linux@rainbow-software.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
I was testing two CyberPro 2000 based PCI cards on x86 and the machine always
hanged completely when the cyber2000fb module was loaded. It seems that the
card hangs when some registers are accessed too quickly after writing RAMDAC
control register. With this patch, both card work.
Add delay after RAMDAC control register write to prevent hangs on module load.
Signed-off-by: Ondrej Zary <linux@rainbow-software.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
__sa1111_remove always frees its argument, so the subsequent reference to
sachip->saved_state represents a use after free. __sa1111_remove does not
appear to use the saved_state field, so the patch simply frees it first.
A simplified version of the semantic patch that finds this problem is as
follows: (http://coccinelle.lip6.fr/)
// <smpl>
@@
expression E,E2;
@@
__sa1111_remove(E)
...
(
E = E2
|
* E
)
// </smpl>
Signed-off-by: Julia Lawall <julia@diku.dk>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
The MMC card detection sense has become really confused with negations
at various levels, leading to some platforms not detecting inserted
cards. Fix this by converting everything to positive logic throughout,
thereby getting rid of these negations.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
smp_processor_id() must not be called from a preemptible context (this
is checked by CONFIG_DEBUG_PREEMPT). kmap_high_l1_vipt() was doing so.
This lead to a problem where the wrong per_cpu kmap_high_l1_vipt_depth
could be incremented, causing a BUG_ON(*depth <= 0); in
kunmap_high_l1_vipt().
The solution is to move the call to smp_processor_id() after the call
to preempt_disable().
Originally by: Andrew Howe <ahowe@nvidia.com>
Signed-off-by: Gary King <gking@nvidia.com>
Acked-by: Nicolas Pitre <nico.as.pitre@linaro.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
See https://bugzilla.kernel.org/show_bug.cgi?id=16056
If other processes are blocked waiting for kswapd to free up some memory so
that they can make progress, then we cannot allow kswapd to block on those
processes.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: stable@kernel.org
In root_nfs_name() it does the following:
if (strlen(buf) + strlen(cp) > NFS_MAXPATHLEN) {
printk(KERN_ERR "Root-NFS: Pathname for remote directory too long.\n");
return -1;
}
sprintf(nfs_export_path, buf, cp);
In the original code if (strlen(buf) + strlen(cp) == NFS_MAXPATHLEN)
then the sprintf() would lead to an overflow. Generally the rest of the
code assumes that the path can have NFS_MAXPATHLEN (1024) characters and
a NUL terminator so the fix is to add space to the nfs_export_path[]
buffer.
Signed-off-by: Dan Carpenter <error27@gmail.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* 'fix/hda' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound-2.6:
ALSA: hda - Add a PC-beep workaround for ASUS P5-V
ALSA: hda - Assume PC-beep as default for Realtek
ALSA: hda - Don't register beep input device when no beep is available
ALSA: hda - Fix pin-detection of Nvidia HDMI
Fix __task_cred()'s lockdep check by removing the following validation
condition:
lockdep_tasklist_lock_is_held()
as commit_creds() does not take the tasklist_lock, and nor do most of the
functions that call it, so this check is pointless and it can prevent
detection of the RCU lock not being held if the tasklist_lock is held.
Instead, add the following validation condition:
task->exit_state >= 0
to permit the access if the target task is dead and therefore unable to change
its own credentials.
Fix __task_cred()'s comment to:
(1) discard the bit that says that the caller must prevent the target task
from being deleted. That shouldn't need saying.
(2) Add a comment indicating the result of __task_cred() should not be passed
directly to get_cred(), but rather than get_task_cred() should be used
instead.
Also put a note into the documentation to enforce this point there too.
Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
It's possible for get_task_cred() as it currently stands to 'corrupt' a set of
credentials by incrementing their usage count after their replacement by the
task being accessed.
What happens is that get_task_cred() can race with commit_creds():
TASK_1 TASK_2 RCU_CLEANER
-->get_task_cred(TASK_2)
rcu_read_lock()
__cred = __task_cred(TASK_2)
-->commit_creds()
old_cred = TASK_2->real_cred
TASK_2->real_cred = ...
put_cred(old_cred)
call_rcu(old_cred)
[__cred->usage == 0]
get_cred(__cred)
[__cred->usage == 1]
rcu_read_unlock()
-->put_cred_rcu()
[__cred->usage == 1]
panic()
However, since a tasks credentials are generally not changed very often, we can
reasonably make use of a loop involving reading the creds pointer and using
atomic_inc_not_zero() to attempt to increment it if it hasn't already hit zero.
If successful, we can safely return the credentials in the knowledge that, even
if the task we're accessing has released them, they haven't gone to the RCU
cleanup code.
We then change task_state() in procfs to use get_task_cred() rather than
calling get_cred() on the result of __task_cred(), as that suffers from the
same problem.
Without this change, a BUG_ON in __put_cred() or in put_cred_rcu() can be
tripped when it is noticed that the usage count is not zero as it ought to be,
for example:
kernel BUG at kernel/cred.c:168!
invalid opcode: 0000 [#1] SMP
last sysfs file: /sys/kernel/mm/ksm/run
CPU 0
Pid: 2436, comm: master Not tainted 2.6.33.3-85.fc13.x86_64 #1 0HR330/OptiPlex
745
RIP: 0010:[<ffffffff81069881>] [<ffffffff81069881>] __put_cred+0xc/0x45
RSP: 0018:ffff88019e7e9eb8 EFLAGS: 00010202
RAX: 0000000000000001 RBX: ffff880161514480 RCX: 00000000ffffffff
RDX: 00000000ffffffff RSI: ffff880140c690c0 RDI: ffff880140c690c0
RBP: ffff88019e7e9eb8 R08: 00000000000000d0 R09: 0000000000000000
R10: 0000000000000001 R11: 0000000000000040 R12: ffff880140c690c0
R13: ffff88019e77aea0 R14: 00007fff336b0a5c R15: 0000000000000001
FS: 00007f12f50d97c0(0000) GS:ffff880007400000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f8f461bc000 CR3: 00000001b26ce000 CR4: 00000000000006f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process master (pid: 2436, threadinfo ffff88019e7e8000, task ffff88019e77aea0)
Stack:
ffff88019e7e9ec8 ffffffff810698cd ffff88019e7e9ef8 ffffffff81069b45
<0> ffff880161514180 ffff880161514480 ffff880161514180 0000000000000000
<0> ffff88019e7e9f28 ffffffff8106aace 0000000000000001 0000000000000246
Call Trace:
[<ffffffff810698cd>] put_cred+0x13/0x15
[<ffffffff81069b45>] commit_creds+0x16b/0x175
[<ffffffff8106aace>] set_current_groups+0x47/0x4e
[<ffffffff8106ac89>] sys_setgroups+0xf6/0x105
[<ffffffff81009b02>] system_call_fastpath+0x16/0x1b
Code: 48 8d 71 ff e8 7e 4e 15 00 85 c0 78 0b 8b 75 ec 48 89 df e8 ef 4a 15 00
48 83 c4 18 5b c9 c3 55 8b 07 8b 07 48 89 e5 85 c0 74 04 <0f> 0b eb fe 65 48 8b
04 25 00 cc 00 00 48 3b b8 58 04 00 00 75
RIP [<ffffffff81069881>] __put_cred+0xc/0x45
RSP <ffff88019e7e9eb8>
---[ end trace df391256a100ebdd ]---
Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This adds the DMA context programming and userspace ABI for multichannel
reception, i.e. for listening on multiple channel numbers by means of a
single DMA context.
The use case is reception of more streams than there are IR DMA units
offered by the link layer. This is already implemented by the older
ohci1394 + ieee1394 + raw1394 stack. And as discussed recently on
linux1394-devel, this feature is occasionally used in practice.
The big drawbacks of this mode are that buffer layout and interrupt
generation necessarily differ from single-channel reception: Headers
and trailers are not stripped from packets, packets are not aligned with
buffer chunks, interrupts are per buffer chunk, not per packet.
These drawbacks also cause a rather hefty code footprint to support this
rarely used OHCI-1394 feature. (367 lines added, among them 94 lines of
added userspace ABI documentation.)
This implementation enforces that a multichannel reception context may
only listen to channels to which no single-channel context on the same
link layer is presently listening to. OHCI-1394 would allow to overlay
single-channel contexts by the multi-channel context, but this would be
a departure from the present first-come-first-served policy of IR
context creation.
The implementation is heavily based on an earlier one by Jay Fenlason.
Thanks Jay.
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
firewire-ohci keeps book of which isochronous channels are occupied by
IR DMA contexts, so that there cannot be more than one context listening
to a certain channel.
If IR context creation failed due to an out-of-memory condition, this
bookkeeping leaked a channel.
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
When we append to a DMA program, we need to ensure that the order in
which initialization of the new descriptors and update of the
branch_address of the old tail descriptor, as seen by the PCI device,
happen as intended.
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
ASUS P5-V provides a SSID that unexpectedly matches with the value
compilant with Realtek's specification. Thus the driver interprets
it badly, resulting in non-working PC beep.
This patch adds a white-list for such a case; a white-list of known
devices with working PC beep.
Signed-off-by: Takashi Iwai <tiwai@suse.de>
The ioread/iowrite accessors also need barriers as they're used in
place of readl/writel et.al. in portable drivers. Create __iormb()
and __iowmb() which are conditionally defined to be barriers dependent
on ARM_DMA_MEM_BUFFERABLE, and always use these macros in the accessors.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
When the coherent DMA buffers are mapped as Normal Non-cacheable
(ARM_DMA_MEM_BUFFERABLE enabled), buffer accesses are no longer ordered
with Device memory accesses causing failures in device drivers that do
not use the mandatory memory barriers before starting a DMA transfer.
LKML discussions led to the conclusion that such barriers have to be
added to the I/O accessors:
http://thread.gmane.org/gmane.linux.kernel/683509/focus=686153http://thread.gmane.org/gmane.linux.ide/46414http://thread.gmane.org/gmane.linux.kernel.cross-arch/5250
This patch introduces a wmb() barrier to the write*() I/O accessors to
handle the situations where Normal Non-cacheable writes are still in the
processor (or L2 cache controller) write buffer before a DMA transfer
command is issued. For the read*() accessors, a rmb() is introduced
after the I/O to avoid speculative loads where the driver polls for a
DMA transfer ready bit.
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
This patch is in preparation for a subsequent patch which adds barriers
to the I/O accessors. Since the mandatory barriers may do an L2 cache
sync, this patch avoids a recursive call into l2x0_cache_sync() via the
write*() accessors and wmb() and a call into l2x0_cache_sync() with the
l2x0_lock held.
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
This patch introduces readl*_relaxed()/write*_relaxed() as the main I/O
accessors (when __mem_pci is defined). The standard read*()/write*()
macros are now based on the relaxed accessors.
This patch is in preparation for a subsequent patch which adds barriers
to the I/O accessors.
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Don't use writeb() in uncompress.h, to avoid the following build errors
when the "Add barriers to the I/O accessors" series is applied. Use
__raw_writeb() instead.
arch/arm/boot/compressed/misc.o: In function `putc':
arch/arm/mach-ux500/include/mach/uncompress.h:41:
undefined reference to `outer_cache'
Acked-by: Linus Walleij <linus.walleij@stericsson.com>
Signed-off-by: Rabin Vincent <rabin.vincent@stericsson.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Update the compressed boot Makefile for ARM to
remove files during clean.
Signed-off-by: Magnus Damm <damm@opensource.se>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
We're adjusting horizontal timings only here, moving vsync was just a
slavish translation of a typo in the X server.
Signed-off-by: Adam Jackson <ajax@redhat.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Fix incorrectly reporting 'default' power profile, when it is set to 'mid'.
Signed-off-by: Daniel J Blueman <daniel.blueman@gmail.com>
Reviewed-by: Alex Deucher <alexdeucher@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
* git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-rc-fixes-2.6:
[SCSI] ibmvscsi: Fix oops when an interrupt is pending during probe
[SCSI] zfcp: Update status read mempool
[SCSI] zfcp: Do not wait for SBALs on stopped queue
[SCSI] zfcp: Fix check whether unchained ct_els is possible
[SCSI] ipr: fix resource path display and formatting
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/lrg/voltage-2.6:
davinci: da850/omap-l138 evm: account for DEFDCDC{2,3} being tied high
regulator: tps6507x: allow driver to use DEFDCDC{2,3}_HIGH register
wm8350-regulator: fix wm8350_register_regulator error handling
ab3100: fix off-by-one value range checking for voltage selector
The function ecryptfs_uid_hash wrongly assumes that the
second parameter to hash_long() is the number of hash
buckets instead of the number of hash bits.
This patch fixes that and renames the variable
ecryptfs_hash_buckets to ecryptfs_hash_bits to make it
clearer.
Fixes: CVE-2010-2492
Signed-off-by: Andre Osterhues <aosterhues@escrypt.com>
Signed-off-by: Tyler Hicks <tyhicks@linux.vnet.ibm.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
HW breakpoints events stopped working correctly with kgdb
as a result of commit: 018cbffe68
(Merge commit 'v2.6.33' into perf/core).
The regression occurred because the behavior changed for setting
NOTIFY_STOP as the return value to the die notifier if the breakpoint
was known to the HW breakpoint API. Because kgdb is using the HW
breakpoint API to register HW breakpoints slots, it must also now
implement the overflow_handler call back else kgdb does not get to see
the events from the die notifier.
The kgdb_ll_trap function will be changed to be general purpose code
which can allow an easy way to implement the hw_breakpoint API
overflow call back.
Signed-off-by: Jason Wessel <jason.wessel@windriver.com>
Acked-by: Dongdong Deng <dongdong.deng@windriver.com>
Acked-by: Frederic Weisbecker <fweisbec@gmail.com>
If we don't need a huge amount of memory in ->readdir() then
we can use kmalloc rather than vmalloc to allocate it. This
should cut down on the greater overheads associated with
vmalloc for smaller directories.
We may be able to eliminate vmalloc entirely at some stage,
but this is easy to do right away.
Also using GFP_NOFS to avoid any issues wrt to deleting inodes
while under a glock, and suggestion from Linus to factor out
the alloc/dealloc.
I've given this a test with a variety of different sized
directories and it seems to work ok.
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Nick Piggin <npiggin@suse.de>
Cc: Prarit Bhargava <prarit@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Enable PC-beep as default for hardwares that aren't compliant with the
SSID value Realtek requires. In such a case, better to enable the beep
to avoid a regression.
Signed-off-by: Takashi Iwai <tiwai@suse.de>
We check now the availability of PC beep and skip the build of beep
mixers, but the driver still registers the input device. This should
be checked as well.
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Per the da850/omap-l138 Beta EVM SOM schematic, the DEFDCDC2 and
DEFDCDC3 lines are tied high. This leads to a 3.3V IO and 1.2V CVDD
voltage.
Pass the right platform data to the TPS6507x driver so it can operate
on the DEFDCDC{2,3}_HIGH register to read and change voltage levels.
Signed-off-by: Sekhar Nori <nsekhar@ti.com>
Acked-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
Signed-off-by: Liam Girdwood <lrg@slimlogic.co.uk>