The MUSB code clears TXCSR_DMAMODE incorrectly in several
places, either asserting that TXCSR_DMAENAB is clear (when
sometimes it isn't) or clearing both bits together. Recent
versions of the programmer's guide require DMAENAB to be
cleared first, although some older ones didn't.
Fix this and while at it:
- In musb_gadget::txstate(), stop clearing the AUTOSET
and DMAMODE bits for the CPPI case since they never
get set anyway (the former bit is reserved on DaVinci);
but do clear the DMAENAB bit on the DMA error path.
- In musb_host::musb_ep_program(), remove the duplicate
DMA controller specific code code clearing the TXCSR
previous state, add the code to clear TXCSR DMA bits
on the Inventra DMA error path, to replace such code
(executed late) on the PIO path.
- In musbhsdma::dma_channel_abort()/dma_controller_irq(),
add/use the 'offset' variable to avoid MUSB_EP_OFFSET()
invocations on every RXCSR/TXCSR access.
[dbrownell@users.sourceforge.net: don't introduce CamelCase,
shrink diff]
Signed-off-by: Sergei Shtylyov <sshtylyov@ru.mvista.com>
Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
We really want to use DMA mode 1 for all multi-packet transfers;
that's one IRQ on DMA completion, instead of one per packet.
There is an important issue with such transfers, especially on
the host side: when such transfers end with a full-size packet,
we must defer musb_dma_completion() calls until the FIFO empties.
Else we report URB completions too soon, and may clobber data in
the FIFO fifo when writing the next packet (losing data).
The Inventra DMA support uses DMA mode 1, but it ignores that
issue. The CPPI DMA support uses mode 0, but doesn't handle
its TXPKTRDY interrupts quite right either; it can get stale
"packet ready" interrupts, and report transfer completion too
early using slightly different code paths, also losing data.
So I'm solving it in a generic way -- by adding a sort of the
"interrupt filter" into musb_host_tx(), catching these cases
where a DMA completion IRQ doesn't suffice and removing some
needlessly controller-specific logic. When a TXDMA interrupt
happens and DMA request mode 1 is active, that filter resets
to mode 0 and defers URB completion processing until TXPKTRDY,
unless the FIFO is already empty. Related filtering logic in
Inventra and CPPI code gets removed.
Since it should be competely safe now to use the DMA request
mode 1 for host side transfers with the CPPI DMA controller,
set it in musb_h_tx_dma_start() ... now renamed (and shared).
[ dbrownell@users.sourceforge.net: don't introduce more
CamElCase; use more concise explanations ]
Signed-off-by: Sergei Shtylyov <sshtylyov@ru.mvista.com>
Cc: Felipe Balbi <felipe.balbi@nokia.com>
Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
The MUSB host side can't share generic TX FIFO flush logic
with EP0; the EP0 TX status register bits are different
from those for other entpoints.
Resolve this issue by providing a new EP0-specific routine
to flush and reset the FIFO, which pays careful attention to
restrictions listed in the latest programmer's guide. This
gets rid of an open issue whereby the usbtest control write
test (#14) failed.
Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
This patch (as1227) adds the MAX_SECTORS_64 flag to the unusual_devs
entry for the Simple Tech/Datafab controller. This fixes Bugzilla
#12882.
Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Reported-and-tested-by: binbin <binbinsh@gmail.com>
Cc: stable <stable@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Someone noted that the enqueue path used an unlocked access
for usb_host_endpoint->hcpriv ... fix that, by being safe
and always accessing it under spinlock protection.
Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Add a set of device IDs from the Windows drivers. These aren't complete
(there's a couple of cases where a QDL device is identified without the
associated modem being identified), but it's better than the current
situation.
Signed-off-by: Matthew Garrett <mjg@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
This patch allows D-Link DWM-652 3.5G modem to work.
It is an express card but was only tested with the provided usb adapter as I
don't have machines with express card connector.
/dev/ttyUSB{0,1,2} get created, and using comgt on ttyUSB1 works fine :
[root@plop tmp]# comgt -d /dev/ttyUSB1 -e
Enter PIN number: XXXX
Waiting for Registration..(120 sec max).
Registered on Home network: "Orange France",2
Signal Quality: 15,99
From: Pascal Terjan <pterjan@mandriva.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
The g_ether USB gadget driver currently decides whether or not there's a
link to report back for eth_get_link based on if the USB link speed is
set. The USB gadget speed is however often set even before the device is
enumerated. It seems more sensible to only report a "link" if we're
actually connected to a host that wants to talk to us. The patch below
does this for me - tested with the PXA27x UDC driver.
Signed-off-by: Jonathan McDowell <noodles@earth.li>
Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
Cc: stable <stable@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
* 'sched-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
sched: Avoid printing sched_group::__cpu_power for default case
tracing, sched: mark get_parent_ip() notrace
* 'core-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
kernel/softirq.c: fix sparse warning
rcu: Make hierarchical RCU less IPI-happy
Fix sparse warning in kernel/softirq.c.
warning: do-while statement is not a compound statement
Signed-off-by: H Hartley Sweeten <hsweeten@visionengravers.com>
LKML-Reference: <BD79186B4FD85F4B8E60E381CAEE1909015F9033@mi8nycmail19.Mi8.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
* 'x86/uv' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
x86: UV BAU distribution and payload MMRs
x86: UV: BAU partition-relative distribution map
x86, uv: add Kconfig dependency on NUMA for UV systems
x86: prevent /sys/firmware/sgi_uv from being created on non-uv systems
x86, UV: Fix for nodes with memory and no cpus
x86, UV: system table in bios accessed after unmap
x86: UV BAU messaging timeouts
x86: UV BAU and nodes with no memory
Commit 46e0bb9c12 ("sched: Print sched_group::__cpu_power
in sched_domain_debug") produces a messy dmesg output while
attempting to print the sched_group::__cpu_power for each
group in the sched_domain hierarchy.
Fix this by avoid printing the __cpu_power for default cases.
(i.e, __cpu_power == SCHED_LOAD_SCALE).
[ Impact: reduce syslog clutter ]
Reported-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Gautham R Shenoy <ego@in.ibm.com>
Fixed-by: Tony Luck <tony.luck@intel.com>
Cc: a.p.zijlstra@chello.nl
LKML-Reference: <20090414033936.GA534@in.ibm.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
* 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jgarzik/libata-dev:
ata: Report 16/32bit PIO as best we can
libata: use ATA_ID_CFA_*
pata_legacy: fix no device fail path
pata_hpt37x: fix HPT370 DMA timeouts
libata: handle SEMB signature better
Tetsuo Handa reports seeing the WARN_ON(current->mm == NULL) in
security_vm_enough_memory(), when do_execve() is touching the
target mm's stack, to set up its args and environment.
Yes, a UMH_NO_WAIT or UMH_WAIT_PROC call_usermodehelper() spawns
an mm-less kernel thread to do the exec. And in any case, that
vm_enough_memory check when growing stack ought to be done on the
target mm, not on the execer's mm (though apart from the warning,
it only makes a slight tweak to OVERCOMMIT_NEVER behaviour).
Reported-by: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>
Signed-off-by: Hugh Dickins <hugh@veritas.com>
Cc: stable@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This reverts commit f520360d93.
Tetsuo Handa, running a kernel with CONFIG_DEBUG_PAGEALLOC=y and
CONFIG_UEVENT_HELPER_PATH=/sbin/hotplug, has been hitting RCU detected
CPU stalls: it's been spinning in the loop where do_execve() counts up
the args (but why wasn't fixup_exception working? dunno).
The recent change, switching kobject_uevent_env() from UMH_WAIT_EXEC
to UMH_NO_WAIT, is broken: the exec uses args on the local stack here,
and an env which is kfreed as soon as call_usermodehelper() returns.
It very much needs to wait for the exec to be done.
An alternative would be to keep the UMH_NO_WAIT, and complicate the code
to allocate and free these resources correctly? but no, as GregKH
pointed out when making the commit, CONFIG_UEVENT_HELPER_PATH="" is a
much better optimization - though some distros are still saying
/sbin/hotplug in their .config, yet with no such binary in their initrd
or their root.
Reported-by: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>
Signed-off-by: Hugh Dickins <hugh@veritas.com>
Acked-by: Greg Kroah-Hartman <gregkh@suse.de>
Acked-by: Arjan van de Ven <arjan@linux.intel.com>
Acked-by: Will Newton <will.newton@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The legacy old IDE ioctl API for this is a bit primitive so we try
and map stuff sensibly onto it.
- Set PIO over DMA devices to report 32bit
- Add ability to change the PIO32 settings if the controller permits it
- Add that functionality into the sff drivers
- Add that functionality into the VLB legacy driver
- Turn on the 32bit PIO on the ninja32 and add support there
Signed-off-by: Alan Cox <alan@linux.intel.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
Use ATA_ID_CFA_* constants for CFA specific identify data words 162 and 163.
Signed-off-by: Sergei Shtylyov <sshtylyov@ru.mvista.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
When pata_legacy can't detect any device, it unregisters the
platform_device and fails detection. However, it forgets to detach
ata host triggering weird failures as the host later gets freed by
devres while still attached. Fix it.
Signed-off-by: Tejun Heo <tj@kernel.org>
Reported-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
The libata driver has copied the code from the IDE driver which caused a post
2.4.18 regression on many HPT370[A] chips -- DMA stopped to work completely,
only causing timeouts. Now remove hpt370_bmdma_start() for good...
Signed-off-by: Sergei Shtylyov <sshtylyov@ru.mvista.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
WDC WD1600JS-62MHB5 successfully hits the window between ATA/ATAPI-7
and Serial ATA II standards and reports 3c/c3 signature which now is
assigned to SEMB. Make ata_dev_classify() report ATA_DEV_SEMB on the
sig and let ata_dev_read_id() work around it by trying IDENTIFY once.
This fixes bko#11579.
Signed-off-by: Tejun Heo <tj@kernel.org>
Reported-by: David Haun <drhaun88@gmail.com>
Reported-by: Lars Wirzenius <liw@liw.fi>
Reported-by: Juan Manuel <jmcarranza@gmail.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
This patch correctly sets BAU memory mapped registers to point
to the sending activation descriptor table and target payload table.
The "Broadcast Assist Unit" is used for TLB shootdown in UV.
The memory mapped registers that point to sending and receiving
memory structures contain node numbers.
In one case the __pa() function did not provide the node id of
memory on blade zero in configurations where that id is nonzero.
In another case, it was assumed that memory was allocated on
the local node. That assumption is not true in a configuration
in which the node has no memory.
Tested on the UV hardware simulator.
[ Impact: fix possible runtime crash due to incorrect TLB logic ]
Signed-off-by: Cliff Wickman <cpw@sgi.com>
LKML-Reference: <E1LuR5Z-0007An-B8@eag09.americas.sgi.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
block_write_full_page doesn't allow the caller to control what happens
when the IO is over. This adds a new call named block_write_full_page_endio
so the buffer head end_io handler can be provided by the caller.
This will be used by the ext3 data=guarded mode to do i_size updates in
a workqueue based end_io handler. end_buffer_async_write is also
exported so it can be called to do the dirty work of managing page
writeback for the higher level end_io handler.
Signed-off-by: Chris Mason <chris.mason@oracle.com>
Acked-by: Theodore Tso <tytso@mit.edu>
Acked-by: Jan Kara <jack@suse.cz>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This wasn't exported before and is useful (used by the experimental ext3
data=guarded code)
Signed-off-by: Chris Mason <chris.mason@oracle.com>
Acked-by: Theodore Tso <tytso@mit.edu>
Acked-by: Jan Kara <jack@suse.cz>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (64 commits)
phylib: Fix delay argument of schedule_delayed_work
NET/ixgbe: Fix powering off during shutdown
NET/e1000e: Fix powering off during shutdown
NET/e1000: Fix powering off during shutdown
packet: avoid warnings when high-order page allocation fails
gianfar: stop send queue before resetting gianfar
myr10ge: again fix lro_gen_skb() alignment
declance: convert to net_device_ops
bfin_mac: convert to net_device_ops
au1000: convert to net_device_ops
atarilance: convert to net_device_ops
a2065: convert to net_device_ops
ixgbe: update real_num_tx_queues on changing num_rx_queues
ixgbe: fix tx queue index
Revert "rose: zero length frame filtering in af_rose.c"
sfc: Use correct macro to set event bitfield
sfc: Match calls to netif_napi_add() and netif_napi_del()
bonding: Remove debug printk
e1000/e1000: fix compile warning
ehea: Fix incomplete conversion to net_device_ops
...
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc-2.6:
sparc: remove some pointless conditionals before kfree()
sbus: changed ioctls to unlocked
sparc: asm/atomic.h on 32bit should include asm/system.h for xchg
sparc64: Fix smp_callin() locking.
The commit a390d1f3 ("phylib: convert state_queue work to
delayed_work") missed converting 'expires' value to 'delay' value.
Signed-off-by: Atsushi Nemoto <anemo@mba.ocn.ne.jp>
Acked-by: Marcin Slusarz <marcin.slusarz@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Prevent ixgbe from putting the adapter into D3 during shutdown except when
we're going to power off the system, since doing that may generally cause
problems with kexec to happen (such problems were observed for igb and
forcedeth). For this purpose seperate ixgbe_shutdown() from ixgbe_suspend()
and use the appropriate PCI PM callbacks in both of them.
Signed-off-by: "Rafael J. Wysocki" <rjw@sisk.pl>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Prevent e1000e from putting the adapter into D3 during shutdown except when
we're going to power off the system, since doing that may generally cause
problems with kexec to happen (such problems were observed for igb and
forcedeth). For this purpose seperate e1000e_shutdown() from e1000e_suspend()
and use the appropriate PCI PM callbacks in both of them.
Signed-off-by: "Rafael J. Wysocki" <rjw@sisk.pl>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Prevent e1000 from putting the adapter into D3 during shutdown except when
we're going to power off the system, since doing that may generally cause
problems with kexec to happen (such problems were observed for igb and
forcedeth). For this purpose seperate e1000_shutdown() from e1000_suspend()
and use the appropriate PCI PM callbacks in both of them.
Signed-off-by: "Rafael J. Wysocki" <rjw@sisk.pl>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Don't try and predeclare inline funcs like this:
static inline void wait_migrated_callbacks(void)
...
static void _rcu_barrier(enum rcu_barrier type)
{
...
wait_migrated_callbacks();
}
...
static inline void wait_migrated_callbacks(void)
{
wait_event(rcu_migrate_wq, !atomic_read(&rcu_migrate_type_count));
}
as it upsets some versions of gcc under some circumstances:
kernel/rcupdate.c: In function `_rcu_barrier':
kernel/rcupdate.c:125: sorry, unimplemented: inlining failed in call to 'wait_migrated_callbacks': function body not available
kernel/rcupdate.c:152: sorry, unimplemented: called from here
This can be dealt with by simply putting the static variables (rcu_migrate_*)
at the top, and moving the implementation of the function up so that it
replaces its forward declaration.
Signed-off-by: David Howells <dhowells@redhat.com>
Cc: Dipankar Sarma <dipankar@in.ibm.com>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The default CONFIG_BUG=n version of BUG() should incorporate an empty a
do...while statement to avoid compilation weirdness.
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Stop gcc from generating uninitialised variable warnings after BUG(). The
problem is that MN10300's implementation of BUG() invokes system call 15 which
doesn't return - but there's no way to tell the compiler that and also emit the
bug table element with the correct file and line data.
So instead, we make the do...while wrapper in _debug_bug_trap() an endless loop
from which there's no escape.
Also, while we're at it, (1) get rid of _debug_bug_trap() and just implement
directly as BUG(), and (2) make the implementation of BUG() contingent on
CONFIG_BUG=y.
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Wire up missing system calls preadv() and pwritev().
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Discard duplicate PFN_xxx() macros from arch code as they're now in the
general headers.
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* 'for-linus' of git://git390.marist.edu/pub/scm/linux-2.6:
[S390] boot cputime accounting
[S390] add read_persistent_clock
[S390] cpu hotplug and accounting values
[S390] fix idle time accounting
[S390] smp: fix cpu_possible_map initialization
[S390] dasd: fix idaw boundary checking for track based ccw
[S390] dasd: Use the new async framework for autoonlining.
[S390] qdio: remove dead timeout handler
[S390] appldata: Use new mod_virt_timer_periodic() function.
[S390] extend virtual timer interface by mod_virt_timer_periodic
[S390] stp synchronization retry timer
[S390] call nmi_enter/nmi_exit on machine checks
[S390] wire up preadv/pwritev system calls
[S390] s390: move machine flags to lowcore
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound-2.6:
ALSA: hda - Fix the cmd cache keys for amp verbs
ALSA: add missing definitions(letters) to HD-Audio.txt
ALSA: hda - Add quirk mask for Fujitsu Amilo laptops with ALC883
[ALSA] intel8x0: add one retry to the ac97_clock measurement routine
[ALSA] intel8x0: fix wrong conditions in ac97_clock measure routine
ALSA: hda - Avoid call of snd_jack_report at release
ALSA: add private_data to struct snd_jack
ALSA: snd-usb-caiaq: rename files to remove redundant information in file pathes
ALSA: snd-usb-caiaq: clean up header includes
ALSA: sound/pci: use memdup_user()
ALSA: sound/usb: use memdup_user()
ALSA: sound/isa: use memdup_user()
ALSA: sound/core: use memdup_user()
[ALSA] intel8x0: do not use zero value from PICB register
[ALSA] intel8x0: an attempt to make ac97_clock measurement more reliable
[ALSA] pcm-midlevel: Add more strict buffer position checks based on jiffies
[ALSA] hda_intel: fix unexpected ring buffer positions
ASoC: Disable S3C64xx support in Kconfig
ASoC: magician: remove un-necessary #include of pxa-regs.h and hardware.h
* 'for-linus' of git://git.kernel.dk/linux-2.6-block: (28 commits)
cfq-iosched: add close cooperator code
cfq-iosched: log responsible 'cfqq' in idle timer arm
cfq-iosched: tweak kick logic a bit more
cfq-iosched: no need to save interrupts in cfq_kick_queue()
brd: fix cacheflushing
brd: support barriers
swap: Remove code handling bio_alloc failure with __GFP_WAIT
gfs2: Remove code handling bio_alloc failure with __GFP_WAIT
ext4: Remove code handling bio_alloc failure with __GFP_WAIT
dio: Remove code handling bio_alloc failure with __GFP_WAIT
block: Remove code handling bio_alloc failure with __GFP_WAIT
bio: add documentation to bio_alloc()
splice: add helpers for locking pipe inode
splice: remove generic_file_splice_write_nolock()
ocfs2: fix i_mutex locking in ocfs2_splice_to_file()
splice: fix i_mutex locking in generic_splice_write()
splice: remove i_mutex locking in splice_from_pipe()
splice: split up __splice_from_pipe()
block: fix SG_IO to return a proper error value
cfq-iosched: don't delay queue kick for a merged request
...
Fix the key value generation for get/set amp verbs. The upper bits of
the parameter have to be combined with the verb value to be unique for
each direction/index of amp access.
This fixes the resume problem on some hardwares like Macbook after
the channel mode is changed.
Tested-by: Johannes Berg <johannes@sipsolutions.net>
Cc: <stable@kernel.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
* 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc:
powerpc: pseries/dtl.c should include asm/firmware.h
powerpc: Fix data-corrupting bug in __futex_atomic_op
powerpc/pseries: Set error_state to pci_channel_io_normal in eeh_report_reset()
powerpc: Allow 256kB pages with SHMEM
powerpc: Document new FSL I2C bindings and cleanup
powerpc/mm: Fix compile warning
powerpc/85xx: TQM8548: update defconfig
powerpc/85xx: TQM8548: use proper phy-handles for enet2 and enet3
powerpc/85xx: TQM85xx: correct address of LM75 I2C device nodes
powerpc: Add support for early tlbilx opcode
powerpc: Fix tlbilx opcode
It turns out that 'smp_call_function_many()' doesn't work at all like
'smp_call_function_single()', and my change to Andrew's patch to use it
rather than a loop over all CPU's acpi-cpufreq doesn't work.
My bad.
'smp_call_function_many()' has two "features" (aka "documented bugs"):
(a) it needs to be called with preemption disabled, because it uses
smp_processor_id() without guarding the CPU lookup with 'get_cpu()'
and 'put_cpu()' like the 'single' variant does.
(b) even if the current CPU is part of the CPU mask, it won't do the
call on that CPU.
Still, we're better off trying to use 'smp_call_function_many()' than
looping over CPU's, since it at least in theory allows us to use a
broadcast IPI and do it all in parallel. So let's just work around the
silly semantic bugs in that function.
Reported-and-tested-by: Ali Gholami Rudi <ali@rudi.ir>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Andrew Morton <akpm@linux-foundation.org>,
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Dave Jones <davej@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
If we have processes that are working in close proximity to each
other on disk, we don't want to idle wait. Instead allow the close
process to issue a request, getting better aggregate bandwidth.
The anticipatory scheduler has similar checks, noop and deadline do
not need it since they don't care about process <-> io mappings.
The code for CFQ is a little more involved though, since we split
request queues into per-process contexts.
This fixes a performance problem with eg dump(8), since it uses
several processes in some silly attempt to speed IO up. Even if
dump(8) isn't really a valid case (it should be fixed by using
CLONE_IO), there are other cases where we see close processes
and where idling ends up hurting performance.
Credit goes to Jeff Moyer <jmoyer@redhat.com> for writing the
initial implementation.
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
We only kick the dispatch for an idling queue, if we think it's a
(somewhat) fully merged request. Also allow a kick if we have other
busy queues in the system, since we don't want to risk waiting for
a potential merge in that case. It's better to get some work done and
proceed.
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>