When do block-attach/block-detach test with below steps, umount hangs
in the guest. Furthermore shutdown ends up being stuck when umounting file-systems.
1. start guest.
2. attach new block device by xm block-attach in Dom0.
3. mount new disk in guest.
4. execute xm block-detach to detach the block device in dom0 until timeout
5. Any request to the disk will hung.
Root cause:
This issue is caused when setting backend device's state to
'XenbusStateClosing', which sends to the frontend the XenbusStateClosing
notification. When frontend receives the notification it tries to release
the disk in blkfront_closing(), but at that moment the disk is still in use
by guest, so frontend refuses to close. Specifically it sets the disk state to
XenbusStateClosing and sends the notification to backend - when backend receives the
event, it disconnects the vbd from real device, and sets the vbd device state to
XenbusStateClosing. The backend disconnects the real device/file, and any IO
requests to the disk in guest will end up in ether, leaving disk DEAD and set to
XenbusStateClosing. When the guest wants to disconnect the disk, umount will
hang on blkif_release()->xlvbd_release_gendisk() as it is unable to send any IO
to the disk, which prevents clean system shutdown.
Solution:
Don't disconnect backend until frontend state switched to XenbusStateClosed.
Signed-off-by: Joe Jin <joe.jin@oracle.com>
Cc: Daniel Stodden <daniel.stodden@citrix.com>
Cc: Jens Axboe <jaxboe@fusionio.com>
Cc: Annie Li <annie.li@oracle.com>
Cc: Ian Campbell <Ian.Campbell@eu.citrix.com>
[v1: Modified description a bit]
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
With the frontend having Xen but the backend not, it just looks odd:
<*> Xen virtual block device support
<*> Block-device backend driver
Fix it to have the 'Xen' in front of it.
Reported-by: Sander Eikelenboom <linux@eikelenboom.it>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Avoid telling users to use xvde and onwards when using xvde.
Acked-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
These were intended to avoid the namespace clash when representing
emulated IDE and SCSI devices. However that seems to confuse users
more than expected (a disk defined as sda becomes xvde).
So for now go back to the scheme which does no adjustments. This
will break when mixing IDE and SCSI names in the configuration of
guests but should be by now expected.
Acked-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Add xen-backend:vbd module alias to the xen-blkback module. This allows
automatic loading of the module.
Signed-off-by: Bastian Blank <waldi@debian.org>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Running RING_FINAL_CHECK_FOR_REQUESTS from make_response is a bad
idea. It means that in-flight I/O is essentially blocking continued
batches. This essentially kills throughput on frontends which unplug
(or even just notify) early and rightfully assume addtional requests
will be picked up on time, not synchronously.
Signed-off-by: Daniel Stodden <daniel.stodden@citrix.com>
[v1: Rebased and fixed compile problems]
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Second condition in OR always implies first condition is false
thus bytes_read in the second is not needed. The same goes to
bytes_written.
Signed-off-by: Namhyung Kim <namhyung@gmail.com>
Acked-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
POLLOUT should be returned only if bd->queued_cmds < bd->max_queue
so that bsg_alloc_command() can proceed.
Signed-off-by: Namhyung Kim <namhyung@gmail.com>
Acked-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/penberg/slab-2.6:
SLAB: Record actual last user of freed objects.
slub: always align cpu_slab to honor cmpxchg_double requirement
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client:
ceph: unwind canceled flock state
ceph: fix ENOENT logic in striped_read
ceph: fix short sync reads from the OSD
ceph: fix sync vs canceled write
ceph: use ihold when we already have an inode ref
Return error on out of range cpsdvsr value.
Acked-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Virupax Sadashivpetimath <virupax.sadashivpetimath@stericsson.com>
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/btrfs-unstable:
Btrfs: use join_transaction in btrfs_evict_inode()
Btrfs - use %pU to print fsid
Btrfs: fix extent state leak on failed nodatasum reads
btrfs: fix unlocked access of delalloc_inodes
Btrfs: avoid stack bloat in btrfs_ioctl_fs_info()
btrfs: remove 64bit alignment padding to allow extent_buffer to fit into one fewer cacheline
Btrfs: clear current->journal_info on async transaction commit
Btrfs: make sure to recheck for bitmaps in clusters
btrfs: remove unneeded includes from scrub.c
btrfs: reinitialize scrub workers
btrfs: scrub: errors in tree enumeration
Btrfs: don't map extent buffer if path->skip_locking is set
Btrfs: unlock the trans lock properly
Btrfs: don't map extent buffer if path->skip_locking is set
Btrfs: fix duplicate checking logic
Btrfs: fix the allocator loop logic
Btrfs: fix bitmap regression
Btrfs: don't commit the transaction if we dont have enough pinned bytes
Btrfs: noinline the cluster searching functions
Btrfs: cache bitmaps when searching for a cluster
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound-2.6:
ALSA: hda: Fix inaudible internal speakers on CyberpowerPC Gamer Xplorer N57001 laptop
ALSA: Use %pV for snd_printk()
ALSA: hda - Fix initialization of hp pins with master_mute in Realtek
ALSA: hda - Fix invalid unsol tag for some alc262 model quirks
ASoC: SAMSUNG: Fix the incorrect referencing of I2SCON register
ASoC: snd_soc_new_{mixer,mux,pga} make sure to use right DAPM context
ASoC: fsl: fix initialization of DMA buffers
ASoC: WM8804 does not support sample rates below 32kHz
ASoC: Fix WM8962 headphone volume update for use of advanced caches
ASoC: Blackfin: bf5xx-ad1836: Fix codec device name
ALSA: hda: Fix quirk for Dell Inspiron 910
ASoC: AD1836: Fix setting the PCM format
ASoC: Check for NULL register bank in snd_soc_get_cache_val()
ASoC: Add missing break in WM8915 FLL source selection
ASoC: Only update SYSCLK_ENA when pausing WM8915 SYSCLK
ASoC: atmel_ssc: Don't try to free ssc if request failed
* 'gpio/merge' of git://git.secretlab.ca/git/linux-2.6:
gpio/basic_mmio: add missing include of spinlock_types.h
gpio/nomadik: fix sleepmode for elder Nomadik
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/security-testing-2.6:
AppArmor: Fix sleep in invalid context from task_setrlimit
We leak the memory allocated to 'phi' when the variable goes out of scope
in hfcsusb_ph_info().
Signed-off-by: Jesper Juhl <jj@chaosbits.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
There is a dev_put(ndev) missing on an error path. This was
introduced in 0c1ad04aec "netpoll: prevent netpoll setup on slave
devices".
Signed-off-by: Dan Carpenter <error27@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Russell King said:
>
> So, to summarize what its doing:
>
> 1. It allocates buffers for rx and tx.
> 2. It maps them with dma_map_single().
> This transfers ownership of the buffer to the DMA device.
> 3. In ep93xx_xmit,
> 3a. It copies the data into the buffer with skb_copy_and_csum_dev()
> This violates the DMA buffer ownership rules - the CPU should
> not be writing to this buffer while it is (in principle) owned
> by the DMA device.
> 3b. It then calls dma_sync_single_for_cpu() for the buffer.
> This transfers ownership of the buffer to the CPU, which surely
> is the wrong direction.
> 4. In ep93xx_rx,
> 4a. It calls dma_sync_single_for_cpu() for the buffer.
> This at least transfers the DMA buffer ownership to the CPU
> before the CPU reads the buffer
> 4b. It then uses skb_copy_to_linear_data() to copy the data out.
> At no point does it transfer ownership back to the DMA device.
> 5. When the driver is removed, it dma_unmap_single()'s the buffer.
> This transfers ownership of the buffer to the CPU.
> 6. It frees the buffer.
>
> While it may work on ep93xx, it's not respecting the DMA API rules,
> and with DMA debugging enabled it will probably encounter quite a few
> warnings.
This patch fixes these violations.
Signed-off-by: Mika Westerberg <mika.westerberg@iki.fi>
Acked-by: Russell King <rmk+kernel@arm.linux.org.uk>
Acked-by: H Hartley Sweeten <hsweeten@visionengravers.com>
Tested-by: Petr Stetiar <ynezz@true.cz>
Signed-off-by: David S. Miller <davem@davemloft.net>
Commit a197b59ae6 (mm: fail GFP_DMA allocations when ZONE_DMA is not
configured) made page allocator to return NULL if GFP_DMA is set but
CONFIG_ZONE_DMA is disabled.
This causes ep93xx_eth to fail:
WARNING: at mm/page_alloc.c:2251 __alloc_pages_nodemask+0x11c/0x638()
Modules linked in:
[<c0035498>] (unwind_backtrace+0x0/0xf4) from [<c0043da4>] (warn_slowpath_common+0x48/0x60)
[<c0043da4>] (warn_slowpath_common+0x48/0x60) from [<c0043dd8>] (warn_slowpath_null+0x1c/0x24)
[<c0043dd8>] (warn_slowpath_null+0x1c/0x24) from [<c0083b6c>] (__alloc_pages_nodemask+0x11c/0x638)
[<c0083b6c>] (__alloc_pages_nodemask+0x11c/0x638) from [<c00366fc>] (__dma_alloc+0x8c/0x3ec)
[<c00366fc>] (__dma_alloc+0x8c/0x3ec) from [<c0036adc>] (dma_alloc_coherent+0x54/0x60)
[<c0036adc>] (dma_alloc_coherent+0x54/0x60) from [<c0227808>] (ep93xx_open+0x20/0x864)
[<c0227808>] (ep93xx_open+0x20/0x864) from [<c0283144>] (__dev_open+0xb8/0x108)
[<c0283144>] (__dev_open+0xb8/0x108) from [<c0280528>] (__dev_change_flags+0x70/0x128)
[<c0280528>] (__dev_change_flags+0x70/0x128) from [<c0283054>] (dev_change_flags+0x10/0x48)
[<c0283054>] (dev_change_flags+0x10/0x48) from [<c001a720>] (ip_auto_config+0x190/0xf68)
[<c001a720>] (ip_auto_config+0x190/0xf68) from [<c00233b0>] (do_one_initcall+0x34/0x18c)
[<c00233b0>] (do_one_initcall+0x34/0x18c) from [<c0008400>] (kernel_init+0x94/0x134)
[<c0008400>] (kernel_init+0x94/0x134) from [<c0030858>] (kernel_thread_exit+0x0/0x8)
Since there is no restrictions for DMA on ep93xx, we can fix this by just
removing the GFP_DMA flag from the call.
Signed-off-by: Mika Westerberg <mika.westerberg@iki.fi>
Acked-by: Russell King <rmk+kernel@arm.linux.org.uk>
Acked-by: H Hartley Sweeten <hsweeten@visionengravers.com>
Tested-by: Petr Stetiar <ynezz@true.cz>
Signed-off-by: David S. Miller <davem@davemloft.net>
We can use simply kmalloc() to allocate the buffers. This also simplifies the
code and allows us to perform DMA sync operations more easily.
Memory is allocated with only GFP_KERNEL since there are no DMA allocation
restrictions on this platform.
Signed-off-by: Mika Westerberg <mika.westerberg@iki.fi>
Acked-by: Russell King <rmk+kernel@arm.linux.org.uk>
Acked-by: H Hartley Sweeten <hsweeten@visionengravers.com>
Tested-by: Petr Stetiar <ynezz@true.cz>
Signed-off-by: David S. Miller <davem@davemloft.net>
We shouldn't use NULL for any DMA API functions, unless we are dealing with
ISA or EISA device. So pass correct struct dev pointer to these functions.
Signed-off-by: Mika Westerberg <mika.westerberg@iki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
Since the driver uses the DMA API, we should pass it valid DMA masks.
Signed-off-by: Mika Westerberg <mika.westerberg@iki.fi>
Acked-by: Russell King <rmk+kernel@arm.linux.org.uk>
Acked-by: H Hartley Sweeten <hsweeten@visionengravers.com>
Tested-by: Petr Stetiar <ynezz@true.cz>
Signed-off-by: David S. Miller <davem@davemloft.net>
Testing of VLAN_FLAG_REORDER_HDR does not belong in vlan_untag
but rather in vlan_do_receive. Otherwise the vlan header
will not be properly put on the packet in the case of
vlan header accelleration.
As we remove the check from vlan_check_reorder_header
rename it vlan_reorder_header to keep the naming clean.
Fix up the skb->pkt_type early so we don't look at the packet
after adding the vlan tag, which guarantees we don't goof
and look at the wrong field.
Use a simple if statement instead of a complicated switch
statement to decided that we need to increment rx_stats
for a multicast packet.
Hopefully at somepoint we will just declare the case where
VLAN_FLAG_REORDER_HDR is cleared as unsupported and remove
the code. Until then this keeps it working correctly.
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Jiri Pirko <jpirko@redhat.com>
Acked-by: Changli Gao <xiaosuo@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix:
/tmp/ccvoZ6h8.s: Assembler messages:
/tmp/ccvoZ6h8.s:284: Warning: register range not in ascending order
/tmp/ccvoZ6h8.s:881: Warning: register range not in ascending order
/tmp/ccvoZ6h8.s:1087: Warning: register range not in ascending order
by ensuring that we have temporary variables placed into specific
registers. Reorder the code a bit to allow the resulting assembly
to be slightly more optimal.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
We were clearing out the multicast filter whenever the interface was
upped, and not setting the mode bits correctly. This can cause
problems if there are any multicast addresses already set at this
point, or if ALLMULTI was set.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
Without this the compiler can (and does) optimize register reads away
from within loops, and other such optimizations.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
One of the legit warnings 'make W=3 drivers/ide/ide-cd.c'
generates is:
drivers/ide/ide-cd.c: In function ide_cd_do_request
drivers/ide/ide-cd.c:828:2: warning: conversion to int from \
unsigned int may change the sign of the result
drivers/ide/ide-cd.c:833:2: warning: conversion to int from \
unsigned int may change the sign of the result
nsectors is declared int, should be unsigned int.
blk_rq_sectors() returns unsigned int, and ide_complete_rq
expects unsigned int as well. Fixes both warnings.
Signed-off-by: Connor Hansen <cmdkhh@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
It uses cpu_relax(), and so needs <asm/processor.h>
Without this patch, I see:
CC arch/mn10300/kernel/asm-offsets.s
In file included from include/linux/time.h:8,
from include/linux/timex.h:56,
from include/linux/sched.h:57,
from arch/mn10300/kernel/asm-offsets.c:7:
include/linux/seqlock.h: In function 'read_seqbegin':
include/linux/seqlock.h:91: error: implicit declaration of function 'cpu_relax'
whilst building asb2364_defconfig on MN10300.
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The WARN_ON() in start_transaction() was triggered while balancing.
The cause is btrfs_relocate_chunk() started a transaction and
then called iput() on the inode that stores free space cache,
and iput() called btrfs_start_transaction() again.
Reported-by: Tsutomu Itoh <t-itoh@jp.fujitsu.com>
Signed-off-by: Li Zefan <lizf@cn.fujitsu.com>
Reviewed-by: Josef Bacik <josef@redhat.com>
Signed-off-by: Chris Mason <chris.mason@oracle.com>
Checkpoint generation interval of nilfs goes wrong after user has
changed the interval parameter with nilfs-tune tool.
segctord starting. Construction interval = 5 seconds,
CP frequency < 30 seconds
segctord starting. Construction interval = 0 seconds,
CP frequency < 30 seconds
This turned out to be caused by a trivial bug in initialization code
of log writer. This will fix it.
Reported-by: Andrea Gelmini <andrea.gelmini@gmail.com>
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
nilfs_btree_delete function does not terminate part of virtual block
addresses when shrinking the last remaining child node into the root
node. The missing address termination causes that dead btree node
blocks persist and chip away free disk space.
This fixes the leak bug on the btree node deletion.
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
nilfs_btree_delete function wrongly terminates virtual block address
of the btree node held by its parent at index 0. When concatenating
the index-0 node with its right sibling node, nilfs_btree_delete
terminates the block address of index-0 node instead of the right
sibling node which should be deleted.
This bug not only wears disk space in the long run, but also causes
file system corruption. This will fix it.
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Get rid of FIXME comment. Uuids from dmesg are now the same as uuids
given by btrfs-progs.
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Signed-off-by: Chris Mason <chris.mason@oracle.com>
When encountering an EIO while reading from a nodatasum extent, we
insert an error record into the inode's failure tree.
btrfs_readpage_end_io_hook returns early for nodatasum inodes. We'd
better clear the failure tree in that case, otherwise the kernel
complains about
BUG extent_state: Objects remaining on kmem_cache_close()
on rmmod.
Signed-off-by: Jan Schmidt <list.btrfs@jan-o-sch.net>
Signed-off-by: Chris Mason <chris.mason@oracle.com>
list_splice_init will make delalloc_inodes empty, but without a spinlock
around, this may produce corrupted list head, accessed in many placess,
The race window is very tight and nobody seems to have hit it so far.
Signed-off-by: David Sterba <dsterba@suse.cz>
Signed-off-by: Chris Mason <chris.mason@oracle.com>
The size of struct btrfs_ioctl_fs_info_args is as big as 1KB, so
don't declare the variable on stack.
Signed-off-by: Li Zefan <lizf@cn.fujitsu.com>
Reviewed-by: Josef Bacik <josef@redhat.com>
Signed-off-by: Chris Mason <chris.mason@oracle.com>
Reorder extent_buffer to remove 8 bytes of alignment padding on 64 bit
builds. This shrinks its size to 128 bytes allowing it to fit into one
fewer cache lines and allows more objects per slab in its kmem_cache.
slabinfo extent_buffer reports :-
before:-
Sizes (bytes) Slabs
----------------------------------
Object : 136 Total : 123
SlabObj: 136 Full : 121
SlabSiz: 4096 Partial: 0
Loss : 0 CpuSlab: 2
Align : 8 Objects: 30
after :-
Object : 128 Total : 4
SlabObj: 128 Full : 2
SlabSiz: 4096 Partial: 0
Loss : 0 CpuSlab: 2
Align : 8 Objects: 32
Signed-off-by: Richard Kennedy <richard@rsk.demon.co.uk>
Signed-off-by: Chris Mason <chris.mason@oracle.com>
Normally current->jouranl_info is cleared by commit_transaction. For an
async snap or subvol creation, though, it runs in a work queue. Clear
it in btrfs_commit_transaction_async() to avoid leaking a non-NULL
journal_info when we return to userspace. When the actual commit runs in
the other thread it won't care that it's current->journal_info is already
NULL.
Signed-off-by: Sage Weil <sage@newdream.net>
Tested-by: Jim Schutt <jaschut@sandia.gov>
Signed-off-by: Chris Mason <chris.mason@oracle.com>