The open_cnt check is an open-coded D_ASSERT() check.
In case the data.work queue is not empty, it does not really help to
know which drbd_work elements remained on that list: they will be freed
immediately afterwards, anyway.
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
Remove the file name and line number from the syslog messages generated:
we have no duplicate function names, and no function contains the same
assertion more than once.
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
We do not do collision detection for read requests, but we still need to
look up the request objects when we receive a package over the network.
Using the same data structure for read and write requests results in
simpler code once the tl_hash and app_reads_hash tables are removed.
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
Unify the ar_id_to_req() and ack_id_to_req() functions: make both fail
if the consistency check fails. Move the request lookup code now
duplicated in both functions into its own function.
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
Move _ar_id_to_req() to drbd_receiver.c and mark it non-inline. Remove
the leading underscores from _ar_id_to_req() and _ack_id_to_req(). Mark
ar_hash_slot() inline.
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
The ID_VACANT definition has become entirely irrelevant by now.
The is_syncer_block_id() macro does not improve the code, so eliminated
it.
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
DRBD_MAGIC has nothing to do with block ids and the funny values
computed were not actually used, anyway.
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
Automatic partition scanning can be requested individually per loop
device during its setup by setting LO_FLAGS_PARTSCAN. By default, no
partition tables are scanned.
Userspace can now always add and remove partitions from all loop
devices, regardless if the in-kernel partition scanner is enabled or
not.
The needed partition minor numbers are allocated from the extended
minors space, the main loop device numbers will continue to match the
loop minors, regardless of the number of partitions used.
# grep . /sys/class/block/loop1/loop/*
/sys/block/loop1/loop/autoclear:0
/sys/block/loop1/loop/backing_file:/home/kay/data/stuff/part.img
/sys/block/loop1/loop/offset:0
/sys/block/loop1/loop/partscan:1
/sys/block/loop1/loop/sizelimit:0
# ls -l /dev/loop*
brw-rw---- 1 root disk 7, 0 Aug 14 20:22 /dev/loop0
brw-rw---- 1 root disk 7, 1 Aug 14 20:23 /dev/loop1
brw-rw---- 1 root disk 259, 0 Aug 14 20:23 /dev/loop1p1
brw-rw---- 1 root disk 259, 1 Aug 14 20:23 /dev/loop1p2
brw-rw---- 1 root disk 7, 99 Aug 14 20:23 /dev/loop99
brw-rw---- 1 root disk 259, 2 Aug 14 20:23 /dev/loop99p1
brw-rw---- 1 root disk 259, 3 Aug 14 20:23 /dev/loop99p2
crw------T 1 root root 10, 237 Aug 14 20:22 /dev/loop-control
Cc: Karel Zak <kzak@redhat.com>
Cc: Davidlohr Bueso <dave@gnu.org>
Acked-By: Tejun Heo <tj@kernel.org>
Signed-off-by: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
When do block-attach/block-detach test with below steps, umount hangs
in the guest. Furthermore shutdown ends up being stuck when umounting file-systems.
1. start guest.
2. attach new block device by xm block-attach in Dom0.
3. mount new disk in guest.
4. execute xm block-detach to detach the block device in dom0 until timeout
5. Any request to the disk will hung.
Root cause:
This issue is caused when setting backend device's state to
'XenbusStateClosing', which sends to the frontend the XenbusStateClosing
notification. When frontend receives the notification it tries to release
the disk in blkfront_closing(), but at that moment the disk is still in use
by guest, so frontend refuses to close. Specifically it sets the disk state to
XenbusStateClosing and sends the notification to backend - when backend receives the
event, it disconnects the vbd from real device, and sets the vbd device state to
XenbusStateClosing. The backend disconnects the real device/file, and any IO
requests to the disk in guest will end up in ether, leaving disk DEAD and set to
XenbusStateClosing. When the guest wants to disconnect the disk, umount will
hang on blkif_release()->xlvbd_release_gendisk() as it is unable to send any IO
to the disk, which prevents clean system shutdown.
Solution:
Don't disconnect backend until frontend state switched to XenbusStateClosed.
Signed-off-by: Joe Jin <joe.jin@oracle.com>
Cc: Daniel Stodden <daniel.stodden@citrix.com>
Cc: Jens Axboe <jaxboe@fusionio.com>
Cc: Annie Li <annie.li@oracle.com>
Cc: Ian Campbell <Ian.Campbell@eu.citrix.com>
[v1: Modified description a bit]
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
This commit adds discard support for loop devices. Discard is usually
supported by SSD and thinly provisioned devices as a method for
reclaiming unused space. This is no different than trying to reclaim
back space which is not used by the file system on the image, but it
still occupies space on the host file system.
We can do the reclamation on file system which does support hole
punching. So when discard request gets to the loop driver we can
translate that to punch a hole to the underlying file, hence reclaim
the free space.
This is very useful for trimming down the size of the image to only what
is really used by the file system on that image. Fstrim may be used for
that purpose.
It has been tested on ext4, xfs and btrfs with the image file systems
ext4, ext3, xfs and btrfs. ext4, or ext6 image on ext4 file system has
some problems but it seems that ext4 punch hole implementation is
somewhat flawed and it is unrelated to this commit.
Also this is a very good method of validating file systems punch hole
implementation.
Note that when encryption is used, discard support is disabled, because
using it might leak some information useful for possible attacker.
Signed-off-by: Lukas Czerner <lczerner@redhat.com>
Reviewed-by: Jeff Moyer <jmoyer@redhat.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
ERROR: code indent should use tabs where possible
#30: FILE: drivers/block/nbd.c:578:
+^I dev_info(disk_to_dev(lo->disk), "NBD_DISCONNECT\n");$
total: 1 errors, 0 warnings, 35 lines checked
NOTE: whitespace errors detected, you may wish to use scripts/cleanpatch or
scripts/cleanfile
./patches/nbd-replace-some-printk-with-dev_warn-and-dev_info.patch has style problems, please review.
If any of these errors are false positives, please report
them to the maintainer, see CHECKPATCH in MAINTAINERS.
Please run checkpatch prior to sending patches
Cc: Paul Clements <Paul.Clements@steeleye.com>
Cc: WANG Cong <amwang@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
With the frontend having Xen but the backend not, it just looks odd:
<*> Xen virtual block device support
<*> Block-device backend driver
Fix it to have the 'Xen' in front of it.
Reported-by: Sander Eikelenboom <linux@eikelenboom.it>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
of_device_id structures need a NULL terminating entry, add it.
Signed-off-by: Axel Lin <axel.lin@gmail.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
LOOP_CLR_FD takes lo->lo_ctl_mutex and tries to remove the loop sysfs
files. Sysfs calls show() and waits for lo->lo_ctl_mutex. LOOP_CLR_FD
waits for show() to finish to remove the sysfs file.
cat /sys/class/block/loop0/loop/backing_file
mutex_lock_nested+0x176/0x350
? loop_attr_do_show_backing_file+0x2f/0xd0 [loop]
? loop_attr_do_show_backing_file+0x2f/0xd0 [loop]
loop_attr_do_show_backing_file+0x2f/0xd0 [loop]
dev_attr_show+0x1b/0x60
? sysfs_read_file+0x86/0x1a0
? __get_free_pages+0x12/0x50
sysfs_read_file+0xaf/0x1a0
ioctl(LOOP_CLR_FD):
wait_for_common+0x12c/0x180
? try_to_wake_up+0x2a0/0x2a0
wait_for_completion+0x18/0x20
sysfs_deactivate+0x178/0x180
? sysfs_addrm_finish+0x43/0x70
? sysfs_addrm_start+0x1d/0x20
sysfs_addrm_finish+0x43/0x70
sysfs_hash_and_remove+0x85/0xa0
sysfs_remove_group+0x59/0x100
loop_clr_fd+0x1dc/0x3f0 [loop]
lo_ioctl+0x223/0x7a0 [loop]
Instead of taking the lo_ctl_mutex from sysfs code, take the inner
lo->lo_lock, to protect the access to the backing_file data.
Thanks to Tejun for help debugging and finding a solution.
Cc: Milan Broz <mbroz@redhat.com>
Cc: Tejun Heo <tj@kernel.org>
Signed-off-by: Kay Sievers <kay.sievers@vrfy.org>
Cc: stable@kernel.org
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
Instead of unconditionally creating a fixed number of dead loop
devices which need to be investigated by storage handling services,
even when they are never used, we allow distros start with 0
loop devices and have losetup(8) and similar switch to the dynamic
/dev/loop-control interface instead of searching /dev/loop%i for free
devices.
Signed-off-by: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
Loop devices today have a fixed pre-allocated number of usually 8.
The number can only be changed at module init time. To find a free
device to use, /dev/loop%i needs to be scanned, and all devices need
to be opened until a free one is possibly found.
This adds a new /dev/loop-control device node, that allows to
dynamically find or allocate a free device, and to add and remove loop
devices from the running system:
LOOP_CTL_ADD adds a specific device. Arg is the number
of the device. It returns the device i or a negative
error code.
LOOP_CTL_REMOVE removes a specific device, Arg is the
number the device. It returns the device i or a negative
error code.
LOOP_CTL_GET_FREE finds the next unbound device or allocates
a new one. No arg is given. It returns the device i or a
negative error code.
The loop kernel module gets automatically loaded when
/dev/loop-control is accessed the first time. The alias
specified in the module, instructs udev to create this
'dead' device node, even when the module is not loaded.
Example:
cfd = open("/dev/loop-control", O_RDWR);
# add a new specific loop device
err = ioctl(cfd, LOOP_CTL_ADD, devnr);
# remove a specific loop device
err = ioctl(cfd, LOOP_CTL_REMOVE, devnr);
# find or allocate a free loop device to use
devnr = ioctl(cfd, LOOP_CTL_GET_FREE);
sprintf(loopname, "/dev/loop%i", devnr);
ffd = open("backing-file", O_RDWR);
lfd = open(loopname, O_RDWR);
err = ioctl(lfd, LOOP_SET_FD, ffd);
Cc: Tejun Heo <tj@kernel.org>
Cc: Karel Zak <kzak@redhat.com>
Signed-off-by: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
Replace the linked list, that keeps track of allocated devices, with an
idr index to allow a more efficient lookup of devices.
Cc: Tejun Heo <tj@kernel.org>
Signed-off-by: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client: (23 commits)
ceph: document unlocked d_parent accesses
ceph: explicitly reference rename old_dentry parent dir in request
ceph: document locking for ceph_set_dentry_offset
ceph: avoid d_parent in ceph_dentry_hash; fix ceph_encode_fh() hashing bug
ceph: protect d_parent access in ceph_d_revalidate
ceph: protect access to d_parent
ceph: handle racing calls to ceph_init_dentry
ceph: set dir complete frag after adding capability
rbd: set blk_queue request sizes to object size
ceph: set up readahead size when rsize is not passed
rbd: cancel watch request when releasing the device
ceph: ignore lease mask
ceph: fix ceph_lookup_open intent usage
ceph: only link open operations to directory unsafe list if O_CREAT|O_TRUNC
ceph: fix bad parent_inode calc in ceph_lookup_open
ceph: avoid carrying Fw cap during write into page cache
libceph: don't time out osd requests that haven't been received
ceph: report f_bfree based on kb_avail rather than diffing.
ceph: only queue capsnap if caps are dirty
ceph: fix snap writeback when racing with writes
...
We were missing this cleanup, so when a device was released
the osd didn't clean up its watchers list, so following notifications
could be slow as osd needed to timeout on the client.
Signed-off-by: Yehuda Sadeh <yehuda@hq.newdream.net>
* 'for-3.1/drivers' of git://git.kernel.dk/linux-block:
cciss: do not attempt to read from a write-only register
xen/blkback: Add module alias for autoloading
xen/blkback: Don't let in-flight requests defer pending ones.
bsg: fix address space warning from sparse
bsg: remove unnecessary conditional expressions
bsg: fix bsg_poll() to return POLLOUT properly