When Block Device Layer is disabled, BLK_MAX_SEGMENT_SIZE is undefined.
The rtrs is a transport library and should compile independently of the
block layer. The desired max segment size should be passed down by the
user.
Introduce max_segment_size parameter for the rtrs_clt_open() call.
Fixes: f7a7a5c228 ("block/rnbd: client: main functionality")
Fixes: 6a98d71dae ("RDMA/rtrs: client: main functionality")
Fixes: cb80329c94 ("RDMA/rtrs: client: private header with client structs and functions")
Fixes: b5c27cdb09 ("RDMA/rtrs: public interface header to establish RDMA connections")
Link: https://lore.kernel.org/r/20200519111419.924170-1-danil.kipnis@cloud.ionos.com
Signed-off-by: Danil Kipnis <danil.kipnis@cloud.ionos.com>
Reported-by: Randy Dunlap <rdunlap@infradead.org>
Acked-by: Randy Dunlap <rdunlap@infradead.org> # build-tested
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Support REQ_OP_ZONE_APPEND requests for null_blk devices with zoned
mode enabled. Use the internally tracked zone write pointer position
as the actual write position and return it using the command request
__sector field in the case of an mq device and using the command BIO
sector in the case of a BIO device.
Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
Signed-off-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Modify the interface of blk_revalidate_disk_zones() to add an optional
driver callback function that a driver can use to extend processing
done during zone revalidation. The callback, if defined, is executed
with the device request queue frozen, after all zones have been
inspected.
Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
Signed-off-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
UBSAN: array-index-out-of-bounds in drivers/block/floppy.c:1521:45
index 16 is out of range for type 'unsigned char [16]'
Call Trace:
...
setup_rw_floppy+0x5c3/0x7f0
floppy_ready+0x2be/0x13b0
process_one_work+0x2c1/0x5d0
worker_thread+0x56/0x5e0
kthread+0x122/0x170
ret_from_fork+0x35/0x40
From include/uapi/linux/fd.h:
struct floppy_raw_cmd {
...
unsigned char cmd_count;
unsigned char cmd[16];
unsigned char reply_count;
unsigned char reply[16];
...
}
This out-of-bounds access is intentional. The command in struct
floppy_raw_cmd may take up the space initially intended for the reply
and the reply count. It is needed for long 82078 commands such as
RESTORE, which takes 17 command bytes. Initial cmd size is not enough
and since struct setup_rw_floppy is a part of uapi we check that
cmd_count is in [0:16+1+16] in raw_cmd_copyin().
The patch adds union with original cmd,reply_count,reply fields and
fullcmd field of equivalent size. The cmd accesses are turned to
fullcmd where appropriate to suppress UBSAN warning.
Link: https://lore.kernel.org/r/20200501134416.72248-5-efremov@linux.com
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Denis Efremov <efremov@linux.com>
Use FD_RAW_CMD_SIZE, FD_RAW_REPLY_SIZE defines instead of magic numbers
for cmd & reply buffers of struct floppy_raw_cmd. Remove local to
floppy.c MAX_REPLIES define, as it is now FD_RAW_REPLY_SIZE.
FD_RAW_CMD_FULLSIZE added as we allow command to also fill reply_count
and reply fields.
Link: https://lore.kernel.org/r/20200501134416.72248-4-efremov@linux.com
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Denis Efremov <efremov@linux.com>
When called with a negative drive value, set_fdc() would stick to the
current fdc (which was assumed to reflect the current_drive's FDC). We
do not need this anymore as the last call place with a negative value
was just addressed. Let's make this function always set both current_fdc
and current_drive so that there's no more ambiguity. A few comments
stating this were added to a few non-obvious places.
Link: https://lore.kernel.org/r/20200410101904.14652-3-w@1wt.eu
Signed-off-by: Willy Tarreau <w@1wt.eu>
Signed-off-by: Denis Efremov <efremov@linux.com>
This macro equals -1 and is used as an alternative for current_drive when
calling reschedule_timeout(), which in turn needs to remap it. This only
adds obfuscation, let's simply use current_drive.
Link: https://lore.kernel.org/r/20200410101904.14652-2-w@1wt.eu
Signed-off-by: Willy Tarreau <w@1wt.eu>
Signed-off-by: Denis Efremov <efremov@linux.com>
In floppy_resume() we don't properly reinitialize all FDCs, instead
we reinitialize the current FDC once per available FDC because value
-1 is passed to user_reset_fdc(). Let's simply save the current drive
and properly reinitialize each FDC.
Link: https://lore.kernel.org/r/20200410101904.14652-1-w@1wt.eu
Signed-off-by: Willy Tarreau <w@1wt.eu>
Signed-off-by: Denis Efremov <efremov@linux.com>
There's no need to iterate on current_fdc in do_floppy_init() anymore,
in the first case it's only used as an array index to access fdc_state[],
so let's get rid of this confusing assignment. The second case is a bit
trickier because user_reset_fdc() needs to already know current_fdc when
called with drive==-1 due to this call chain:
user_reset_fdc()
lock_fdc()
set_fdc()
drive<0 ==> new_fdc = current_fdc
Note that current_drive is not used in this code part and may even not
match a unit belonging to current_fdc. Instead of passing -1 we can
simply pass the first drive of the FDC being initialized, which is even
cleaner as it will allow the function chain above to consistently assign
both variables.
Link: https://lore.kernel.org/r/20200410093023.14499-1-w@1wt.eu
Signed-off-by: Willy Tarreau <w@1wt.eu>
Signed-off-by: Denis Efremov <efremov@linux.com>
The locking in the driver is far from being obvious, with unlocking
automatically happening at end of operations scheduled by interrupt,
especially for the error paths where one does not necessarily expect
that such an interrupt may be triggered. Let's add a few comments
about what to expect at certain places to avoid misdetecting bugs
which are not.
Link: https://lore.kernel.org/r/20200331094054.24441-24-w@1wt.eu
Signed-off-by: Willy Tarreau <w@1wt.eu>
Signed-off-by: Denis Efremov <efremov@linux.com>
Both floppy_grab_irq_and_dma() and floppy_release_irq_and_dma() used to
iterate on the global variable while setting up or freeing resources.
Now that they exclusively rely on functions which take the fdc as an
argument, so let's not touch the global one anymore.
Link: https://lore.kernel.org/r/20200331094054.24441-23-w@1wt.eu
Signed-off-by: Willy Tarreau <w@1wt.eu>
Signed-off-by: Denis Efremov <efremov@linux.com>
Now the fdc is passed in argument so that the function does not
use current_fdc anymore.
It's worth noting that there's still a single raw_cmd pointer
specific to the current fdc. It may make sense to have one per
fdc in the future. In addition, cont->done() still relies on the
current drive and current raw_cmd.
Link: https://lore.kernel.org/r/20200331094054.24441-17-w@1wt.eu
Signed-off-by: Willy Tarreau <w@1wt.eu>
Signed-off-by: Denis Efremov <efremov@linux.com>
Now the fdc is passed in argument so that the function does not
use current_fdc anymore.
It's worth noting that there's still a single reply_buffer[] which
will store the result for the current fdc. It may or may not make
sense to implement one buffer per fdc in the future.
Link: https://lore.kernel.org/r/20200331094054.24441-15-w@1wt.eu
Signed-off-by: Willy Tarreau <w@1wt.eu>
Signed-off-by: Denis Efremov <efremov@linux.com>
Currently we have architecture-specific fd_inb() and fd_outb() functions
or macros, taking just a port which is in fact made of a base address and
a register. The base address is FDC-specific and derived from the local or
global "fdc" variable through the FD_IOPORT macro used in the base address
calculation.
This change splits this by explicitly passing the FDC's base address and
the register separately to fd_outb() and fd_inb(). It affects the
following archs:
- x86, alpha, mips, powerpc, parisc, arm, m68k:
simple remap of port -> base+reg
- sparc32: use of reg only, since the base address was already masked
out and the FDC controller is known from a static struct.
- sparc64: like x86 for PCI, like sparc32 for 82077
Some archs use inline functions and others macros. This was not
unified in order to minimize the number of changes to review. For the
same reason checkpatch still spews a few warnings about things that
were already there before.
The parisc still uses hard-coded register values and could be cleaned up
by taking the register definitions.
The sparc per-controller inb/outb functions could further be refined
to explicitly take an FDC register instead of a port in argument but it
was not needed yet and may be cleaned later.
Link: https://lore.kernel.org/r/20200331094054.24441-2-w@1wt.eu
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Ian Molton <spyro@f2s.com>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Cc: Helge Deller <deller@gmx.de>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: x86@kernel.org
Signed-off-by: Willy Tarreau <w@1wt.eu>
Signed-off-by: Denis Efremov <efremov@linux.com>
Pull in block-5.7 fixes for 5.8. Mostly to resolve a conflict with
the blk-iocost changes, but we also need the base of the bdi
use-after-free as well as we build on top of it.
* block-5.7:
nvme: fix possible hang when ns scanning fails during error recovery
nvme-pci: fix "slimmer CQ head update"
bdi: add a ->dev_name field to struct backing_dev_info
bdi: use bdi_dev_name() to get device name
bdi: move bdi_dev_name out of line
vboxsf: don't use the source name in the bdi name
iocost: protect iocg->abs_vdebt with iocg->waitq.lock
block: remove the bd_openers checks in blk_drop_partitions
nvme: prevent double free in nvme_alloc_ns() error handling
null_blk: Cleanup zoned device initialization
null_blk: Fix zoned command handling
block: remove unused header
blk-iocost: Fix error on iocost_ioc_vrate_adj
bdev: Reduce time holding bd_mutex in sync in blkdev_close()
buffer: remove useless comment and WB_REASON_FREE_MORE_MEM, reason.
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Add a pointer to the CDROM information structure to struct gendisk.
This will allow various removable media file systems to call directly
into the CDROM layer instead of abusing ioctls with kernel pointers.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Damien Le Moal <damien.lemoal@wdc.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Remove the check because DAX now has it's own read/write methods and
file systems which support DAX check IS_DAX() prior to IOCB_DIRECT on
their own. Therefore, it does not matter if the file state is DAX when
the iocb flags are created.
Also remove io_is_direct() as it is just a simple flag check.
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Ira Weiny <ira.weiny@intel.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
A userspace process holding a file descriptor to a virtio_blk device can
still invoke block_device_operations after hot unplug. This leads to a
use-after-free accessing vblk->vdev in virtblk_getgeo() when
ioctl(HDIO_GETGEO) is invoked:
BUG: unable to handle kernel NULL pointer dereference at 0000000000000090
IP: [<ffffffffc00e5450>] virtio_check_driver_offered_feature+0x10/0x90 [virtio]
PGD 800000003a92f067 PUD 3a930067 PMD 0
Oops: 0000 [#1] SMP
CPU: 0 PID: 1310 Comm: hdio-getgeo Tainted: G OE ------------ 3.10.0-1062.el7.x86_64 #1
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
task: ffff9be5fbfb8000 ti: ffff9be5fa890000 task.ti: ffff9be5fa890000
RIP: 0010:[<ffffffffc00e5450>] [<ffffffffc00e5450>] virtio_check_driver_offered_feature+0x10/0x90 [virtio]
RSP: 0018:ffff9be5fa893dc8 EFLAGS: 00010246
RAX: ffff9be5fc3f3400 RBX: ffff9be5fa893e30 RCX: 0000000000000000
RDX: 0000000000000000 RSI: 0000000000000004 RDI: ffff9be5fbc10b40
RBP: ffff9be5fa893dc8 R08: 0000000000000301 R09: 0000000000000301
R10: 0000000000000000 R11: 0000000000000000 R12: ffff9be5fdc24680
R13: ffff9be5fbc10b40 R14: ffff9be5fbc10480 R15: 0000000000000000
FS: 00007f1bfb968740(0000) GS:ffff9be5ffc00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000090 CR3: 000000003a894000 CR4: 0000000000360ff0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
[<ffffffffc016ac37>] virtblk_getgeo+0x47/0x110 [virtio_blk]
[<ffffffff8d3f200d>] ? handle_mm_fault+0x39d/0x9b0
[<ffffffff8d561265>] blkdev_ioctl+0x1f5/0xa20
[<ffffffff8d488771>] block_ioctl+0x41/0x50
[<ffffffff8d45d9e0>] do_vfs_ioctl+0x3a0/0x5a0
[<ffffffff8d45dc81>] SyS_ioctl+0xa1/0xc0
A related problem is that virtblk_remove() leaks the vd_index_ida index
when something still holds a reference to vblk->disk during hot unplug.
This causes virtio-blk device names to be lost (vda, vdb, etc).
Fix these issues by protecting vblk->vdev with a mutex and reference
counting vblk so the vd_index_ida index can be removed in all cases.
Fixes: 48e4043d45 ("virtio: add virtio disk geometry feature")
Reported-by: Lance Digby <ldigby@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Link: https://lore.kernel.org/r/20200430140442.171016-1-stefanha@redhat.com
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Pull block fixes from Jens Axboe:
"A few fixes/changes that should go into this release:
- null_blk zoned fixes (Damien)
- blkdev_close() sync improvement (Douglas)
- Fix regression in blk-iocost that impacted (at least) systemtap
(Waiman)
- Comment fix, header removal (Zhiqiang, Jianpeng)"
* tag 'block-5.7-2020-04-24' of git://git.kernel.dk/linux-block:
null_blk: Cleanup zoned device initialization
null_blk: Fix zoned command handling
block: remove unused header
blk-iocost: Fix error on iocost_ioc_vrate_adj
bdev: Reduce time holding bd_mutex in sync in blkdev_close()
buffer: remove useless comment and WB_REASON_FREE_MORE_MEM, reason.
Move all zoned mode related code from null_blk_main.c to
null_blk_zoned.c, avoiding an ugly #ifdef in the process.
Rename null_zone_init() into null_init_zoned_dev(), null_zone_exit()
into null_free_zoned_dev() and add the new function
null_register_zoned_dev() to finalize the zoned dev setup before
add_disk().
Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
For write operations issued to a null_blk device with zoned mode
enabled, the state and write pointer position of the zone targeted by
the command should be checked before badblocks and memory backing
are handled as the write may be first failed due to, for instance, a
sector position not aligned with the zone write pointer. This order of
checking for errors reflects more accuratly the behavior of physical
zoned devices.
Furthermore, the write pointer position of the target zone should be
incremented only and only if no errors are reported by badblocks and
memory backing handling.
To fix this, introduce the small helper function null_process_cmd()
which execute null_handle_badblocks() and null_handle_memory_backed()
and use this function in null_zone_write() to correctly handle write
requests to zoned null devices depending on the type and state of the
write target zone. Also call this function in null_handle_zoned() to
process read requests to zoned null devices.
null_process_cmd() is called directly from null_handle_cmd() for
regular null devices, resulting in no functional change for these type
of devices. To have symmetric names, the function null_handle_zoned()
is renamed to null_process_zoned_cmd().
Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Pull virtio fixes and cleanups from Michael Tsirkin:
- Some bug fixes
- Cleanup a couple of issues that surfaced meanwhile
- Disable vhost on ARM with OABI for now - to be fixed fully later in
the cycle or in the next release.
* tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost: (24 commits)
vhost: disable for OABI
virtio: drop vringh.h dependency
virtio_blk: add a missing include
virtio-balloon: Avoid using the word 'report' when referring to free page hinting
virtio-balloon: make virtballoon_free_page_report() static
vdpa: fix comment of vdpa_register_device()
vdpa: make vhost, virtio depend on menu
vdpa: allow a 32 bit vq alignment
drm/virtio: fix up for include file changes
remoteproc: pull in slab.h
rpmsg: pull in slab.h
virtio_input: pull in slab.h
remoteproc: pull in slab.h
virtio-rng: pull in slab.h
virtgpu: pull in uaccess.h
tools/virtio: make asm/barrier.h self contained
tools/virtio: define aligned attribute
virtio/test: fix up after IOTLB changes
vhost: Create accessors for virtqueues private_data
vdpasim: Return status in vdpasim_get_status
...