linux

mirror of https://github.com/torvalds/linux.git synced 2024-12-29 22:31:32 +00:00

Author	SHA1	Message	Date
Christoph Hellwig	651bbba57a	bcache: remove the extra cflags for request.o There is no block directory this file needs includes from. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Coly Li <colyli@suse.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2019-11-13 15:42:50 -07:00
Coly Li	9fcc34b1a6	bcache: at least try to shrink 1 node in bch_mca_scan() In bch_mca_scan(), the number of shrinking btree node is calculated by code like this, unsigned long nr = sc->nr_to_scan; nr /= c->btree_pages; nr = min_t(unsigned long, nr, mca_can_free(c)); variable sc->nr_to_scan is number of objects (here is bcache B+tree nodes' number) to shrink, and pointer variable sc is sent from memory management code as parametr of a callback. If sc->nr_to_scan is smaller than c->btree_pages, after the above calculation, variable 'nr' will be 0 and nothing will be shrunk. It is frequeently observed that only 1 or 2 is set to sc->nr_to_scan and make nr to be zero. Then bch_mca_scan() will do nothing more then acquiring and releasing mutex c->bucket_lock. This patch checkes whether nr is 0 after the above calculation, if 0 is the result then set 1 to variable 'n'. Then at least bch_mca_scan() will try to shrink a single B+tree node. Signed-off-by: Coly Li <colyli@suse.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2019-11-13 15:42:50 -07:00
Coly Li	c5fcdedcee	bcache: add idle_max_writeback_rate sysfs interface For writeback mode, if there is no regular I/O request for a while, the writeback rate will be set to the maximum value (1TB/s for now). This is good for most of the storage workload, but there are still people don't what the maximum writeback rate in I/O idle time. This patch adds a sysfs interface file idle_max_writeback_rate to permit people to disable maximum writeback rate. Then the minimum writeback rate can be advised by writeback_rate_minimum in the bcache device's sysfs interface. Reported-by: Christian Balzer <chibi@gol.com> Signed-off-by: Coly Li <colyli@suse.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2019-11-13 15:42:50 -07:00
Coly Li	5dccefd3ea	bcache: add code comments in bch_btree_leaf_dirty() This patch adds code comments in bch_btree_leaf_dirty() to explain why w->journal should always reference the eldest journal pin of all the writing bkeys in the btree node. To make the bcache journal code to be easier to be understood. Signed-off-by: Coly Li <colyli@suse.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2019-11-13 15:42:50 -07:00
Andrea Righi	84c529aea1	bcache: fix deadlock in bcache_allocator bcache_allocator can call the following: bch_allocator_thread() -> bch_prio_write() -> bch_bucket_alloc() -> wait on &ca->set->bucket_wait But the wake up event on bucket_wait is supposed to come from bch_allocator_thread() itself => deadlock: [ 1158.490744] INFO: task bcache_allocato:15861 blocked for more than 10 seconds. [ 1158.495929] Not tainted 5.3.0-050300rc3-generic #201908042232 [ 1158.500653] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 1158.504413] bcache_allocato D 0 15861 2 0x80004000 [ 1158.504419] Call Trace: [ 1158.504429] __schedule+0x2a8/0x670 [ 1158.504432] schedule+0x2d/0x90 [ 1158.504448] bch_bucket_alloc+0xe5/0x370 [bcache] [ 1158.504453] ? wait_woken+0x80/0x80 [ 1158.504466] bch_prio_write+0x1dc/0x390 [bcache] [ 1158.504476] bch_allocator_thread+0x233/0x490 [bcache] [ 1158.504491] kthread+0x121/0x140 [ 1158.504503] ? invalidate_buckets+0x890/0x890 [bcache] [ 1158.504506] ? kthread_park+0xb0/0xb0 [ 1158.504510] ret_from_fork+0x35/0x40 Fix by making the call to bch_prio_write() non-blocking, so that bch_allocator_thread() never waits on itself. Moreover, make sure to wake up the garbage collector thread when bch_prio_write() is failing to allocate buckets. BugLink: https://bugs.launchpad.net/bugs/1784665 BugLink: https://bugs.launchpad.net/bugs/1796292 Signed-off-by: Andrea Righi <andrea.righi@canonical.com> Signed-off-by: Coly Li <colyli@suse.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2019-11-13 15:42:50 -07:00
Coly Li	06c1526da9	bcache: add code comment bch_keylist_pop() and bch_keylist_pop_front() This patch adds simple code comments for bch_keylist_pop() and bch_keylist_pop_front() in bset.c, to make the code more easier to be understand. Signed-off-by: Coly Li <colyli@suse.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2019-11-13 15:42:50 -07:00
Coly Li	41fa4deef9	bcache: deleted code comments for dead code in bch_data_insert_keys() In request.c:bch_data_insert_keys(), there is code comment for a piece of dead code. This patch deletes the dead code and its code comment since they are useless in practice. Signed-off-by: Coly Li <colyli@suse.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2019-11-13 15:42:50 -07:00
Coly Li	aaf8dbeab5	bcache: add more accurate error messages in read_super() Previous code only returns "Not a bcache superblock" for both bcache super block offset and magic error. This patch addss more accurate error messages, - for super block unmatched offset: "Not a bcache superblock (bad offset)" - for super block unmatched magic number: "Not a bcache superblock (bad magic)" Signed-off-by: Coly Li <colyli@suse.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2019-11-13 15:42:50 -07:00
Coly Li	2d8869518a	bcache: fix static checker warning in bcache_device_free() Commit `cafe563591` ("bcache: A block layer cache") leads to the following static checker warning: ./drivers/md/bcache/super.c:770 bcache_device_free() warn: variable dereferenced before check 'd->disk' (see line 766) drivers/md/bcache/super.c 762 static void bcache_device_free(struct bcache_device *d) 763 { 764 lockdep_assert_held(&bch_register_lock); 765 766 pr_info("%s stopped", d->disk->disk_name); ^^^^^^^^^ Unchecked dereference. 767 768 if (d->c) 769 bcache_device_detach(d); 770 if (d->disk && d->disk->flags & GENHD_FL_UP) ^^^^^^^ Check too late. 771 del_gendisk(d->disk); 772 if (d->disk && d->disk->queue) 773 blk_cleanup_queue(d->disk->queue); 774 if (d->disk) { 775 ida_simple_remove(&bcache_device_idx, 776 first_minor_to_idx(d->disk->first_minor)); 777 put_disk(d->disk); 778 } 779 It is not 100% sure that the gendisk struct of bcache device will always be there, the warning makes sense when there is problem in block core. This patch tries to remove the static checking warning by checking d->disk to avoid NULL pointer deferences. Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Coly Li <colyli@suse.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2019-11-13 15:42:50 -07:00
Guoju Fang	34cf78bf34	bcache: fix a lost wake-up problem caused by mca_cannibalize_lock This patch fix a lost wake-up problem caused by the race between mca_cannibalize_lock and bch_cannibalize_unlock. Consider two processes, A and B. Process A is executing mca_cannibalize_lock, while process B takes c->btree_cache_alloc_lock and is executing bch_cannibalize_unlock. The problem happens that after process A executes cmpxchg and will execute prepare_to_wait. In this timeslice process B executes wake_up, but after that process A executes prepare_to_wait and set the state to TASK_INTERRUPTIBLE. Then process A goes to sleep but no one will wake up it. This problem may cause bcache device to dead. Signed-off-by: Guoju Fang <fangguoju@gmail.com> Signed-off-by: Coly Li <colyli@suse.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2019-11-13 15:42:50 -07:00
Coly Li	c0e0954e90	bcache: fix fifo index swapping condition in journal_pin_cmp() Fifo structure journal.pin is implemented by a cycle buffer, if the back index reaches highest location of the cycle buffer, it will be swapped to 0. Once the swapping happens, it means a smaller fifo index might be associated to a newer journal entry. So the btree node with oldest journal entry won't be selected in bch_btree_leaf_dirty() to reference the dirty B+tree leaf node. This problem may cause bcache journal won't protect unflushed oldest B+tree dirty leaf node in power failure, and this B+tree leaf node is possible to beinconsistent after reboot from power failure. This patch fixes the fifo index comparing logic in journal_pin_cmp(), to avoid potential corrupted B+tree leaf node when the back index of journal pin is swapped. Signed-off-by: Coly Li <colyli@suse.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2019-11-13 15:42:49 -07:00
Jens Axboe	e2a7b9f4a1	Merge branch 'md-next' of git://git.kernel.org/pub/scm/linux/kernel/git/song/md into for-5.5/drivers Pull MD changes from Song. * 'md-next' of git://git.kernel.org/pub/scm/linux/kernel/git/song/md: md/raid10: prevent access of uninitialized resync_pages offset md: avoid invalid memory access for array sb->dev_roles md/raid1: avoid soft lockup under high load	2019-11-13 15:41:21 -07:00
John Pittman	45422b704d	md/raid10: prevent access of uninitialized resync_pages offset Due to unneeded multiplication in the out_free_pages portion of r10buf_pool_alloc(), when using a 3-copy raid10 layout, it is possible to access a resync_pages offset that has not been initialized. This access translates into a crash of the system within resync_free_pages() while passing a bad pointer to put_page(). Remove the multiplication, preventing access to the uninitialized area. Fixes: `f025061836` ("md: raid10: don't use bio's vec table to manage resync pages") Cc: stable@vger.kernel.org # 4.12+ Signed-off-by: John Pittman <jpittman@redhat.com> Suggested-by: David Jeffery <djeffery@redhat.com> Reviewed-by: Laurence Oberman <loberman@redhat.com> Signed-off-by: Song Liu <songliubraving@fb.com>	2019-11-11 16:47:39 -08:00
Yufen Yu	228fc7d76d	md: avoid invalid memory access for array sb->dev_roles we need to gurantee 'desc_nr' valid before access array of sb->dev_roles. In addition, we should avoid .load_super always return '0' when level is LEVEL_MULTIPATH, which is not expected. Reported-by: coverity-bot <keescook+coverity-bot@chromium.org> Addresses-Coverity-ID: 1487373 ("Memory - illegal accesses") Fixes: `6a5cb53aaa` ("md: no longer compare spare disk superblock events in super_load") Signed-off-by: Yufen Yu <yuyufen@huawei.com> Signed-off-by: Song Liu <songliubraving@fb.com>	2019-11-11 16:32:22 -08:00
Hannes Reinecke	5fa4f8bac9	md/raid1: avoid soft lockup under high load As all I/O is being pushed through a kernel thread the softlockup watchdog might be triggered under high load. Signed-off-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Song Liu <songliubraving@fb.com>	2019-11-11 16:32:22 -08:00
Ajay Joshi	da644b2cc1	null_blk: add zone open, close, and finish support Implement REQ_OP_ZONE_OPEN, REQ_OP_ZONE_CLOSE and REQ_OP_ZONE_FINISH support to allow explicit control of zone states. Contains contributions from Matias Bjorling, Hans Holmberg, Keith Busch and Damien Le Moal. Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Ajay Joshi <ajay.joshi@wdc.com> Signed-off-by: Matias Bjorling <matias.bjorling@wdc.com> Signed-off-by: Hans Holmberg <hans.holmberg@wdc.com> Signed-off-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2019-11-07 06:37:40 -07:00
Ajay Joshi	2e2d6f7e44	dm: add zone open, close and finish support Implement REQ_OP_ZONE_OPEN, REQ_OP_ZONE_CLOSE and REQ_OP_ZONE_FINISH support to allow explicit control of zone states. Contains contributions from Matias Bjorling, Hans Holmberg and Damien Le Moal. Acked-by: Mike Snitzer <snitzer@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Ajay Joshi <ajay.joshi@wdc.com> Signed-off-by: Matias Bjorling <matias.bjorling@wdc.com> Signed-off-by: Hans Holmberg <hans.holmberg@wdc.com> Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2019-11-07 06:36:08 -07:00
Jens Axboe	439b84fa17	Merge branch 'for-5.5/block' into for-5.5/drivers Pull in dependencies for the new zoned open/close/finish support. * for-5.5/block: (32 commits) block: add zone open, close and finish ioctl support block: add zone open, close and finish operations block: Simplify REQ_OP_ZONE_RESET_ALL handling block: Remove REQ_OP_ZONE_RESET plugging block: Warn if elevator= parameter is used block: avoid blk_bio_segment_split for small I/O operations blk-mq: make sure that line break can be printed block: sed-opal: Introduce Opal Datastore UID block: sed-opal: Add support to read/write opal tables generically block: sed-opal: Generalizing write data to any opal table bdev: Refresh bdev size for disks without partitioning bdev: Factor out bdev revalidation into a common helper blk-mq: avoid sysfs buffer overflow with too many CPU cores blk-mq: Make blk_mq_run_hw_queue() return void fcntl: fix typo in RWH_WRITE_LIFE_NOT_SET r/w hint name blk-mq: fill header with kernel-doc blk-mq: remove needless goto from blk_mq_get_driver_tag block: reorder bio::__bi_remaining for better packing block: Reduce the amount of memory used for tag sets block: Reduce the amount of memory required per request queue ...	2019-11-07 06:33:05 -07:00
Ajay Joshi	e876df1fe0	block: add zone open, close and finish ioctl support Introduce three new ioctl commands BLKOPENZONE, BLKCLOSEZONE and BLKFINISHZONE to allow applications to control the condition of zones on a zoned block device through the execution of the REQ_OP_ZONE_OPEN, REQ_OP_ZONE_CLOSE and REQ_OP_ZONE_FINISH operations. Contains contributions from Matias Bjorling, Hans Holmberg, Dmitry Fomichev, Keith Busch, Damien Le Moal and Christoph Hellwig. Reviewed-by: Javier González <javier@javigon.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Ajay Joshi <ajay.joshi@wdc.com> Signed-off-by: Matias Bjorling <matias.bjorling@wdc.com> Signed-off-by: Hans Holmberg <hans.holmberg@wdc.com> Signed-off-by: Dmitry Fomichev <dmitry.fomichev@wdc.com> Signed-off-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2019-11-07 06:31:50 -07:00
Ajay Joshi	6c1b1da58f	block: add zone open, close and finish operations Zoned block devices (ZBC and ZAC devices) allow an explicit control over the condition (state) of zones. The operations allowed are: * Open a zone: Transition to open condition to indicate that a zone will actively be written * Close a zone: Transition to closed condition to release the drive resources used for writing to a zone * Finish a zone: Transition an open or closed zone to the full condition to prevent write operations To enable this control for in-kernel zoned block device users, define the new request operations REQ_OP_ZONE_OPEN, REQ_OP_ZONE_CLOSE and REQ_OP_ZONE_FINISH as well as the generic function blkdev_zone_mgmt() for submitting these operations on a range of zones. This results in blkdev_reset_zones() removal and replacement with this new zone magement function. Users of blkdev_reset_zones() (f2fs and dm-zoned) are updated accordingly. Contains contributions from Matias Bjorling, Hans Holmberg, Dmitry Fomichev, Keith Busch, Damien Le Moal and Christoph Hellwig. Reviewed-by: Javier González <javier@javigon.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Ajay Joshi <ajay.joshi@wdc.com> Signed-off-by: Matias Bjorling <matias.bjorling@wdc.com> Signed-off-by: Hans Holmberg <hans.holmberg@wdc.com> Signed-off-by: Dmitry Fomichev <dmitry.fomichev@wdc.com> Signed-off-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2019-11-07 06:31:48 -07:00
Damien Le Moal	c7a1d926dc	block: Simplify REQ_OP_ZONE_RESET_ALL handling There is no need for the function __blkdev_reset_all_zones() as REQ_OP_ZONE_RESET_ALL can be handled directly in blkdev_reset_zones() bio loop with an early break from the loop. This patch removes this function and modifies blkdev_reset_zones(), simplifying the code. Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2019-11-07 06:30:18 -07:00
Damien Le Moal	a84324d2ed	block: Remove REQ_OP_ZONE_RESET plugging REQ_OP_ZONE_RESET operations cannot be merged as these bios and requests do not have a size and are never sequential due to the zone start sector position required for their execution. As a result, there is no point in using a plug around blkdev_reset_zones() bio issuing loop. This patch removes this unnecessary plugging. Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Reviewed-by: Javier González <javier@javigon.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2019-11-07 06:30:16 -07:00
Jan Kara	f8db383507	block: Warn if elevator= parameter is used With transition to blk-mq, the elevator= kernel argument was removed as it makes less and less sense with the current variety of devices. Since this may surprise some users and there are advices on the Internet that still suggest to use it, let's at least warn if the parameter is used. Reviewed-by: Jeff Moyer <jmoyer@redhat.com> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2019-11-06 07:16:07 -07:00
Christoph Hellwig	fa53228721	block: avoid blk_bio_segment_split for small I/O operations __blk_queue_split() adds significant overhead for small I/O operations. Add a shortcut to avoid it for cases where we know we never need to split. Based on a patch from Ming Lei. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2019-11-04 17:13:54 -07:00
Prabhath Sajeepa	64fab7290d	nvme: Fix parsing of ANA log page Check validity of offset into ANA log buffer before accessing nvme_ana_group_desc. This check ensures the size of ANA log buffer >= offset + sizeof(nvme_ana_group_desc) Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Prabhath Sajeepa <psajeepa@purestorage.com> Signed-off-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2019-11-04 10:56:42 -07:00
Christoph Hellwig	716fd9c119	nvmet: stop using bio_set_op_attrs bio_set_op_attrs has been long deprecated, replace it with a direct assignment of the flags to bio->bi_opf. Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2019-11-04 10:56:42 -07:00
Christoph Hellwig	9dea0c81ee	nvmet: add plugging for read/write when ns is bdev With reference to the following issue reported on the mailing list :- http://lists.infradead.org/pipermail/linux-nvme/2019-October/027604.html this patch adds plugging for the bdev-ns under nvmet_bdev_execute_rw(). We can see the following performance improvement in random write workload I/Os with the setup described in the link when device_path configured as /dev/md0. Without this patch :- write: IOPS=40.8k, BW=159MiB/s (167MB/s)(4777MiB/30002msec) write: IOPS=41.2k, BW=161MiB/s (169MB/s)(4831MiB/30011msec) slat (usec): min=8, max=10823, avg=15.64, stdev=16.85 slat (usec): min=8, max=401, avg=15.40, stdev= 9.56 clat (usec): min=54, max=2492, avg=759.07, stdev=172.62 clat (usec): min=56, max=1997, avg=768.06, stdev=178.72 With this patch :- write: IOPS=123k, BW=480MiB/s (504MB/s)(14.1GiB/30011msec) write: IOPS=123k, BW=481MiB/s (504MB/s)(14.1GiB/30002msec) slat (usec): min=8, max=9941, avg=13.31, stdev= 8.04 slat (usec): min=8, max=289, avg=13.31, stdev= 3.37 clat (usec): min=43, max=17635, avg=245.46, stdev=171.23 clat (usec): min=44, max=17751, avg=245.25, stdev=183.14 Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Signed-off-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2019-11-04 10:56:42 -07:00
Christoph Hellwig	d84dd8cde6	nvmet: clean up command parsing a bit Move the special cases for fabrics commands and the discovery controller to nvmet_parse_admin_cmd in preparation for adding passthrough support. Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2019-11-04 10:56:42 -07:00
Geert Uytterhoeven	05d3046ff7	nvme-pci: Spelling s/resdicovered/rediscovered/ Fix misspelling of "rediscovered". Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Signed-off-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2019-11-04 10:56:42 -07:00
Sagi Grimberg	d4b3a17411	nvmet: fill discovery controller sn, fr and mn correctly Discovery controllers need this information as well. Signed-off-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2019-11-04 10:56:42 -07:00
Christoph Hellwig	be3f3114dd	nvmet: Open code nvmet_req_execute() Now that nvmet_req_execute does nothing, open code it. Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Signed-off-by: Christoph Hellwig <hch@lst.de> [split patch, update changelog] Signed-off-by: Logan Gunthorpe <logang@deltatee.com> Signed-off-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2019-11-04 10:56:42 -07:00
Christoph Hellwig	e9061c3978	nvmet: Remove the data_len field from the nvmet_req struct Instead of storing the expected length and checking it when it's executed, just check the length inside the command themselves. A new helper, nvmet_check_data_len() is created to help with this check. Signed-off-by: Christoph Hellwig <hch@lst.de> [split patch, udpate changelog] Signed-off-by: Logan Gunthorpe <logang@deltatee.com> Signed-off-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2019-11-04 10:56:42 -07:00
Christoph Hellwig	59ef0eaa77	nvmet: Introduce nvmet_dsm_len() helper Similar to the nvmet_rw_len helper. Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Signed-off-by: Christoph Hellwig <hch@lst.de> [split patch, update changelog] Signed-off-by: Logan Gunthorpe <logang@deltatee.com> Signed-off-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2019-11-04 10:56:42 -07:00
Christoph Hellwig	6f86f2c9d9	nvmet: Cleanup discovery execute handlers Push the lid and cns check into their respective handlers and, while we're at it, rename the functions to be consistent with other discovery handlers. Signed-off-by: Christoph Hellwig <hch@lst.de> [split patch, update changelog] Signed-off-by: Logan Gunthorpe <logang@deltatee.com> Signed-off-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2019-11-04 10:56:41 -07:00
Christoph Hellwig	2cb6963a16	nvmet: Introduce common execute function for get_log_page and identify Instead of picking the sub-command handler to execute in a nested switch statement introduce a landing functions that calls out to the appropriate sub-command handler. This will allow us to have a common place in the handler to check the transfer length in a future patch. Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Signed-off-by: Christoph Hellwig <hch@lst.de> [split patch, update change log] Signed-off-by: Logan Gunthorpe <logang@deltatee.com> Signed-off-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2019-11-04 10:56:41 -07:00
Logan Gunthorpe	c73eebc07a	nvmet-tcp: Don't set the request's data_len It's not apprporiate for the transports to set the data_len field of the request which is only used by the core. In this case, just use a variable on the stack to store the length of the sgl for comparison. Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Logan Gunthorpe <logang@deltatee.com> Signed-off-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2019-11-04 10:56:41 -07:00
Logan Gunthorpe	e0bace7177	nvmet-tcp: Don't check data_len in nvmet_tcp_map_data() None of the other transports check data_len which is verified in core code. The function should instead check that the sgl length is non-zero. Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Logan Gunthorpe <logang@deltatee.com> Signed-off-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2019-11-04 10:56:41 -07:00
Damien Le Moal	e08f2ae850	nvme: Introduce nvme_lba_to_sect() Introduce the new helper function nvme_lba_to_sect() to convert a device logical block number to a 512B sector number. Use this new helper in obvious places, cleaning up the code. Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com> Signed-off-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2019-11-04 10:56:41 -07:00
Damien Le Moal	314d48dd22	nvme: Cleanup and rename nvme_block_nr() Rename nvme_block_nr() to nvme_sect_to_lba() and use SECTOR_SHIFT instead of its hard coded value 9. Also add a comment to decribe this helper. Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com> Signed-off-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2019-11-04 10:56:41 -07:00
Revanth Rajashekar	48c9e85b23	nvme: resync include/linux/nvme.h with nvmecli Update enumerations and structures in include/linux/nvme.h to resync with the nvmecli. All the updates are mentioned in the ratified NVMe 1.4 spec https://nvmexpress.org/wp-content/uploads/NVM-Express-1_4-2019.06.10-Ratified.pdf Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Revanth Rajashekar <revanth.rajashekar@intel.com> Signed-off-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2019-11-04 10:56:41 -07:00
Max Gurtovoy	16686f3a6c	nvme: move common call to nvme_cleanup_cmd to core layer nvme_cleanup_cmd should be called for each call to nvme_setup_cmd (symmetrical functions). Move the call for nvme_cleanup_cmd to the common core layer and call it during nvme_complete_rq for the good flow. For error flow, each transport will call nvme_cleanup_cmd independently. Also take care of a special case of path failure, where we call nvme_complete_rq without doing nvme_setup_cmd. Signed-off-by: Max Gurtovoy <maxg@mellanox.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2019-11-04 10:56:41 -07:00
Max Gurtovoy	2dc3947b53	nvme: introduce "Command Aborted By host" status code Fix the status code of canceled requests initiated by the host according to TP4028 (Status Code 0x371): "Command Aborted By host: The command was aborted as a result of host action (e.g., the host disconnected the Fabric connection)." Also in a multipath environment, unless otherwise specified, errors of this type (path related) should be retried using a different path, if one is available. Signed-off-by: Max Gurtovoy <maxg@mellanox.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2019-11-04 10:56:41 -07:00
Israel Rukshin	59534b9d60	nvmet-rdma: add unlikely check at nvmet_rdma_map_sgl_keyed The calls to nvmet_req_alloc_sgl and rdma_rw_ctx_init should usually succeed, so add this simple optimization to the fast path. Signed-off-by: Israel Rukshin <israelr@mellanox.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Max Gurtovoy <maxg@mellanox.com> Signed-off-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2019-11-04 10:56:41 -07:00
Israel Rukshin	e522f44602	nvmet: add unlikely check at nvmet_req_alloc_sgl The call to sgl_alloc shouldn't fail so add this simple optimization to the fast path. Signed-off-by: Israel Rukshin <israelr@mellanox.com> Reviewed-by: Max Gurtovoy <maxg@mellanox.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2019-11-04 10:56:40 -07:00
Israel Rukshin	4d764bb9a9	nvmet: use bio_io_error instead of duplicating it This commit doesn't change any logic. Signed-off-by: Israel Rukshin <israelr@mellanox.com> Reviewed-by: Max Gurtovoy <maxg@mellanox.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2019-11-04 10:56:40 -07:00
Israel Rukshin	58a8df67e0	nvme: introduce nvme_is_aen_req function This function improves code readability and reduces code duplication. Signed-off-by: Israel Rukshin <israelr@mellanox.com> Signed-off-by: Max Gurtovoy <maxg@mellanox.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2019-11-04 10:56:40 -07:00
James Smart	bcde5f0fc7	nvme-fc: ensure association_id is cleared regardless of a Disconnect LS Code today only clears the association_id if a Disconnect LS is transmit. Remove ambiguity and unconditionally clear the association_id if the association has been terminated. Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2019-11-04 10:56:40 -07:00
James Smart	7db394848e	nvme-fc: clarify error messages Change wording on a couple of messages to clarify what happened. Signed-off-by: Ewan D. Milne <emilne@redhat.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2019-11-04 10:56:40 -07:00
James Smart	44fbf3bb1a	nvme-fc: Set new cmd set indicator in nvme-fc cmnd iu Set the new category field in the FC-NVME CMND_IU based on queue number. Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2019-11-04 10:56:40 -07:00
James Smart	53b2b2f599	nvme-fc and nvmet-fc: sync with FC-NVME-2 header changes Sync sources with revised structure and field names to correspond with FC-NVME-2 header sync-up. Tested interoperability with success: - prior initiator with new target - prior target with new initiator - new on new Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2019-11-04 10:56:40 -07:00

1 2 3 4 5 ...

871328 Commits