Commit Graph

855718 Commits

Author SHA1 Message Date
Jackie Liu
a982eeb09b io_uring: fix an issue when IOSQE_IO_LINK is inserted into defer list
This patch may fix two issues:

First, when IOSQE_IO_DRAIN set, the next IOs need to be inserted into
defer list to delay execution, but link io will be actively scheduled to
run by calling io_queue_sqe.

Second, when multiple LINK_IOs are inserted together with defer_list,
the LINK_IO is no longer keep order.

   |-------------|
   |   LINK_IO   |      ----> insert to defer_list  -----------
   |-------------|                                            |
   |   LINK_IO   |      ----> insert to defer_list  ----------|
   |-------------|                                            |
   |   LINK_IO   |      ----> insert to defer_list  ----------|
   |-------------|                                            |
   |   NORMAL_IO |      ----> insert to defer_list  ----------|
   |-------------|                                            |
                                                              |
                              queue_work at same time   <-----|

Fixes: 9e645e1105 ("io_uring: add support for sqe links")
Signed-off-by: Jackie Liu <liuyun01@kylinos.cn>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-08-15 11:21:39 -06:00
Jens Axboe
7b6620d7db block: remove REQ_NOWAIT_INLINE
We had a few issues with this code, and there's still a problem around
how we deal with error handling for chained/split bios. For now, just
revert the code and we'll try again with a thoroug solution. This
reverts commits:

e15c2ffa10 ("block: fix O_DIRECT error handling for bio fragments")
0eb6ddfb86 ("block: Fix __blkdev_direct_IO() for bio fragments")
6a43074e2f ("block: properly handle IOCB_NOWAIT for async O_DIRECT IO")
893a1c9720 ("blk-mq: allow REQ_NOWAIT to return an error inline")

Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-08-15 11:09:16 -06:00
Aleix Roca Nonell
99c79f6692 io_uring: fix manual setup of iov_iter for fixed buffers
Commit bd11b3a391 ("io_uring: don't use iov_iter_advance() for fixed
buffers") introduced an optimization to avoid using the slow
iov_iter_advance by manually populating the iov_iter iterator in some
cases.

However, the computation of the iterator count field was erroneous: The
first bvec was always accounted for an extent of page size even if the
bvec length was smaller.

In consequence, some I/O operations on fixed buffers were unable to
operate on the full extent of the buffer, consistently skipping some
bytes at the end of it.

Fixes: bd11b3a391 ("io_uring: don't use iov_iter_advance() for fixed buffers")
Cc: stable@vger.kernel.org
Signed-off-by: Aleix Roca Nonell <aleix.rocanonell@bsc.es>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-08-15 11:03:38 -06:00
Wenwen Wang
ae78ca3cf3 xen/blkback: fix memory leaks
In read_per_ring_refs(), after 'req' and related memory regions are
allocated, xen_blkif_map() is invoked to map the shared frame, irq, and
etc. However, if this mapping process fails, no cleanup is performed,
leading to memory leaks. To fix this issue, invoke the cleanup before
returning the error.

Acked-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: Wenwen Wang <wenwen@cs.uga.edu>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-08-12 08:18:37 -06:00
zhengbin
e26cc08265 blk-mq: move cancel of requeue_work to the front of blk_exit_queue
blk_exit_queue will free elevator_data, while blk_mq_requeue_work
will access it. Move cancel of requeue_work to the front of
blk_exit_queue to avoid use-after-free.

blk_exit_queue                blk_mq_requeue_work
  __elevator_exit               blk_mq_run_hw_queues
    blk_mq_exit_sched             blk_mq_run_hw_queue
      dd_exit_queue                 blk_mq_hctx_has_pending
        kfree(elevator_data)          blk_mq_sched_has_work
                                        dd_has_work

Fixes: fbc2a15e34 ("blk-mq: move cancel of requeue_work into blk_mq_release")
Cc: stable@vger.kernel.org
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: zhengbin <zhengbin13@huawei.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-08-12 08:14:11 -06:00
Jens Axboe
0c9c83043d Merge branch 'nvme-5.3-rc' of git://git.infradead.org/nvme into for-linus
Pull NVMe fixes from Sagi:

"Few nvme fixes for the next rc round.
- detect capacity changes on the mpath disk from Anthony
- probe/remove fix from Keith
- various fixes to pass blktests from Logan
- deadlock in reset/scan race fix
- nvme-rdma use-after-free fix
- deadlock fix when passthru commands race mpath disk info update"

* 'nvme-5.3-rc' of git://git.infradead.org/nvme:
  nvme-pci: Fix async probe remove race
  nvme: fix controller removal race with scan work
  nvme-rdma: fix possible use-after-free in connect error flow
  nvme: fix a possible deadlock when passthru commands sent to a multipath device
  nvme-core: Fix extra device_put() call on error path
  nvmet-file: fix nvmet_file_flush() always returning an error
  nvmet-loop: Flush nvme_delete_wq when removing the port
  nvmet: Fix use-after-free bug when a port is removed
  nvme-multipath: revalidate nvme_ns_head gendisk in nvme_validate_ns
2019-08-10 21:41:41 -06:00
Coly Li
20621fedb2 bcache: Revert "bcache: use sysfs_match_string() instead of __sysfs_match_string()"
This reverts commit 89e0341af0.

In drivers/md/bcache/sysfs.c:bch_snprint_string_list(), NULL pointer at
the end of list is necessary. Remove the NULL from last element of each
lists will cause the following panic,

[ 4340.455652] bcache: register_cache() registered cache device nvme0n1
[ 4340.464603] bcache: register_bdev() registered backing device sdk
[ 4421.587335] bcache: bch_cached_dev_run() cached dev sdk is running already
[ 4421.587348] bcache: bch_cached_dev_attach() Caching sdk as bcache0 on set 354e1d46-d99f-4d8b-870b-078b80dc88a6
[ 5139.247950] general protection fault: 0000 [#1] SMP NOPTI
[ 5139.247970] CPU: 9 PID: 5896 Comm: cat Not tainted 4.12.14-95.29-default #1 SLE12-SP4
[ 5139.247988] Hardware name: HPE ProLiant DL380 Gen10/ProLiant DL380 Gen10, BIOS U30 04/18/2019
[ 5139.248006] task: ffff888fb25c0b00 task.stack: ffff9bbacc704000
[ 5139.248021] RIP: 0010:string+0x21/0x70
[ 5139.248030] RSP: 0018:ffff9bbacc707bf0 EFLAGS: 00010286
[ 5139.248043] RAX: ffffffffa7e432e3 RBX: ffff8881c20da02a RCX: ffff0a00ffffff04
[ 5139.248058] RDX: 3f00656863616362 RSI: ffff8881c20db000 RDI: ffffffffffffffff
[ 5139.248075] RBP: ffff8881c20db000 R08: 0000000000000000 R09: ffff8881c20da02a
[ 5139.248090] R10: 0000000000000004 R11: 0000000000000000 R12: ffff9bbacc707c48
[ 5139.248104] R13: 0000000000000fd6 R14: ffffffffc0665855 R15: ffffffffc0665855
[ 5139.248119] FS:  00007faf253b8700(0000) GS:ffff88903f840000(0000) knlGS:0000000000000000
[ 5139.248137] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 5139.248149] CR2: 00007faf25395008 CR3: 0000000f72150006 CR4: 00000000007606e0
[ 5139.248164] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 5139.248179] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 5139.248193] PKRU: 55555554
[ 5139.248200] Call Trace:
[ 5139.248210]  vsnprintf+0x1fb/0x510
[ 5139.248221]  snprintf+0x39/0x40
[ 5139.248238]  bch_snprint_string_list.constprop.15+0x5b/0x90 [bcache]
[ 5139.248256]  __bch_cached_dev_show+0x44d/0x5f0 [bcache]
[ 5139.248270]  ? __alloc_pages_nodemask+0xb2/0x210
[ 5139.248284]  bch_cached_dev_show+0x2c/0x50 [bcache]
[ 5139.248297]  sysfs_kf_seq_show+0xbb/0x190
[ 5139.248308]  seq_read+0xfc/0x3c0
[ 5139.248317]  __vfs_read+0x26/0x140
[ 5139.248327]  vfs_read+0x87/0x130
[ 5139.248336]  SyS_read+0x42/0x90
[ 5139.248346]  do_syscall_64+0x74/0x160
[ 5139.248358]  entry_SYSCALL_64_after_hwframe+0x3d/0xa2
[ 5139.248370] RIP: 0033:0x7faf24eea370
[ 5139.248379] RSP: 002b:00007fff82d03f38 EFLAGS: 00000246 ORIG_RAX: 0000000000000000
[ 5139.248395] RAX: ffffffffffffffda RBX: 0000000000020000 RCX: 00007faf24eea370
[ 5139.248411] RDX: 0000000000020000 RSI: 00007faf25396000 RDI: 0000000000000003
[ 5139.248426] RBP: 00007faf25396000 R08: 00000000ffffffff R09: 0000000000000000
[ 5139.248441] R10: 000000007c9d4d41 R11: 0000000000000246 R12: 00007faf25396000
[ 5139.248456] R13: 0000000000000003 R14: 0000000000000000 R15: 0000000000000fff
[ 5139.248892] Code: ff ff ff 0f 1f 80 00 00 00 00 49 89 f9 48 89 cf 48 c7 c0 e3 32 e4 a7 48 c1 ff 30 48 81 fa ff 0f 00 00 48 0f 46 d0 48 85 ff 74 45 <44> 0f b6 02 48 8d 42 01 45 84 c0 74 38 48 01 fa 4c 89 cf eb 0e

The simplest way to fix is to revert commit 89e0341af0 ("bcache: use
sysfs_match_string() instead of __sysfs_match_string()").

This bug was introduced in Linux v5.2, so this fix only applies to
Linux v5.2 is enough for stable tree maintainer.

Fixes: 89e0341af0 ("bcache: use sysfs_match_string() instead of __sysfs_match_string()")
Cc: stable@vger.kernel.org
Cc: Alexandru Ardelean <alexandru.ardelean@analog.com>
Reported-by: Peifeng Lin <pflin@suse.com>
Acked-by: Alexandru Ardelean <alexandru.ardelean@analog.com>
Signed-off-by: Coly Li <colyli@suse.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-08-09 07:37:33 -06:00
Mikulas Patocka
d0a255e795 loop: set PF_MEMALLOC_NOIO for the worker thread
A deadlock with this stacktrace was observed.

The loop thread does a GFP_KERNEL allocation, it calls into dm-bufio
shrinker and the shrinker depends on I/O completion in the dm-bufio
subsystem.

In order to fix the deadlock (and other similar ones), we set the flag
PF_MEMALLOC_NOIO at loop thread entry.

PID: 474    TASK: ffff8813e11f4600  CPU: 10  COMMAND: "kswapd0"
   #0 [ffff8813dedfb938] __schedule at ffffffff8173f405
   #1 [ffff8813dedfb990] schedule at ffffffff8173fa27
   #2 [ffff8813dedfb9b0] schedule_timeout at ffffffff81742fec
   #3 [ffff8813dedfba60] io_schedule_timeout at ffffffff8173f186
   #4 [ffff8813dedfbaa0] bit_wait_io at ffffffff8174034f
   #5 [ffff8813dedfbac0] __wait_on_bit at ffffffff8173fec8
   #6 [ffff8813dedfbb10] out_of_line_wait_on_bit at ffffffff8173ff81
   #7 [ffff8813dedfbb90] __make_buffer_clean at ffffffffa038736f [dm_bufio]
   #8 [ffff8813dedfbbb0] __try_evict_buffer at ffffffffa0387bb8 [dm_bufio]
   #9 [ffff8813dedfbbd0] dm_bufio_shrink_scan at ffffffffa0387cc3 [dm_bufio]
  #10 [ffff8813dedfbc40] shrink_slab at ffffffff811a87ce
  #11 [ffff8813dedfbd30] shrink_zone at ffffffff811ad778
  #12 [ffff8813dedfbdc0] kswapd at ffffffff811ae92f
  #13 [ffff8813dedfbec0] kthread at ffffffff810a8428
  #14 [ffff8813dedfbf50] ret_from_fork at ffffffff81745242

  PID: 14127  TASK: ffff881455749c00  CPU: 11  COMMAND: "loop1"
   #0 [ffff88272f5af228] __schedule at ffffffff8173f405
   #1 [ffff88272f5af280] schedule at ffffffff8173fa27
   #2 [ffff88272f5af2a0] schedule_preempt_disabled at ffffffff8173fd5e
   #3 [ffff88272f5af2b0] __mutex_lock_slowpath at ffffffff81741fb5
   #4 [ffff88272f5af330] mutex_lock at ffffffff81742133
   #5 [ffff88272f5af350] dm_bufio_shrink_count at ffffffffa03865f9 [dm_bufio]
   #6 [ffff88272f5af380] shrink_slab at ffffffff811a86bd
   #7 [ffff88272f5af470] shrink_zone at ffffffff811ad778
   #8 [ffff88272f5af500] do_try_to_free_pages at ffffffff811adb34
   #9 [ffff88272f5af590] try_to_free_pages at ffffffff811adef8
  #10 [ffff88272f5af610] __alloc_pages_nodemask at ffffffff811a09c3
  #11 [ffff88272f5af710] alloc_pages_current at ffffffff811e8b71
  #12 [ffff88272f5af760] new_slab at ffffffff811f4523
  #13 [ffff88272f5af7b0] __slab_alloc at ffffffff8173a1b5
  #14 [ffff88272f5af880] kmem_cache_alloc at ffffffff811f484b
  #15 [ffff88272f5af8d0] do_blockdev_direct_IO at ffffffff812535b3
  #16 [ffff88272f5afb00] __blockdev_direct_IO at ffffffff81255dc3
  #17 [ffff88272f5afb30] xfs_vm_direct_IO at ffffffffa01fe3fc [xfs]
  #18 [ffff88272f5afb90] generic_file_read_iter at ffffffff81198994
  #19 [ffff88272f5afc50] __dta_xfs_file_read_iter_2398 at ffffffffa020c970 [xfs]
  #20 [ffff88272f5afcc0] lo_rw_aio at ffffffffa0377042 [loop]
  #21 [ffff88272f5afd70] loop_queue_work at ffffffffa0377c3b [loop]
  #22 [ffff88272f5afe60] kthread_worker_fn at ffffffff810a8a0c
  #23 [ffff88272f5afec0] kthread at ffffffff810a8428
  #24 [ffff88272f5aff50] ret_from_fork at ffffffff81745242

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Cc: stable@vger.kernel.org
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-08-08 10:12:21 -06:00
Jan Kara
e91455bad5 bdev: Fixup error handling in blkdev_get()
Commit 89e524c04f ("loop: Fix mount(2) failure due to race with
LOOP_SET_FD") converted blkdev_get() to use the new helpers for
finishing claiming of a block device. However the conversion botched the
error handling in blkdev_get() and thus the bdev has been marked as held
even in case __blkdev_get() returned error. This led to occasional
warnings with block/001 test from blktests like:

kernel: WARNING: CPU: 5 PID: 907 at fs/block_dev.c:1899 __blkdev_put+0x396/0x3a0

Correct the error handling.

CC: stable@vger.kernel.org
Fixes: 89e524c04f ("loop: Fix mount(2) failure due to race with LOOP_SET_FD")
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-08-08 07:37:03 -06:00
Paolo Valente
fd03177c33 block, bfq: handle NULL return value by bfq_init_rq()
As reported in [1], the call bfq_init_rq(rq) may return NULL in case
of OOM (in particular, if rq->elv.icq is NULL because memory
allocation failed in failed in ioc_create_icq()).

This commit handles this circumstance.

[1] https://lkml.org/lkml/2019/7/22/824

Cc: Hsin-Yi Wang <hsinyi@google.com>
Cc: Nicolas Boichat <drinkcat@chromium.org>
Cc: Doug Anderson <dianders@chromium.org>
Reported-by: Guenter Roeck <linux@roeck-us.net>
Reported-by: Hsin-Yi Wang <hsinyi@google.com>
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Paolo Valente <paolo.valente@linaro.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-08-08 07:31:50 -06:00
Paolo Valente
3f758e844a block, bfq: move update of waker and woken list to queue freeing
Since commit 13a857a4c4 ("block, bfq: detect wakers and
unconditionally inject their I/O"), every bfq_queue has a pointer to a
waker bfq_queue and a list of the bfq_queues it may wake. In this
respect, when a bfq_queue, say Q, remains with no I/O source attached
to it, Q cannot be woken by any other bfq_queue, and cannot wake any
other bfq_queue. Then Q must be removed from the woken list of its
possible waker bfq_queue, and all bfq_queues in the woken list of Q
must stop having a waker bfq_queue.

Q remains with no I/O source in two cases: when the last process
associated with Q exits or when such a process gets associated with a
different bfq_queue. Unfortunately, commit 13a857a4c4 ("block, bfq:
detect wakers and unconditionally inject their I/O") performed the
above updates only in the first case.

This commit fixes this bug by moving these updates to when Q gets
freed. This is a simple and safe way to handle all cases, as both the
above events, process exit and re-association, lead to Q being freed
soon, and because dangling references would come out only after Q gets
freed (if no update were performed).

Fixes: 13a857a4c4 ("block, bfq: detect wakers and unconditionally inject their I/O")
Reported-by: Douglas Anderson <dianders@chromium.org>
Tested-by: Douglas Anderson <dianders@chromium.org>
Signed-off-by: Paolo Valente <paolo.valente@linaro.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-08-08 07:30:52 -06:00
Paolo Valente
08d383a749 block, bfq: reset last_completed_rq_bfqq if the pointed queue is freed
Since commit 13a857a4c4 ("block, bfq: detect wakers and
unconditionally inject their I/O"), BFQ stores, in a per-device
pointer last_completed_rq_bfqq, the last bfq_queue that had an I/O
request completed. If some bfq_queue receives new I/O right after the
last request of last_completed_rq_bfqq has been completed, then
last_completed_rq_bfqq may be a waker bfq_queue.

But if the bfq_queue last_completed_rq_bfqq points to is freed, then
last_completed_rq_bfqq becomes a dangling reference. This commit
resets last_completed_rq_bfqq if the pointed bfq_queue is freed.

Fixes: 13a857a4c4 ("block, bfq: detect wakers and unconditionally inject their I/O")
Reported-by: Douglas Anderson <dianders@chromium.org>
Tested-by: Douglas Anderson <dianders@chromium.org>
Signed-off-by: Paolo Valente <paolo.valente@linaro.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-08-08 07:30:50 -06:00
He Zhe
430380b463 block: aoe: Fix kernel crash due to atomic sleep when exiting
Since commit 3582dd2917 ("aoe: convert aoeblk to blk-mq"), aoedev_downdev
has had the possibility of sleeping and causing the following crash.

BUG: scheduling while atomic: rmmod/2242/0x00000003
Modules linked in: aoe
Preemption disabled at:
[<ffffffffc01d95e5>] flush+0x95/0x4a0 [aoe]
CPU: 7 PID: 2242 Comm: rmmod Tainted: G          I       5.2.3 #1
Hardware name: Intel Corporation S5520HC/S5520HC, BIOS S5500.86B.01.10.0025.030220091519 03/02/2009
Call Trace:
 dump_stack+0x4f/0x6a
 ? flush+0x95/0x4a0 [aoe]
 __schedule_bug.cold+0x44/0x54
 __schedule+0x44f/0x680
 schedule+0x44/0xd0
 blk_mq_freeze_queue_wait+0x46/0xb0
 ? wait_woken+0x80/0x80
 blk_mq_freeze_queue+0x1b/0x20
 aoedev_downdev+0x111/0x160 [aoe]
 flush+0xff/0x4a0 [aoe]
 aoedev_exit+0x23/0x30 [aoe]
 aoe_exit+0x35/0x948 [aoe]
 __se_sys_delete_module+0x183/0x210
 __x64_sys_delete_module+0x16/0x20
 do_syscall_64+0x4d/0x130
 entry_SYSCALL_64_after_hwframe+0x44/0xa9
RIP: 0033:0x7f24e0043b07
Code: 73 01 c3 48 8b 0d 89 73 0b 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f
1f 84 00 00 00 00 00 0f 1f 44 00 00 b8 b0 00 00 00 0f 05 <48> 3d 01 f0 ff
ff 73 01 c3 48 8b 0d 59 73 0b 00 f7 d8 64 89 01 48
RSP: 002b:00007ffe18f7f1e8 EFLAGS: 00000206 ORIG_RAX: 00000000000000b0
RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f24e0043b07
RDX: 000000000000000a RSI: 0000000000000800 RDI: 0000555c3ecf87c8
RBP: 00007ffe18f7f1f0 R08: 0000000000000000 R09: 0000000000000000
R10: 00007f24e00b4ac0 R11: 0000000000000206 R12: 00007ffe18f7f238
R13: 00007ffe18f7f410 R14: 00007ffe18f80e73 R15: 0000555c3ecf8760

This patch, handling in the same way of pass two, unlocks the locks and
restart pass one after aoedev_downdev is done.

Fixes: 3582dd2917 ("aoe: convert aoeblk to blk-mq")
Signed-off-by: He Zhe <zhe.he@windriver.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-08-08 07:29:02 -06:00
Jens Axboe
752ead4449 libata: add SG safety checks in SFF pio transfers
Abort processing of a command if we run out of mapped data in the
SG list. This should never happen, but a previous bug caused it to
be possible. Play it safe and attempt to abort nicely if we don't
have more SG segments left.

Reviewed-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-08-07 12:23:57 -06:00
Jens Axboe
2d72715017 libata: have ata_scsi_rw_xlat() fail invalid passthrough requests
For passthrough requests, libata-scsi takes what the user passes in
as gospel. This can be problematic if the user fills in the CDB
incorrectly. One example of that is in request sizes. For read/write
commands, the CDB contains fields describing the transfer length of
the request. These should match with the SG_IO header fields, but
libata-scsi currently does no validation of that.

Check that the number of blocks in the CDB for passthrough requests
matches what was mapped into the request. If the CDB asks for more
data then the validated SG_IO header fields, error it.

Reported-by: Krishna Ram Prakash R <krp@gtux.in>
Reviewed-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-08-07 12:20:52 -06:00
Jens Axboe
e15c2ffa10 block: fix O_DIRECT error handling for bio fragments
0eb6ddfb86 tried to fix this up, but introduced a use-after-free
of dio. Additionally, we still had an issue with error handling,
as reported by Darrick:

"I noticed a regression in xfs/747 (an unreleased xfstest for the
xfs_scrub media scanning feature) on 5.3-rc3.  I'll condense that down
to a simpler reproducer:

error-test: 0 209 linear 8:48 0
error-test: 209 1 error
error-test: 210 6446894 linear 8:48 210

Basically we have a ~3G /dev/sdd and we set up device mapper to fail IO
for sector 209 and to pass the io to the scsi device everywhere else.

On 5.3-rc3, performing a directio pread of this range with a < 1M buffer
(in other words, a request for fewer than MAX_BIO_PAGES bytes) yields
EIO like you'd expect:

pread64(3, 0x7f880e1c7000, 1048576, 0)  = -1 EIO (Input/output error)
pread: Input/output error
+++ exited with 0 +++

But doing it with a larger buffer succeeds(!):

pread64(3, "XFSB\0\0\20\0\0\0\0\0\0\fL\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 1146880, 0) = 1146880
read 1146880/1146880 bytes at offset 0
1 MiB, 1 ops; 0.0009 sec (1.124 GiB/sec and 1052.6316 ops/sec)
+++ exited with 0 +++

(Note that the part of the buffer corresponding to the dm-error area is
uninitialized)

On 5.3-rc2, both commands would fail with EIO like you'd expect.  The
only change between rc2 and rc3 is commit 0eb6ddfb86 ("block: Fix
__blkdev_direct_IO() for bio fragments").

AFAICT we end up in __blkdev_direct_IO with a 1120K buffer, which gets
split into two bios: one for the first BIO_MAX_PAGES worth of data (1MB)
and a second one for the 96k after that."

Fix this by noting that it's always safe to dereference dio if we get
BLK_QC_T_EAGAIN returned, as end_io hasn't been run for that case. So
we can safely increment the dio size before calling submit_bio(), and
then decrement it on failure (not that it really matters, as the bio
and dio are going away).

For error handling, return to the original method of just using 'ret'
for tracking the error, and the size tracking in dio->size.

Fixes: 0eb6ddfb86 ("block: Fix __blkdev_direct_IO() for bio fragments")
Fixes: 6a43074e2f ("block: properly handle IOCB_NOWAIT for async O_DIRECT IO")
Reported-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-08-07 12:19:43 -06:00
Gustavo A. R. Silva
db341a049e ata: rb532_cf: Fix unused variable warning in rb532_pata_driver_probe
Fix the following warning (Building: rb532_defconfig mips):

drivers/ata/pata_rb532_cf.c: In function ‘rb532_pata_driver_remove’:
drivers/ata/pata_rb532_cf.c:161:24: warning: unused variable ‘info’ [-Wunused-variable]
  struct rb532_cf_info *info = ah->private_data;
                        ^~~~

Fixes: cd56f35e52 ("ata: rb532_cf: Convert to use GPIO descriptors")
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-08-06 07:44:59 -06:00
Stefan Haberland
41995342b4 s390/dasd: fix endless loop after read unit address configuration
After getting a storage server event that causes the DASD device driver
to update its unit address configuration during a device shutdown there is
the possibility of an endless loop in the device driver.

In the system log there will be ongoing DASD error messages with RC: -19.

The reason is that the loop starting the ruac request only terminates when
the retry counter is decreased to 0. But in the sleep_on function there are
early exit paths that do not decrease the retry counter.

Prevent an endless loop by handling those cases separately.

Remove the unnecessary do..while loop since the sleep_on function takes
care of retries by itself.

Fixes: 8e09f21574 ("[S390] dasd: add hyper PAV support to DASD device driver, part 1")
Cc: stable@vger.kernel.org # 2.6.25+
Signed-off-by: Stefan Haberland <sth@linux.ibm.com>
Reviewed-by: Jan Hoeppner <hoeppner@linux.ibm.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-08-01 20:46:14 -06:00
Damien Le Moal
0eb6ddfb86 block: Fix __blkdev_direct_IO() for bio fragments
The recent fix to properly handle IOCB_NOWAIT for async O_DIRECT IO
(patch 6a43074e2f) introduced two problems with BIO fragment handling
for direct IOs:
1) The dio size processed is calculated by incrementing the ret variable
by the size of the bio fragment issued for the dio. However, this size
is obtained directly from bio->bi_iter.bi_size AFTER the bio submission
which may result in referencing the bi_size value after the bio
completed, resulting in an incorrect value use.
2) The ret variable is not incremented by the size of the last bio
fragment issued for the bio, leading to an invalid IO size being
returned to the user.

Fix both problem by using dio->size (which is incremented before the bio
submission) to update the value of ret after bio submissions, including
for the last bio fragment issued.

Fixes: 6a43074e2f ("block: properly handle IOCB_NOWAIT for async O_DIRECT IO")
Reported-by: Masato Suzuki <masato.suzuki@wdc.com>
Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-08-01 13:51:18 -06:00
Keith Busch
bd46a90634 nvme-pci: Fix async probe remove race
Ensure the controller is not in the NEW state when nvme_probe() exits.
This will always allow a subsequent nvme_remove() to set the state to
DELETING, fixing a potential race between the initial asynchronous probe
and device removal.

Reported-by: Li Zhong <lizhongfs@gmail.com>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Keith Busch <kbusch@kernel.org>
Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
2019-07-31 18:03:36 -07:00
Sagi Grimberg
0157ec8dad nvme: fix controller removal race with scan work
With multipath enabled, nvme_scan_work() can read from the device
(through nvme_mpath_add_disk()) and hang [1]. However, with fabrics,
once ctrl->state is set to NVME_CTRL_DELETING, the reads will hang
(see nvmf_check_ready()) and the mpath stack device make_request
will block if head->list is not empty. However, when the head->list
consistst of only DELETING/DEAD controllers, we should actually not
block, but rather fail immediately.

In addition, before we go ahead and remove the namespaces, make sure
to clear the current path and kick the requeue list so that the
request will fast fail upon requeuing.

[1]:
--
  INFO: task kworker/u4:3:166 blocked for more than 120 seconds.
        Not tainted 5.2.0-rc6-vmlocalyes-00005-g808c8c2dc0cf #316
  "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
  kworker/u4:3    D    0   166      2 0x80004000
  Workqueue: nvme-wq nvme_scan_work
  Call Trace:
   __schedule+0x851/0x1400
   schedule+0x99/0x210
   io_schedule+0x21/0x70
   do_read_cache_page+0xa57/0x1330
   read_cache_page+0x4a/0x70
   read_dev_sector+0xbf/0x380
   amiga_partition+0xc4/0x1230
   check_partition+0x30f/0x630
   rescan_partitions+0x19a/0x980
   __blkdev_get+0x85a/0x12f0
   blkdev_get+0x2a5/0x790
   __device_add_disk+0xe25/0x1250
   device_add_disk+0x13/0x20
   nvme_mpath_set_live+0x172/0x2b0
   nvme_update_ns_ana_state+0x130/0x180
   nvme_set_ns_ana_state+0x9a/0xb0
   nvme_parse_ana_log+0x1c3/0x4a0
   nvme_mpath_add_disk+0x157/0x290
   nvme_validate_ns+0x1017/0x1bd0
   nvme_scan_work+0x44d/0x6a0
   process_one_work+0x7d7/0x1240
   worker_thread+0x8e/0xff0
   kthread+0x2c3/0x3b0
   ret_from_fork+0x35/0x40

   INFO: task kworker/u4:1:1034 blocked for more than 120 seconds.
        Not tainted 5.2.0-rc6-vmlocalyes-00005-g808c8c2dc0cf #316
  "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
  kworker/u4:1    D    0  1034      2 0x80004000
  Workqueue: nvme-delete-wq nvme_delete_ctrl_work
  Call Trace:
   __schedule+0x851/0x1400
   schedule+0x99/0x210
   schedule_timeout+0x390/0x830
   wait_for_completion+0x1a7/0x310
   __flush_work+0x241/0x5d0
   flush_work+0x10/0x20
   nvme_remove_namespaces+0x85/0x3d0
   nvme_do_delete_ctrl+0xb4/0x1e0
   nvme_delete_ctrl_work+0x15/0x20
   process_one_work+0x7d7/0x1240
   worker_thread+0x8e/0xff0
   kthread+0x2c3/0x3b0
   ret_from_fork+0x35/0x40
--

Reported-by: Logan Gunthorpe <logang@deltatee.com>
Tested-by: Logan Gunthorpe <logang@deltatee.com>
Reviewed-by: Logan Gunthorpe <logang@deltatee.com>
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
2019-07-31 18:01:56 -07:00
Sagi Grimberg
d94211b8ba nvme-rdma: fix possible use-after-free in connect error flow
When start_queue fails, we need to make sure to drain the
queue cq before freeing the rdma resources because we might
still race with the completion path. Have start_queue() error
path safely stop the queue.

--
[30371.808111] nvme nvme1: Failed reconnect attempt 11
[30371.808113] nvme nvme1: Reconnecting in 10 seconds...
[...]
[30382.069315] nvme nvme1: creating 4 I/O queues.
[30382.257058] nvme nvme1: Connect Invalid SQE Parameter, qid 4
[30382.257061] nvme nvme1: failed to connect queue: 4 ret=386
[30382.305001] BUG: unable to handle kernel NULL pointer dereference at 0000000000000018
[30382.305022] IP: qedr_poll_cq+0x8a3/0x1170 [qedr]
[30382.305028] PGD 0 P4D 0
[30382.305037] Oops: 0000 [#1] SMP PTI
[...]
[30382.305153] Call Trace:
[30382.305166]  ? __switch_to_asm+0x34/0x70
[30382.305187]  __ib_process_cq+0x56/0xd0 [ib_core]
[30382.305201]  ib_poll_handler+0x26/0x70 [ib_core]
[30382.305213]  irq_poll_softirq+0x88/0x110
[30382.305223]  ? sort_range+0x20/0x20
[30382.305232]  __do_softirq+0xde/0x2c6
[30382.305241]  ? sort_range+0x20/0x20
[30382.305249]  run_ksoftirqd+0x1c/0x60
[30382.305258]  smpboot_thread_fn+0xef/0x160
[30382.305265]  kthread+0x113/0x130
[30382.305273]  ? kthread_create_worker_on_cpu+0x50/0x50
[30382.305281]  ret_from_fork+0x35/0x40
--

Reported-by: Nicolas Morey-Chaisemartin <NMoreyChaisemartin@suse.com>
Reviewed-by: Max Gurtovoy <maxg@mellanox.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
2019-07-31 18:00:07 -07:00
Sagi Grimberg
b9156daeb1 nvme: fix a possible deadlock when passthru commands sent to a multipath device
When the user issues a command with side effects, we will end up freezing
the namespace request queue when updating disk info (and the same for
the corresponding mpath disk node).

However, we are not freezing the mpath node request queue,
which means that mpath I/O can still come in and block on blk_queue_enter
(called from nvme_ns_head_make_request -> direct_make_request).

This is a deadlock, because blk_queue_enter will block until the inner
namespace request queue is unfroze, but that process is blocked because
the namespace revalidation is trying to update the mpath disk info
and freeze its request queue (which will never complete because
of the I/O that is blocked on blk_queue_enter).

Fix this by freezing all the subsystem nsheads request queues before
executing the passthru command. Given that these commands are infrequent
we should not worry about this temporary I/O freeze to keep things sane.

Here is the matching hang traces:
--
[ 374.465002] INFO: task systemd-udevd:17994 blocked for more than 122 seconds.
[ 374.472975] Not tainted 5.2.0-rc3-mpdebug+ #42
[ 374.478522] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 374.487274] systemd-udevd D 0 17994 1 0x00000000
[ 374.493407] Call Trace:
[ 374.496145] __schedule+0x2ef/0x620
[ 374.500047] schedule+0x38/0xa0
[ 374.503569] blk_queue_enter+0x139/0x220
[ 374.507959] ? remove_wait_queue+0x60/0x60
[ 374.512540] direct_make_request+0x60/0x130
[ 374.517219] nvme_ns_head_make_request+0x11d/0x420 [nvme_core]
[ 374.523740] ? generic_make_request_checks+0x307/0x6f0
[ 374.529484] generic_make_request+0x10d/0x2e0
[ 374.534356] submit_bio+0x75/0x140
[ 374.538163] ? guard_bio_eod+0x32/0xe0
[ 374.542361] submit_bh_wbc+0x171/0x1b0
[ 374.546553] block_read_full_page+0x1ed/0x330
[ 374.551426] ? check_disk_change+0x70/0x70
[ 374.556008] ? scan_shadow_nodes+0x30/0x30
[ 374.560588] blkdev_readpage+0x18/0x20
[ 374.564783] do_read_cache_page+0x301/0x860
[ 374.569463] ? blkdev_writepages+0x10/0x10
[ 374.574037] ? prep_new_page+0x88/0x130
[ 374.578329] ? get_page_from_freelist+0xa2f/0x1280
[ 374.583688] ? __alloc_pages_nodemask+0x179/0x320
[ 374.588947] read_cache_page+0x12/0x20
[ 374.593142] read_dev_sector+0x2d/0xd0
[ 374.597337] read_lba+0x104/0x1f0
[ 374.601046] find_valid_gpt+0xfa/0x720
[ 374.605243] ? string_nocheck+0x58/0x70
[ 374.609534] ? find_valid_gpt+0x720/0x720
[ 374.614016] efi_partition+0x89/0x430
[ 374.618113] ? string+0x48/0x60
[ 374.621632] ? snprintf+0x49/0x70
[ 374.625339] ? find_valid_gpt+0x720/0x720
[ 374.629828] check_partition+0x116/0x210
[ 374.634214] rescan_partitions+0xb6/0x360
[ 374.638699] __blkdev_reread_part+0x64/0x70
[ 374.643377] blkdev_reread_part+0x23/0x40
[ 374.647860] blkdev_ioctl+0x48c/0x990
[ 374.651956] block_ioctl+0x41/0x50
[ 374.655766] do_vfs_ioctl+0xa7/0x600
[ 374.659766] ? locks_lock_inode_wait+0xb1/0x150
[ 374.664832] ksys_ioctl+0x67/0x90
[ 374.668539] __x64_sys_ioctl+0x1a/0x20
[ 374.672732] do_syscall_64+0x5a/0x1c0
[ 374.676828] entry_SYSCALL_64_after_hwframe+0x44/0xa9

[ 374.738474] INFO: task nvmeadm:49141 blocked for more than 123 seconds.
[ 374.745871] Not tainted 5.2.0-rc3-mpdebug+ #42
[ 374.751419] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 374.760170] nvmeadm D 0 49141 36333 0x00004080
[ 374.766301] Call Trace:
[ 374.769038] __schedule+0x2ef/0x620
[ 374.772939] schedule+0x38/0xa0
[ 374.776452] blk_mq_freeze_queue_wait+0x59/0x100
[ 374.781614] ? remove_wait_queue+0x60/0x60
[ 374.786192] blk_mq_freeze_queue+0x1a/0x20
[ 374.790773] nvme_update_disk_info.isra.57+0x5f/0x350 [nvme_core]
[ 374.797582] ? nvme_identify_ns.isra.50+0x71/0xc0 [nvme_core]
[ 374.804006] __nvme_revalidate_disk+0xe5/0x110 [nvme_core]
[ 374.810139] nvme_revalidate_disk+0xa6/0x120 [nvme_core]
[ 374.816078] ? nvme_submit_user_cmd+0x11e/0x320 [nvme_core]
[ 374.822299] nvme_user_cmd+0x264/0x370 [nvme_core]
[ 374.827661] nvme_dev_ioctl+0x112/0x1d0 [nvme_core]
[ 374.833114] do_vfs_ioctl+0xa7/0x600
[ 374.837117] ? __audit_syscall_entry+0xdd/0x130
[ 374.842184] ksys_ioctl+0x67/0x90
[ 374.845891] __x64_sys_ioctl+0x1a/0x20
[ 374.850082] do_syscall_64+0x5a/0x1c0
[ 374.854178] entry_SYSCALL_64_after_hwframe+0x44/0xa9
--

Reported-by: James Puthukattukaran <james.puthukattukaran@oracle.com>
Tested-by: James Puthukattukaran <james.puthukattukaran@oracle.com>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
2019-07-31 17:59:01 -07:00
Logan Gunthorpe
8c36e66fb4 nvme-core: Fix extra device_put() call on error path
In the error path for nvme_init_subsystem(), nvme_put_subsystem()
will call device_put(), but it will get called again after the
mutex_unlock().

The device_put() only needs to be called if device_add() fails.

This bug caused a KASAN use-after-free error when adding and
removing subsytems in a loop:

  BUG: KASAN: use-after-free in device_del+0x8d9/0x9a0
  Read of size 8 at addr ffff8883cdaf7120 by task multipathd/329

  CPU: 0 PID: 329 Comm: multipathd Not tainted 5.2.0-rc6-vmlocalyes-00019-g70a2b39005fd-dirty #314
  Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1 04/01/2014
  Call Trace:
   dump_stack+0x7b/0xb5
   print_address_description+0x6f/0x280
   ? device_del+0x8d9/0x9a0
   __kasan_report+0x148/0x199
   ? device_del+0x8d9/0x9a0
   ? class_release+0x100/0x130
   ? device_del+0x8d9/0x9a0
   kasan_report+0x12/0x20
   __asan_report_load8_noabort+0x14/0x20
   device_del+0x8d9/0x9a0
   ? device_platform_notify+0x70/0x70
   nvme_destroy_subsystem+0xf9/0x150
   nvme_free_ctrl+0x280/0x3a0
   device_release+0x72/0x1d0
   kobject_put+0x144/0x410
   put_device+0x13/0x20
   nvme_free_ns+0xc4/0x100
   nvme_release+0xb3/0xe0
   __blkdev_put+0x549/0x6e0
   ? kasan_check_write+0x14/0x20
   ? bd_set_size+0xb0/0xb0
   ? kasan_check_write+0x14/0x20
   ? mutex_lock+0x8f/0xe0
   ? __mutex_lock_slowpath+0x20/0x20
   ? locks_remove_file+0x239/0x370
   blkdev_put+0x72/0x2c0
   blkdev_close+0x8d/0xd0
   __fput+0x256/0x770
   ? _raw_read_lock_irq+0x40/0x40
   ____fput+0xe/0x10
   task_work_run+0x10c/0x180
   ? filp_close+0xf7/0x140
   exit_to_usermode_loop+0x151/0x170
   do_syscall_64+0x240/0x2e0
   ? prepare_exit_to_usermode+0xd5/0x190
   entry_SYSCALL_64_after_hwframe+0x44/0xa9
  RIP: 0033:0x7f5a79af05d7
  Code: 00 00 0f 05 48 3d 00 f0 ff ff 77 3f c3 66 0f 1f 44 00 00 53 89 fb 48 83 ec 10 e8 c4 fb ff ff 89 df 89 c2 b8 03 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 2b 89 d7 89 44 24 0c e8 06 fc ff ff 8b 44 24
  RSP: 002b:00007f5a7799c810 EFLAGS: 00000293 ORIG_RAX: 0000000000000003
  RAX: 0000000000000000 RBX: 0000000000000008 RCX: 00007f5a79af05d7
  RDX: 0000000000000000 RSI: 0000000000000001 RDI: 0000000000000008
  RBP: 00007f5a58000f98 R08: 0000000000000002 R09: 00007f5a7935ee80
  R10: 0000000000000000 R11: 0000000000000293 R12: 000055e432447240
  R13: 0000000000000000 R14: 0000000000000001 R15: 000055e4324a9cf0

  Allocated by task 1236:
   save_stack+0x21/0x80
   __kasan_kmalloc.constprop.6+0xab/0xe0
   kasan_kmalloc+0x9/0x10
   kmem_cache_alloc_trace+0x102/0x210
   nvme_init_identify+0x13c3/0x3820
   nvme_loop_configure_admin_queue+0x4fa/0x5e0
   nvme_loop_create_ctrl+0x469/0xf40
   nvmf_dev_write+0x19a3/0x21ab
   __vfs_write+0x66/0x120
   vfs_write+0x154/0x490
   ksys_write+0x104/0x240
   __x64_sys_write+0x73/0xb0
   do_syscall_64+0xa5/0x2e0
   entry_SYSCALL_64_after_hwframe+0x44/0xa9

  Freed by task 329:
   save_stack+0x21/0x80
   __kasan_slab_free+0x129/0x190
   kasan_slab_free+0xe/0x10
   kfree+0xa7/0x200
   nvme_release_subsystem+0x49/0x60
   device_release+0x72/0x1d0
   kobject_put+0x144/0x410
   put_device+0x13/0x20
   klist_class_dev_put+0x31/0x40
   klist_put+0x8f/0xf0
   klist_del+0xe/0x10
   device_del+0x3a7/0x9a0
   nvme_destroy_subsystem+0xf9/0x150
   nvme_free_ctrl+0x280/0x3a0
   device_release+0x72/0x1d0
   kobject_put+0x144/0x410
   put_device+0x13/0x20
   nvme_free_ns+0xc4/0x100
   nvme_release+0xb3/0xe0
   __blkdev_put+0x549/0x6e0
   blkdev_put+0x72/0x2c0
   blkdev_close+0x8d/0xd0
   __fput+0x256/0x770
   ____fput+0xe/0x10
   task_work_run+0x10c/0x180
   exit_to_usermode_loop+0x151/0x170
   do_syscall_64+0x240/0x2e0
   entry_SYSCALL_64_after_hwframe+0x44/0xa9

Fixes: 32fd90c407 ("nvme: change locking for the per-subsystem controller list")
Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Reviewed-by : Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
2019-07-31 17:57:24 -07:00
Logan Gunthorpe
cfc1a1af56 nvmet-file: fix nvmet_file_flush() always returning an error
Presently, nvmet_file_flush() always returns a call to
errno_to_nvme_status() but that helper doesn't take into account the
case when errno=0. So nvmet_file_flush() always returns an error code.

All other callers of errno_to_nvme_status() check for success before
calling it.

To fix this, ensure errno_to_nvme_status() returns success if the
errno is zero. This should prevent future mistakes like this from
happening.

Fixes: c6aa3542e0 ("nvmet: add error log support for file backend")
Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
2019-07-31 17:57:21 -07:00
Logan Gunthorpe
86b9a63e59 nvmet-loop: Flush nvme_delete_wq when removing the port
After calling nvme_loop_delete_ctrl(), the controllers will not
yet be deleted because nvme_delete_ctrl() only schedules work
to do the delete.

This means a race can occur if a port is removed but there
are still active controllers trying to access that memory.

To fix this, flush the nvme_delete_wq before returning from
nvme_loop_remove_port() so that any controllers that might
be in the process of being deleted won't access a freed port.

Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Reviewed-by: Max Gurtovoy <maxg@mellanox.com>
Reviewed-by : Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
2019-07-31 17:57:17 -07:00
Logan Gunthorpe
3aed86731e nvmet: Fix use-after-free bug when a port is removed
When a port is removed through configfs, any connected controllers
are still active and can still send commands. This causes a
use-after-free bug which is detected by KASAN for any admin command
that dereferences req->port (like in nvmet_execute_identify_ctrl).

To fix this, disconnect all active controllers when a subsystem is
removed from a port. This ensures there are no active controllers
when the port is eventually removed.

Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Reviewed-by: Max Gurtovoy <maxg@mellanox.com>
Reviewed-by : Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
2019-07-31 17:57:06 -07:00
Denis Efremov
3d0b63c5df MAINTAINERS: floppy: take over maintainership
I would like to maintain the floppy driver. After the recent fixes,
I think I know the code pretty well. Nowadays I've got 2 physical 3.5"
readers to test all the changes.

Signed-off-by: Denis Efremov <efremov@linux.com>
Acked-by: Will Deacon <will@kernel.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-07-31 09:29:43 -06:00
Munehisa Kamata
2b5c8f0063 nbd: replace kill_bdev() with __invalidate_device() again
Commit abbbdf1249 ("replace kill_bdev() with __invalidate_device()")
once did this, but 29eaadc036 ("nbd: stop using the bdev everywhere")
resurrected kill_bdev() and it has been there since then. So buffer_head
mappings still get killed on a server disconnection, and we can still
hit the BUG_ON on a filesystem on the top of the nbd device.

  EXT4-fs (nbd0): mounted filesystem with ordered data mode. Opts: (null)
  block nbd0: Receive control failed (result -32)
  block nbd0: shutting down sockets
  print_req_error: I/O error, dev nbd0, sector 66264 flags 3000
  EXT4-fs warning (device nbd0): htree_dirblock_to_tree:979: inode #2: lblock 0: comm ls: error -5 reading directory block
  print_req_error: I/O error, dev nbd0, sector 2264 flags 3000
  EXT4-fs error (device nbd0): __ext4_get_inode_loc:4690: inode #2: block 283: comm ls: unable to read itable block
  EXT4-fs error (device nbd0) in ext4_reserve_inode_write:5894: IO failure
  ------------[ cut here ]------------
  kernel BUG at fs/buffer.c:3057!
  invalid opcode: 0000 [#1] SMP PTI
  CPU: 7 PID: 40045 Comm: jbd2/nbd0-8 Not tainted 5.1.0-rc3+ #4
  Hardware name: Amazon EC2 m5.12xlarge/, BIOS 1.0 10/16/2017
  RIP: 0010:submit_bh_wbc+0x18b/0x190
  ...
  Call Trace:
   jbd2_write_superblock+0xf1/0x230 [jbd2]
   ? account_entity_enqueue+0xc5/0xf0
   jbd2_journal_update_sb_log_tail+0x94/0xe0 [jbd2]
   jbd2_journal_commit_transaction+0x12f/0x1d20 [jbd2]
   ? __switch_to_asm+0x40/0x70
   ...
   ? lock_timer_base+0x67/0x80
   kjournald2+0x121/0x360 [jbd2]
   ? remove_wait_queue+0x60/0x60
   kthread+0xf8/0x130
   ? commit_timeout+0x10/0x10 [jbd2]
   ? kthread_bind+0x10/0x10
   ret_from_fork+0x35/0x40

With __invalidate_device(), I no longer hit the BUG_ON with sync or
unmount on the disconnected device.

Fixes: 29eaadc036 ("nbd: stop using the bdev everywhere")
Cc: linux-block@vger.kernel.org
Cc: Ratna Manoj Bolla <manoj.br@gmail.com>
Cc: nbd@other.debian.org
Cc: stable@vger.kernel.org
Cc: David Woodhouse <dwmw@amazon.com>
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: Munehisa Kamata <kamatam@amazon.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-07-31 08:51:56 -06:00
Miquel Raynal
090bb80370 ata: libahci: do not complain in case of deferred probe
Retrieving PHYs can defer the probe, do not spawn an error when
-EPROBE_DEFER is returned, it is normal behavior.

Fixes: b1a9edbda0 ("ata: libahci: allow to use multiple PHYs")
Reviewed-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-07-31 08:51:17 -06:00
Jackie Liu
d0ee879187 io_uring: fix KASAN use after free in io_sq_wq_submit_work
[root@localhost ~]# ./liburing/test/link

QEMU Standard PC report that:

[   29.379892] CPU: 0 PID: 84 Comm: kworker/u2:2 Not tainted 5.3.0-rc2-00051-g4010b622f1d2-dirty #86
[   29.379902] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.12.0-1 04/01/2014
[   29.379913] Workqueue: io_ring-wq io_sq_wq_submit_work
[   29.379929] Call Trace:
[   29.379953]  dump_stack+0xa9/0x10e
[   29.379970]  ? io_sq_wq_submit_work+0xbf4/0xe90
[   29.379986]  print_address_description.cold.6+0x9/0x317
[   29.379999]  ? io_sq_wq_submit_work+0xbf4/0xe90
[   29.380010]  ? io_sq_wq_submit_work+0xbf4/0xe90
[   29.380026]  __kasan_report.cold.7+0x1a/0x34
[   29.380044]  ? io_sq_wq_submit_work+0xbf4/0xe90
[   29.380061]  kasan_report+0xe/0x12
[   29.380076]  io_sq_wq_submit_work+0xbf4/0xe90
[   29.380104]  ? io_sq_thread+0xaf0/0xaf0
[   29.380152]  process_one_work+0xb59/0x19e0
[   29.380184]  ? pwq_dec_nr_in_flight+0x2c0/0x2c0
[   29.380221]  worker_thread+0x8c/0xf40
[   29.380248]  ? __kthread_parkme+0xab/0x110
[   29.380265]  ? process_one_work+0x19e0/0x19e0
[   29.380278]  kthread+0x30b/0x3d0
[   29.380292]  ? kthread_create_on_node+0xe0/0xe0
[   29.380311]  ret_from_fork+0x3a/0x50

[   29.380635] Allocated by task 209:
[   29.381255]  save_stack+0x19/0x80
[   29.381268]  __kasan_kmalloc.constprop.6+0xc1/0xd0
[   29.381279]  kmem_cache_alloc+0xc0/0x240
[   29.381289]  io_submit_sqe+0x11bc/0x1c70
[   29.381300]  io_ring_submit+0x174/0x3c0
[   29.381311]  __x64_sys_io_uring_enter+0x601/0x780
[   29.381322]  do_syscall_64+0x9f/0x4d0
[   29.381336]  entry_SYSCALL_64_after_hwframe+0x49/0xbe

[   29.381633] Freed by task 84:
[   29.382186]  save_stack+0x19/0x80
[   29.382198]  __kasan_slab_free+0x11d/0x160
[   29.382210]  kmem_cache_free+0x8c/0x2f0
[   29.382220]  io_put_req+0x22/0x30
[   29.382230]  io_sq_wq_submit_work+0x28b/0xe90
[   29.382241]  process_one_work+0xb59/0x19e0
[   29.382251]  worker_thread+0x8c/0xf40
[   29.382262]  kthread+0x30b/0x3d0
[   29.382272]  ret_from_fork+0x3a/0x50

[   29.382569] The buggy address belongs to the object at ffff888067172140
                which belongs to the cache io_kiocb of size 224
[   29.384692] The buggy address is located 120 bytes inside of
                224-byte region [ffff888067172140, ffff888067172220)
[   29.386723] The buggy address belongs to the page:
[   29.387575] page:ffffea00019c5c80 refcount:1 mapcount:0 mapping:ffff88806ace5180 index:0x0
[   29.387587] flags: 0x100000000000200(slab)
[   29.387603] raw: 0100000000000200 dead000000000100 dead000000000122 ffff88806ace5180
[   29.387617] raw: 0000000000000000 00000000800c000c 00000001ffffffff 0000000000000000
[   29.387624] page dumped because: kasan: bad access detected

[   29.387920] Memory state around the buggy address:
[   29.388771]  ffff888067172080: fb fb fb fb fb fb fb fb fb fb fb fb fc fc fc fc
[   29.390062]  ffff888067172100: fc fc fc fc fc fc fc fc fb fb fb fb fb fb fb fb
[   29.391325] >ffff888067172180: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[   29.392578]                                         ^
[   29.393480]  ffff888067172200: fb fb fb fb fc fc fc fc fc fc fc fc fc fc fc fc
[   29.394744]  ffff888067172280: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
[   29.396003] ==================================================================
[   29.397260] Disabling lock debugging due to kernel taint

io_sq_wq_submit_work free and read req again.

Cc: Zhengyuan Liu <liuzhengyuan@kylinos.cn>
Cc: linux-block@vger.kernel.org
Cc: stable@vger.kernel.org
Fixes: f7b76ac9d1 ("io_uring: fix counter inc/dec mismatch in async_list")
Signed-off-by: Jackie Liu <liuyun01@kylinos.cn>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-07-31 08:45:10 -06:00
Jan Kara
89e524c04f loop: Fix mount(2) failure due to race with LOOP_SET_FD
Commit 33ec3e53e7 ("loop: Don't change loop device under exclusive
opener") made LOOP_SET_FD ioctl acquire exclusive block device reference
while it updates loop device binding. However this can make perfectly
valid mount(2) fail with EBUSY due to racing LOOP_SET_FD holding
temporarily the exclusive bdev reference in cases like this:

for i in {a..z}{a..z}; do
        dd if=/dev/zero of=$i.image bs=1k count=0 seek=1024
        mkfs.ext2 $i.image
        mkdir mnt$i
done

echo "Run"
for i in {a..z}{a..z}; do
        mount -o loop -t ext2 $i.image mnt$i &
done

Fix the problem by not getting full exclusive bdev reference in
LOOP_SET_FD but instead just mark the bdev as being claimed while we
update the binding information. This just blocks new exclusive openers
instead of failing them with EBUSY thus fixing the problem.

Fixes: 33ec3e53e7 ("loop: Don't change loop device under exclusive opener")
Cc: stable@vger.kernel.org
Tested-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-07-30 13:16:57 -06:00
Anthony Iliopoulos
fab7772bfb nvme-multipath: revalidate nvme_ns_head gendisk in nvme_validate_ns
When CONFIG_NVME_MULTIPATH is set, only the hidden gendisk associated
with the per-controller ns is run through revalidate_disk when a
rescan is triggered, while the visible blockdev never gets its size
(bdev->bd_inode->i_size) updated to reflect any capacity changes that
may have occurred.

This prevents online resizing of nvme block devices and in extension of
any filesystems atop that will are unable to expand while mounted, as
userspace relies on the blockdev size for obtaining the disk capacity
(via BLKGETSIZE/64 ioctls).

Fix this by explicitly revalidating the actual namespace gendisk in
addition to the per-controller gendisk, when multipath is enabled.

Signed-off-by: Anthony Iliopoulos <ailiopoulos@suse.com>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Christoph Hellwig <hch@lst.de>
2019-07-30 13:29:20 +03:00
Kees Cook
71d6c505b4 libata: zpodd: Fix small read overflow in zpodd_get_mech_type()
Jeffrin reported a KASAN issue:

  BUG: KASAN: global-out-of-bounds in ata_exec_internal_sg+0x50f/0xc70
  Read of size 16 at addr ffffffff91f41f80 by task scsi_eh_1/149
  ...
  The buggy address belongs to the variable:
    cdb.48319+0x0/0x40

Much like commit 18c9a99bce ("libata: zpodd: small read overflow in
eject_tray()"), this fixes a cdb[] buffer length, this time in
zpodd_get_mech_type():

We read from the cdb[] buffer in ata_exec_internal_sg(). It has to be
ATAPI_CDB_LEN (16) bytes long, but this buffer is only 12 bytes.

Reported-by: Jeffrin Jose T <jeffrin@rajagiritech.edu.in>
Fixes: afe7595118 ("libata: identify and init ZPODD devices")
Link: https://lore.kernel.org/lkml/201907181423.E808958@keescook/
Tested-by: Jeffrin Jose T <jeffrin@rajagiritech.edu.in>
Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-07-29 16:00:14 -06:00
Gustavo A. R. Silva
7be21763f7 ataflop: Mark expected switch fall-through
Mark switch cases where we are expecting to fall through.

This patch fixes the following warning (Building: m68k):

drivers/block/ataflop.c: In function ‘fd_locked_ioctl’:
drivers/block/ataflop.c:1728:3: warning: this statement may fall through [-Wimplicit-fallthrough=]
   set_capacity(floppy->disk, MAX_DISK_SIZE * 2);
   ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
drivers/block/ataflop.c:1729:2: note: here
  case FDFMTEND:
  ^~~~

Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-07-29 15:24:58 -06:00
Linus Torvalds
609488bc97 Linux 5.3-rc2 2019-07-28 12:47:02 -07:00
Linus Torvalds
c622fc5f54 meminit fix
- Disable gcc-based stack variable auto-init under KASAN (Arnd Bergmann)
 -----BEGIN PGP SIGNATURE-----
 Comment: Kees Cook <kees@outflux.net>
 
 iQJKBAABCgA0FiEEpcP2jyKd1g9yPm4TiXL039xtwCYFAl099MsWHGtlZXNjb29r
 QGNocm9taXVtLm9yZwAKCRCJcvTf3G3AJr6GD/0Xl/YxeXPnKIHoafoqMCBAY12f
 OnRZ2N6YCikYfLwgBnTAAyQi3P0qU8ffjt4LjoPxzByUPBmZ+VkUBXU1eNUuU0mT
 4CX+ZakeWp5atbg7Ja7DAThBrJS4DYRzXiGB1Is8IACD/zkkRDoGU1tN+3nubtlk
 F2SYtmJBz/6pje2ksLDmuSS1sapaom7Cs4khB/oDb8HOsqydS0CpzN7Oa/Di3HoZ
 yUbyM3bcgmYECasGt7zVOLzr/EcI4T7rtLhMTnFBMbfckQJBPc7UpaLTt9pxMVqO
 Vo7SH/q8atmp3aThT3XbEYbSvx4kUdHZYcuMogPe8T+3Bx4i9gWGnmpqF94P0Kl8
 SZgY92JEhF92PwVTi7ztAfAZQDunVm60c/Lp44r0q/lGQKZLXP8jQXd7KmL6dnPI
 gDnispJnNdNxVSVDx/r3yjSRh0VCA3yv01ed/pusCrxX48sEw7ExwswEJBy12O3s
 rUY7Xx/U+eIP+E+4B7ddlzTFy+0t6HQ0q0LLtbiim1ELF+8ZBnAvCMnm49SQbpEQ
 UMgO/bCAGkGu88uR3sclIwUbaR9oCCxkZO0YuLvAnGoMJ7JaYQlDmDqe/lWP7VjV
 HEmJxDpJE9SgmVtYkfz3aOEds5nSspRQOQfQpnq/JxjRQTSfriSpDpl72d5qk1CH
 WHAM8lviqVg/uT6r2Q==
 =z0XP
 -----END PGP SIGNATURE-----

Merge tag 'meminit-v5.3-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux

Pull structleak fix from Kees Cook:
 "Disable gcc-based stack variable auto-init under KASAN (Arnd
  Bergmann).

  This fixes a bunch of build warnings under KASAN and the
  gcc-plugin-based stack auto-initialization features (which are
  arguably redundant, so better to let KASAN control this)"

* tag 'meminit-v5.3-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
  structleak: disable STRUCTLEAK_BYREF in combination with KASAN_STACK
2019-07-28 12:33:15 -07:00
Linus Torvalds
8e61ea11c2 Kbuild fixes for v5.3
- add compile_commands.json to .gitignore
 
  - fix false-positive warning from gen_compile_commands.py after
    allnoconfig build
 
  - remove unused code
 -----BEGIN PGP SIGNATURE-----
 
 iQJSBAABCgA8FiEEbmPs18K1szRHjPqEPYsBB53g2wYFAl09uOgeHHlhbWFkYS5t
 YXNhaGlyb0Bzb2Npb25leHQuY29tAAoJED2LAQed4NsGoEIP/3+yy7XoxR1aHL2g
 qDUBLVhe00rMqjqDTePxZOnCMRmlv2m00pCZzfokqrakbwnY1DyoDR9KP+RRDtwr
 0A3FLNsJXpEAHakGygkL7IMen9q5RqPWoer+AMulnl+GaX0yCeii5OYXZLgmhh/L
 YKTCYiOK7MvCYhSQcUEjgPpm2fQttnp+umynDm4XCqIEnY7pr0edCgK6de4mJc+7
 NEf6TkNsVCe1STCAp0lAuW+vNCjNhJ4ADlD9bV4mwdl1D5eYKaGSLHCEtp3s6QJe
 DXmRGS1NFV49fmKv9YFLdB4ap3B8gW2uQugkAQbeURs813Y5arOZbJ3E6htxh2Wq
 5ATCiEZMCxMQG8HzS0SmvvuWicEAvxseyTWVhVkV2Z1jo1BT8h7W8fifIsKrp0Oo
 fKo0H+f5B6hNv+MlzBIk4yHwEG2I2YygBujfektOK0/IPyQK3qQNRSlXSMdmzFRH
 2Z3ZbYOIXrHg5TohqgIVPtKuKbpGnyIge9Gqq4RqeVXWTgfOzge3B1a5ok3SqoQg
 GCGu6z6w2ch53QmzEOcT30SOQQpAQUNbim80/BYGsANE3myUafpEplqmlpEOVT1g
 yCRDkv26yvi5fuofpI96Yu9hQE/AidyLJZLT32Ow0wMAzpzV778jm3gmNKgyUzlW
 rqKgzYj6kZmpWMiLjTjnIafvdL6R
 =J7xL
 -----END PGP SIGNATURE-----

Merge tag 'kbuild-fixes-v5.3' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild

Pull Kbuild fixes from Masahiro Yamada:

 - add compile_commands.json to .gitignore

 - fix false-positive warning from gen_compile_commands.py after
   allnoconfig build

 - remove unused code

* tag 'kbuild-fixes-v5.3' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild:
  kbuild: remove unused single-used-m
  gen_compile_commands: lower the entry count threshold
  .gitignore: Add compilation database file
  kbuild: remove unused objectify macro
2019-07-28 10:35:04 -07:00
Linus Torvalds
04ce931889 Char/Misc driver fixes for 5.3-rc2
Here are some small char and misc driver fixes for 5.3-rc2 to resolve
 some reported issues.
 
 Nothing major at all, some binder bugfixes for issues found, some new
 mei device ids, firmware building warning fixes, habanalabs fixes, a few
 other build fixes, and a MAINTAINERS update.
 
 All of these have been in linux-next with no reported issues.
 
 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
 -----BEGIN PGP SIGNATURE-----
 
 iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCXT2L1g8cZ3JlZ0Brcm9h
 aC5jb20ACgkQMUfUDdst+ynVaQCfbYmj0QWJhL5rVvrUOcdXppa3qxgAn1Pi5+hf
 V3OH/1OmAnY8E07HA1Y3
 =qVrq
 -----END PGP SIGNATURE-----

Merge tag 'char-misc-5.3-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc

Pull char/misc driver fixes from Greg KH:
 "Here are some small char and misc driver fixes for 5.3-rc2 to resolve
  some reported issues.

  Nothing major at all, some binder bugfixes for issues found, some new
  mei device ids, firmware building warning fixes, habanalabs fixes, a
  few other build fixes, and a MAINTAINERS update.

  All of these have been in linux-next with no reported issues"

* tag 'char-misc-5.3-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc:
  test_firmware: fix a memory leak bug
  hpet: Fix division by zero in hpet_time_div()
  eeprom: make older eeprom drivers select NVMEM_SYSFS
  vmw_balloon: Remove Julien from the maintainers list
  fpga-manager: altera-ps-spi: Fix build error
  mei: me: add mule creek canyon (EHL) device ids
  binder: prevent transactions to context manager from its own process.
  binder: Set end of SG buffer area properly.
  firmware: Fix missing inline
  firmware: fix build errors in paged buffer handling code
  habanalabs: don't reset device when getting VRHOT
  habanalabs: use %pad for printing a dma_addr_t
2019-07-28 10:26:10 -07:00
Linus Torvalds
572782b213 TTY fixes for 5.3-rc2
Here are 2 tty/vt patches for 5.3-rc2
   - delete the netx-serial driver as the arch has been removed, no need
     to keep the serial driver for it around either.
   - vt console_lock fix to resolve a reported noisy warning at runtime
 
 Both of these have been in linux-next with no reported issues.
 
 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
 -----BEGIN PGP SIGNATURE-----
 
 iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCXT2MbA8cZ3JlZ0Brcm9h
 aC5jb20ACgkQMUfUDdst+yk0iACeOnBnhE1OKqIVsG11K619UtLb67sAoL9h6Rgk
 3PWLkB02N1ApgevMEOs5
 =V3Xf
 -----END PGP SIGNATURE-----

Merge tag 'tty-5.3-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty

Pull tty fixes from Greg KH:
 "Here are two tty/vt fixes:

   - delete the netx-serial driver as the arch has been removed, no need
     to keep the serial driver for it around either.

   - vt console_lock fix to resolve a reported noisy warning at runtime

  Both of these have been in linux-next with no reported issues"

* tag 'tty-5.3-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty:
  vt: Grab console_lock around con_is_bound in show_bind
  tty: serial: netx: Delete driver
2019-07-28 10:18:33 -07:00
Linus Torvalds
ad28fd1cb2 SPDX fixes for 5.3-rc2
Here are some small SPDX fixes for 5.3-rc2 for things that came in
 during the 5.3-rc1 merge window that we previously missed.
 
 Only 3 small patches here:
 	- 2 uapi patches to resolve some SPDX tags that were not correct
 	- fix an invalid SPDX tag in the iomap Makefile file
 
 All have been properly reviewed on the public mailing lists.
 
 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
 -----BEGIN PGP SIGNATURE-----
 
 iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCXT2N9w8cZ3JlZ0Brcm9h
 aC5jb20ACgkQMUfUDdst+ylY9wCeJIYfs/eNf3tsjLQXxUBMYAJNqnsAn2IaMiTt
 cv2mck7JZm5KyHpP3f5N
 =RSZa
 -----END PGP SIGNATURE-----

Merge tag 'spdx-5.3-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/spdx

Pull SPDX fixes from Greg KH:
 "Here are some small SPDX fixes for 5.3-rc2 for things that came in
  during the 5.3-rc1 merge window that we previously missed.

  Only three small patches here:

   - two uapi patches to resolve some SPDX tags that were not correct

   - fix an invalid SPDX tag in the iomap Makefile file

  All have been properly reviewed on the public mailing lists"

* tag 'spdx-5.3-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/spdx:
  iomap: fix Invalid License ID
  treewide: remove SPDX "WITH Linux-syscall-note" from kernel-space headers again
  treewide: add "WITH Linux-syscall-note" to SPDX tag of uapi headers
2019-07-28 10:00:06 -07:00
Linus Torvalds
29af915cab USB fixes for 5.3-rc2
Here are some small fixes for 5.3-rc2.  All of these resolve some
 reported issues, some more than others :)
 
 Included in here is:
 	- xhci fix for an annoying issue with odd devices
 	- reversion of some usb251xb patches that should not have been
 	  merged
 	- usb pci quirk additions and fixups
 	- usb storage fix
 	- usb host controller error test fix
 
 All of these have been in linux-next with no reported issues.
 
 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
 -----BEGIN PGP SIGNATURE-----
 
 iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCXT2NGQ8cZ3JlZ0Brcm9h
 aC5jb20ACgkQMUfUDdst+ymsIACgjIehIXyxGQXJb2PzAyt5Omx6ZmoAnjfCZI8p
 zzsg1mZSdDsie3gILfQc
 =nzbN
 -----END PGP SIGNATURE-----

Merge tag 'usb-5.3-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb

Pull USB fixes from Greg KH:
 "Here are some small fixes for 5.3-rc2. All of these resolve some
  reported issues, some more than others :)

  Included in here is:

   - xhci fix for an annoying issue with odd devices

   - reversion of some usb251xb patches that should not have been merged

   - usb pci quirk additions and fixups

   - usb storage fix

   - usb host controller error test fix

  All of these have been in linux-next with no reported issues"

* tag 'usb-5.3-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb:
  xhci: Fix crash if scatter gather is used with Immediate Data Transfer (IDT).
  usb: usb251xb: Reallow swap-dx-lanes to apply to the upstream port
  Revert "usb: usb251xb: Add US port lanes inversion property"
  Revert "usb: usb251xb: Add US lanes inversion dts-bindings"
  usb: wusbcore: fix unbalanced get/put cluster_id
  usb/hcd: Fix a NULL vs IS_ERR() bug in usb_hcd_setup_local_mem()
  usb-storage: Add a limitation for blk_queue_max_hw_sectors()
  usb: pci-quirks: Minor cleanup for AMD PLL quirk
  usb: pci-quirks: Correct AMD PLL quirk detection
2019-07-28 09:52:35 -07:00
Linus Torvalds
5bb575bcc6 ARM: SoC fixes
Here's the first batch of fixes for this release cycle.
 
 Main diffstat here is the re-deletion of netx. I messed up and most
 likely didn't remove the files from the index when I test-merged this
 and saw conflicts, and from there on out 'git rerere' remembered the
 mistake and I missed checking it. Here it's done again as expected.
 
 Besides that:
 
  - A defconfig refresh + enabling of new drivers for u8500
 
  - i.MX fixlets for i2c/SAI/pinmux
 
  - sleep.S build fix for Davinci
 
  - Broadcom devicetree build/warning fix
 -----BEGIN PGP SIGNATURE-----
 
 iQJDBAABCAAtFiEElf+HevZ4QCAJmMQ+jBrnPN6EHHcFAl082HkPHG9sb2ZAbGl4
 b20ubmV0AAoJEIwa5zzehBx32UcP/jYuVG1UxV95/bO1tCjX0egg4RqJYm9UzI0e
 tVcge5P27AL8QeP7GnApBRDq4jxZhyVvHwX2Gff3HA01bcnJDctLUChZaUna31un
 PXWcx7jtuSZa243GC2k75nLBPxJVQKs0LnTRlpcyf17TDW6xy7nEYZJl8TPoOGBm
 PvS9rxoVB9PkOxK+QqzdxL35OVurEmv0KQPwRbPR0dcDMydIgB9yWFSb3J15xtLh
 t/7tdnj8EPPJaocMzR67EMkjDwcaIeGkdLdLReyuDr0t1g40gP+prSHJmgWcC0Ct
 BWD8aaw9xbDpXgNd2wqC8YCiaC7iGXBwUqU2z88CsXpAsmoJirBrwTZWFRqN6uOl
 8xFvwDy4DEhncRq6reHGUI0yV8ZLqyH/rImJK0ougLYnO+IbpyM4ElOPY9tV1Bvz
 FsvsCDfffN4AOyrGYblGDDnZwKYOs4woMmWNvhn9t3jx1ss7PSefS9P4tIse+QPm
 d0ojLjwDtsTjtSDahi7KKjW7PwFbzpLn/MpVelQUW+R6Rt9vLpT2XBWb/WpuAwaG
 EQu1pHdjrIgcYUNqmanWbPi5rOVyHI+JZepr1wQwpy34R/xDkLfqqYfXAflRzCah
 QeDeNpWv1M/v/Pg8mPvVhuZq91Fs5HOe2ZebGKyhcE4CAIc73AeDxApU/PAq9nAu
 0yFPUMdD
 =IReR
 -----END PGP SIGNATURE-----

Merge tag 'armsoc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc

Pull ARM SoC fixes from Olof Johansson:
 "Here's the first batch of fixes for this release cycle.

  Main diffstat here is the re-deletion of netx. I messed up and most
  likely didn't remove the files from the index when I test-merged this
  and saw conflicts, and from there on out 'git rerere' remembered the
  mistake and I missed checking it. Here it's done again as expected.

  Besides that:

   - A defconfig refresh + enabling of new drivers for u8500

   - i.MX fixlets for i2c/SAI/pinmux

   - sleep.S build fix for Davinci

   - Broadcom devicetree build/warning fix"

* tag 'armsoc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc:
  ARM: defconfig: u8500: Add new drivers
  ARM: defconfig: u8500: Refresh defconfig
  ARM: dts: bcm: bcm47094: add missing #cells for mdio-bus-mux
  ARM: davinci: fix sleep.S build error on ARMv4
  arm64: dts: imx8mq: fix SAI compatible
  arm64: dts: imx8mm: Correct SAI3 RXC/TXFS pin's mux option #1
  ARM: dts: imx6ul: fix clock frequency property name of I2C buses
  ARM: Delete netx a second time
  ARM: dts: imx7ulp: Fix usb-phy unit address format
2019-07-28 09:38:55 -07:00
Linus Torvalds
a9815a4fa2 Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fixes from Thomas Gleixner:
 "A set of x86 fixes and functional updates:

   - Prevent stale huge I/O TLB mappings on 32bit. A long standing bug
     which got exposed by KPTI support for 32bit

   - Prevent bogus access_ok() warnings in arch_stack_walk_user()

   - Add display quirks for Lenovo devices which have height and width
     swapped

   - Add the missing CR2 fixup for 32 bit async pagefaults. Fallout of
     the CR2 bug fix series.

   - Unbreak handling of force enabled HPET by moving the 'is HPET
     counting' check back to the original place.

   - A more accurate check for running on a hypervisor platform in the
     MDS mitigation code. Not perfect, but more accurate than the
     previous one.

   - Update a stale and confusing comment vs. IRQ stacks"

* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/speculation/mds: Apply more accurate check on hypervisor platform
  x86/hpet: Undo the early counter is counting check
  x86/entry/32: Pass cr2 to do_async_page_fault()
  x86/irq/64: Update stale comment
  x86/sysfb_efi: Add quirks for some devices with swapped width and height
  x86/stacktrace: Prevent access_ok() warnings in arch_stack_walk_user()
  mm/vmalloc: Sync unmappings in __purge_vmap_area_lazy()
  x86/mm: Sync also unmappings in vmalloc_sync_all()
  x86/mm: Check for pfn instead of page in vmalloc_sync_one()
2019-07-27 21:46:43 -07:00
Linus Torvalds
e24ce84e85 Merge branch 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull scheduler fixes from Thomas Gleixner:
 "Two fixes for the fair scheduling class:

   - Prevent freeing memory which is accessible by concurrent readers

   - Make the RCU annotations for numa groups consistent"

* 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  sched/fair: Use RCU accessors consistently for ->numa_group
  sched/fair: Don't free p->numa_faults with concurrent readers
2019-07-27 21:22:33 -07:00
Linus Torvalds
750991f9af Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf fixes from Thomas Gleixner:
 "A pile of perf related fixes:

  Kernel:
   - Fix SLOTS PEBS event constraints for Icelake CPUs

   - Add the missing mask bit to allow counting hardware generated
     prefetches on L3 for Icelake CPUs

   - Make the test for hypervisor platforms more accurate (as far as
     possible)

   - Handle PMUs correctly which override event->cpu

   - Yet another missing fallthrough annotation

  Tools:
     perf.data:
        - Fix loading of compressed data split across adjacent records
        - Fix buffer size setting for processing CPU topology perf.data
          header.

     perf stat:
        - Fix segfault for event group in repeat mode
        - Always separate "stalled cycles per insn" line, it was being
          appended to the "instructions" line.

     perf script:
        - Fix --max-blocks man page description.
        - Improve man page description of metrics.
        - Fix off by one in brstackinsn IPC computation.

     perf probe:
        - Avoid calling freeing routine multiple times for same pointer.

     perf build:
        - Do not use -Wshadow on gcc < 4.8, avoiding too strict warnings
          treated as errors, breaking the build"

* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  perf/x86/intel: Mark expected switch fall-throughs
  perf/core: Fix creating kernel counters for PMUs that override event->cpu
  perf/x86: Apply more accurate check on hypervisor platform
  perf/x86/intel: Fix invalid Bit 13 for Icelake MSR_OFFCORE_RSP_x register
  perf/x86/intel: Fix SLOTS PEBS event constraint
  perf build: Do not use -Wshadow on gcc < 4.8
  perf probe: Avoid calling freeing routine multiple times for same pointer
  perf probe: Set pev->nargs to zero after freeing pev->args entries
  perf session: Fix loading of compressed data split across adjacent records
  perf stat: Always separate stalled cycles per insn
  perf stat: Fix segfault for event group in repeat mode
  perf tools: Fix proper buffer size for feature processing
  perf script: Fix off by one in brstackinsn IPC computation
  perf script: Improve man page description of metrics
  perf script: Fix --max-blocks man page description
2019-07-27 21:17:56 -07:00
Linus Torvalds
431f288ed7 Merge branch 'locking-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull locking fixes from Thomas Gleixner:
 "A set of locking fixes:

   - Address the fallout of the rwsem rework. Missing ACQUIREs and a
     sanity check to prevent a use-after-free

   - Add missing checks for unitialized mutexes when mutex debugging is
     enabled.

   - Remove the bogus code in the generic SMP variant of
     arch_futex_atomic_op_inuser()

   - Fixup the #ifdeffery in lockdep to prevent compile warnings"

* 'locking-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  locking/mutex: Test for initialized mutex
  locking/lockdep: Clean up #ifdef checks
  locking/lockdep: Hide unused 'class' variable
  locking/rwsem: Add ACQUIRE comments
  tty/ldsem, locking/rwsem: Add missing ACQUIRE to read_failed sleep loop
  lcoking/rwsem: Add missing ACQUIRE to read_slowpath sleep loop
  locking/rwsem: Add missing ACQUIRE to read_slowpath exit when queue is empty
  locking/rwsem: Don't call owner_on_cpu() on read-owner
  futex: Cleanup generic SMP variant of arch_futex_atomic_op_inuser()
2019-07-27 21:10:26 -07:00
Linus Torvalds
13fbe991b5 Merge branch 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull objtool fix from Thomas Gleixner:
 "A single robustness fix for objtool to handle unbalanced CLAC
  invocations under all circumstances"

* 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  objtool: Improve UACCESS coverage
2019-07-27 20:49:43 -07:00
Linus Torvalds
88c5083442 Wimplicit-fallthrough patches for 5.3-rc2
Hi Linus,
 
 Please, pull the following patches that mark switch cases where we are
 expecting to fall through. These patches are part of the ongoing efforts
 to enable -Wimplicit-fallthrough. Most of them have been baking in linux-next
 for a whole development cycle.
 
 Also, pull the Makefile patch that globally enables the
 -Wimplicit-fallthrough option.
 
 Finally, some missing-break fixes that have been tagged for -stable:
 
  - drm/amdkfd: Fix missing break in switch statement
  - drm/amdgpu/gfx10: Fix missing break in switch statement
 
 Notice that with these changes, we completely get rid of all the
 fall-through warnings in the kernel.
 
 Thanks
 
 Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEkmRahXBSurMIg1YvRwW0y0cG2zEFAl06XP0ACgkQRwW0y0cG
 2zHPXhAAsatJGNIg7vuSicVipJBIlwgRLwtbE2rV+MCneUAZj37O9NtBgbxHNoJ+
 5TBB8sLpNCw3rEvPKwm6vicRgntMclEY1vaplPOHt3RiY4lqVSIkpvFSCw1hGur3
 +34+O++n/6HbtY96T+DN5WGMXrU3JSe46xnBLIt0BOoUwyKMaNdntiyd79GrrnyB
 BwPkHrmB+9FEVq+dHmPnRIhc9WUfIdily3+j1CIncaM4eXKWLjoUlOIw1VcSEc4X
 SSdFh2co+3nm/6O54ZxUEC1PyuQZWedsgFmeiTrOanG3AEWQ/jX7GwXvPlgraa2E
 F5MzGUOQwXYL/IzXl1vGjQc4+FV4nhIjEBTDdZseOp7FP5xkHyyOwzGDEmaMXpXT
 XThb+k7Q6EbiSPLFcz8zkCwNB8ngNJMNOsGP3JPFD06MfquqdzP7T1BsHcNtiMVo
 FNwQg9KtvbK9F7a9lXDqfcxfuEScbqSvnKWWwNLImCqIBATYirJzeTeLvTbVWpA2
 iyXF3ylIclsSMOvCtaz41iuoqNh12eqw2UIFPBmkU2wpVkTt+ZIn0Apgom90xe1K
 tSUqMBbBMoHVtsBsILdkdz/m6FJpK+YmmwpRrJSDvPxA5c9caD7cj+62amW/gP4z
 Hi6mC4hw1H5bKMsC2PP/QgJcmS/pgo1zwA9J2a2NNJbYPATHfKI=
 =J2sO
 -----END PGP SIGNATURE-----

Merge tag 'Wimplicit-fallthrough-5.3-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gustavoars/linux

Pull Wimplicit-fallthrough enablement from Gustavo A. R. Silva:
 "This marks switch cases where we are expecting to fall through, and
  globally enables the -Wimplicit-fallthrough option in the main
  Makefile.

  Finally, some missing-break fixes that have been tagged for -stable:

   - drm/amdkfd: Fix missing break in switch statement

   - drm/amdgpu/gfx10: Fix missing break in switch statement

  With these changes, we completely get rid of all the fall-through
  warnings in the kernel"

* tag 'Wimplicit-fallthrough-5.3-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gustavoars/linux:
  Makefile: Globally enable fall-through warning
  drm/i915: Mark expected switch fall-throughs
  drm/amd/display: Mark expected switch fall-throughs
  drm/amdkfd/kfd_mqd_manager_v10: Avoid fall-through warning
  drm/amdgpu/gfx10: Fix missing break in switch statement
  drm/amdkfd: Fix missing break in switch statement
  perf/x86/intel: Mark expected switch fall-throughs
  mtd: onenand_base: Mark expected switch fall-through
  afs: fsclient: Mark expected switch fall-throughs
  afs: yfsclient: Mark expected switch fall-throughs
  can: mark expected switch fall-throughs
  firewire: mark expected switch fall-throughs
2019-07-27 11:04:18 -07:00
Linus Torvalds
43e317c1bb s390 updates for 5.3-rc2
- Add ABI to kernel image file which allows e.g. the file utility to figure
    out the kernel version.
 
  - Wire up clone3 system call.
 
  - Add support for kasan bitops instrumentation.
 
  - uapi header cleanup: use __u{16,32,64} instead of uint{16,32,64}_t.
 
  - Provide proper ARCH_ZONE_DMA_BITS so the s390 DMA zone is correctly defined
    with 2 GB instead of the default value of 1 MB.
 
  - Farhan Ali leaves the group of vfio-ccw maintainers.
 
  - Various small vfio-ccw fixes.
 
  - Add missing locking for airq_areas array in virtio code.
 
  - Minor qdio improvements.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJdPBMNAAoJECIOw3kbKW7CNp0QAKcvg0eCNUfP9KuTGZgie4Gt
 MeRESA4SpF2nS9Wp/cxnENRaMDYfUGdoFwXHJEuTTrUvBhBykaRJ/U0qMU5v25kD
 nesmHYpuhRYxzHp+Cm5nzzicNID/9pI+Yd3E/eefpUsMt90sLXZ/1vAaPmPQJ/9S
 AsA8jyE35qKdZ8zeBCF7zGy0jsy3A/afi52C6HIZlhkoDaTHBMChndUORIQ1XIK1
 Ce5ssmiqlbFPC0JJ1ZbnSQB/kyRunaE7fF6tdFDOztSqdc19JZ5xxStn4bSZo508
 54mqoS3tHR7+fDmu2CH/tIHtPkinPFvQmxbTQemSnz3xRVpf3YPlnxEHmuQcAOp5
 edDwdlx28QELRi9OGAPR60t0rtT2iWO32xPeh2hYoakDDHamnDmk1BWKn1FPbBJr
 IZpvlfzn0t2Mlf96YiEzdRqEYIjBgtw0yl/ozRIRyEsGC4eLhvHOmrqgchCk/tD+
 Gy+BeDYCaHj4Lu4hhZdAmUcFAbLJu1dXX1GXijmAMvQ7xFhUMHaj+BxWz7Ou5+cg
 zpcSHUbRVXUwbLVdDttJ6VGDo9KriYg54aPLOdUZ1rAA3cWBZRjA/iDqT7NIWWe+
 HV9IKZ55+lPE/xuVvbbwKn7JucX6yBCzdCRADDmC+VKzwVS78mUdIppaLaq+KY9G
 9Zxgd3oVatQ5VMC1yXOk
 =J+rB
 -----END PGP SIGNATURE-----

Merge tag 's390-5.3-3' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux

Pull s390 updates from Heiko Carstens:

 - Add ABI to kernel image file which allows e.g. the file utility to
   figure out the kernel version.

 - Wire up clone3 system call.

 - Add support for kasan bitops instrumentation.

 - uapi header cleanup: use __u{16,32,64} instead of uint{16,32,64}_t.

 - Provide proper ARCH_ZONE_DMA_BITS so the s390 DMA zone is correctly
   defined with 2 GB instead of the default value of 1 MB.

 - Farhan Ali leaves the group of vfio-ccw maintainers.

 - Various small vfio-ccw fixes.

 - Add missing locking for airq_areas array in virtio code.

 - Minor qdio improvements.

* tag 's390-5.3-3' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
  MAINTAINERS: vfio-ccw: Remove myself as the maintainer
  s390/mm: use shared variables for sysctl range check
  virtio/s390: fix race on airq_areas[]
  s390/dma: provide proper ARCH_ZONE_DMA_BITS value
  s390/kasan: add bitops instrumentation
  s390/bitops: make test functions return bool
  s390: wire up clone3 system call
  kbuild: enable arch/s390/include/uapi/asm/zcrypt.h for uapi header test
  s390: use __u{16,32,64} instead of uint{16,32,64}_t in uapi header
  s390/hypfs: fix a typo in the name of a function
  s390/qdio: restrict QAOB usage to IQD unicast queues
  s390/qdio: add sanity checks to the fast-requeue path
  s390: enable detection of kernel version from bzImage
  Documentation: fix vfio-ccw doc
  vfio-ccw: Update documentation for csch/hsch
  vfio-ccw: Don't call cp_free if we are processing a channel program
  vfio-ccw: Set pa_nr to 0 if memory allocation fails for pa_iova_pfn
  vfio-ccw: Fix memory leak and don't call cp_free in cp_init
  vfio-ccw: Fix misleading comment when setting orb.cmd.c64
2019-07-27 08:58:04 -07:00