Commit Graph

70301 Commits

Author SHA1 Message Date
Pavel Begunkov
216e583596 io_uring: fix misaccounting fix buf pinned pages
As Andres reports "... io_sqe_buffer_register() doesn't initialize imu.
io_buffer_account_pin() does imu->acct_pages++, before calling
io_account_mem(ctx, imu->acct_pages).", leading to evevntual -ENOMEM.

Initialise the field.

Reported-by: Andres Freund <andres@anarazel.de>
Fixes: 41edf1a5ec ("io_uring: keep table of pointers to ubufs")
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/438a6f46739ae5e05d9c75a0c8fa235320ff367c.1622285901.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-05-29 19:27:21 -06:00
Marco Elver
b16ef427ad io_uring: fix data race to avoid potential NULL-deref
Commit ba5ef6dc8a ("io_uring: fortify tctx/io_wq cleanup") introduced
setting tctx->io_wq to NULL a bit earlier. This has caused KCSAN to
detect a data race between accesses to tctx->io_wq:

  write to 0xffff88811d8df330 of 8 bytes by task 3709 on cpu 1:
   io_uring_clean_tctx                  fs/io_uring.c:9042 [inline]
   __io_uring_cancel                    fs/io_uring.c:9136
   io_uring_files_cancel                include/linux/io_uring.h:16 [inline]
   do_exit                              kernel/exit.c:781
   do_group_exit                        kernel/exit.c:923
   get_signal                           kernel/signal.c:2835
   arch_do_signal_or_restart            arch/x86/kernel/signal.c:789
   handle_signal_work                   kernel/entry/common.c:147 [inline]
   exit_to_user_mode_loop               kernel/entry/common.c:171 [inline]
   ...
  read to 0xffff88811d8df330 of 8 bytes by task 6412 on cpu 0:
   io_uring_try_cancel_iowq             fs/io_uring.c:8911 [inline]
   io_uring_try_cancel_requests         fs/io_uring.c:8933
   io_ring_exit_work                    fs/io_uring.c:8736
   process_one_work                     kernel/workqueue.c:2276
   ...

With the config used, KCSAN only reports data races with value changes:
this implies that in the case here we also know that tctx->io_wq was
non-NULL. Therefore, depending on interleaving, we may end up with:

              [CPU 0]                 |        [CPU 1]
  io_uring_try_cancel_iowq()          | io_uring_clean_tctx()
    if (!tctx->io_wq) // false        |   ...
    ...                               |   tctx->io_wq = NULL
    io_wq_cancel_cb(tctx->io_wq, ...) |   ...
      -> NULL-deref                   |

Note: It is likely that thus far we've gotten lucky and the compiler
optimizes the double-read into a single read into a register -- but this
is never guaranteed, and can easily change with a different config!

Fix the data race by restoring the previous behaviour, where both
setting io_wq to NULL and put of the wq are _serialized_ after
concurrent io_uring_try_cancel_iowq() via acquisition of the uring_lock
and removal of the node in io_uring_del_task_file().

Fixes: ba5ef6dc8a ("io_uring: fortify tctx/io_wq cleanup")
Suggested-by: Pavel Begunkov <asml.silence@gmail.com>
Reported-by: syzbot+bf2b3d0435b9b728946c@syzkaller.appspotmail.com
Signed-off-by: Marco Elver <elver@google.com>
Cc: Jens Axboe <axboe@kernel.dk>
Link: https://lore.kernel.org/r/20210527092547.2656514-1-elver@google.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-05-27 07:44:49 -06:00
Zqiang
3743c1723b io-wq: Fix UAF when wakeup wqe in hash waitqueue
BUG: KASAN: use-after-free in __wake_up_common+0x637/0x650
Read of size 8 at addr ffff8880304250d8 by task iou-wrk-28796/28802

Call Trace:
 __dump_stack [inline]
 dump_stack+0x141/0x1d7
 print_address_description.constprop.0.cold+0x5b/0x2c6
 __kasan_report [inline]
 kasan_report.cold+0x7c/0xd8
 __wake_up_common+0x637/0x650
 __wake_up_common_lock+0xd0/0x130
 io_worker_handle_work+0x9dd/0x1790
 io_wqe_worker+0xb2a/0xd40
 ret_from_fork+0x1f/0x30

Allocated by task 28798:
 kzalloc_node [inline]
 io_wq_create+0x3c4/0xdd0
 io_init_wq_offload [inline]
 io_uring_alloc_task_context+0x1bf/0x6b0
 __io_uring_add_task_file+0x29a/0x3c0
 io_uring_add_task_file [inline]
 io_uring_install_fd [inline]
 io_uring_create [inline]
 io_uring_setup+0x209a/0x2bd0
 do_syscall_64+0x3a/0xb0
 entry_SYSCALL_64_after_hwframe+0x44/0xae

Freed by task 28798:
 kfree+0x106/0x2c0
 io_wq_destroy+0x182/0x380
 io_wq_put [inline]
 io_wq_put_and_exit+0x7a/0xa0
 io_uring_clean_tctx [inline]
 __io_uring_cancel+0x428/0x530
 io_uring_files_cancel
 do_exit+0x299/0x2a60
 do_group_exit+0x125/0x310
 get_signal+0x47f/0x2150
 arch_do_signal_or_restart+0x2a8/0x1eb0
 handle_signal_work[inline]
 exit_to_user_mode_loop [inline]
 exit_to_user_mode_prepare+0x171/0x280
 __syscall_exit_to_user_mode_work [inline]
 syscall_exit_to_user_mode+0x19/0x60
 do_syscall_64+0x47/0xb0
 entry_SYSCALL_64_after_hwframe

There are the following scenarios, hash waitqueue is shared by
io-wq1 and io-wq2. (note: wqe is worker)

io-wq1:worker2     | locks bit1
io-wq2:worker1     | waits bit1
io-wq1:worker3     | waits bit1

io-wq1:worker2     | completes all wqe bit1 work items
io-wq1:worker2     | drop bit1, exit

io-wq2:worker1     | locks bit1
io-wq1:worker3     | can not locks bit1, waits bit1 and exit
io-wq1             | exit and free io-wq1
io-wq2:worker1     | drops bit1
io-wq1:worker3     | be waked up, even though wqe is freed

After all iou-wrk belonging to io-wq1 have exited, remove wqe
form hash waitqueue, it is guaranteed that there will be no more
wqe belonging to io-wq1 in the hash waitqueue.

Reported-by: syzbot+6cb11ade52aa17095297@syzkaller.appspotmail.com
Signed-off-by: Zqiang <qiang.zhang@windriver.com>
Link: https://lore.kernel.org/r/20210526050826.30500-1-qiang.zhang@windriver.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-05-26 09:03:56 -06:00
Pavel Begunkov
17a91051fe io_uring/io-wq: close io-wq full-stop gap
There is an old problem with io-wq cancellation where requests should be
killed and are in io-wq but are not discoverable, e.g. in @next_hashed
or @linked vars of io_worker_handle_work(). It adds some unreliability
to individual request canellation, but also may potentially get
__io_uring_cancel() stuck. For instance:

1) An __io_uring_cancel()'s cancellation round have not found any
   request but there are some as desribed.
2) __io_uring_cancel() goes to sleep
3) Then workers wake up and try to execute those hidden requests
   that happen to be unbound.

As we already cancel all requests of io-wq there, set IO_WQ_BIT_EXIT
in advance, so preventing 3) from executing unbound requests. The
workers will initially break looping because of getting a signal as they
are threads of the dying/exec()'ing user task.

Cc: stable@vger.kernel.org
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/abfcf8c54cb9e8f7bfbad7e9a0cc5433cc70bdc2.1621781238.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-05-25 19:39:58 -06:00
Pavel Begunkov
ba5ef6dc8a io_uring: fortify tctx/io_wq cleanup
We don't want anyone poking into tctx->io_wq awhile it's being destroyed
by io_wq_put_and_exit(), and even though it shouldn't even happen, if
buggy would be preferable to get a NULL-deref instead of subtle delayed
failure or UAF.

Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/827b021de17926fd807610b3e53a5a5fa8530856.1621513214.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-05-20 07:29:11 -06:00
Pavel Begunkov
7a27472770 io_uring: don't modify req->poll for rw
__io_queue_proc() is used by both poll and apoll, so we should not
access req->poll directly but selecting right struct io_poll_iocb
depending on use case.

Reported-and-tested-by: syzbot+a84b8783366ecb1c65d0@syzkaller.appspotmail.com
Fixes: ea6a693d86 ("io_uring: disable multishot poll for double poll add cases")
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/4a6a1de31142d8e0250fe2dfd4c8923d82a5bbfc.1621251795.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-05-17 07:28:48 -06:00
Pavel Begunkov
489809e2e2 io_uring: increase max number of reg buffers
Since recent changes instead of storing a large array of struct
io_mapped_ubuf, we store pointers to them, that is 4 times slimmer and
we should not to so worry about restricting max number of registererd
buffer slots, increase the limit 4 times.

Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/d3dee1da37f46da416aa96a16bf9e5094e10584d.1620990371.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-05-14 06:06:34 -06:00
Pavel Begunkov
2d74d0421e io_uring: further remove sqpoll limits on opcodes
There are three types of requests that left disabled for sqpoll, namely
epoll ctx, statx, and resources update. Since SQPOLL task is now closely
mimics a userspace thread, remove the restrictions.

Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/909b52d70c45636d8d7897582474ea5aab5eed34.1620990306.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-05-14 06:06:23 -06:00
Pavel Begunkov
447c19f3b5 io_uring: fix ltout double free on completion race
Always remove linked timeout on io_link_timeout_fn() from the master
request link list, otherwise we may get use-after-free when first
io_link_timeout_fn() puts linked timeout in the fail path, and then
will be found and put on master's free.

Cc: stable@vger.kernel.org # 5.10+
Fixes: 90cd7e4249 ("io_uring: track link timeout's master explicitly")
Reported-and-tested-by: syzbot+5a864149dd970b546223@syzkaller.appspotmail.com
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/69c46bf6ce37fec4fdcd98f0882e18eb07ce693a.1620990121.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-05-14 06:06:15 -06:00
Pavel Begunkov
a298232ee6 io_uring: fix link timeout refs
WARNING: CPU: 0 PID: 10242 at lib/refcount.c:28 refcount_warn_saturate+0x15b/0x1a0 lib/refcount.c:28
RIP: 0010:refcount_warn_saturate+0x15b/0x1a0 lib/refcount.c:28
Call Trace:
 __refcount_sub_and_test include/linux/refcount.h:283 [inline]
 __refcount_dec_and_test include/linux/refcount.h:315 [inline]
 refcount_dec_and_test include/linux/refcount.h:333 [inline]
 io_put_req fs/io_uring.c:2140 [inline]
 io_queue_linked_timeout fs/io_uring.c:6300 [inline]
 __io_queue_sqe+0xbef/0xec0 fs/io_uring.c:6354
 io_submit_sqe fs/io_uring.c:6534 [inline]
 io_submit_sqes+0x2bbd/0x7c50 fs/io_uring.c:6660
 __do_sys_io_uring_enter fs/io_uring.c:9240 [inline]
 __se_sys_io_uring_enter+0x256/0x1d60 fs/io_uring.c:9182

io_link_timeout_fn() should put only one reference of the linked timeout
request, however in case of racing with the master request's completion
first io_req_complete() puts one and then io_put_req_deferred() is
called.

Cc: stable@vger.kernel.org # 5.12+
Fixes: 9ae1f8dd37 ("io_uring: fix inconsistent lock state")
Reported-by: syzbot+a2910119328ce8e7996f@syzkaller.appspotmail.com
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/ff51018ff29de5ffa76f09273ef48cb24c720368.1620417627.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-05-08 22:11:49 -06:00
Thadeu Lima de Souza Cascardo
d1f8280887 io_uring: truncate lengths larger than MAX_RW_COUNT on provide buffers
Read and write operations are capped to MAX_RW_COUNT. Some read ops rely on
that limit, and that is not guaranteed by the IORING_OP_PROVIDE_BUFFERS.

Truncate those lengths when doing io_add_buffers, so buffer addresses still
use the uncapped length.

Also, take the chance and change struct io_buffer len member to __u32, so
it matches struct io_provide_buffer len member.

This fixes CVE-2021-3491, also reported as ZDI-CAN-13546.

Fixes: ddf0322db7 ("io_uring: add IORING_OP_PROVIDE_BUFFERS")
Reported-by: Billy Jheng Bing-Jhong (@st424204)
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-05-05 15:17:35 -06:00
Zqiang
bb6659cc0a io_uring: Fix memory leak in io_sqe_buffers_register()
unreferenced object 0xffff8881123bf0a0 (size 32):
comm "syz-executor557", pid 8384, jiffies 4294946143 (age 12.360s)
backtrace:
[<ffffffff81469b71>] kmalloc_node include/linux/slab.h:579 [inline]
[<ffffffff81469b71>] kvmalloc_node+0x61/0xf0 mm/util.c:587
[<ffffffff815f0b3f>] kvmalloc include/linux/mm.h:795 [inline]
[<ffffffff815f0b3f>] kvmalloc_array include/linux/mm.h:813 [inline]
[<ffffffff815f0b3f>] kvcalloc include/linux/mm.h:818 [inline]
[<ffffffff815f0b3f>] io_rsrc_data_alloc+0x4f/0xc0 fs/io_uring.c:7164
[<ffffffff815f26d8>] io_sqe_buffers_register+0x98/0x3d0 fs/io_uring.c:8383
[<ffffffff815f84a7>] __io_uring_register+0xf67/0x18c0 fs/io_uring.c:9986
[<ffffffff81609222>] __do_sys_io_uring_register fs/io_uring.c:10091 [inline]
[<ffffffff81609222>] __se_sys_io_uring_register fs/io_uring.c:10071 [inline]
[<ffffffff81609222>] __x64_sys_io_uring_register+0x112/0x230 fs/io_uring.c:10071
[<ffffffff842f616a>] do_syscall_64+0x3a/0xb0 arch/x86/entry/common.c:47
[<ffffffff84400068>] entry_SYSCALL_64_after_hwframe+0x44/0xae

Fix data->tags memory leak, through io_rsrc_data_free() to release
data memory space.

Reported-by: syzbot+0f32d05d8b6cd8d7ea3e@syzkaller.appspotmail.com
Signed-off-by: Zqiang <qiang.zhang@windriver.com>
Link: https://lore.kernel.org/r/20210430082515.13886-1-qiang.zhang@windriver.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-04-30 06:44:22 -06:00
Colin Ian King
cf3770e784 io_uring: Fix premature return from loop and memory leak
Currently the -EINVAL error return path is leaking memory allocated
to data. Fix this by not returning immediately but instead setting
the error return variable to -EINVAL and breaking out of the loop.

Kudos to Pavel Begunkov for suggesting a correct fix.

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Reviewed-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/20210429104602.62676-1-colin.king@canonical.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-04-29 13:26:19 -06:00
Pavel Begunkov
47b228ce6f io_uring: fix unchecked error in switch_start()
io_rsrc_node_switch_start() can fail, don't forget to check returned
error code.

Reported-by: syzbot+a4715dd4b7c866136f79@syzkaller.appspotmail.com
Fixes: eae071c9b4 ("io_uring: prepare fixed rw for dynanic buffers")
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/c4c06e2f3f0c8e43bd8d0a266c79055bcc6b6e60.1619693112.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-04-29 13:26:19 -06:00
Pavel Begunkov
6224843d56 io_uring: allow empty slots for reg buffers
Allow empty reg buffer slots any request using which should fail. This
allows users to not register all buffers in advance, but do it lazily
and/or on demand via updates. That is achieved by setting iov_base and
iov_len to zero for registration and/or buffer updates. Empty buffer
can't have a non-zero tag.

Implementation details: to not add extra overhead to io_import_fixed(),
create a dummy buffer crafted to fail any request using it, and set it
to all empty buffer slots.

Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/7e95e4d700082baaf010c648c72ac764c9cc8826.1619611868.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-04-29 13:26:19 -06:00
Pavel Begunkov
b0d658ec88 io_uring: add more build check for uapi
Add a couple of BUILD_BUG_ON() checking some rsrc uapi structs and SQE
flags.

Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/ff960df4d5026b9fb5bfd80994b9d3667d3926da.1619536280.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-04-29 13:26:18 -06:00
Pavel Begunkov
dddca22636 io_uring: dont overlap internal and user req flags
CQE flags take one byte that we store in req->flags together with other
REQ_F_* internal flags. CQE flags are copied directly into req and then
verified that requires some handling on failures, e.g. to make sure that
that copy doesn't set some of the internal flags.

Move all internal flags to take bits after the first byte, so we don't
need extra handling and make it safer overall.

Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/b8b5b02d1ab9d786fcc7db4a3fe86db6b70b8987.1619536280.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-04-29 13:26:18 -06:00
Pavel Begunkov
2840f710f2 io_uring: fix drain with rsrc CQEs
Resource emitted CQEs are not bound to requests, so fix up counters used
for DRAIN/defer logic.

Fixes: b60c8dce33 ("io_uring: preparation for rsrc tagging")
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/2b32f5f0a40d5928c3466d028f936e167f0654be.1619536280.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-04-29 13:26:18 -06:00
Linus Torvalds
3644286f6c \n
-----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCAAdFiEEq1nRK9aeMoq1VSgcnJ2qBz9kQNkFAmCJUfIACgkQnJ2qBz9k
 QNkStAf8CA7beya7LZ/GGN7HzXhv2cs+IpUFhRkynLklEM0lxKsOEagLFSZxkoMD
 IBSRSo4odkkderqI9W/yp+9OYhOd9+BQCq4isg1Gh9Tf5xANJEpLvBAPnWVhooJs
 9CrYZQY9Bdf+fF/8GHbKlrMAYm56vBCmWqyWTEtWUyPBOA12in2ZHQJmCa+5+nge
 zTT/B5cvuhN5K7uYhGM4YfeCU5DBmmvD4sV6YBTkQOgCU0bEF0f9R3JjHDo34a1s
 yqna3ypqKNRhsJVs8F+aOGRieUYxFoRqtYNHZK3qI9i07v7ndoTm5jzGN6OFlKs3
 U3rF9/+cBgeESahWG6IjHIqhXGXNhg==
 =KjNm
 -----END PGP SIGNATURE-----

Merge tag 'fsnotify_for_v5.13-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs

Pull fsnotify updates from Jan Kara:

 - support for limited fanotify functionality for unpriviledged users

 - faster merging of fanotify events

 - a few smaller fsnotify improvements

* tag 'fsnotify_for_v5.13-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs:
  shmem: allow reporting fanotify events with file handles on tmpfs
  fs: introduce a wrapper uuid_to_fsid()
  fanotify_user: use upper_32_bits() to verify mask
  fanotify: support limited functionality for unprivileged users
  fanotify: configurable limits via sysfs
  fanotify: limit number of event merge attempts
  fsnotify: use hash table for faster events merge
  fanotify: mix event info and pid into merge key hash
  fanotify: reduce event objectid to 29-bit hash
  fsnotify: allow fsnotify_{peek,remove}_first_event with empty queue
2021-04-29 11:06:13 -07:00
Linus Torvalds
767fcbc80f \n
-----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCAAdFiEEq1nRK9aeMoq1VSgcnJ2qBz9kQNkFAmCJU1UACgkQnJ2qBz9k
 QNk62AgAgp05OIXU/AgObb7DvSyI3ycwCV8PeWBpwD8yoDAh5x0tmT7vnJu974p6
 yHdnF7rr69ZzvbNCHLJ5kRykRlUao9W7cO5fdOW1uTpL7Ic60QuJMks/NfgVTHp1
 2zIQmBDerfn1/LTK8r2pPGcvtcjRcr7Ep4beN0Duw57lfVMJhjsNRPnBbXGBcp0r
 QzKk4/8V3DCZvOw+XNC3nto7avjvf+nU9sJmuh83546eqh0atjWivvO5aAlDOe6W
 rhBiLlmP0in5u2n1fYqzI1OQvtgtleyEZT2G0CrbAZn0xjmV/if9wl+3K6TOwDvR
 778xDEX7sZCaO/xkB+WK3hrd15ftKg==
 =0kYE
 -----END PGP SIGNATURE-----

Merge tag 'for_v5.13-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs

Pull quota, ext2, reiserfs updates from Jan Kara:

 - support for path (instead of device) based quotactl syscall
   (quotactl_path(2))

 - ext2 conversion to kmap_local()

 - other minor cleanups & fixes

* tag 'for_v5.13-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs:
  fs/reiserfs/journal.c: delete useless variables
  fs/ext2: Replace kmap() with kmap_local_page()
  ext2: Match up ext2_put_page() with ext2_dotdot() and ext2_find_entry()
  fs/ext2/: fix misspellings using codespell tool
  quota: report warning limits for realtime space quotas
  quota: wire up quotactl_path
  quota: Add mountpath based quota support
2021-04-29 10:51:29 -07:00
Linus Torvalds
d2b6f8a179 New code for 5.13:
- Various minor fixes in online scrub.
 - Prevent metadata files from being automatically inactivated.
 - Validate btree heights by the computed per-btree limits.
 - Don't warn about remounting with deprecated mount options.
 - Initialize attr forks at create time if we suspect we're going to need
   to store them.
 - Reduce memory reallocation workouts in the logging code.
 - Fix some theoretical math calculation errors in logged buffers that
   span multiple discontig memory ranges but contiguous ondisk regions.
 - Speedups in dirty buffer bitmap handling.
 - Make type verifier functions more inline-happy to reduce overhead.
 - Reduce debug overhead in directory checking code.
 - Many many typo fixes.
 - Begin to handle the permanent loss of the very end of a filesystem.
 - Fold struct xfs_icdinode into xfs_inode.
 - Deprecate the long defunct BMV_IF_NO_DMAPI_READ from the bmapx ioctl.
 - Remove a broken directory block format check from online scrub.
 - Fix a bug where we could produce an unnecessarily tall data fork btree
   when creating an attr fork.
 - Fix scrub and readonly remounts racing.
 - Fix a writeback ioend log deadlock problem by dropping the behavior
   where we could preallocate a setfilesize transaction.
 - Fix some bugs in the new extent count checking code.
 - Fix some bugs in the attr fork preallocation code.
 - Refactor if_flags out of the incore inode fork data structure.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEUzaAxoMeQq6m2jMV+H93GTRKtOsFAmB6MFUACgkQ+H93GTRK
 tOvigBAAlpzBUXnZVo+U18u0tSHnq5c1zbXYcf5GPhQv9w3n3TlPi3YhK2vgEXlI
 TULwsdU+an30oqWkQiVrwQjKPVaTWeWE3K0sA2MlYX9L2CwPPde4x5hwhyppfQxq
 mQyu0suWp480ao7vToXAgZ751OdZRtGu8sRQ7rVQ/FVf9K4R8EqpZMEynNry25f+
 hpK235hpf4IUC9E1A4pE2hNBSr/LGPIyu1t5sZsfazcNmtpKcauy5R5b8Pdnzo2/
 WFa6PoeE8SRIp4OxZY/c/4QUI5cRubJGyoB+kbl0hg69uYIJO+pc+R69yrQPD9Z+
 JDW/FktH+Zz4pstFsC+qnSvhRaF2DvXpvXrIldonQ2Z2ByVqbs3r6HzKySlWQ+QE
 jU717HApWl/ADI/kVD2IuQnrbU+Q8Ue8thzgQeEpTRWsea2HzPMofNi5FImU2ulw
 g4V7PleQWJ6AsLhcpfA46Y+CUAtjTD1Tvj67JpXuWJ+MFTB4hRm3U7zgCtV/0c3T
 wBBUybQjDoVA6DDr6CP/9ki1k0BO3wKJGlZMR0bkEsuxXdFNTvHEz5lmueYT/Wxc
 D91+oRbna9NpEeIVFGo6lhMIu2t0iYssFdgQKyn1jXrpGXKvOklP8zDjRdPnnQmz
 plT2ajlXPIjc6KjOTP2mbVqKs059LuJoYV7gIWwM7CgtFsMIrd8=
 =oRKe
 -----END PGP SIGNATURE-----

Merge tag 'xfs-5.13-merge-3' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux

Pull xfs updates from Darrick Wong:
 "The notable user-visible addition this cycle is ability to remove
  space from the last AG in a filesystem. This is the first of many
  changes needed for full-fledged support for shrinking a filesystem.
  Still needed are (a) the ability to reorganize files and metadata away
  from the end of the fs; (b) the ability to remove entire allocation
  groups; (c) shrink support for realtime volumes; and (d) thorough
  testing of (a-c).

  There are a number of performance improvements in this code drop: Dave
  streamlined various parts of the buffer logging code and reduced the
  cost of various debugging checks, and added the ability to pre-create
  the xattr structures while creating files. Brian eliminated
  transaction reservations that were being held across writeback (thus
  reducing livelock potential.

  Other random pieces: Pavel fixed the repetitve warnings about
  deprecated mount options, I fixed online fsck to behave itself when a
  readonly remount comes in during scrub, and refactored various other
  parts of that code, Christoph contributed a lot of refactoring this
  cycle. The xfs_icdinode structure has been absorbed into the (incore)
  xfs_inode structure, and the format and flags handling around
  xfs_inode_fork structures has been simplified. Chandan provided a
  number of fixes for extent count overflow related problems that have
  been shaken out by debugging knobs added during 5.12.

  Summary:

   - Various minor fixes in online scrub.

   - Prevent metadata files from being automatically inactivated.

   - Validate btree heights by the computed per-btree limits.

   - Don't warn about remounting with deprecated mount options.

   - Initialize attr forks at create time if we suspect we're going to
     need to store them.

   - Reduce memory reallocation workouts in the logging code.

   - Fix some theoretical math calculation errors in logged buffers that
     span multiple discontig memory ranges but contiguous ondisk
     regions.

   - Speedups in dirty buffer bitmap handling.

   - Make type verifier functions more inline-happy to reduce overhead.

   - Reduce debug overhead in directory checking code.

   - Many many typo fixes.

   - Begin to handle the permanent loss of the very end of a filesystem.

   - Fold struct xfs_icdinode into xfs_inode.

   - Deprecate the long defunct BMV_IF_NO_DMAPI_READ from the bmapx
     ioctl.

   - Remove a broken directory block format check from online scrub.

   - Fix a bug where we could produce an unnecessarily tall data fork
     btree when creating an attr fork.

   - Fix scrub and readonly remounts racing.

   - Fix a writeback ioend log deadlock problem by dropping the behavior
     where we could preallocate a setfilesize transaction.

   - Fix some bugs in the new extent count checking code.

   - Fix some bugs in the attr fork preallocation code.

   - Refactor if_flags out of the incore inode fork data structure"

* tag 'xfs-5.13-merge-3' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux: (77 commits)
  xfs: remove xfs_quiesce_attr declaration
  xfs: remove XFS_IFEXTENTS
  xfs: remove XFS_IFINLINE
  xfs: remove XFS_IFBROOT
  xfs: only look at the fork format in xfs_idestroy_fork
  xfs: simplify xfs_attr_remove_args
  xfs: rename and simplify xfs_bmap_one_block
  xfs: move the XFS_IFEXTENTS check into xfs_iread_extents
  xfs: drop unnecessary setfilesize helper
  xfs: drop unused ioend private merge and setfilesize code
  xfs: open code ioend needs workqueue helper
  xfs: drop submit side trans alloc for append ioends
  xfs: fix return of uninitialized value in variable error
  xfs: get rid of the ip parameter to xchk_setup_*
  xfs: fix scrub and remount-ro protection when running scrub
  xfs: move the check for post-EOF mappings into xfs_can_free_eofblocks
  xfs: move the xfs_can_free_eofblocks call under the IOLOCK
  xfs: precalculate default inode attribute offset
  xfs: default attr fork size does not handle device inodes
  xfs: inode fork allocation depends on XFS_IFEXTENT flag
  ...
2021-04-29 10:43:51 -07:00
Linus Torvalds
f2c80837e2 Changes in gfs2:
- Fix some compiler and kernel-doc warnings.
 
 - Various minor cleanups and optimizations.
 
 - Add a new sysfs gfs2 status file with some filesystem wide
   information.
 -----BEGIN PGP SIGNATURE-----
 
 iQJIBAABCAAyFiEEJZs3krPW0xkhLMTc1b+f6wMTZToFAmCKiWIUHGFncnVlbmJh
 QHJlZGhhdC5jb20ACgkQ1b+f6wMTZTqXOA//cgEMi+WZ0pQ1m4Z7Yk58ArAGXOW4
 L+efdMjk2zoqgixF502tQzaa2ctz6XpukF4oZbM+Jc+yZxrbZ5CLjUIOWc9RH+Id
 WQwj5+5GLbMAPnn5ksHUCTK9V+1oAlpgoY4fMtLdKq234Y6xqWj5qBjvtGUTLFAl
 ACvy8FUZplFOkaOSBqgh221LT4Oh0Wthe/Elq5qvqwBfdAiE/p1sHSi2FWxktIlU
 wV3PKL96rFsnWN8E6jqyJR1RNJ5d5MYA+PDkTHKcoqcXZrzw4mfu2tCh88Bh9wFb
 MEyjtLxE09G1+3Li/T/Tb7qbRKWvxEmkLZXaFAjRUp7zYPvM6twKSg8nihcBDtLi
 UgvTrc208CYvYj7QpRQ1dU9lEg47A46rB8dgLz+ymlpNNk/G0gqgvWLevMKnBfaX
 AkZviI6qm1iNCBd6wWWPUKqR0qrCWqoe9N8F7cWyZBki7dKkoj29Gt1X1SeIQMjd
 n8Mkv6Btd39kBt3DydXlCEaREMQYeDrxBJHxur234hEfFLFraFj5tYFjoeSODZdg
 Uxsn5X5dgLy/hHjps8YcuBoEgRMR/aKovK5G1FXDcQR5O6UByqtJuJTJBT8jxAld
 vLYHqO6vxdgGATaYAkuGLSLrJjwES+7tEXjtrdarswGo55dPitwtOB1NRWuOor/Z
 uTnzJbykMdIzEHs=
 =VblJ
 -----END PGP SIGNATURE-----

Merge tag 'gfs2-for-5.13' of git://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2

Pull gfs2 updates from Andreas Gruenbacher:

 - Fix some compiler and kernel-doc warnings

 - Various minor cleanups and optimizations

 - Add a new sysfs gfs2 status file with some filesystem wide
   information

* tag 'gfs2-for-5.13' of git://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2:
  gfs2: Fix fall-through warnings for Clang
  gfs2: Fix a number of kernel-doc warnings
  gfs2: Make gfs2_setattr_simple static
  gfs2: Add new sysfs file for gfs2 status
  gfs2: Silence possible null pointer dereference warning
  gfs2: Turn gfs2_meta_indirect_buffer into gfs2_meta_buffer
  gfs2: Replace gfs2_lblk_to_dblk with gfs2_get_extent
  gfs2: Turn gfs2_extent_map into gfs2_{get,alloc}_extent
  gfs2: Add new gfs2_iomap_get helper
  gfs2: Remove unused variable sb_format
  gfs2: Fix dir.c function parameter descriptions
  gfs2: Eliminate gh parameter from go_xmote_bh func
  gfs2: don't create empty buffers for NO_CREATE
2021-04-29 10:33:35 -07:00
Linus Torvalds
8ae8932c6a Description for this pull request:
- Improve write performance with dirsync mount option.
  - Improve lookup performance.
  - Add support for FITRIM ioctl.
  - Fix a bug with discard option.
 -----BEGIN PGP SIGNATURE-----
 
 iQJMBAABCgA2FiEE6NzKS6Uv/XAAGHgyZwv7A1FEIQgFAmCIADwYHG5hbWphZS5q
 ZW9uQHNhbXN1bmcuY29tAAoJEGcL+wNRRCEIalwP/jepL2Y7B8FOhMH1R4whbY6x
 5fFlG608tR7jxUEMpAfMTcFC2+5nKpHN1keoHV8SXdJz2esd1ZUknaok5CIVQFYX
 I7TTOWn/genJ6nVmHeTLZLAxAWznSlIqRdeuSnDMLMKSLDn7noW5l+vzgYy5dnaF
 SzICtNXJqA7D7PcMIwTtfszcSEwmt1aLiRZZabZ+y0xSF2ha0mRB6B3hCx9sDXh5
 iVVbSn10lk6ULENYcjaUZhF1Dt9Lv4pOZ9drr8bVfRmWBeWspB/X4/TgO+Mf/xP8
 5el7OL965ofUaaCmhyfDMrWCBffeFbf0K8sUM/psiAUuKTugKvlZniZ0RB8zLEvZ
 Bn2yvr+walpdkVNBbv3qi32/4xAx5ng90rSdqX3bvY3Zq1UDW49MIIZOh6uQCNnD
 9ravgGoU2ujVcUQUKEU01dj1ora83IKQ7WuxcGXQOt/xNFMRG1FmiKgS7aj2HgMM
 ax9pjZZY4H8ZUZOC2hpph+x6Kc+ZQDqzR6+0qbWM2/hlLgIURDXTqVe3m1DCuvDI
 c2rvtpVtpYGz+HTZW+krUzc4kOAfUsS98D+7bLS5X/f4x4e5FxQDGyXaCNUI0OCr
 ed8kHmHQsKvbIui4RHgZvZylj2xO34arSuMURL41ifuT9jFoQVxPcGluXNvbdcu8
 yOyqeaqwgh5AJzXPVsRv
 =igtQ
 -----END PGP SIGNATURE-----

Merge tag 'exfat-for-5.13-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/linkinjeon/exfat

Pull exfat updates from Namjae Jeon:

 - Improve write performance with dirsync mount option

 - Improve lookup performance

 - Add support for FITRIM ioctl

 - Fix a bug with discard option

* tag 'exfat-for-5.13-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/linkinjeon/exfat:
  exfat: speed up iterate/lookup by fixing start point of traversing cluster chain
  exfat: improve write performance when dirsync enabled
  exfat: add support ioctl and FITRIM function
  exfat: introduce bitmap_lock for cluster bitmap access
  exfat: fix erroneous discard when clear cluster bit
2021-04-29 10:32:18 -07:00
Linus Torvalds
625434dafd for-5.13/io_uring-2021-04-27
-----BEGIN PGP SIGNATURE-----
 
 iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAmCIRBUQHGF4Ym9lQGtl
 cm5lbC5kawAKCRD301j7KXHgpjt5D/9de6zCaha6CyfIIPiU+crropQ2jPzO49cb
 WzcOCmdhSv0GtYlhdnIqCOo5p8mRDWJAEBU9upTDTCWOx9hwr5Ms0TCNQHxuQ/T0
 4Ll+/cMsOxeTypiykfMtOG9TEmYSria2vTJKLgpyaP4ohfJa3uT7r2NZ8NK/8T4t
 wwbJ+jCSKewelI1l0XD8k8LBU39FS/KRgLTdfYj/rCW3PWt/ZE2eSIYjZQvMCVOC
 3fIdgOOJAMQVQafz+YAeJd2E+/l5/8YcJVKpJMVtBNbqTHIjA4EsInZauy8TpBgW
 OzJ3I+XdF70qZM119tI/nXw3sb0e+UV0fRsIXLkOwTEBzowernrAtsEwAOP+qFKS
 2YnqSKOSjMO5d5Mpkz6T0MDMloU45jph88lUH0RoShVxGa7jv+TMOL6QU1oOyxc1
 +gPPbApQs9WtSZDHsTJ0xFLpol804UDQmwb38mHdzedDVSE7iip1jANkw6LEhKkJ
 Mlg60ZF1Z305G+cDhrbs02ZGVa+fzbrtXtLlTqZw8bNX9lBp0JLtDpzskjbnUmck
 6A04nfg+Eto5GvAn+FRBuOCPridLEk2K6ygko/gwQWsYCgqkCgRuqjlIQCSZy5iu
 jHEFixIXKn6eACf+YzLVxSLyEQrmFyDSypbN7LvzoKJYo/loy8Q1+42nGlrVC3zi
 +CB1NokPng==
 =ZJ8L
 -----END PGP SIGNATURE-----

Merge tag 'for-5.13/io_uring-2021-04-27' of git://git.kernel.dk/linux-block

Pull io_uring updates from Jens Axboe:

 - Support for multi-shot mode for POLL requests

 - More efficient reference counting. This is shamelessly stolen from
   the mm side. Even though referencing is mostly single/dual user, the
   128 count was retained to keep the code the same. Maybe this
   should/could be made generic at some point.

 - Removal of the need to have a manager thread for each ring. The
   manager threads only job was checking and creating new io-threads as
   needed, instead we handle this from the queue path.

 - Allow SQPOLL without CAP_SYS_ADMIN or CAP_SYS_NICE. Since 5.12, this
   thread is "just" a regular application thread, so no need to restrict
   use of it anymore.

 - Cleanup of how internal async poll data lifetime is managed.

 - Fix for syzbot reported crash on SQPOLL cancelation.

 - Make buffer registration more like file registrations, which includes
   flexibility in avoiding full set unregistration and re-registration.

 - Fix for io-wq affinity setting.

 - Be a bit more defensive in task->pf_io_worker setup.

 - Various SQPOLL fixes.

 - Cleanup of SQPOLL creds handling.

 - Improvements to in-flight request tracking.

 - File registration cleanups.

 - Tons of cleanups and little fixes

* tag 'for-5.13/io_uring-2021-04-27' of git://git.kernel.dk/linux-block: (156 commits)
  io_uring: maintain drain logic for multishot poll requests
  io_uring: Check current->io_uring in io_uring_cancel_sqpoll
  io_uring: fix NULL reg-buffer
  io_uring: simplify SQPOLL cancellations
  io_uring: fix work_exit sqpoll cancellations
  io_uring: Fix uninitialized variable up.resv
  io_uring: fix invalid error check after malloc
  io_uring: io_sq_thread() no longer needs to reset current->pf_io_worker
  kernel: always initialize task->pf_io_worker to NULL
  io_uring: update sq_thread_idle after ctx deleted
  io_uring: add full-fledged dynamic buffers support
  io_uring: implement fixed buffers registration similar to fixed files
  io_uring: prepare fixed rw for dynanic buffers
  io_uring: keep table of pointers to ubufs
  io_uring: add generic rsrc update with tags
  io_uring: add IORING_REGISTER_RSRC
  io_uring: enumerate dynamic resources
  io_uring: add generic path for rsrc update
  io_uring: preparation for rsrc tagging
  io_uring: decouple CQE filling from requests
  ...
2021-04-28 14:56:09 -07:00
Linus Torvalds
fc05860628 for-5.13/drivers-2021-04-27
-----BEGIN PGP SIGNATURE-----
 
 iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAmCIJYcQHGF4Ym9lQGtl
 cm5lbC5kawAKCRD301j7KXHgpieWD/92qbtWl/z+9oCY212xV+YMoMqj/vGROX+U
 9i/FQJ3AIC/AUoNjZeW3NIbiaNqde5mrLlUSCHgn6RLsHK7p0GQJ4ohpbIGFG5+i
 2+Efm+vjlCxLVGrkeZEwMtsht7w/NbOYDr1Rgv9b4lQ6iWI11Mg8E337Whl1me1k
 h6bEXaioK9yqxYtsLgcn9I1qQ2p7gok0HX7zFU/XxEUZylqH6E4vQhj2+NL8UUqE
 7siFHADZE99Z7LXtOkl8YyOlGU52RCUzqDHWydvkipKjgYBi95HLXGT64Z+WCEvz
 HI54oVDRWr+uWdqDFfy+ncHm8pNeP0GV9JPhDz4ELRTSndoxB2il7wRLvp6wxV9d
 8Y4j7vb30i+8GGbM0c79dnlG76D9r5ivbTKixcXFKB128NusQR6JymIv1pKlSKhk
 H871/iOarrepAAUwVR5CtldDDJCy/q1Hks+7UXbaM3F9iNitxsJNZryQq9xdTu/N
 ThFOTz+VECG4RJLxIwmsWGiLgwr52/ybAl2MBcn+s7uC4jM/TFKpdQBfQnOAiINb
 MLlfuYRRSMg1Osb2fYZneR2ifmSNOMRdDJb+tsZGz4xWmZcj0uL4QgqcsOvuiOEQ
 veF/Ky50qw57hWtiEhvqa7/WIxzNF3G3wejqqA8hpT9Qifu0QawYTnXGUttYNBB1
 mO9R3/ccaw==
 =c0x4
 -----END PGP SIGNATURE-----

Merge tag 'for-5.13/drivers-2021-04-27' of git://git.kernel.dk/linux-block

Pull block driver updates from Jens Axboe:

 - MD changes via Song:
        - raid5 POWER fix
        - raid1 failure fix
        - UAF fix for md cluster
        - mddev_find_or_alloc() clean up
        - Fix NULL pointer deref with external bitmap
        - Performance improvement for raid10 discard requests
        - Fix missing information of /proc/mdstat

 - rsxx const qualifier removal (Arnd)

 - Expose allocated brd pages (Calvin)

 - rnbd via Gioh Kim:
        - Change maintainer
        - Change domain address of maintainers' email
        - Add polling IO mode and document update
        - Fix memory leak and some bug detected by static code analysis
          tools
        - Code refactoring

 - Series of floppy cleanups/fixes (Denis)

 - s390 dasd fixes (Julian)

 - kerneldoc fixes (Lee)

 - null_blk double free (Lv)

 - null_blk virtual boundary addition (Max)

 - Remove xsysace driver (Michal)

 - umem driver removal (Davidlohr)

 - ataflop fixes (Dan)

 - Revalidate disk removal (Christoph)

 - Bounce buffer cleanups (Christoph)

 - Mark lightnvm as deprecated (Christoph)

 - mtip32xx init cleanups (Shixin)

 - Various fixes (Tian, Gustavo, Coly, Yang, Zhang, Zhiqiang)

* tag 'for-5.13/drivers-2021-04-27' of git://git.kernel.dk/linux-block: (143 commits)
  async_xor: increase src_offs when dropping destination page
  drivers/block/null_blk/main: Fix a double free in null_init.
  md/raid1: properly indicate failure when ending a failed write request
  md-cluster: fix use-after-free issue when removing rdev
  nvme: introduce generic per-namespace chardev
  nvme: cleanup nvme_configure_apst
  nvme: do not try to reconfigure APST when the controller is not live
  nvme: add 'kato' sysfs attribute
  nvme: sanitize KATO setting
  nvmet: avoid queuing keep-alive timer if it is disabled
  brd: expose number of allocated pages in debugfs
  ataflop: fix off by one in ataflop_probe()
  ataflop: potential out of bounds in do_format()
  drbd: Fix fall-through warnings for Clang
  block/rnbd: Use strscpy instead of strlcpy
  block/rnbd-clt-sysfs: Remove copy buffer overlap in rnbd_clt_get_path_name
  block/rnbd-clt: Remove max_segment_size
  block/rnbd-clt: Generate kobject_uevent when the rnbd device state changes
  block/rnbd-srv: Remove unused arguments of rnbd_srv_rdma_ev
  Documentation/ABI/rnbd-clt: Add description for nr_poll_queues
  ...
2021-04-28 14:39:37 -07:00
Linus Torvalds
6c00292113 for-5.13/block-2021-04-27
-----BEGIN PGP SIGNATURE-----
 
 iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAmCIJW0QHGF4Ym9lQGtl
 cm5lbC5kawAKCRD301j7KXHgpr8sD/4qP+MsFTB1IFUu8fW7BjBPdduoK8Vq9o3S
 HB8iF/yhJZ73nLecMMdn/jTO8SCW0Iw+okywW3BugGnNPbwXo0UQ4jLhzbTts76P
 JvZaguZFhBsF3ceFOt3CRCQDOeoDfMp3sitLUVivkN+2vwMs9vJpVNaEeUjcCC1Z
 8QjlpqYSMuakTwEn7QhlnKxVWn1V2B6PDjZMcf48ONRZGsCkoOXH1SE4Ge8nxjqa
 KHKO5bvwgRzGhKpvdHEIl8dmFL9WEWElBVoY3vE2EHL0SPE32zHlxtYLS0NAhY2M
 aprkJ0QP0Rgl8HpYiCstwAnJGKDg4a0ArWhf/CJTuLAWmTNFR7v5n7vw2SilJHTG
 0FtiFiOnpvvBmUC0B1PUEQX8AiFcdXueLb6xboExcp2WtxIAe8wPoGFl6T1tobBY
 qsfWggGs/vD1RVrJISPC+20cJemcRyeakMV48w+n3Lt/ES3IEv/LXx6PO/PbXvOo
 B7HJXTofkoaX52A/1+NxraGapwzhYouhi6Sb6Fc++X59/a/oBuOUGuur0eZ+/oWA
 9787mUUDmW/sahfZUgZh5AxqKo2jJULjeggANCICW9/RN6duV8TBQVOLW1/0Wddp
 9lndiA9ZMveWF+J19+sjBoiYMYawLmURaOlDK77ctTCcR/ji3l4GZ+2KvBEMeIT8
 O1OYEnwaIQ==
 =oza6
 -----END PGP SIGNATURE-----

Merge tag 'for-5.13/block-2021-04-27' of git://git.kernel.dk/linux-block

Pull block updates from Jens Axboe:
 "Pretty quiet round this time, which is nice. In detail:

   - Series revamping bounce buffer support (Christoph)

   - Dead code removal (Christoph, Bart)

   - Partition iteration revamp, now using xarray (Christoph)

   - Passthrough request scheduler improvements (Lin)

   - Series of BFQ improvements (Paolo)

   - Fix ioprio task iteration (Peter)

   - Various little tweaks and fixes (Tejun, Saravanan, Bhaskar, Max,
     Nikolay)"

* tag 'for-5.13/block-2021-04-27' of git://git.kernel.dk/linux-block: (41 commits)
  blk-iocost: don't ignore vrate_min on QD contention
  blk-mq: Fix spurious debugfs directory creation during initialization
  bfq/mq-deadline: remove redundant check for passthrough request
  blk-mq: bypass IO scheduler's limit_depth for passthrough request
  block: Remove an obsolete comment from sg_io()
  block: move bio_list_copy_data to pktcdvd
  block: remove zero_fill_bio_iter
  block: add queue_to_disk() to get gendisk from request_queue
  block: remove an incorrect check from blk_rq_append_bio
  block: initialize ret in bdev_disk_changed
  block: Fix sys_ioprio_set(.which=IOPRIO_WHO_PGRP) task iteration
  block: remove disk_part_iter
  block: simplify diskstats_show
  block: simplify show_partition
  block: simplify printk_all_partitions
  block: simplify partition_overlaps
  block: simplify partition removal
  block: take bd_mutex around delete_partitions in del_gendisk
  block: refactor blk_drop_partitions
  block: move more syncing and invalidation to delete_partition
  ...
2021-04-28 14:27:12 -07:00
Linus Torvalds
16b3d0cf5b Scheduler updates for this cycle are:
- Clean up SCHED_DEBUG: move the decades old mess of sysctl, procfs and debugfs interfaces
    to a unified debugfs interface.
 
  - Signals: Allow caching one sigqueue object per task, to improve performance & latencies.
 
  - Improve newidle_balance() irq-off latencies on systems with a large number of CPU cgroups.
 
  - Improve energy-aware scheduling
 
  - Improve the PELT metrics for certain workloads
 
  - Reintroduce select_idle_smt() to improve load-balancing locality - but without the previous
    regressions
 
  - Add 'scheduler latency debugging': warn after long periods of pending need_resched. This
    is an opt-in feature that requires the enabling of the LATENCY_WARN scheduler feature,
    or the use of the resched_latency_warn_ms=xx boot parameter.
 
  - CPU hotplug fixes for HP-rollback, and for the 'fail' interface. Fix remaining
    balance_push() vs. hotplug holes/races
 
  - PSI fixes, plus allow /proc/pressure/ files to be written by CAP_SYS_RESOURCE tasks as well
 
  - Fix/improve various load-balancing corner cases vs. capacity margins
 
  - Fix sched topology on systems with NUMA diameter of 3 or above
 
  - Fix PF_KTHREAD vs to_kthread() race
 
  - Minor rseq optimizations
 
  - Misc cleanups, optimizations, fixes and smaller updates
 
 Signed-off-by: Ingo Molnar <mingo@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAmCJInsRHG1pbmdvQGtl
 cm5lbC5vcmcACgkQEnMQ0APhK1i5XxAArh0b+fwXlkVGzTUly7HQjhU7lFbChnmF
 h6ToyNLi6pXoZ14VC/WoRIME+RzK3gmw9cEFaSLVPxbkbekTcyWS78kqmcg1/j2v
 kO/20QhXobiIxVskYfoMmqSavZ5mKhMWBqtFXkCuYfxwGylas0VVdh3AZLJ7N21G
 WEoFh99pVULwWnPHxM2ZQ87Ex9BkGKbsBTswxWpprCfXLqD0N2hHlABpwJP78zRf
 VniWFOcC7lslILCFawb7CqGgAwbgV85nDRS4QCuCKisrkFywvjJrEeu/W+h1NfhF
 d6ves/osNdEAM1DSALoxwEA42An8l8xh8NyJnl8JZV00LW0DM108O5/7pf5Zcryc
 RHV3RxA7skgezBh5uThvo60QzNK+kVMatI4qpQEHxLE52CaDl/fBu1Cgb/VUxnIl
 AEBfyiFbk+skHpuMFKtl30Tx3M+yJKMTzFPd4kYjHYGEDwtAcXcB3dJQW48A79i3
 H3IWcDcXpk5Rjo2UZmaXdt/qlj7mP6U0xdOUq8ZK6JOC4uY9skszVGsfuNN9QQ5u
 2E2YKKVrGFoQydl4C8R6A7axL2VzIJszHFZNipd8E3YOyW7PWRAkr02tOOkBTj8N
 dLMcNM7aPJWqEYiEIjEzGQN20pweJ1dRA29LDuOswKh+7W2bWTQFh6F2Q8Haansc
 RVg5PDzl+Mc=
 =E7mz
 -----END PGP SIGNATURE-----

Merge tag 'sched-core-2021-04-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull scheduler updates from Ingo Molnar:

 - Clean up SCHED_DEBUG: move the decades old mess of sysctl, procfs and
   debugfs interfaces to a unified debugfs interface.

 - Signals: Allow caching one sigqueue object per task, to improve
   performance & latencies.

 - Improve newidle_balance() irq-off latencies on systems with a large
   number of CPU cgroups.

 - Improve energy-aware scheduling

 - Improve the PELT metrics for certain workloads

 - Reintroduce select_idle_smt() to improve load-balancing locality -
   but without the previous regressions

 - Add 'scheduler latency debugging': warn after long periods of pending
   need_resched. This is an opt-in feature that requires the enabling of
   the LATENCY_WARN scheduler feature, or the use of the
   resched_latency_warn_ms=xx boot parameter.

 - CPU hotplug fixes for HP-rollback, and for the 'fail' interface. Fix
   remaining balance_push() vs. hotplug holes/races

 - PSI fixes, plus allow /proc/pressure/ files to be written by
   CAP_SYS_RESOURCE tasks as well

 - Fix/improve various load-balancing corner cases vs. capacity margins

 - Fix sched topology on systems with NUMA diameter of 3 or above

 - Fix PF_KTHREAD vs to_kthread() race

 - Minor rseq optimizations

 - Misc cleanups, optimizations, fixes and smaller updates

* tag 'sched-core-2021-04-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (61 commits)
  cpumask/hotplug: Fix cpu_dying() state tracking
  kthread: Fix PF_KTHREAD vs to_kthread() race
  sched/debug: Fix cgroup_path[] serialization
  sched,psi: Handle potential task count underflow bugs more gracefully
  sched: Warn on long periods of pending need_resched
  sched/fair: Move update_nohz_stats() to the CONFIG_NO_HZ_COMMON block to simplify the code & fix an unused function warning
  sched/debug: Rename the sched_debug parameter to sched_verbose
  sched,fair: Alternative sched_slice()
  sched: Move /proc/sched_debug to debugfs
  sched,debug: Convert sysctl sched_domains to debugfs
  debugfs: Implement debugfs_create_str()
  sched,preempt: Move preempt_dynamic to debug.c
  sched: Move SCHED_DEBUG sysctl to debugfs
  sched: Don't make LATENCYTOP select SCHED_DEBUG
  sched: Remove sched_schedstats sysctl out from under SCHED_DEBUG
  sched/numa: Allow runtime enabling/disabling of NUMA balance without SCHED_DEBUG
  sched: Use cpu_dying() to fix balance_push vs hotplug-rollback
  cpumask: Introduce DYING mask
  cpumask: Make cpu_{online,possible,present,active}() inline
  rseq: Optimise rseq_get_rseq_cs() and clear_rseq_cs()
  ...
2021-04-28 13:33:57 -07:00
Linus Torvalds
42dec9a936 Perf events changes in this cycle were:
- Improve Intel uncore PMU support:
 
      - Parse uncore 'discovery tables' - a new hardware capability enumeration method
        introduced on the latest Intel platforms. This table is in a well-defined PCI
        namespace location and is read via MMIO. It is organized in an rbtree.
 
        These uncore tables will allow the discovery of standard counter blocks, but
        fancier counters still need to be enumerated explicitly.
 
      - Add Alder Lake support
 
      - Improve IIO stacks to PMON mapping support on Skylake servers
 
  - Add Intel Alder Lake PMU support - which requires the introduction of 'hybrid' CPUs
    and PMUs. Alder Lake is a mix of Golden Cove ('big') and Gracemont ('small' - Atom derived)
    cores.
 
    The CPU-side feature set is entirely symmetrical - but on the PMU side there's
    core type dependent PMU functionality.
 
  - Reduce data loss with CPU level hardware tracing on Intel PT / AUX profiling, by
    fixing the AUX allocation watermark logic.
 
  - Improve ring buffer allocation on NUMA systems
 
  - Put 'struct perf_event' into their separate kmem_cache pool
 
  - Add support for synchronous signals for select perf events. The immediate motivation
    is to support low-overhead sampling-based race detection for user-space code. The
    feature consists of the following main changes:
 
     - Add thread-only event inheritance via perf_event_attr::inherit_thread, which limits
       inheritance of events to CLONE_THREAD.
 
     - Add the ability for events to not leak through exec(), via perf_event_attr::remove_on_exec.
 
     - Allow the generation of SIGTRAP via perf_event_attr::sigtrap, extend siginfo with an u64
       ::si_perf, and add the breakpoint information to ::si_addr and ::si_perf if the event is
       PERF_TYPE_BREAKPOINT.
 
    The siginfo support is adequate for breakpoints right now - but the new field can be used
    to introduce support for other types of metadata passed over siginfo as well.
 
  - Misc fixes, cleanups and smaller updates.
 
 Signed-off-by: Ingo Molnar <mingo@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAmCJGpERHG1pbmdvQGtl
 cm5lbC5vcmcACgkQEnMQ0APhK1j9zBAAuVbG2snV6SBSdXLhQcM66N3NckOXvSY5
 QjjhQcuwJQEK/NJB3266K5d8qSmdyRBsWf3GCsrmyBT67P1V28K44Pu7oCV0UDtf
 mpVRjEP0oR7hNsANSSgo8Fa4ZD7H5waX7dK7925Tvw8By3mMoZoddiD/84WJHhxO
 NDF+GRFaRj+/dpbhV8cdCoXTjYdkC36vYuZs3b9lu0tS9D/AJgsNy7TinLvO02Cs
 5peP+2y29dgvCXiGBiuJtEA6JyGnX3nUJCvfOZZ/DWDc3fdduARlRrc5Aiq4n/wY
 UdSkw1VTZBlZ1wMSdmHQVeC5RIH3uWUtRoNqy0Yc90lBm55AQ0EENwIfWDUDC5zy
 USdBqWTNWKMBxlEilUIyqKPQK8LW/31TRzqy8BWKPNcZt5yP5YS1SjAJRDDjSwL/
 I+OBw1vjLJamYh8oNiD5b+VLqNQba81jFASfv+HVWcULumnY6ImECCpkg289Fkpi
 BVR065boifJDlyENXFbvTxyMBXQsZfA+EhtxG7ju2Ni+TokBbogyCb3L2injPt9g
 7jjtTOqmfad4gX1WSc+215iYZMkgECcUd9E+BfOseEjBohqlo7yNKIfYnT8mE/Xq
 nb7eHjyvLiE8tRtZ+7SjsujOMHv9LhWFAbSaxU/kEVzpkp0zyd6mnnslDKaaHLhz
 goUMOL/D0lg=
 =NhQ7
 -----END PGP SIGNATURE-----

Merge tag 'perf-core-2021-04-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull perf event updates from Ingo Molnar:

 - Improve Intel uncore PMU support:

     - Parse uncore 'discovery tables' - a new hardware capability
       enumeration method introduced on the latest Intel platforms. This
       table is in a well-defined PCI namespace location and is read via
       MMIO. It is organized in an rbtree.

       These uncore tables will allow the discovery of standard counter
       blocks, but fancier counters still need to be enumerated
       explicitly.

     - Add Alder Lake support

     - Improve IIO stacks to PMON mapping support on Skylake servers

 - Add Intel Alder Lake PMU support - which requires the introduction of
   'hybrid' CPUs and PMUs. Alder Lake is a mix of Golden Cove ('big')
   and Gracemont ('small' - Atom derived) cores.

   The CPU-side feature set is entirely symmetrical - but on the PMU
   side there's core type dependent PMU functionality.

 - Reduce data loss with CPU level hardware tracing on Intel PT / AUX
   profiling, by fixing the AUX allocation watermark logic.

 - Improve ring buffer allocation on NUMA systems

 - Put 'struct perf_event' into their separate kmem_cache pool

 - Add support for synchronous signals for select perf events. The
   immediate motivation is to support low-overhead sampling-based race
   detection for user-space code. The feature consists of the following
   main changes:

     - Add thread-only event inheritance via
       perf_event_attr::inherit_thread, which limits inheritance of
       events to CLONE_THREAD.

     - Add the ability for events to not leak through exec(), via
       perf_event_attr::remove_on_exec.

     - Allow the generation of SIGTRAP via perf_event_attr::sigtrap,
       extend siginfo with an u64 ::si_perf, and add the breakpoint
       information to ::si_addr and ::si_perf if the event is
       PERF_TYPE_BREAKPOINT.

   The siginfo support is adequate for breakpoints right now - but the
   new field can be used to introduce support for other types of
   metadata passed over siginfo as well.

 - Misc fixes, cleanups and smaller updates.

* tag 'perf-core-2021-04-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (53 commits)
  signal, perf: Add missing TRAP_PERF case in siginfo_layout()
  signal, perf: Fix siginfo_t by avoiding u64 on 32-bit architectures
  perf/x86: Allow for 8<num_fixed_counters<16
  perf/x86/rapl: Add support for Intel Alder Lake
  perf/x86/cstate: Add Alder Lake CPU support
  perf/x86/msr: Add Alder Lake CPU support
  perf/x86/intel/uncore: Add Alder Lake support
  perf: Extend PERF_TYPE_HARDWARE and PERF_TYPE_HW_CACHE
  perf/x86/intel: Add Alder Lake Hybrid support
  perf/x86: Support filter_match callback
  perf/x86/intel: Add attr_update for Hybrid PMUs
  perf/x86: Add structures for the attributes of Hybrid PMUs
  perf/x86: Register hybrid PMUs
  perf/x86: Factor out x86_pmu_show_pmu_cap
  perf/x86: Remove temporary pmu assignment in event_init
  perf/x86/intel: Factor out intel_pmu_check_extra_regs
  perf/x86/intel: Factor out intel_pmu_check_event_constraints
  perf/x86/intel: Factor out intel_pmu_check_num_counters
  perf/x86: Hybrid PMU support for extra_regs
  perf/x86: Hybrid PMU support for event constraints
  ...
2021-04-28 13:03:44 -07:00
Linus Torvalds
7f3d08b255 printk changes for 5.13
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEESH4wyp42V4tXvYsjUqAMR0iAlPIFAmCIBMIACgkQUqAMR0iA
 lPIt9w//bbHUN/JsNtLCs/849oExdUn/thVajrD5yELttYZXhdzbXncNdkGX9tlU
 4JmExmUoqKYdN6JhSnrcYvckHj7XXZM7pVh9IdzqRh10MEXIQ+7IUHjQc8034Zs/
 W4/oZmfMtBjszap+cJ9hvdp9qaJkPz/fRLGlrbjc1K4hhxDa1gGmeD35SKswGltm
 q6RzX3uRl5JbBrYsLoqb28MGYRHhjf2+Pvndoj+5Nn9FtwPSot6jAkyqY5Y6iJlS
 W2EsFqOt+Kv7/I93FyQlnXC6Nx7vntmow7knmmGPXDf2BqLb0J8Bxl3fwuzpQoao
 nZzL/p9GQ4ZXF6y8gRV8+RzPIcftBdayOswEDGH0LzlTkbAe/9Sq9Lo7a4Z8jxHW
 ro0P+PSRK5Ksm7jvpVmSTg+Nt+XqDA5zA1lAorX1UjsyeDDNF9ndQ4C+ZNhCKo54
 y+RDgtAArJMIvsHLQ53ReoOct5NnGVNb8G/r3bIAu+Dn6K3nesr6fP1XG8iduseL
 yFlLB7w214BQMr2B/C+8lQvj54wWE4lea2+LNvObxC5b8puYj0fEniUxTYP6bcB5
 QT+LfTToufYz4US7ggJy6hoEfohifGWVvDHbn9tXmyXotSTHH7pHdYypqY+UO+kl
 7BkwzNFCm4qCIKsg8nyJxT2hDOlpcCrQx1dBIjveMqJ0c5+ahXU=
 =ovSn
 -----END PGP SIGNATURE-----

Merge tag 'printk-for-5.13' of git://git.kernel.org/pub/scm/linux/kernel/git/printk/linux

Pull printk updates from Petr Mladek:

 - Stop synchronizing kernel log buffer readers by logbuf_lock. As a
   result, the access to the buffer is fully lockless now.

   Note that printk() itself still uses locks because it tries to flush
   the messages to the console immediately. Also the per-CPU temporary
   buffers are still there because they prevent infinite recursion and
   serialize backtraces from NMI. All this is going to change in the
   future.

 - kmsg_dump API rework and cleanup as a side effect of the logbuf_lock
   removal.

 - Make bstr_printf() aware that %pf and %pF formats could deference the
   given pointer.

 - Show also page flags by %pGp format.

 - Clarify the documentation for plain pointer printing.

 - Do not show no_hash_pointers warning multiple times.

 - Update Senozhatsky email address.

 - Some clean up.

* tag 'printk-for-5.13' of git://git.kernel.org/pub/scm/linux/kernel/git/printk/linux: (24 commits)
  lib/vsprintf.c: remove leftover 'f' and 'F' cases from bstr_printf()
  printk: clarify the documentation for plain pointer printing
  kernel/printk.c: Fixed mundane typos
  printk: rename vprintk_func to vprintk
  vsprintf: dump full information of page flags in pGp
  mm, slub: don't combine pr_err with INFO
  mm, slub: use pGp to print page flags
  MAINTAINERS: update Senozhatsky email address
  lib/vsprintf: do not show no_hash_pointers message multiple times
  printk: console: remove unnecessary safe buffer usage
  printk: kmsg_dump: remove _nolock() variants
  printk: remove logbuf_lock
  printk: introduce a kmsg_dump iterator
  printk: kmsg_dumper: remove @active field
  printk: add syslog_lock
  printk: use atomic64_t for devkmsg_user.seq
  printk: use seqcount_latch for clear_seq
  printk: introduce CONSOLE_LOG_MAX
  printk: consolidate kmsg_dump_get_buffer/syslog_print_all code
  printk: refactor kmsg_dump_get_buffer()
  ...
2021-04-27 18:09:44 -07:00
Linus Torvalds
f1c921fb70 selinux/stable-5.13 PR 20210426
-----BEGIN PGP SIGNATURE-----
 
 iQJIBAABCAAyFiEES0KozwfymdVUl37v6iDy2pc3iXMFAmCHM2sUHHBhdWxAcGF1
 bC1tb29yZS5jb20ACgkQ6iDy2pc3iXNfCg/9GmoCyCh+ZRj5RGQ6M+yJas1+yyJQ
 uEfTNde54yfATUTaaWYnZG59yqzM3I2uaV11U7tqg8ajiFPxJKqbs5R9jl3lnSjH
 0Dg22nXPSCOTKcU0x/DeLoKRr+M9jO1K/nQ8NEZvYX4nC/OgtCvJqb/oEQZIKAk5
 2a7OEmNNQyFGd274p9dELaDHxN9UIaJ2PzQFXtq7ROHgBXQO4ONb2ajOf6mDSFQb
 vP/CDHwaH+pcE28w44oRy0/YBkO1SrdqoFQchg5yFagM5tQRLGkXK4OFSs5KHi5Q
 YMtmaOzMPIv1e5eaC1HuuMJYA4pPb30T9hFHP7tmBVZfmZaFaDeUs+BhMm98WTiS
 o0iTP7tfs36/poOR1Q0/sB06uvF9hUAAX1ZuE95YySifbXU9hsUc9b0uQSwCdg9P
 /J9rcdHLTpWqjw9n02mezWmAvo5U8ZvbDs+0xPIwI+3RTUP5t6mp+Hd5Tc7bPTq1
 0rpWXx+FQoSytFap5qiUSiwBp+HF6HQnNIXB0Muf6wctChoTjvo7TwoxH//z4kEm
 +SddhOCNkB7VC/X7hOxhl0F/rdHuXvb1AFIWjpTLJH2CR1PvMtF+sGey+uPT6hKZ
 /gvhmQGjFdph99eGlfVbCNvx1pM61O25IscaYD1T2wGImw+z7dX4WkG3WoOdDSkR
 bRjrBkcHh0gLhWk=
 =HTEy
 -----END PGP SIGNATURE-----

Merge tag 'selinux-pr-20210426' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux

Pull selinux updates from Paul Moore:

 - Add support for measuring the SELinux state and policy capabilities
   using IMA.

 - A handful of SELinux/NFS patches to compare the SELinux state of one
   mount with a set of mount options. Olga goes into more detail in the
   patch descriptions, but this is important as it allows more
   flexibility when using NFS and SELinux context mounts.

 - Properly differentiate between the subjective and objective LSM
   credentials; including support for the SELinux and Smack. My clumsy
   attempt at a proper fix for AppArmor didn't quite pass muster so John
   is working on a proper AppArmor patch, in the meantime this set of
   patches shouldn't change the behavior of AppArmor in any way. This
   change explains the bulk of the diffstat beyond security/.

 - Fix a problem where we were not properly terminating the permission
   list for two SELinux object classes.

* tag 'selinux-pr-20210426' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux:
  selinux: add proper NULL termination to the secclass_map permissions
  smack: differentiate between subjective and objective task credentials
  selinux: clarify task subjective and objective credentials
  lsm: separate security_task_getsecid() into subjective and objective variants
  nfs: account for selinux security context when deciding to share superblock
  nfs: remove unneeded null check in nfs_fill_super()
  lsm,selinux: add new hook to compare new mount to an existing mount
  selinux: fix misspellings using codespell tool
  selinux: fix misspellings using codespell tool
  selinux: measure state and policy capabilities
  selinux: Allow context mounts for unpriviliged overlayfs
2021-04-27 13:42:11 -07:00
Linus Torvalds
fafe1e39ed AFS: Use the new netfs lib
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEqG5UsNXhtOCrfGQP+7dXa6fLC2sFAmCHJJAACgkQ+7dXa6fL
 C2uv0A//S/sJyToPtj3xbzmRVmSGGWFYNRMaxBD2gYAq7swbDNiX4ZbBCe8A4FBY
 zedeMfoNztHIRB2M9vvnhG4HJWXPKq2BaT0xzeteCcmZ65b5zBOrAXue0PQPqE20
 xmK1RDls/y5Y2FaF92Ay0VZzXW7+y/M+RRSo+FCFzrIgpJrPprTnlZigrECYauGJ
 Qdsv26rQ0flK6tyi6GVuWZIMvpINCt3WwpwQTkAUewz2VewA1tZ1xFe70sP0vF7R
 MJNaS6A4uJmvoJJzb8rqdnBGiu76+TxmPaXn0IZKJBECZjBVJyk/duce0jgqbQ7C
 PZz5j4C2xrPyu3Y98joj37HPEAHCy0DPRx2Es1mz5cHPzI1TDRClHzPrxyycz9gr
 D9WnMiPj9ff9aDaV6XpWKyuHhPxaHpoOD3VGdrhx6bU19Jd3/mLHB3lSt1kJzWdg
 QrSAk3KzMWAZigz/+I5xetOpbygKTPLEYgpdmdOSTrtACcm1wjnhIougu0FUIWXK
 arPNFOIV9liN0qCQyDOcLx4UEcxXrb2W0AYeHHJDBFxJ7sT2WWUCjPZFW5bh3G+Y
 goKv/XJRVWJxFlTXLZLZ5siclzzIlAAmSylh661ji836yRhqTQ3NJTB8QfnrGGsZ
 QlD1hjpyqC8uwIGUvoh56KdLRTxj9Gj70gpVe/Lk3Z16mivqDUE=
 =fSr0
 -----END PGP SIGNATURE-----

Merge tag 'afs-netfs-lib-20210426' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs

Pull AFS updates from David Howells:
 "Use the new netfs lib.

  Begin the process of overhauling the use of the fscache API by AFS and
  the introduction of support for features such as Transparent Huge
  Pages (THPs).

   - Add some support for THPs, including using core VM helper functions
     to find details of pages.

   - Use the ITER_XARRAY I/O iterator to mediate access to the pagecache
     as this handles THPs and doesn't require allocation of large bvec
     arrays.

   - Delegate address_space read/pre-write I/O methods for AFS to the
     netfs helper library. A method is provided to the library that
     allows it to issue a read against the server.

     This includes a change in use for PG_fscache (it now indicates a
     DIO write in progress from the marked page), so a number of waits
     need to be deployed for it.

   - Split the core AFS writeback function to make it easier to modify
     in future patches to handle writing to the cache. [This might
     feasibly make more sense moved out into my fscache-iter branch].

  I've tested these with "xfstests -g quick" against an AFS volume
  (xfstests needs patching to make it work). With this, AFS without a
  cache passes all expected xfstests; with a cache, there's an extra
  failure, but that's also there before these patches. Fixing that
  probably requires a greater overhaul (as can be found on my
  fscache-iter branch, but that's for a later time).

  Thanks should go to Marc Dionne and Jeff Altman of AuriStor for
  exercising the patches in their test farm also"

Link: https://lore.kernel.org/lkml/3785063.1619482429@warthog.procyon.org.uk/

* tag 'afs-netfs-lib-20210426' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs:
  afs: Use the netfs_write_begin() helper
  afs: Use new netfs lib read helper API
  afs: Use the fs operation ops to handle FetchData completion
  afs: Prepare for use of THPs
  afs: Extract writeback extension into its own function
  afs: Wait on PG_fscache before modifying/releasing a page
  afs: Use ITER_XARRAY for writing
  afs: Set up the iov_iter before calling afs_extract_data()
  afs: Log remote unmarshalling errors
  afs: Don't truncate iter during data fetch
  afs: Move key to afs_read struct
  afs: Print the operation debug_id when logging an unexpected data version
  afs: Pass page into dirty region helpers to provide THP size
  afs: Disable use of the fscache I/O routines
2021-04-27 13:27:39 -07:00
Linus Torvalds
820c4bae40 Network filesystem helper library
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEqG5UsNXhtOCrfGQP+7dXa6fLC2sFAmCHPZwACgkQ+7dXa6fL
 C2uJxw/9FVNssHxtA8iFDvZskE4YHiL6vMgOgKOeVmBfUvxqJcxWQXcF8ycbon5y
 jGcDRV1DWTv395ckALHqmD6SlH/5q+OBt4cCOXCebOlzbC63JmjJ6xOjHntZKw3i
 9c3GITNca5AsPXHXHGIcoRY4/4FntpLoVpyfYJ4ZZJCY7a7QUbgnEIIy9/Ps8Clw
 BahhiKChl2JCgV3KZBk/ypkf0IBduxKgT+IUxA9o7H5UsLzvUgnfd5uMIALLPMI1
 NXzUHBJoUtnWcB52nWPufJx9YwkMfSx70mutT0T74CFxbJakwRgAl2tWr5g989qM
 /fQrsOhMlU3NaXYaRPelbxkuzvy3hU1xSe3GLiZcxmh4Cb/YAX0TrHRecO62NWff
 pu/UWQS8Du5Gy8DrHScuo8baI1KFfyiV2lWQPfBO8kPaEB2ERw+PN6fWSh993Cn9
 4UHaR3Oyn4qyVXeirNZg+frado+BEZAbNMZwn0lyi6jnLeyir6qABOdpQk34SB35
 D4jfdPOBxeh3OVFkc+EBJ98i3/nal2+yXrNOqkP4OwmF0HqGt0YKKSaLNigXaDdO
 3CKmQlBqBZsUdRYHJyJsofrifkKjP78zx2WyUJPms8MGX9z+9kYR3f1erifLesCT
 Kb2TrAFx4ZgqS5tFh6UHnX4x0qy2RckgNrKTMpv38K8lNqplvLo=
 =tZgy
 -----END PGP SIGNATURE-----

Merge tag 'netfs-lib-20210426' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs

Pull network filesystem helper library updates from David Howells:
 "Here's a set of patches for 5.13 to begin the process of overhauling
  the local caching API for network filesystems. This set consists of
  two parts:

  (1) Add a helper library to handle the new VM readahead interface.

      This is intended to be used unconditionally by the filesystem
      (whether or not caching is enabled) and provides a common
      framework for doing caching, transparent huge pages and, in the
      future, possibly fscrypt and read bandwidth maximisation. It also
      allows the netfs and the cache to align, expand and slice up a
      read request from the VM in various ways; the netfs need only
      provide a function to read a stretch of data to the pagecache and
      the helper takes care of the rest.

  (2) Add an alternative fscache/cachfiles I/O API that uses the kiocb
      facility to do async DIO to transfer data to/from the netfs's
      pages, rather than using readpage with wait queue snooping on one
      side and vfs_write() on the other. It also uses less memory, since
      it doesn't do buffered I/O on the backing file.

      Note that this uses SEEK_HOLE/SEEK_DATA to locate the data
      available to be read from the cache. Whilst this is an improvement
      from the bmap interface, it still has a problem with regard to a
      modern extent-based filesystem inserting or removing bridging
      blocks of zeros. Fixing that requires a much greater overhaul.

  This is a step towards overhauling the fscache API. The change is
  opt-in on the part of the network filesystem. A netfs should not try
  to mix the old and the new API because of conflicting ways of handling
  pages and the PG_fscache page flag and because it would be mixing DIO
  with buffered I/O. Further, the helper library can't be used with the
  old API.

  This does not change any of the fscache cookie handling APIs or the
  way invalidation is done at this time.

  In the near term, I intend to deprecate and remove the old I/O API
  (fscache_allocate_page{,s}(), fscache_read_or_alloc_page{,s}(),
  fscache_write_page() and fscache_uncache_page()) and eventually
  replace most of fscache/cachefiles with something simpler and easier
  to follow.

  This patchset contains the following parts:

   - Some helper patches, including provision of an ITER_XARRAY iov
     iterator and a function to do readahead expansion.

   - Patches to add the netfs helper library.

   - A patch to add the fscache/cachefiles kiocb API.

   - A pair of patches to fix some review issues in the ITER_XARRAY and
     read helpers as spotted by Al and Willy.

  Jeff Layton has patches to add support in Ceph for this that he
  intends for this merge window. I have a set of patches to support AFS
  that I will post a separate pull request for.

  With this, AFS without a cache passes all expected xfstests; with a
  cache, there's an extra failure, but that's also there before these
  patches. Fixing that probably requires a greater overhaul. Ceph also
  passes the expected tests.

  I also have patches in a separate branch to tidy up the handling of
  PG_fscache/PG_private_2 and their contribution to page refcounting in
  the core kernel here, but I haven't included them in this set and will
  route them separately"

Link: https://lore.kernel.org/lkml/3779937.1619478404@warthog.procyon.org.uk/

* tag 'netfs-lib-20210426' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs:
  netfs: Miscellaneous fixes
  iov_iter: Four fixes for ITER_XARRAY
  fscache, cachefiles: Add alternate API to use kiocb for read/write to cache
  netfs: Add a tracepoint to log failures that would be otherwise unseen
  netfs: Define an interface to talk to a cache
  netfs: Add write_begin helper
  netfs: Gather stats
  netfs: Add tracepoints
  netfs: Provide readahead and readpage netfs helpers
  netfs, mm: Add set/end/wait_on_page_fscache() aliases
  netfs, mm: Move PG_fscache helper funcs to linux/netfs.h
  netfs: Documentation for helper library
  netfs: Make a netfs helper module
  mm: Implement readahead_control pageset expansion
  mm/readahead: Handle ractl nr_pages being modified
  fs: Document file_ra_state
  mm/filemap: Pass the file_ra_state in the ractl
  mm: Add set/end/wait functions for PG_private_2
  iov_iter: Add ITER_XARRAY
2021-04-27 13:08:12 -07:00
Linus Torvalds
34a456eb1f fs.idmapped.helpers.v5.13
-----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQRAhzRXHqcMeLMyaSiRxhvAZXjcogUCYIfiiwAKCRCRxhvAZXjc
 ogtMAQC+MtgJZdcH5iDHNEyI36JaWUccKRV7PdvfF1YgnXO45gD+IYxR1c/EQQyD
 kh2AmqhET6jVhe9Nsob5yxduksI+ygo=
 =oh/d
 -----END PGP SIGNATURE-----

Merge tag 'fs.idmapped.helpers.v5.13' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux

Pull fs mapping helper updates from Christian Brauner:
 "This adds kernel-doc to all new idmapping helpers and improves their
  naming which was triggered by a discussion with some fs developers.
  Some of the names are based on suggestions by Vivek and Al.

  Also remove the open-coded permission checking in a few places with
  simple helpers. Overall this should lead to more clarity and make it
  easier to maintain"

* tag 'fs.idmapped.helpers.v5.13' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux:
  fs: introduce two inode i_{u,g}id initialization helpers
  fs: introduce fsuidgid_has_mapping() helper
  fs: document and rename fsid helpers
  fs: document mapping helpers
2021-04-27 12:49:42 -07:00
Linus Torvalds
cc15422c1f fs.idmapped.docs.v5.13
-----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQRAhzRXHqcMeLMyaSiRxhvAZXjcogUCYIfiFwAKCRCRxhvAZXjc
 oswFAP4sL0oA7mBGDzoxktIMWKY+f7KKDjb9gXc8fDQV9bbcNwD6A9QPJCahfab9
 cndByav/xcB/7n/NXLecNYr8NcfTgg8=
 =mdyh
 -----END PGP SIGNATURE-----

Merge tag 'fs.idmapped.docs.v5.13' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux

Pull fs helper kernel-doc updates from Christian Brauner:
 "In the last cycles we forgot to update the kernel-docs in some places
  that were changed during the idmapped mount work. Lukas and Randy took
  the chance to not just fixup those places but also fixup and expand
  kernel-docs for some additional helpers.

  No functional changes"

* tag 'fs.idmapped.docs.v5.13' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux:
  fs: update kernel-doc for vfs_rename()
  fs: turn some comments into kernel-doc
  xattr: fix kernel-doc for mnt_userns and vfs xattr helpers
  namei: fix kernel-doc for struct renamedata and more
  libfs: fix kernel-doc for mnt_userns
2021-04-27 12:42:03 -07:00
Linus Torvalds
b34b95ebbb New code for 5.13:
- When a swap file is rejected, actually log the /name/ of the swapfile.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEUzaAxoMeQq6m2jMV+H93GTRKtOsFAmBerWkACgkQ+H93GTRK
 tOu2zhAAm22CPqLH4/MJ8E3+DSCcZ1b/IKPemHk8lcJbPEmhoMyRxG6TFndo4p38
 7IfsC1BChP9RYj+XwP3CWlya2c68cCu87RScDOShtFpxRQbSSv3HYsYrssfKLAXq
 Mn6GaIoLJu16gtQmw5kIbaPyJDlik+dyjNHowzv13hrYhjVqp97GgSiNq0uMMyJ5
 7yfyWCig42V06FCVLVEatNKecI0gZy2n1ZxwMTIzqfcI+ItzYCHf6TGirKd1IVF2
 kqjqEg2kQmft8pr0fAmSXxF23fRrjCFm8zhDnBcHWsPSAXmONXVEhicSotX60NrB
 ++KE6/eiNJNGZSdhhzGyIAYZsdlDHq0GS7OM0LVxp19/yc4kAYoi3UIrg0PBHkwt
 D6gGWqDpMuPwq++Ao9dWzM1tpXlmKGCC5Ju5Oqv1BIPcgMBJ6cd9ZVgCJgwn0+k8
 y6wrlW1JDJqY5JcYfPCCd1Y7EPR+3lTtYhD39/UKDEg89lNXur9PEPDW+eN8jsqE
 EHJri5eQuGfprsmXp7qqOEur0BU5vzi9+O95b/EiJSQiHu52EtvbZYppllj1V9nj
 Nm/uz8Uq2BBehFOdSDZ3QlzlOXlronq74DCaK9RdRmWoipIH09cTkPoctGahH2uk
 OpE/9y1PIU+Kl1N2GU5W7rGud/jMFyYK1BVmHeIkvYqbLMNSXug=
 =9Oxg
 -----END PGP SIGNATURE-----

Merge tag 'iomap-5.13-merge-2' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux

Pull iomap update from Darrick Wong:
 "A single patch to the iomap code, which augments what gets logged when
  someone tries to swapon an unacceptable swap file. (Yes, this is a
  continuation of the swapfile drama from last season...)"

* tag 'iomap-5.13-merge-2' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux:
  iomap: improve the warnings from iomap_swapfile_activate
2021-04-27 12:27:23 -07:00
Linus Torvalds
a4f7fae101 Merge branch 'miklos.fileattr' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull fileattr conversion updates from Miklos Szeredi via Al Viro:
 "This splits the handling of FS_IOC_[GS]ETFLAGS from ->ioctl() into a
  separate method.

  The interface is reasonably uniform across the filesystems that
  support it and gives nice boilerplate removal"

* 'miklos.fileattr' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (23 commits)
  ovl: remove unneeded ioctls
  fuse: convert to fileattr
  fuse: add internal open/release helpers
  fuse: unsigned open flags
  fuse: move ioctl to separate source file
  vfs: remove unused ioctl helpers
  ubifs: convert to fileattr
  reiserfs: convert to fileattr
  ocfs2: convert to fileattr
  nilfs2: convert to fileattr
  jfs: convert to fileattr
  hfsplus: convert to fileattr
  efivars: convert to fileattr
  xfs: convert to fileattr
  orangefs: convert to fileattr
  gfs2: convert to fileattr
  f2fs: convert to fileattr
  ext4: convert to fileattr
  ext2: convert to fileattr
  btrfs: convert to fileattr
  ...
2021-04-27 11:18:24 -07:00
Linus Torvalds
5e67208885 Merge branch 'work.coredump' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull coredump updates from Al Viro:
 "Just a couple of patches this cycle: use of seek + write instead of
  expanding truncate and minor header cleanup"

* 'work.coredump' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  coredump.h: move CONFIG_COREDUMP-only stuff inside the ifdef
  coredump: don't bother with do_truncate()
2021-04-27 11:04:27 -07:00
Linus Torvalds
d1466bc583 Merge branch 'work.inode-type-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull vfs inode type handling updates from Al Viro:
 "We should never change the type bits of ->i_mode or the method tables
  (->i_op and ->i_fop) of a live inode.

  Unfortunately, not all filesystems took care to prevent that"

* 'work.inode-type-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  spufs: fix bogosity in S_ISGID handling
  9p: missing chunk of "fs/9p: Don't update file type when updating file attributes"
  openpromfs: don't do unlock_new_inode() until the new inode is set up
  hostfs_mknod(): don't bother with init_special_inode()
  cifs: have cifs_fattr_to_inode() refuse to change type on live inode
  cifs: have ->mkdir() handle race with another client sanely
  do_cifs_create(): don't set ->i_mode of something we had not created
  gfs2: be careful with inode refresh
  ocfs2_inode_lock_update(): make sure we don't change the type bits of i_mode
  orangefs_inode_is_stale(): i_mode type bits do *not* form a bitmap...
  vboxsf: don't allow to change the inode type
  afs: Fix updating of i_mode due to 3rd party change
  ceph: don't allow type or device number to change on non-I_NEW inodes
  ceph: fix up error handling with snapdirs
  new helper: inode_wrong_type()
2021-04-27 10:57:42 -07:00
Linus Torvalds
57fa2369ab CFI on arm64 series for v5.13-rc1
- Clean up list_sort prototypes (Sami Tolvanen)
 
 - Introduce CONFIG_CFI_CLANG for arm64 (Sami Tolvanen)
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEpcP2jyKd1g9yPm4TiXL039xtwCYFAmCHCR8ACgkQiXL039xt
 wCZyFQ//fnUZaXR2K354zDyW6CJljMf+d94RF6rH+J6eMTH2/HXa5v0iJokwABLf
 ussP6qF4k5wtmI22Gm9A5Zc3e4iiry5pC0jOdk0mk4gzWwFN9MdgNxJZIGA3xqhS
 bsBK4AGrVKjtZl48G1/ZxJuNDeJhVp6GNK2n6/Gl4rZF6R7D/Upz0XelyJRdDpcM
 HIGma7jZl6xfGU0mdWCzpOGK1zdMca1WVs7A4YuurSbLn5PZJrcNVWLouDqt/Si2
 AduSri1gyPClicgvqWjMOzhUpuw/nJtBLRl1x1EsWk/KSZ1/uNVjlewfzdN4fZrr
 zbtFr2gLubYLK6JOX7/LqoHlOTgE3tYLL+WIVN75DsOGZBKgHhmebTmWLyqzV0SL
 oqcyM5d3ucC6msdtAK5Fv4MSp8rpjqlK1Ha4SGRT6kC2wut7AhZ3KD7eyRIz8mV9
 Sa9mhignGFJnTEUp+LSbYdrAudgSKxB40WyXPmswAXX4VJFRD4ONrrcAON/SzkUT
 Hw/JdFRCKkJjgwNQjIQoZcUNMTbFz2PlNIEnjJWm38YImQKQlCb2mXaZKCwBkf45
 aheCZk17eKoxTCXFMd+KxlyNEtS2yBfq/PpZgvw7GW/pfFbWUg1+2O41LnihIe5v
 zu0hN1wNCQqgfxiMZqX1OTb9C/2vybzGsXILt+9nppjZ8EBU7iU=
 =wU6U
 -----END PGP SIGNATURE-----

Merge tag 'cfi-v5.13-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux

Pull CFI on arm64 support from Kees Cook:
 "This builds on last cycle's LTO work, and allows the arm64 kernels to
  be built with Clang's Control Flow Integrity feature. This feature has
  happily lived in Android kernels for almost 3 years[1], so I'm excited
  to have it ready for upstream.

  The wide diffstat is mainly due to the treewide fixing of mismatched
  list_sort prototypes. Other things in core kernel are to address
  various CFI corner cases. The largest code portion is the CFI runtime
  implementation itself (which will be shared by all architectures
  implementing support for CFI). The arm64 pieces are Acked by arm64
  maintainers rather than coming through the arm64 tree since carrying
  this tree over there was going to be awkward.

  CFI support for x86 is still under development, but is pretty close.
  There are a handful of corner cases on x86 that need some improvements
  to Clang and objtool, but otherwise works well.

  Summary:

   - Clean up list_sort prototypes (Sami Tolvanen)

   - Introduce CONFIG_CFI_CLANG for arm64 (Sami Tolvanen)"

* tag 'cfi-v5.13-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
  arm64: allow CONFIG_CFI_CLANG to be selected
  KVM: arm64: Disable CFI for nVHE
  arm64: ftrace: use function_nocfi for ftrace_call
  arm64: add __nocfi to __apply_alternatives
  arm64: add __nocfi to functions that jump to a physical address
  arm64: use function_nocfi with __pa_symbol
  arm64: implement function_nocfi
  psci: use function_nocfi for cpu_resume
  lkdtm: use function_nocfi
  treewide: Change list_sort to use const pointers
  bpf: disable CFI in dispatcher functions
  kallsyms: strip ThinLTO hashes from static functions
  kthread: use WARN_ON_FUNCTION_MISMATCH
  workqueue: use WARN_ON_FUNCTION_MISMATCH
  module: ensure __cfi_check alignment
  mm: add generic function_nocfi macro
  cfi: add __cficanonical
  add support for Clang CFI
2021-04-27 10:16:46 -07:00
Linus Torvalds
288321a9c6 pstore update for v5.13-rc1
- Add mem_type property to expand support for >2 memory types (Mukesh Ojha)
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEpcP2jyKd1g9yPm4TiXL039xtwCYFAmCHBv0ACgkQiXL039xt
 wCYkdQ//f8yGzy0fDYXtoA6Nw+wBC9NE2b1iaZ7WIvTAamcSuay4QGQHXOZZO8tc
 vCJG1parx7aRwPg5zpeYODUTBC6aVxLdRd2Zk46Pahy8XOel+Rr60oqr3LKdPaEW
 xzh+rkRd4uXSI/MjvNuNviiLFSkp7qqb+M3R2Gkto8C3cSVi3OURSPdyvdrAiWan
 29xwwNvxva2cgWvP8GZL9V63piWzqpJWW8Y8Mxc3TMQDqwNczH8xqxfPEoP+jhIV
 WyD+gDlRFxjjohs2pQFxrdrXzIydT9wPH9Qe0h9QmFIXpOLBHMRo56rFy6kbZ1fj
 kAGtbd6//MfVBSBWneNrhocEFEZDK8V/WHvCY1OV9iGgB38wvI290hJqwt92NEyL
 vMg8Bcn4TmlW3LekjQZWzqn0BqHLPVSmOW7mImaO0HA33EpcnwqWqJuWxuyVelAG
 TZ0s0DU30eaquYX26Qi39+EkM7N/YsFc1VD5ejaghNkZrxPpdibTk3BtgKtQN0gT
 9m3E8r+LiVX5RI4H18aW8AjzCE7p/BKfHRqCKyq77era8BhUGEXXmjGIwYDRlRQA
 eS90PJUKBnloLVS88urMkVsaF5MjM8hKOgbc8crxiISoC+fd3pQT1dIwRFGbMsiG
 iJP7bLfkKU/HEaQhWQwXqiaXb/qDWqRL6nrjsrUfW1rnlJ6INfc=
 =j6qi
 -----END PGP SIGNATURE-----

Merge tag 'pstore-v5.13-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux

Pull pstore update from Kees Cook:

 - Add mem_type property to expand support for >2 memory types (Mukesh Ojha)

* tag 'pstore-v5.13-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
  pstore: Add mem_type property DT parsing support
2021-04-27 10:08:10 -07:00
Linus Torvalds
7e4910b9ac seccomp updates for v5.13-rc1
- Fix "cacheable" typo in comments (Cui GaoSheng)
 
 - Fix CONFIG for /proc/$pid/status Seccomp_filters (Kenta.Tada@sony.com)
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEpcP2jyKd1g9yPm4TiXL039xtwCYFAmCHBe0ACgkQiXL039xt
 wCadDA//cy6LXlzJ78tRy1Zj4/iRlvfLGQ6rNuhoWkm9nuLOTJlzmb9lPxLFo1lo
 N4FDuXE0daPvmgy/XVu9wBKDZsgTlegzikGARfQmeHJ7Wj1H8ibz1OJPd1o60p4Y
 pfeImxefoNKxx7IxnNFDMLHgVi+CtnOZklwlj+bobIWjzclNB2EacumnyJlPuboW
 4ZHBSkG1roLkBB4Q10fI7OHV8lSuQp/IyrAypLybydJ0xiZgvGD3NPOA4N8KH8nR
 A0kbA953Rld/PFzw5inRqepyPZKtT07LJyfl1ff60OtKOHkVBXPv6pYrdgWs0A9y
 XZxdHjVV/MHLvcK9dBoZZGi0/907fvcEgtMacaRekevD5sqiqtNOH5B5rQsMwtXs
 s/Kvg1KgmVJBQwFcMRuAfXqnnPy2672XvDU5/uptVbhpOIcIVeHtGvygPkiobuuO
 V1sE+huGCw+xnfRIOOmytRTpkHMlIS9ev1ApfXtuUtbXbM0W1G6H7adc0KE4bApm
 D/fpv97myH42r/UghOL5EHVaLcnw8embVr/ij4WpMiC1TrhWy0XU27oJisG6xRw6
 A2Q4ybO3VM85LgteeQg10BZFmnuwfHMRJPBL4TOhNSs5GBx5EmkEFByozvMst5xR
 W/GIDn7g7jy1H0wuQOQ7NCgU5+RDDslCOjCIJdSipwpsTc65QCQ=
 =m4xO
 -----END PGP SIGNATURE-----

Merge tag 'seccomp-v5.13-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux

Pull seccomp updates from Kees Cook:

 - Fix "cacheable" typo in comments (Cui GaoSheng)

 - Fix CONFIG for /proc/$pid/status Seccomp_filters (Kenta.Tada@sony.com)

* tag 'seccomp-v5.13-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
  seccomp: Fix "cacheable" typo in comments
  seccomp: Fix CONFIG tests for Seccomp_filters
2021-04-27 10:03:12 -07:00
Hao Xu
7b289c3833 io_uring: maintain drain logic for multishot poll requests
Now that we have multishot poll requests, one SQE can emit multiple
CQEs. given below example:
    sqe0(multishot poll)-->sqe1-->sqe2(drain req)
sqe2 is designed to issue after sqe0 and sqe1 completed, but since sqe0
is a multishot poll request, sqe2 may be issued after sqe0's event
triggered twice before sqe1 completed. This isn't what users leverage
drain requests for.
Here the solution is to wait for multishot poll requests fully
completed.
To achieve this, we should reconsider the req_need_defer equation, the
original one is:

    all_sqes(excluding dropped ones) == all_cqes(including dropped ones)

This means we issue a drain request when all the previous submitted
SQEs have generated their CQEs.
Now we should consider multishot requests, we deduct all the multishot
CQEs except the cancellation one, In this way a multishot poll request
behave like a normal request, so:
    all_sqes == all_cqes - multishot_cqes(except cancellations)

Here we introduce cq_extra for it.

Signed-off-by: Hao Xu <haoxu@linux.alibaba.com>
Link: https://lore.kernel.org/r/1618298439-136286-1-git-send-email-haoxu@linux.alibaba.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-04-27 07:38:58 -06:00
Palash Oswal
6d042ffb59 io_uring: Check current->io_uring in io_uring_cancel_sqpoll
syzkaller identified KASAN: null-ptr-deref Write in
io_uring_cancel_sqpoll.

io_uring_cancel_sqpoll is called by io_sq_thread before calling
io_uring_alloc_task_context. This leads to current->io_uring being NULL.
io_uring_cancel_sqpoll should not have to deal with threads where
current->io_uring is NULL.

In order to cast a wider safety net, perform input sanitisation directly
in io_uring_cancel_sqpoll and return for NULL value of current->io_uring.
This is safe since if current->io_uring isn't set, then there's no way
for the task to have submitted any requests.

Reported-by: syzbot+be51ca5a4d97f017cd50@syzkaller.appspotmail.com
Cc: stable@vger.kernel.org
Signed-off-by: Palash Oswal <hello@oswalpalash.com>
Link: https://lore.kernel.org/r/20210427125148.21816-1-hello@oswalpalash.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-04-27 07:37:39 -06:00
Hyeongseok Kim
c6e2f52e30 exfat: speed up iterate/lookup by fixing start point of traversing cluster chain
When directory iterate and lookup is called, there's a buggy rewinding
of start point for traversing cluster chain to the parent directory
entry's first cluster. This caused repeated cluster chain traversing
from the first entry of the parent directory that would show worse
performance if huge amounts of files exist under the parent directory.
Fix not to rewind, make continue from currently referenced cluster and
dir entry.

Tested with 50,000 files under single directory / 256GB sdcard,
with command "time ls -l > /dev/null",
Before :     0m08.69s real     0m00.27s user     0m05.91s system
After  :     0m07.01s real     0m00.25s user     0m04.34s system

Signed-off-by: Hyeongseok Kim <hyeongseok@gmail.com>
Reviewed-by: Sungjong Seo <sj1557.seo@samsung.com>
Signed-off-by: Namjae Jeon <namjae.jeon@samsung.com>
2021-04-27 20:45:07 +09:00
Hyeongseok Kim
23befe490b exfat: improve write performance when dirsync enabled
Degradation of write speed caused by frequent disk access for cluster
bitmap update on every cluster allocation could be improved by
selective syncing bitmap buffer. Change to flush bitmap buffer only
for the directory related operations.

Signed-off-by: Hyeongseok Kim <hyeongseok@gmail.com>
Acked-by: Sungjong Seo <sj1557.seo@samsung.com>
Signed-off-by: Namjae Jeon <namjae.jeon@samsung.com>
2021-04-27 20:45:06 +09:00
Hyeongseok Kim
654762df2e exfat: add support ioctl and FITRIM function
Add FITRIM ioctl to enable discarding unused blocks while mounted.
As current exFAT doesn't have generic ioctl handler, add empty ioctl
function first, and add FITRIM handler.

Signed-off-by: Hyeongseok Kim <hyeongseok@gmail.com>
Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Acked-by: Sungjong Seo <sj1557.seo@samsung.com>
Signed-off-by: Namjae Jeon <namjae.jeon@samsung.com>
2021-04-27 20:45:06 +09:00
Hyeongseok Kim
5c2d728507 exfat: introduce bitmap_lock for cluster bitmap access
s_lock which is for protecting concurrent access of file operations is
too huge for cluster bitmap protection, so introduce a new bitmap_lock
to narrow the lock range if only need to access cluster bitmap.

Signed-off-by: Hyeongseok Kim <hyeongseok@gmail.com>
Acked-by: Sungjong Seo <sj1557.seo@samsung.com>
Signed-off-by: Namjae Jeon <namjae.jeon@samsung.com>
2021-04-27 20:45:06 +09:00
Hyeongseok Kim
77edfc6e51 exfat: fix erroneous discard when clear cluster bit
If mounted with discard option, exFAT issues discard command when clear
cluster bit to remove file. But the input parameter of cluster-to-sector
calculation is abnormally added by reserved cluster size which is 2,
leading to discard unrelated sectors included in target+2 cluster.
With fixing this, remove the wrong comments in set/clear/find bitmap
functions.

Fixes: 1e49a94cf7 ("exfat: add bitmap operations")
Cc: stable@vger.kernel.org # v5.7+
Signed-off-by: Hyeongseok Kim <hyeongseok@gmail.com>
Acked-by: Sungjong Seo <sj1557.seo@samsung.com>
Signed-off-by: Namjae Jeon <namjae.jeon@samsung.com>
2021-04-27 20:45:06 +09:00
David Howells
53b776c77a netfs: Miscellaneous fixes
Fix some miscellaneous things in the new netfs lib[1]:

 (1) The kerneldoc for netfs_readpage() shouldn't say netfs_page().

 (2) netfs_readpage() can get an integer overflow on 32-bit when it
     multiplies page_index(page) by PAGE_SIZE.  It should use
     page_file_offset() instead.

 (3) netfs_write_begin() should use page_offset() to avoid the same
     overflow.

Note that netfs_readpage() needs to use page_file_offset() rather than
page_offset() as it may see swap-over-NFS.

Reported-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Link: https://lore.kernel.org/r/161789062190.6155.12711584466338493050.stgit@warthog.procyon.org.uk/ [1]
2021-04-26 23:23:41 +01:00
Linus Torvalds
55ba0fe059 for-5.13-tag
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEE8rQSAMVO+zA4DBdWxWXV+ddtWDsFAmCHGMYACgkQxWXV+ddt
 WDsFeA/+MZ+5UiYYucH5RVw/VExOQzSvlRVxxnOeR2s8V/gFj/Ip7d9E9UezA+lX
 di2byKXPzfjL9+xoyqfEZLpPgDQrhHTIQCzuNSvzMBykJIL6Sf1OZTtUZWU3HDH7
 S51UZtghgTPzeOhxsiBHqSFo9danT0w3KQhliE10Ur855ziKSvL2Tb7dvM6q5TS7
 mFTAj/Y2aanDaFKQjjBzzA+GZ0LFIGuErg1PADmF5XjbyY06ho1xqQ1A0t2/XL9x
 UpHdRP3E5XRAMl4uyYOUtbvUB1cROzoS6ySHJOJ9Bbz+IC0cLf5xTJkLE25bGkSi
 GjFNvnQOha1s8oMIlqkw64hKQqwp+gu2iZ7m1o76Z31k7CpLAC+rg11gbpODRuoh
 7B3EzowKyVihMHF8URAdC3A+9gbpPvyuGKDSy07yULh/2vas6dEzR3cPVEU0yXyJ
 3DO1ds0lVY3B/T9LKPQ785hQ7VdpgZ8BdIOVRtjgV2QQEa9eFh9VzybQjU8yBXGd
 vflBe8kQfASIZ5E0rcUGPUVIJoesM8U1pSlx9jvvTQVkOC/DQjtBx/5ePCL2iVfd
 izY8uWlCdguF/P1CYFf1M0auASSzl3bip1NnSMVvZ90dgDEK4XaIyd16kMGDCbU2
 UMOePMsLDApWcCVTqM/J+lFLa7rajRccdKby7F/zSpZIRgadPF8=
 =J5jh
 -----END PGP SIGNATURE-----

Merge tag 'for-5.13-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux

Pull btrfs updates from David Sterba:
 "The updates this time are mostly stabilization, preparation and minor
  improvements.

  User visible improvements:

   - readahead for send, improving run time of full send by 10% and for
     incremental by 25%

   - make reflinks respect O_SYNC, O_DSYNC and S_SYNC flags

   - export supported sectorsize values in sysfs (currently only page
     size, more once full subpage support lands)

   - more graceful errors and warnings on 32bit systems when logical
     addresses for metadata reach the limit posed by unsigned long in
     page::index
      - error: fail mount if there's a metadata block beyond the limit
      - error: new metadata block would be at unreachable address
      - warn when 5/8th of the limit is reached, for 4K page systems
        it's 10T, for 64K page it's 160T

   - zoned mode
      - relocated zones get reset at the end instead of discard
      - automatic background reclaim of zones that have 75%+ of unusable
        space, the threshold is tunable in sysfs

  Fixes:

   - fsync and tree mod log fixes

   - fix inefficient preemptive reclaim calculations

   - fix exhaustion of the system chunk array due to concurrent
     allocations

   - fix fallback to no compression when racing with remount

   - preemptive fix for dm-crypt on zoned device that does not properly
     advertise zoned support

  Core changes:

   - add inode lock to synchronize mmap and other block updates (eg.
     deduplication, fallocate, fsync)

   - kmap conversions to new kmap_local API

   - subpage support (continued)
      - new helpers for page state/extent buffer tracking
      - metadata changes now support read and write

   - error handling through out relocation call paths

   - many other cleanups and code simplifications"

* tag 'for-5.13-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux: (112 commits)
  btrfs: zoned: automatically reclaim zones
  btrfs: rename delete_unused_bgs_mutex to reclaim_bgs_lock
  btrfs: zoned: reset zones of relocated block groups
  btrfs: more graceful errors/warnings on 32bit systems when reaching limits
  btrfs: zoned: fix unpaired block group unfreeze during device replace
  btrfs: fix race when picking most recent mod log operation for an old root
  btrfs: fix metadata extent leak after failure to create subvolume
  btrfs: handle remount to no compress during compression
  btrfs: zoned: fail mount if the device does not support zone append
  btrfs: fix race between transaction aborts and fsyncs leading to use-after-free
  btrfs: introduce submit_eb_subpage() to submit a subpage metadata page
  btrfs: make lock_extent_buffer_for_io() to be subpage compatible
  btrfs: introduce write_one_subpage_eb() function
  btrfs: introduce end_bio_subpage_eb_writepage() function
  btrfs: check return value of btrfs_commit_transaction in relocation
  btrfs: do proper error handling in merge_reloc_roots
  btrfs: handle extent corruption with select_one_root properly
  btrfs: cleanup error handling in prepare_to_merge
  btrfs: do not panic in __add_reloc_root
  btrfs: handle __add_reloc_root failures in btrfs_recover_relocation
  ...
2021-04-26 13:48:02 -07:00