Commit Graph

70705 Commits

Author SHA1 Message Date
Greg Kroah-Hartman
e1436df2f2 Revert "ecryptfs: replace BUG_ON with error handling code"
This reverts commit 2c2a7552dd.

Because of recent interactions with developers from @umn.edu, all
commits from them have been recently re-reviewed to ensure if they were
correct or not.

Upon review, this commit was found to be incorrect for the reasons
below, so it must be reverted.  It will be fixed up "correctly" in a
later kernel change.

The original commit log for this change was incorrect, no "error
handling code" was added, things will blow up just as badly as before if
any of these cases ever were true.  As this BUG_ON() never fired, and
most of these checks are "obviously" never going to be true, let's just
revert to the original code for now until this gets unwound to be done
correctly in the future.

Cc: Aditya Pakki <pakki001@umn.edu>
Fixes: 2c2a7552dd ("ecryptfs: replace BUG_ON with error handling code")
Cc: stable <stable@vger.kernel.org>
Acked-by: Tyler Hicks <code@tyhicks.com>
Link: https://lore.kernel.org/r/20210503115736.2104747-49-gregkh@linuxfoundation.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-05-13 18:32:24 +02:00
Gao Xiang
0852b6ca94 erofs: fix 1 lcluster-sized pcluster for big pcluster
If the 1st NONHEAD lcluster of a pcluster isn't CBLKCNT lcluster type
rather than a HEAD or PLAIN type instead, which means its pclustersize
_must_ be 1 lcluster (since its uncompressed size < 2 lclusters),
as illustrated below:

       HEAD     HEAD / PLAIN    lcluster type
   ____________ ____________
  |_:__________|_________:__|   file data (uncompressed)
   .                .
  .____________.
  |____________|                pcluster data (compressed)

Such on-disk case was explained before [1] but missed to be handled
properly in the runtime implementation.

It can be observed if manually generating 1 lcluster-sized pcluster
with 2 lclusters (thus CBLKCNT doesn't exist.) Let's fix it now.

[1] https://lore.kernel.org/r/20210407043927.10623-1-xiang@kernel.org

Link: https://lore.kernel.org/r/20210510064715.29123-1-xiang@kernel.org
Fixes: cec6e93bea ("erofs: support parsing big pcluster compress indexes")
Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Gao Xiang <xiang@kernel.org>
2021-05-13 15:58:46 +08:00
Jaegeuk Kim
f395183f95 f2fs: return EINVAL for hole cases in swap file
This tries to fix xfstests/generic/495.

Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2021-05-12 07:38:00 -07:00
Christian Brauner
2ca4dcc490 fs/mount_setattr: tighten permission checks
We currently don't have any filesystems that support idmapped mounts
which are mountable inside a user namespace. That was a deliberate
decision for now as a userns root can just mount the filesystem
themselves. So enforce this restriction explicitly until there's a real
use-case for this. This way we can notice it and will have a chance to
adapt and audit our translation helpers and fstests appropriately if we
need to support such filesystems.

Cc: Christoph Hellwig <hch@lst.de>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: stable@vger.kernel.org
CC: linux-fsdevel@vger.kernel.org
Suggested-by: Seth Forshee <seth.forshee@canonical.com>
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
2021-05-12 14:13:16 +02:00
Jaegeuk Kim
ca298241bc f2fs: avoid swapon failure by giving a warning first
The final solution can be migrating blocks to form a section-aligned file
internally. Meanwhile, let's ask users to do that when preparing the swap
file initially like:
1) create()
2) ioctl(F2FS_IOC_SET_PIN_FILE)
3) fallocate()

Reported-by: kernel test robot <oliver.sang@intel.com>
Fixes: 36e4d95891 ("f2fs: check if swapfile is section-alligned")
Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2021-05-11 20:51:53 -07:00
Chao Yu
8bfbfb0ddd f2fs: compress: fix to assign cc.cluster_idx correctly
In f2fs_destroy_compress_ctx(), after f2fs_destroy_compress_ctx(),
cc.cluster_idx will be cleared w/ NULL_CLUSTER, f2fs_cluster_blocks()
may check wrong cluster metadata, fix it.

Fixes: 4c8ff7095b ("f2fs: support data compression")
Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2021-05-11 14:48:12 -07:00
Chao Yu
a949dc5f2c f2fs: compress: fix race condition of overwrite vs truncate
pos_fsstress testcase complains a panic as belew:

------------[ cut here ]------------
kernel BUG at fs/f2fs/compress.c:1082!
invalid opcode: 0000 [#1] SMP PTI
CPU: 4 PID: 2753477 Comm: kworker/u16:2 Tainted: G           OE     5.12.0-rc1-custom #1
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.14.0-2 04/01/2014
Workqueue: writeback wb_workfn (flush-252:16)
RIP: 0010:prepare_compress_overwrite+0x4c0/0x760 [f2fs]
Call Trace:
 f2fs_prepare_compress_overwrite+0x5f/0x80 [f2fs]
 f2fs_write_cache_pages+0x468/0x8a0 [f2fs]
 f2fs_write_data_pages+0x2a4/0x2f0 [f2fs]
 do_writepages+0x38/0xc0
 __writeback_single_inode+0x44/0x2a0
 writeback_sb_inodes+0x223/0x4d0
 __writeback_inodes_wb+0x56/0xf0
 wb_writeback+0x1dd/0x290
 wb_workfn+0x309/0x500
 process_one_work+0x220/0x3c0
 worker_thread+0x53/0x420
 kthread+0x12f/0x150
 ret_from_fork+0x22/0x30

The root cause is truncate() may race with overwrite as below,
so that one reference count left in page can not guarantee the
page attaching in mapping tree all the time, after truncation,
later find_lock_page() may return NULL pointer.

- prepare_compress_overwrite
 - f2fs_pagecache_get_page
 - unlock_page
					- f2fs_setattr
					 - truncate_setsize
					  - truncate_inode_page
					   - delete_from_page_cache
 - find_lock_page

Fix this by avoiding referencing updated page.

Fixes: 4c8ff7095b ("f2fs: support data compression")
Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2021-05-11 14:48:12 -07:00
Chao Yu
a12cc5b423 f2fs: compress: fix to free compress page correctly
In error path of f2fs_write_compressed_pages(), it needs to call
f2fs_compress_free_page() to release temporary page.

Fixes: 5e6bbde959 ("f2fs: introduce mempool for {,de}compress intermediate page allocation")
Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2021-05-11 14:48:12 -07:00
Jaegeuk Kim
a753103909 f2fs: support iflag change given the mask
In f2fs_fileattr_set(),

	if (!fa->flags_valid)
		mask &= FS_COMMON_FL;

In this case, we can set supported flags by mask only instead of BUG_ON.

/* Flags shared betwen flags/xflags */
	(FS_SYNC_FL | FS_IMMUTABLE_FL | FS_APPEND_FL | \
	 FS_NODUMP_FL |	FS_NOATIME_FL | FS_DAX_FL | \
	 FS_PROJINHERIT_FL)

Fixes: 9b1bb01c8a ("f2fs: convert to fileattr")
Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2021-05-11 14:48:11 -07:00
Jaegeuk Kim
349c4d6c75 f2fs: avoid null pointer access when handling IPU error
Unable to handle kernel NULL pointer dereference at virtual address 000000000000001a
 pc : f2fs_inplace_write_data+0x144/0x208
 lr : f2fs_inplace_write_data+0x134/0x208
 Call trace:
  f2fs_inplace_write_data+0x144/0x208
  f2fs_do_write_data_page+0x270/0x770
  f2fs_write_single_data_page+0x47c/0x830
  __f2fs_write_data_pages+0x444/0x98c
  f2fs_write_data_pages.llvm.16514453770497736882+0x2c/0x38
  do_writepages+0x58/0x118
  __writeback_single_inode+0x44/0x300
  writeback_sb_inodes+0x4b8/0x9c8
  wb_writeback+0x148/0x42c
  wb_do_writeback+0xc8/0x390
  wb_workfn+0xb0/0x2f4
  process_one_work+0x1fc/0x444
  worker_thread+0x268/0x4b4
  kthread+0x13c/0x158
  ret_from_fork+0x10/0x18

Fixes: 9557727876 ("f2fs: drop inplace IO if fs status is abnormal")
Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2021-05-11 14:48:07 -07:00
Linus Torvalds
88b06399c9 Merge tag 'for-5.13-rc1-part2-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux
Pull btrfs fix from David Sterba:
 "Handle transaction start error in btrfs_fileattr_set()

  This is fix for code introduced by the new fileattr merge"

* tag 'for-5.13-rc1-part2-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
  btrfs: handle transaction start error in btrfs_fileattr_set
2021-05-11 09:43:16 -07:00
Ritesh Harjani
9b8a233bc2 btrfs: handle transaction start error in btrfs_fileattr_set
Add error handling in btrfs_fileattr_set in case of an error while
starting a transaction. This fixes btrfs/232 which otherwise used to
fail with below signature on Power.

  btrfs/232 [ 1119.474650] run fstests btrfs/232 at 2021-04-21 02:21:22
  <...>
  [ 1366.638585] BUG: Unable to handle kernel data access on read at 0xffffffffffffff86
  [ 1366.638768] Faulting instruction address: 0xc0000000009a5c88
  cpu 0x0: Vector: 380 (Data SLB Access) at [c000000014f177b0]
      pc: c0000000009a5c88: btrfs_update_root_times+0x58/0xc0
      lr: c0000000009a5c84: btrfs_update_root_times+0x54/0xc0
      <...>
      pid   = 24881, comm = fsstress
	   btrfs_update_inode+0xa0/0x140
	   btrfs_fileattr_set+0x5d0/0x6f0
	   vfs_fileattr_set+0x2a8/0x390
	   do_vfs_ioctl+0x1290/0x1ac0
	   sys_ioctl+0x6c/0x120
	   system_call_exception+0x3d4/0x410
	   system_call_common+0xec/0x278

Fixes: 97fc297754 ("btrfs: convert to fileattr")
Signed-off-by: Ritesh Harjani <riteshh@linux.ibm.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2021-05-11 15:35:57 +02:00
Linus Torvalds
142b507f91 Merge tag 'for-5.13-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux
Pull btrfs fixes from David Sterba:
 "First batch of various fixes, here's a list of notable ones:

   - fix unmountable seed device after fstrim

   - fix silent data loss in zoned mode due to ordered extent splitting

   - fix race leading to unpersisted data and metadata on fsync

   - fix deadlock when cloning inline extents and using qgroups"

* tag 'for-5.13-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
  btrfs: initialize return variable in cleanup_free_space_cache_v1
  btrfs: zoned: sanity check zone type
  btrfs: fix unmountable seed device after fstrim
  btrfs: fix deadlock when cloning inline extents and using qgroups
  btrfs: fix race leading to unpersisted data and metadata on fsync
  btrfs: do not consider send context as valid when trying to flush qgroups
  btrfs: zoned: fix silent data loss after failure splitting ordered extent
2021-05-10 14:10:42 -07:00
Christophe JAILLET
8c721cb0f7 quota: Use 'hlist_for_each_entry' to simplify code
Use 'hlist_for_each_entry' instead of hand writing it.
This saves a few lines of code.

Link: https://lore.kernel.org/r/f82d3e33964dcbd2aac19866735e0a8381c8a735.1619599407.git.christophe.jaillet@wanadoo.fr
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Signed-off-by: Jan Kara <jack@suse.cz>
2021-05-10 16:27:49 +02:00
Linus Torvalds
0a55a1fbed Merge tag '5.13-rc-smb3-part3' of git://git.samba.org/sfrench/cifs-2.6
Pull cifs fixes from Steve French:
 "Three small SMB3 chmultichannel related changesets (also for stable)
  from the SMB3 test event this week.

  The other fixes are still in review/testing"

* tag '5.13-rc-smb3-part3' of git://git.samba.org/sfrench/cifs-2.6:
  smb3: if max_channels set to more than one channel request multichannel
  smb3: do not attempt multichannel to server which does not support it
  smb3: when mounting with multichannel include it in requested capabilities
2021-05-09 13:19:29 -07:00
Pavel Begunkov
a298232ee6 io_uring: fix link timeout refs
WARNING: CPU: 0 PID: 10242 at lib/refcount.c:28 refcount_warn_saturate+0x15b/0x1a0 lib/refcount.c:28
RIP: 0010:refcount_warn_saturate+0x15b/0x1a0 lib/refcount.c:28
Call Trace:
 __refcount_sub_and_test include/linux/refcount.h:283 [inline]
 __refcount_dec_and_test include/linux/refcount.h:315 [inline]
 refcount_dec_and_test include/linux/refcount.h:333 [inline]
 io_put_req fs/io_uring.c:2140 [inline]
 io_queue_linked_timeout fs/io_uring.c:6300 [inline]
 __io_queue_sqe+0xbef/0xec0 fs/io_uring.c:6354
 io_submit_sqe fs/io_uring.c:6534 [inline]
 io_submit_sqes+0x2bbd/0x7c50 fs/io_uring.c:6660
 __do_sys_io_uring_enter fs/io_uring.c:9240 [inline]
 __se_sys_io_uring_enter+0x256/0x1d60 fs/io_uring.c:9182

io_link_timeout_fn() should put only one reference of the linked timeout
request, however in case of racing with the master request's completion
first io_req_complete() puts one and then io_put_req_deferred() is
called.

Cc: stable@vger.kernel.org # 5.12+
Fixes: 9ae1f8dd37 ("io_uring: fix inconsistent lock state")
Reported-by: syzbot+a2910119328ce8e7996f@syzkaller.appspotmail.com
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/ff51018ff29de5ffa76f09273ef48cb24c720368.1620417627.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-05-08 22:11:49 -06:00
Linus Torvalds
0f979d815c Merge tag 'kbuild-v5.13-2' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild
Pull more Kbuild updates from Masahiro Yamada:

 - Convert sh and sparc to use generic shell scripts to generate the
   syscall headers

 - refactor .gitignore files

 - Update kernel/config_data.gz only when the content of the .config
   is really changed, which avoids the unneeded re-link of vmlinux

 - move "remove stale files" workarounds to scripts/remove-stale-files

 - suppress unused-but-set-variable warnings by default for Clang
   as well

 - fix locale setting LANG=C to LC_ALL=C

 - improve 'make distclean'

 - always keep intermediate objects from scripts/link-vmlinux.sh

 - move IF_ENABLED out of <linux/kconfig.h> to make it self-contained

 - misc cleanups

* tag 'kbuild-v5.13-2' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: (25 commits)
  linux/kconfig.h: replace IF_ENABLED() with PTR_IF() in <linux/kernel.h>
  kbuild: Don't remove link-vmlinux temporary files on exit/signal
  kbuild: remove the unneeded comments for external module builds
  kbuild: make distclean remove tag files in sub-directories
  kbuild: make distclean work against $(objtree) instead of $(srctree)
  kbuild: refactor modname-multi by using suffix-search
  kbuild: refactor fdtoverlay rule
  kbuild: parameterize the .o part of suffix-search
  arch: use cross_compiling to check whether it is a cross build or not
  kbuild: remove ARCH=sh64 support from top Makefile
  .gitignore: prefix local generated files with a slash
  kbuild: replace LANG=C with LC_ALL=C
  Makefile: Move -Wno-unused-but-set-variable out of GCC only block
  kbuild: add a script to remove stale generated files
  kbuild: update config_data.gz only when the content of .config is changed
  .gitignore: ignore only top-level modules.builtin
  .gitignore: move tags and TAGS close to other tag files
  kernel/.gitgnore: remove stale timeconst.h and hz.bc
  usr/include: refactor .gitignore
  genksyms: fix stale comment
  ...
2021-05-08 10:00:11 -07:00
Steve French
c1f8a398b6 smb3: if max_channels set to more than one channel request multichannel
Mounting with "multichannel" is obviously implied if user requested
more than one channel on mount (ie mount parm max_channels>1).
Currently both have to be specified. Fix that so that if max_channels
is greater than 1 on mount, enable multichannel rather than silently
falling back to non-multichannel.

Signed-off-by: Steve French <stfrench@microsoft.com>
Reviewed-By: Tom Talpey <tom@talpey.com>
Cc: <stable@vger.kernel.org> # v5.11+
Reviewed-by: Shyam Prasad N <sprasad@microsoft.com>
2021-05-08 10:51:06 -05:00
Steve French
9c2dc11df5 smb3: do not attempt multichannel to server which does not support it
We were ignoring CAP_MULTI_CHANNEL in the server response - if the
server doesn't support multichannel we should not be attempting it.

See MS-SMB2 section 3.2.5.2

Reviewed-by: Shyam Prasad N <sprasad@microsoft.com>
Reviewed-By: Tom Talpey <tom@talpey.com>
Cc: <stable@vger.kernel.org> # v5.8+
Signed-off-by: Steve French <stfrench@microsoft.com>
2021-05-08 10:50:53 -05:00
Steve French
679971e721 smb3: when mounting with multichannel include it in requested capabilities
In the SMB3/SMB3.1.1 negotiate protocol request, we are supposed to
advertise CAP_MULTICHANNEL capability when establishing multiple
channels has been requested by the user doing the mount. See MS-SMB2
sections 2.2.3 and 3.2.5.2

Without setting it there is some risk that multichannel could fail
if the server interpreted the field strictly.

Reviewed-By: Tom Talpey <tom@talpey.com>
Reviewed-by: Shyam Prasad N <sprasad@microsoft.com>
Cc: <stable@vger.kernel.org> # v5.8+
Signed-off-by: Steve French <stfrench@microsoft.com>
2021-05-08 10:44:11 -05:00
Vivek Goyal
237388320d dax: Wake up all waiters after invalidating dax entry
I am seeing missed wakeups which ultimately lead to a deadlock when I am
using virtiofs with DAX enabled and running "make -j". I had to mount
virtiofs as rootfs and also reduce to dax window size to 256M to reproduce
the problem consistently.

So here is the problem. put_unlocked_entry() wakes up waiters only
if entry is not null as well as !dax_is_conflict(entry). But if I
call multiple instances of invalidate_inode_pages2() in parallel,
then I can run into a situation where there are waiters on
this index but nobody will wake these waiters.

invalidate_inode_pages2()
  invalidate_inode_pages2_range()
    invalidate_exceptional_entry2()
      dax_invalidate_mapping_entry_sync()
        __dax_invalidate_entry() {
                xas_lock_irq(&xas);
                entry = get_unlocked_entry(&xas, 0);
                ...
                ...
                dax_disassociate_entry(entry, mapping, trunc);
                xas_store(&xas, NULL);
                ...
                ...
                put_unlocked_entry(&xas, entry);
                xas_unlock_irq(&xas);
        }

Say a fault in in progress and it has locked entry at offset say "0x1c".
Now say three instances of invalidate_inode_pages2() are in progress
(A, B, C) and they all try to invalidate entry at offset "0x1c". Given
dax entry is locked, all tree instances A, B, C will wait in wait queue.

When dax fault finishes, say A is woken up. It will store NULL entry
at index "0x1c" and wake up B. When B comes along it will find "entry=0"
at page offset 0x1c and it will call put_unlocked_entry(&xas, 0). And
this means put_unlocked_entry() will not wake up next waiter, given
the current code. And that means C continues to wait and is not woken
up.

This patch fixes the issue by waking up all waiters when a dax entry
has been invalidated. This seems to fix the deadlock I am facing
and I can make forward progress.

Reported-by: Sergio Lopez <slp@redhat.com>
Fixes: ac401cc782 ("dax: New fault locking")
Reviewed-by: Jan Kara <jack@suse.cz>
Suggested-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Link: https://lore.kernel.org/r/20210428190314.1865312-4-vgoyal@redhat.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2021-05-07 15:55:44 -07:00
Vivek Goyal
4c3d043d27 dax: Add a wakeup mode parameter to put_unlocked_entry()
As of now put_unlocked_entry() always wakes up next waiter. In next
patches we want to wake up all waiters at one callsite. Hence, add a
parameter to the function.

This patch does not introduce any change of behavior.

Reviewed-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Jan Kara <jack@suse.cz>
Suggested-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Link: https://lore.kernel.org/r/20210428190314.1865312-3-vgoyal@redhat.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2021-05-07 15:55:44 -07:00
Vivek Goyal
698ab77aeb dax: Add an enum for specifying dax wakup mode
Dan mentioned that he is not very fond of passing around a boolean true/false
to specify if only next waiter should be woken up or all waiters should be
woken up. He instead prefers that we introduce an enum and make it very
explicity at the callsite itself. Easier to read code.

This patch should not introduce any change of behavior.

Reviewed-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Jan Kara <jack@suse.cz>
Suggested-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Link: https://lore.kernel.org/r/20210428190314.1865312-2-vgoyal@redhat.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2021-05-07 15:55:44 -07:00
Linus Torvalds
bd313968fd Merge tag 'block-5.13-2021-05-07' of git://git.kernel.dk/linux-block
Pull block fixes from Jens Axboe:

 - dasd spelling fixes (Bhaskar)

 - Limit bio max size on multi-page bvecs to the hardware limit, to
   avoid overly large bio's (and hence latencies). Originally queued for
   the merge window, but needed a fix and was dropped from the initial
   pull (Changheun)

 - NVMe pull request (Christoph):
      - reset the bdev to ns head when failover (Daniel Wagner)
      - remove unsupported command noise (Keith Busch)
      - misc passthrough improvements (Kanchan Joshi)
      - fix controller ioctl through ns_head (Minwoo Im)
      - fix controller timeouts during reset (Tao Chiu)

 - rnbd fixes/cleanups (Gioh, Md, Dima)

 - Fix iov_iter re-expansion (yangerkun)

* tag 'block-5.13-2021-05-07' of git://git.kernel.dk/linux-block:
  block: reexpand iov_iter after read/write
  nvmet: remove unsupported command noise
  nvme-multipath: reset bdev to ns head when failover
  nvme-pci: fix controller reset hang when racing with nvme_timeout
  nvme: move the fabrics queue ready check routines to core
  nvme: avoid memset for passthrough requests
  nvme: add nvme_get_ns helper
  nvme: fix controller ioctl through ns_head
  bio: limit bio max size
  RDMA/rtrs: fix uninitialized symbol 'cnt'
  s390: dasd: Mundane spelling fixes
  block/rnbd: Remove all likely and unlikely
  block/rnbd-clt: Check the return value of the function rtrs_clt_query
  block/rnbd: Fix style issues
  block/rnbd-clt: Change queue_depth type in rnbd_clt_session to size_t
2021-05-07 11:35:12 -07:00
Linus Torvalds
28b4afeb59 Merge tag 'io_uring-5.13-2021-05-07' of git://git.kernel.dk/linux-block
Pull io_uring fixes from Jens Axboe:
 "Mostly fixes for merge window merged code. In detail:

   - Error case memory leak fixes (Colin, Zqiang)

   - Add the tools/io_uring/ to the list of maintained files (Lukas)

   - Set of fixes for the modified buffer registration API (Pavel)

   - Sanitize io thread setup on x86 (Stefan)

   - Ensure we truncate transfer count for registered buffers (Thadeu)"

* tag 'io_uring-5.13-2021-05-07' of git://git.kernel.dk/linux-block:
  x86/process: setup io_threads more like normal user space threads
  MAINTAINERS: add io_uring tool to IO_URING
  io_uring: truncate lengths larger than MAX_RW_COUNT on provide buffers
  io_uring: Fix memory leak in io_sqe_buffers_register()
  io_uring: Fix premature return from loop and memory leak
  io_uring: fix unchecked error in switch_start()
  io_uring: allow empty slots for reg buffers
  io_uring: add more build check for uapi
  io_uring: dont overlap internal and user req flags
  io_uring: fix drain with rsrc CQEs
2021-05-07 11:29:23 -07:00
Linus Torvalds
a647034fe2 Merge tag 'nfs-for-5.13-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs
Pull NFS client updates from Trond Myklebust:
 "Highlights include:

  Stable fixes:

   - Add validation of the UDP retrans parameter to prevent shift
     out-of-bounds

   - Don't discard pNFS layout segments that are marked for return

  Bugfixes:

   - Fix a NULL dereference crash in xprt_complete_bc_request() when the
     NFSv4.1 server misbehaves.

   - Fix the handling of NFS READDIR cookie verifiers

   - Sundry fixes to ensure attribute revalidation works correctly when
     the server does not return post-op attributes.

   - nfs4_bitmask_adjust() must not change the server global bitmasks

   - Fix major timeout handling in the RPC code.

   - NFSv4.2 fallocate() fixes.

   - Fix the NFSv4.2 SEEK_HOLE/SEEK_DATA end-of-file handling

   - Copy offload attribute revalidation fixes

   - Fix an incorrect filehandle size check in the pNFS flexfiles driver

   - Fix several RDMA transport setup/teardown races

   - Fix several RDMA queue wrapping issues

   - Fix a misplaced memory read barrier in sunrpc's call_decode()

  Features:

   - Micro optimisation of the TCP transmission queue using TCP_CORK

   - statx() performance improvements by further splitting up the
     tracking of invalid cached file metadata.

   - Support the NFSv4.2 'change_attr_type' attribute and use it to
     optimise handling of change attribute updates"

* tag 'nfs-for-5.13-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: (85 commits)
  xprtrdma: Fix a NULL dereference in frwr_unmap_sync()
  sunrpc: Fix misplaced barrier in call_decode
  NFSv4.2: Remove ifdef CONFIG_NFSD from NFSv4.2 client SSC code.
  xprtrdma: Move fr_mr field to struct rpcrdma_mr
  xprtrdma: Move the Work Request union to struct rpcrdma_mr
  xprtrdma: Move fr_linv_done field to struct rpcrdma_mr
  xprtrdma: Move cqe to struct rpcrdma_mr
  xprtrdma: Move fr_cid to struct rpcrdma_mr
  xprtrdma: Remove the RPC/RDMA QP event handler
  xprtrdma: Don't display r_xprt memory addresses in tracepoints
  xprtrdma: Add an rpcrdma_mr_completion_class
  xprtrdma: Add tracepoints showing FastReg WRs and remote invalidation
  xprtrdma: Avoid Send Queue wrapping
  xprtrdma: Do not wake RPC consumer on a failed LocalInv
  xprtrdma: Do not recycle MR after FastReg/LocalInv flushes
  xprtrdma: Clarify use of barrier in frwr_wc_localinv_done()
  xprtrdma: Rename frwr_release_mr()
  xprtrdma: rpcrdma_mr_pop() already does list_del_init()
  xprtrdma: Delete rpcrdma_recv_buffer_put()
  xprtrdma: Fix cwnd update ordering
  ...
2021-05-07 11:23:41 -07:00
Linus Torvalds
e22e983279 Merge tag '9p-for-5.13-rc1' of git://github.com/martinetd/linux
Pull 9p updates from Dominique Martinet:
 "An error handling fix and constification"

* tag '9p-for-5.13-rc1' of git://github.com/martinetd/linux:
  fs: 9p: fix v9fs_file_open writeback fid error check
  9p: Constify static struct v9fs_attr_group
2021-05-07 11:18:52 -07:00
Linus Torvalds
a48b0872e6 Merge branch 'akpm' (patches from Andrew)
Merge yet more updates from Andrew Morton:
 "This is everything else from -mm for this merge window.

  90 patches.

  Subsystems affected by this patch series: mm (cleanups and slub),
  alpha, procfs, sysctl, misc, core-kernel, bitmap, lib, compat,
  checkpatch, epoll, isofs, nilfs2, hpfs, exit, fork, kexec, gcov,
  panic, delayacct, gdb, resource, selftests, async, initramfs, ipc,
  drivers/char, and spelling"

* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (90 commits)
  mm: fix typos in comments
  mm: fix typos in comments
  treewide: remove editor modelines and cruft
  ipc/sem.c: spelling fix
  fs: fat: fix spelling typo of values
  kernel/sys.c: fix typo
  kernel/up.c: fix typo
  kernel/user_namespace.c: fix typos
  kernel/umh.c: fix some spelling mistakes
  include/linux/pgtable.h: few spelling fixes
  mm/slab.c: fix spelling mistake "disired" -> "desired"
  scripts/spelling.txt: add "overflw"
  scripts/spelling.txt: Add "diabled" typo
  scripts/spelling.txt: add "overlfow"
  arm: print alloc free paths for address in registers
  mm/vmalloc: remove vwrite()
  mm: remove xlate_dev_kmem_ptr()
  drivers/char: remove /dev/kmem for good
  mm: fix some typos and code style problems
  ipc/sem.c: mundane typo fixes
  ...
2021-05-07 00:34:51 -07:00
Masahiro Yamada
fa60ce2cb4 treewide: remove editor modelines and cruft
The section "19) Editor modelines and other cruft" in
Documentation/process/coding-style.rst clearly says, "Do not include any
of these in source files."

I recently receive a patch to explicitly add a new one.

Let's do treewide cleanups, otherwise some people follow the existing code
and attempt to upstream their favoriate editor setups.

It is even nicer if scripts/checkpatch.pl can check it.

If we like to impose coding style in an editor-independent manner, I think
editorconfig (patch [1]) is a saner solution.

[1] https://lore.kernel.org/lkml/20200703073143.423557-1-danny@kdrag0n.dev/

Link: https://lkml.kernel.org/r/20210324054457.1477489-1-masahiroy@kernel.org
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>
Reviewed-by: Miguel Ojeda <ojeda@kernel.org>	[auxdisplay]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-05-07 00:26:34 -07:00
dingsenjie
a109ae2a02 fs: fat: fix spelling typo of values
vaules -> values

Link: https://lkml.kernel.org/r/20210302034817.30384-1-dingsenjie@163.com
Signed-off-by: dingsenjie <dingsenjie@yulong.com>
Acked-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-05-07 00:26:34 -07:00
Linus Torvalds
05da1f643f Merge tag 'iomap-5.13-merge-3' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux
Pull more iomap updates from Darrick Wong:
 "Remove the now unused 'io_private' field from struct iomap_ioend, for
  a modest savings in memory allocation"

* tag 'iomap-5.13-merge-3' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux:
  iomap: remove unused private field from ioend
2021-05-06 23:54:12 -07:00
Linus Torvalds
af120709b1 Merge tag 'xfs-5.13-merge-5' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux
Pull more xfs updates from Darrick Wong:
 "Except for the timestamp struct renaming patches, everything else in
  here are bug fixes:

   - Rename the log timestamp struct.

   - Remove broken transaction counter debugging that wasn't working
     correctly on very old filesystems.

   - Various fixes to make pre-lazysbcount filesystems work properly
     again.

   - Fix a free space accounting problem where we neglected to consider
     free space btree blocks that track metadata reservation space when
     deciding whether or not to allow caller to reserve space for a
     metadata update.

   - Fix incorrect pagecache clearing behavior during FUNSHARE ops.

   - Don't allow log writes if the data device is readonly"

* tag 'xfs-5.13-merge-5' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux:
  xfs: don't allow log writes if the data device is readonly
  xfs: fix xfs_reflink_unshare usage of filemap_write_and_wait_range
  xfs: set aside allocation btree blocks from block reservation
  xfs: introduce in-core global counter of allocbt blocks
  xfs: unconditionally read all AGFs on mounts with perag reservation
  xfs: count free space btree blocks when scrubbing pre-lazysbcount fses
  xfs: update superblock counters correctly for !lazysbcount
  xfs: don't check agf_btreeblks on pre-lazysbcount filesystems
  xfs: remove obsolete AGF counter debugging
  xfs: rename struct xfs_legacy_ictimestamp
  xfs: rename xfs_ictimestamp_t
2021-05-06 23:46:46 -07:00
Gustavo A. R. Silva
c1e4726f46 hpfs: replace one-element array with flexible-array member
There is a regular need in the kernel to provide a way to declare having
a dynamically sized set of trailing elements in a structure. Kernel code
should always use “flexible array members”[1] for these cases. The older
style of one-element or zero-length arrays should no longer be used[2].

Also, this helps with the ongoing efforts to enable -Warray-bounds by
fixing the following warning:

  CC [M]  fs/hpfs/dir.o
fs/hpfs/dir.c: In function `hpfs_readdir':
fs/hpfs/dir.c:163:41: warning: array subscript 1 is above array bounds of `u8[1]' {aka `unsigned char[1]'} [-Warray-bounds]
  163 |         || de ->name[0] != 1 || de->name[1] != 1))
      |                                 ~~~~~~~~^~~

[1] https://en.wikipedia.org/wiki/Flexible_array_member
[2] https://www.kernel.org/doc/html/v5.10/process/deprecated.html#zero-length-and-one-element-arrays

Link: https://github.com/KSPP/linux/issues/79
Link: https://github.com/KSPP/linux/issues/109
Link: https://lkml.kernel.org/r/20210326173510.GA81212@embeddedor
Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Cc: Mikulas Patocka <mikulas@artax.karlin.mff.cuni.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-05-06 19:24:13 -07:00
Lu Jialin
312f79c486 nilfs2: fix typos in comments
numer -> number in fs/nilfs2/cpfile.c
Decription -> Description in fs/nilfs2/ioctl.c
isntance -> instance in fs/nilfs2/the_nilfs.c

Link: https://lkml.kernel.org/r/1617942951-14631-1-git-send-email-konishi.ryusuke@gmail.com
Link: https://lore.kernel.org/r/20210409022519.176988-1-lujialin4@huawei.com
Signed-off-by: Lu Jialin <lujialin4@huawei.com>
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-05-06 19:24:13 -07:00
Liu xuzhi
300563e6e0 fs/nilfs2: fix misspellings using codespell tool
Two typos are found out by codespell tool \
in 2217th and 2254th lines of segment.c:

$ codespell ./fs/nilfs2/
./segment.c:2217 :retured  ==> returned
./segment.c:2254: retured  ==> returned

Fix two typos found by codespell.

Link: https://lkml.kernel.org/r/1617864087-8198-1-git-send-email-konishi.ryusuke@gmail.com
Signed-off-by: Liu xuzhi <liu.xuzhi@zte.com.cn>
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-05-06 19:24:13 -07:00
Gustavo A. R. Silva
b4ca4c0178 isofs: fix fall-through warnings for Clang
In preparation to enable -Wimplicit-fallthrough for Clang, fix a warning
by explicitly adding a break statement instead of just letting the code
fall through to the next case.

Link: https://github.com/KSPP/linux/issues/115
Link: https://lkml.kernel.org/r/5b7caa73958588065fabc59032c340179b409ef5.1605896059.git.gustavoars@kernel.org
Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-05-06 19:24:13 -07:00
Davidlohr Bueso
7fab29e356 fs/epoll: restore waking from ep_done_scan()
Commit 339ddb53d3 ("fs/epoll: remove unnecessary wakeups of nested
epoll") changed the userspace visible behavior of exclusive waiters
blocked on a common epoll descriptor upon a single event becoming ready.

Previously, all tasks doing epoll_wait would awake, and now only one is
awoken, potentially causing missed wakeups on applications that rely on
this behavior, such as Apache Qpid.

While the aforementioned commit aims at having only a wakeup single path
in ep_poll_callback (with the exceptions of epoll_ctl cases), we need to
restore the wakeup in what was the old ep_scan_ready_list() such that
the next thread can be awoken, in a cascading style, after the waker's
corresponding ep_send_events().

Link: https://lkml.kernel.org/r/20210405231025.33829-3-dave@stgolabs.net
Fixes: 339ddb53d3 ("fs/epoll: remove unnecessary wakeups of nested epoll")
Signed-off-by: Davidlohr Bueso <dbueso@suse.de>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Jason Baron <jbaron@akamai.com>
Cc: Roman Penyaev <rpenyaev@suse.de>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-05-06 19:24:13 -07:00
zhouchuangao
5b31a7dfa3 proc/sysctl: fix function name error in comments
The function name should be modified to register_sysctl_paths instead of
register_sysctl_table_path.

Link: https://lkml.kernel.org/r/1615807194-79646-1-git-send-email-zhouchuangao@vivo.com
Signed-off-by: zhouchuangao <zhouchuangao@vivo.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-05-06 19:24:11 -07:00
Alexey Dobriyan
1dcdd7ef96 proc: delete redundant subset=pid check
Two checks in lookup and readdir code should be enough to not have third
check in open code.

Can't open what can't be looked up?

Link: https://lkml.kernel.org/r/YFYYwIBIkytqnkxP@localhost.localdomain
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Acked-by: Alexey Gladkov <gladkov.alexey@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-05-06 19:24:11 -07:00
Alexey Dobriyan
d4455faccd proc: mandate ->proc_lseek in "struct proc_ops"
Now that proc_ops are separate from file_operations and other operations
it easy to check all instances to have ->proc_lseek hook and remove check
in main code.

Note:
nonseekable_open() files naturally don't require ->proc_lseek.

Garbage collect pde_lseek() function.

[adobriyan@gmail.com: smoke test lseek()]
  Link: https://lkml.kernel.org/r/YG4OIhChOrVTPgdN@localhost.localdomain

Link: https://lkml.kernel.org/r/YFYX0Bzwxlc7aBa/@localhost.localdomain
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-05-06 19:24:11 -07:00
Alexey Dobriyan
b793cd9ab3 proc: save LOC in __xlate_proc_name()
Can't look at this verbosity anymore.

Link: https://lkml.kernel.org/r/YFYXAp/fgq405qcy@localhost.localdomain
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-05-06 19:24:11 -07:00
Colin Ian King
f4bf74d829 fs/proc/generic.c: fix incorrect pde_is_permanent check
Currently the pde_is_permanent() check is being run on root multiple times
rather than on the next proc directory entry.  This looks like a
copy-paste error.  Fix this by replacing root with next.

Addresses-Coverity: ("Copy-paste error")
Link: https://lkml.kernel.org/r/20210318122633.14222-1-colin.king@canonical.com
Fixes: d919b33daf ("proc: faster open/read/close with "permanent" files")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Acked-by: Christian Brauner <christian.brauner@ubuntu.com>
Reviewed-by: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Greg Kroah-Hartman <gregkh@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-05-06 19:24:11 -07:00
Linus Torvalds
7ac86b3dca Merge tag 'ceph-for-5.13-rc1' of git://github.com/ceph/ceph-client
Pull ceph updates from Ilya Dryomov:
 "Notable items here are

   - a series to take advantage of David Howells' netfs helper library
     from Jeff

   - three new filesystem client metrics from Xiubo

   - ceph.dir.rsnaps vxattr from Yanhu

   - two auth-related fixes from myself, marked for stable.

  Interspersed is a smattering of assorted fixes and cleanups across the
  filesystem"

* tag 'ceph-for-5.13-rc1' of git://github.com/ceph/ceph-client: (24 commits)
  libceph: allow addrvecs with a single NONE/blank address
  libceph: don't set global_id until we get an auth ticket
  libceph: bump CephXAuthenticate encoding version
  ceph: don't allow access to MDS-private inodes
  ceph: fix up some bare fetches of i_size
  ceph: convert some PAGE_SIZE invocations to thp_size()
  ceph: support getting ceph.dir.rsnaps vxattr
  ceph: drop pinned_page parameter from ceph_get_caps
  ceph: fix inode leak on getattr error in __fh_to_dentry
  ceph: only check pool permissions for regular files
  ceph: send opened files/pinned caps/opened inodes metrics to MDS daemon
  ceph: avoid counting the same request twice or more
  ceph: rename the metric helpers
  ceph: fix kerneldoc copypasta over ceph_start_io_direct
  ceph: use attach/detach_page_private for tracking snap context
  ceph: don't use d_add in ceph_handle_snapdir
  ceph: don't clobber i_snap_caps on non-I_NEW inode
  ceph: fix fall-through warnings for Clang
  ceph: convert ceph_readpages to ceph_readahead
  ceph: convert ceph_write_begin to netfs_write_begin
  ...
2021-05-06 10:27:02 -07:00
Linus Torvalds
682a8e2b41 Merge tag 'ecryptfs-5.13-rc1-updates' of git://git.kernel.org/pub/scm/linux/kernel/git/tyhicks/ecryptfs
Pull ecryptfs updates from Tyler Hicks:
 "Code cleanups and a bug fix

   - W=1 compiler warning cleanups

   - Mutex initialization simplification

   - Protect against NULL pointer exception during mount"

* tag 'ecryptfs-5.13-rc1-updates' of git://git.kernel.org/pub/scm/linux/kernel/git/tyhicks/ecryptfs:
  ecryptfs: fix kernel panic with null dev_name
  ecryptfs: remove unused helpers
  ecryptfs: Fix typo in message
  eCryptfs: Use DEFINE_MUTEX() for mutex lock
  ecryptfs: keystore: Fix some kernel-doc issues and demote non-conformant headers
  ecryptfs: inode: Help out nearly-there header and demote non-conformant ones
  ecryptfs: mmap: Help out one function header and demote other abuses
  ecryptfs: crypto: Supply some missing param descriptions and demote abuses
  ecryptfs: miscdev: File headers are not good kernel-doc candidates
  ecryptfs: main: Demote a bunch of non-conformant kernel-doc headers
  ecryptfs: messaging: Add missing param descriptions and demote abuses
  ecryptfs: super: Fix formatting, naming and kernel-doc abuses
  ecryptfs: file: Demote kernel-doc abuses
  ecryptfs: kthread: Demote file header and provide description for 'cred'
  ecryptfs: dentry: File headers are not good candidates for kernel-doc
  ecryptfs: debug: Demote a couple of kernel-doc abuses
  ecryptfs: read_write: File headers do not make good candidates for kernel-doc
  ecryptfs: use DEFINE_MUTEX() for mutex lock
  eCryptfs: add a semicolon
2021-05-06 10:06:39 -07:00
yangerkun
cf7b39a0cb block: reexpand iov_iter after read/write
We get a bug:

BUG: KASAN: slab-out-of-bounds in iov_iter_revert+0x11c/0x404
lib/iov_iter.c:1139
Read of size 8 at addr ffff0000d3fb11f8 by task

CPU: 0 PID: 12582 Comm: syz-executor.2 Not tainted
5.10.0-00843-g352c8610ccd2 #2
Hardware name: linux,dummy-virt (DT)
Call trace:
 dump_backtrace+0x0/0x2d0 arch/arm64/kernel/stacktrace.c:132
 show_stack+0x28/0x34 arch/arm64/kernel/stacktrace.c:196
 __dump_stack lib/dump_stack.c:77 [inline]
 dump_stack+0x110/0x164 lib/dump_stack.c:118
 print_address_description+0x78/0x5c8 mm/kasan/report.c:385
 __kasan_report mm/kasan/report.c:545 [inline]
 kasan_report+0x148/0x1e4 mm/kasan/report.c:562
 check_memory_region_inline mm/kasan/generic.c:183 [inline]
 __asan_load8+0xb4/0xbc mm/kasan/generic.c:252
 iov_iter_revert+0x11c/0x404 lib/iov_iter.c:1139
 io_read fs/io_uring.c:3421 [inline]
 io_issue_sqe+0x2344/0x2d64 fs/io_uring.c:5943
 __io_queue_sqe+0x19c/0x520 fs/io_uring.c:6260
 io_queue_sqe+0x2a4/0x590 fs/io_uring.c:6326
 io_submit_sqe fs/io_uring.c:6395 [inline]
 io_submit_sqes+0x4c0/0xa04 fs/io_uring.c:6624
 __do_sys_io_uring_enter fs/io_uring.c:9013 [inline]
 __se_sys_io_uring_enter fs/io_uring.c:8960 [inline]
 __arm64_sys_io_uring_enter+0x190/0x708 fs/io_uring.c:8960
 __invoke_syscall arch/arm64/kernel/syscall.c:36 [inline]
 invoke_syscall arch/arm64/kernel/syscall.c:48 [inline]
 el0_svc_common arch/arm64/kernel/syscall.c:158 [inline]
 do_el0_svc+0x120/0x290 arch/arm64/kernel/syscall.c:227
 el0_svc+0x1c/0x28 arch/arm64/kernel/entry-common.c:367
 el0_sync_handler+0x98/0x170 arch/arm64/kernel/entry-common.c:383
 el0_sync+0x140/0x180 arch/arm64/kernel/entry.S:670

Allocated by task 12570:
 stack_trace_save+0x80/0xb8 kernel/stacktrace.c:121
 kasan_save_stack mm/kasan/common.c:48 [inline]
 kasan_set_track mm/kasan/common.c:56 [inline]
 __kasan_kmalloc+0xdc/0x120 mm/kasan/common.c:461
 kasan_kmalloc+0xc/0x14 mm/kasan/common.c:475
 __kmalloc+0x23c/0x334 mm/slub.c:3970
 kmalloc include/linux/slab.h:557 [inline]
 __io_alloc_async_data+0x68/0x9c fs/io_uring.c:3210
 io_setup_async_rw fs/io_uring.c:3229 [inline]
 io_read fs/io_uring.c:3436 [inline]
 io_issue_sqe+0x2954/0x2d64 fs/io_uring.c:5943
 __io_queue_sqe+0x19c/0x520 fs/io_uring.c:6260
 io_queue_sqe+0x2a4/0x590 fs/io_uring.c:6326
 io_submit_sqe fs/io_uring.c:6395 [inline]
 io_submit_sqes+0x4c0/0xa04 fs/io_uring.c:6624
 __do_sys_io_uring_enter fs/io_uring.c:9013 [inline]
 __se_sys_io_uring_enter fs/io_uring.c:8960 [inline]
 __arm64_sys_io_uring_enter+0x190/0x708 fs/io_uring.c:8960
 __invoke_syscall arch/arm64/kernel/syscall.c:36 [inline]
 invoke_syscall arch/arm64/kernel/syscall.c:48 [inline]
 el0_svc_common arch/arm64/kernel/syscall.c:158 [inline]
 do_el0_svc+0x120/0x290 arch/arm64/kernel/syscall.c:227
 el0_svc+0x1c/0x28 arch/arm64/kernel/entry-common.c:367
 el0_sync_handler+0x98/0x170 arch/arm64/kernel/entry-common.c:383
 el0_sync+0x140/0x180 arch/arm64/kernel/entry.S:670

Freed by task 12570:
 stack_trace_save+0x80/0xb8 kernel/stacktrace.c:121
 kasan_save_stack mm/kasan/common.c:48 [inline]
 kasan_set_track+0x38/0x6c mm/kasan/common.c:56
 kasan_set_free_info+0x20/0x40 mm/kasan/generic.c:355
 __kasan_slab_free+0x124/0x150 mm/kasan/common.c:422
 kasan_slab_free+0x10/0x1c mm/kasan/common.c:431
 slab_free_hook mm/slub.c:1544 [inline]
 slab_free_freelist_hook mm/slub.c:1577 [inline]
 slab_free mm/slub.c:3142 [inline]
 kfree+0x104/0x38c mm/slub.c:4124
 io_dismantle_req fs/io_uring.c:1855 [inline]
 __io_free_req+0x70/0x254 fs/io_uring.c:1867
 io_put_req_find_next fs/io_uring.c:2173 [inline]
 __io_queue_sqe+0x1fc/0x520 fs/io_uring.c:6279
 __io_req_task_submit+0x154/0x21c fs/io_uring.c:2051
 io_req_task_submit+0x2c/0x44 fs/io_uring.c:2063
 task_work_run+0xdc/0x128 kernel/task_work.c:151
 get_signal+0x6f8/0x980 kernel/signal.c:2562
 do_signal+0x108/0x3a4 arch/arm64/kernel/signal.c:658
 do_notify_resume+0xbc/0x25c arch/arm64/kernel/signal.c:722
 work_pending+0xc/0x180

blkdev_read_iter can truncate iov_iter's count since the count + pos may
exceed the size of the blkdev. This will confuse io_read that we have
consume the iovec. And once we do the iov_iter_revert in io_read, we
will trigger the slab-out-of-bounds. Fix it by reexpand the count with
size has been truncated.

blkdev_write_iter can trigger the problem too.

Signed-off-by: yangerkun <yangerkun@huawei.com>
Acked-by: Pavel Begunkov <asml.silencec@gmail.com>
Link: https://lore.kernel.org/r/20210401071807.3328235-1-yangerkun@huawei.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-05-06 09:24:03 -06:00
Thadeu Lima de Souza Cascardo
d1f8280887 io_uring: truncate lengths larger than MAX_RW_COUNT on provide buffers
Read and write operations are capped to MAX_RW_COUNT. Some read ops rely on
that limit, and that is not guaranteed by the IORING_OP_PROVIDE_BUFFERS.

Truncate those lengths when doing io_add_buffers, so buffer addresses still
use the uncapped length.

Also, take the chance and change struct io_buffer len member to __u32, so
it matches struct io_provide_buffer len member.

This fixes CVE-2021-3491, also reported as ZDI-CAN-13546.

Fixes: ddf0322db7 ("io_uring: add IORING_OP_PROVIDE_BUFFERS")
Reported-by: Billy Jheng Bing-Jhong (@st424204)
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-05-05 15:17:35 -06:00
Linus Torvalds
8404c9fbc8 Merge branch 'akpm' (patches from Andrew)
Merge more updates from Andrew Morton:
 "The remainder of the main mm/ queue.

  143 patches.

  Subsystems affected by this patch series (all mm): pagecache, hugetlb,
  userfaultfd, vmscan, compaction, migration, cma, ksm, vmstat, mmap,
  kconfig, util, memory-hotplug, zswap, zsmalloc, highmem, cleanups, and
  kfence"

* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (143 commits)
  kfence: use power-efficient work queue to run delayed work
  kfence: maximize allocation wait timeout duration
  kfence: await for allocation using wait_event
  kfence: zero guard page after out-of-bounds access
  mm/process_vm_access.c: remove duplicate include
  mm/mempool: minor coding style tweaks
  mm/highmem.c: fix coding style issue
  btrfs: use memzero_page() instead of open coded kmap pattern
  iov_iter: lift memzero_page() to highmem.h
  mm/zsmalloc: use BUG_ON instead of if condition followed by BUG.
  mm/zswap.c: switch from strlcpy to strscpy
  arm64/Kconfig: introduce ARCH_MHP_MEMMAP_ON_MEMORY_ENABLE
  x86/Kconfig: introduce ARCH_MHP_MEMMAP_ON_MEMORY_ENABLE
  mm,memory_hotplug: add kernel boot option to enable memmap_on_memory
  acpi,memhotplug: enable MHP_MEMMAP_ON_MEMORY when supported
  mm,memory_hotplug: allocate memmap from the added memory range
  mm,memory_hotplug: factor out adjusting present pages into adjust_present_page_count()
  mm,memory_hotplug: relax fully spanned sections check
  drivers/base/memory: introduce memory_block_{online,offline}
  mm/memory_hotplug: remove broken locking of zone PCP structures during hot remove
  ...
2021-05-05 13:50:15 -07:00
Linus Torvalds
a79cdfba68 Merge tag 'nfsd-5.13-1' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux
Pull more nfsd updates from Chuck Lever:
 "Additional fixes and clean-ups for NFSD since tags/nfsd-5.13,
  including a fix to grant read delegations for files open for writing"

* tag 'nfsd-5.13-1' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux:
  SUNRPC: Fix null pointer dereference in svc_rqst_free()
  SUNRPC: fix ternary sign expansion bug in tracing
  nfsd: Fix fall-through warnings for Clang
  nfsd: grant read delegations to clients holding writes
  nfsd: reshuffle some code
  nfsd: track filehandle aliasing in nfs4_files
  nfsd: hash nfs4_files by inode number
  nfsd: ensure new clients break delegations
  nfsd: removed unused argument in nfsd_startup_generic()
  nfsd: remove unused function
  svcrdma: Pass a useful error code to the send_err tracepoint
  svcrdma: Rename goto labels in svc_rdma_sendto()
  svcrdma: Don't leak send_ctxt on Send errors
2021-05-05 13:44:19 -07:00
Linus Torvalds
7c9e41e0ef Merge tag '5.13-rc-smb3-part2' of git://git.samba.org/sfrench/cifs-2.6
Pull cifs updates from Steve French:
 "Ten CIFS/SMB3 changes - including two marked for stable - including
  some important multichannel fixes, as well as support for handle
  leases (deferred close) and shutdown support:

   - some important multichannel fixes

   - support for handle leases (deferred close)

   - shutdown support (which is also helpful since it enables multiple
     xfstests)

   - enable negotiating stronger encryption by default (GCM256)

   - improve wireshark debugging by allowing more options for root to
     dump decryption keys

  SambaXP and the SMB3 Plugfest test event are going on now so I am
  expecting more patches over the next few days due to extra testing
  (including more multichannel fixes)"

* tag '5.13-rc-smb3-part2' of git://git.samba.org/sfrench/cifs-2.6:
  fs/cifs: Fix resource leak
  Cifs: Fix kernel oops caused by deferred close for files.
  cifs: fix regression when mounting shares with prefix paths
  cifs: use echo_interval even when connection not ready.
  cifs: detect dead connections only when echoes are enabled.
  smb3.1.1: allow dumping keys for multiuser mounts
  smb3.1.1: allow dumping GCM256 keys to improve debugging of encrypted shares
  cifs: add shutdown support
  cifs: Deferred close for files
  smb3.1.1: enable negotiating stronger encryption by default
2021-05-05 13:37:07 -07:00
Ira Weiny
d048b9c2a7 btrfs: use memzero_page() instead of open coded kmap pattern
There are many places where kmap/memset/kunmap patterns occur.

Use the newly lifted memzero_page() to eliminate direct uses of kmap and
leverage the new core functions use of kmap_local_page().

The development of this patch was aided by the following coccinelle
script:

// <smpl>
// SPDX-License-Identifier: GPL-2.0-only
// Find kmap/memset/kunmap pattern and replace with memset*page calls
//
// NOTE: Offsets and other expressions may be more complex than what the script
// will automatically generate.  Therefore a catchall rule is provided to find
// the pattern which then must be evaluated by hand.
//
// Confidence: Low
// Copyright: (C) 2021 Intel Corporation
// URL: http://coccinelle.lip6.fr/
// Comments:
// Options:

//
// Then the memset pattern
//
@ memset_rule1 @
expression page, V, L, Off;
identifier ptr;
type VP;
@@

(
-VP ptr = kmap(page);
|
-ptr = kmap(page);
|
-VP ptr = kmap_atomic(page);
|
-ptr = kmap_atomic(page);
)
<+...
(
-memset(ptr, 0, L);
+memzero_page(page, 0, L);
|
-memset(ptr + Off, 0, L);
+memzero_page(page, Off, L);
|
-memset(ptr, V, L);
+memset_page(page, V, 0, L);
|
-memset(ptr + Off, V, L);
+memset_page(page, V, Off, L);
)
...+>
(
-kunmap(page);
|
-kunmap_atomic(ptr);
)

// Remove any pointers left unused
@
depends on memset_rule1
@
identifier memset_rule1.ptr;
type VP, VP1;
@@

-VP ptr;
	... when != ptr;
? VP1 ptr;

//
// Catch all
//
@ memset_rule2 @
expression page;
identifier ptr;
expression GenTo, GenSize, GenValue;
type VP;
@@

(
-VP ptr = kmap(page);
|
-ptr = kmap(page);
|
-VP ptr = kmap_atomic(page);
|
-ptr = kmap_atomic(page);
)
<+...
(
//
// Some call sites have complex expressions within the memset/memcpy
// The follow are catch alls which need to be evaluated by hand.
//
-memset(GenTo, 0, GenSize);
+memzero_pageExtra(page, GenTo, GenSize);
|
-memset(GenTo, GenValue, GenSize);
+memset_pageExtra(page, GenValue, GenTo, GenSize);
)
...+>
(
-kunmap(page);
|
-kunmap_atomic(ptr);
)

// Remove any pointers left unused
@
depends on memset_rule2
@
identifier memset_rule2.ptr;
type VP, VP1;
@@

-VP ptr;
	... when != ptr;
? VP1 ptr;

// </smpl>

Link: https://lkml.kernel.org/r/20210309212137.2610186-4-ira.weiny@intel.com
Signed-off-by: Ira Weiny <ira.weiny@intel.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Cc: Chris Mason <clm@fb.com>
Cc: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-05-05 11:27:27 -07:00