Commit Graph

82758 Commits

Author SHA1 Message Date
Chao Yu
091a4dfbb1 f2fs: compress: fix to assign compress_level for lz4 correctly
After remount, F2FS_OPTION().compress_level was assgin to
LZ4HC_DEFAULT_CLEVEL incorrectly, result in lz4hc:9 was enabled, fix it.

1. mount /dev/vdb
/dev/vdb on /mnt/f2fs type f2fs (...,compress_algorithm=lz4,compress_log_size=2,...)
2. mount -t f2fs -o remount,compress_log_size=3 /mnt/f2fs/
3. mount|grep f2fs
/dev/vdb on /mnt/f2fs type f2fs (...,compress_algorithm=lz4:9,compress_log_size=3,...)

Fixes: 00e120b5e4 ("f2fs: assign default compression level")
Signed-off-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2023-08-23 10:24:40 -07:00
Chao Yu
5118697f72 f2fs: fix error path of f2fs_submit_page_read()
In error path of f2fs_submit_page_read(), it missed to call
iostat_update_and_unbind_ctx() and free bio_post_read_ctx, fix it.

Signed-off-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2023-08-23 10:24:40 -07:00
Chao Yu
c988794984 f2fs: clean up error handling in sanity_check_{compress_,}inode()
In sanity_check_{compress_,}inode(), it doesn't need to set SBI_NEED_FSCK
in each error case, instead, we can set the flag in do_read_inode() only
once when sanity_check_inode() fails.

Signed-off-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2023-08-23 10:24:40 -07:00
Jaegeuk Kim
5c13e2388b f2fs: avoid false alarm of circular locking
======================================================
WARNING: possible circular locking dependency detected
6.5.0-rc5-syzkaller-00353-gae545c3283dc #0 Not tainted
------------------------------------------------------
syz-executor273/5027 is trying to acquire lock:
ffff888077fe1fb0 (&fi->i_sem){+.+.}-{3:3}, at: f2fs_down_write fs/f2fs/f2fs.h:2133 [inline]
ffff888077fe1fb0 (&fi->i_sem){+.+.}-{3:3}, at: f2fs_add_inline_entry+0x300/0x6f0 fs/f2fs/inline.c:644

but task is already holding lock:
ffff888077fe07c8 (&fi->i_xattr_sem){.+.+}-{3:3}, at: f2fs_down_read fs/f2fs/f2fs.h:2108 [inline]
ffff888077fe07c8 (&fi->i_xattr_sem){.+.+}-{3:3}, at: f2fs_add_dentry+0x92/0x230 fs/f2fs/dir.c:783

which lock already depends on the new lock.

the existing dependency chain (in reverse order) is:

-> #1 (&fi->i_xattr_sem){.+.+}-{3:3}:
       down_read+0x9c/0x470 kernel/locking/rwsem.c:1520
       f2fs_down_read fs/f2fs/f2fs.h:2108 [inline]
       f2fs_getxattr+0xb1e/0x12c0 fs/f2fs/xattr.c:532
       __f2fs_get_acl+0x5a/0x900 fs/f2fs/acl.c:179
       f2fs_acl_create fs/f2fs/acl.c:377 [inline]
       f2fs_init_acl+0x15c/0xb30 fs/f2fs/acl.c:420
       f2fs_init_inode_metadata+0x159/0x1290 fs/f2fs/dir.c:558
       f2fs_add_regular_entry+0x79e/0xb90 fs/f2fs/dir.c:740
       f2fs_add_dentry+0x1de/0x230 fs/f2fs/dir.c:788
       f2fs_do_add_link+0x190/0x280 fs/f2fs/dir.c:827
       f2fs_add_link fs/f2fs/f2fs.h:3554 [inline]
       f2fs_mkdir+0x377/0x620 fs/f2fs/namei.c:781
       vfs_mkdir+0x532/0x7e0 fs/namei.c:4117
       do_mkdirat+0x2a9/0x330 fs/namei.c:4140
       __do_sys_mkdir fs/namei.c:4160 [inline]
       __se_sys_mkdir fs/namei.c:4158 [inline]
       __x64_sys_mkdir+0xf2/0x140 fs/namei.c:4158
       do_syscall_x64 arch/x86/entry/common.c:50 [inline]
       do_syscall_64+0x38/0xb0 arch/x86/entry/common.c:80
       entry_SYSCALL_64_after_hwframe+0x63/0xcd

-> #0 (&fi->i_sem){+.+.}-{3:3}:
       check_prev_add kernel/locking/lockdep.c:3142 [inline]
       check_prevs_add kernel/locking/lockdep.c:3261 [inline]
       validate_chain kernel/locking/lockdep.c:3876 [inline]
       __lock_acquire+0x2e3d/0x5de0 kernel/locking/lockdep.c:5144
       lock_acquire kernel/locking/lockdep.c:5761 [inline]
       lock_acquire+0x1ae/0x510 kernel/locking/lockdep.c:5726
       down_write+0x93/0x200 kernel/locking/rwsem.c:1573
       f2fs_down_write fs/f2fs/f2fs.h:2133 [inline]
       f2fs_add_inline_entry+0x300/0x6f0 fs/f2fs/inline.c:644
       f2fs_add_dentry+0xa6/0x230 fs/f2fs/dir.c:784
       f2fs_do_add_link+0x190/0x280 fs/f2fs/dir.c:827
       f2fs_add_link fs/f2fs/f2fs.h:3554 [inline]
       f2fs_mkdir+0x377/0x620 fs/f2fs/namei.c:781
       vfs_mkdir+0x532/0x7e0 fs/namei.c:4117
       ovl_do_mkdir fs/overlayfs/overlayfs.h:196 [inline]
       ovl_mkdir_real+0xb5/0x370 fs/overlayfs/dir.c:146
       ovl_workdir_create+0x3de/0x820 fs/overlayfs/super.c:309
       ovl_make_workdir fs/overlayfs/super.c:711 [inline]
       ovl_get_workdir fs/overlayfs/super.c:864 [inline]
       ovl_fill_super+0xdab/0x6180 fs/overlayfs/super.c:1400
       vfs_get_super+0xf9/0x290 fs/super.c:1152
       vfs_get_tree+0x88/0x350 fs/super.c:1519
       do_new_mount fs/namespace.c:3335 [inline]
       path_mount+0x1492/0x1ed0 fs/namespace.c:3662
       do_mount fs/namespace.c:3675 [inline]
       __do_sys_mount fs/namespace.c:3884 [inline]
       __se_sys_mount fs/namespace.c:3861 [inline]
       __x64_sys_mount+0x293/0x310 fs/namespace.c:3861
       do_syscall_x64 arch/x86/entry/common.c:50 [inline]
       do_syscall_64+0x38/0xb0 arch/x86/entry/common.c:80
       entry_SYSCALL_64_after_hwframe+0x63/0xcd

other info that might help us debug this:

 Possible unsafe locking scenario:

       CPU0                    CPU1
       ----                    ----
  rlock(&fi->i_xattr_sem);
                               lock(&fi->i_sem);
                               lock(&fi->i_xattr_sem);
  lock(&fi->i_sem);

Cc: <stable@vger.kernel.org>
Reported-and-tested-by: syzbot+e5600587fa9cbf8e3826@syzkaller.appspotmail.com
Fixes: 5eda1ad1aa "f2fs: fix deadlock in i_xattr_sem and inode page lock"
Tested-by: Guenter Roeck <linux@roeck-us.net>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2023-08-21 12:43:26 -07:00
Chao Yu
005abf9e5e Revert "f2fs: do not issue small discard commands during checkpoint"
Previously, we have two mechanisms to cache & submit small discards:

a) set max small discard number in /sys/fs/f2fs/vdb/max_small_discards,
and checkpoint will cache small discard candidates w/ configured maximum
number.

b) call FITRIM ioctl, also, checkpoint in f2fs_trim_fs() will cache small
discard candidates w/ configured discard granularity, but w/o limitation
of number. FSTRIM interface is asynchronized, so it won't submit discard
directly.

Finally, discard thread will submit them in background periodically.

However, after commit 9ac00e7cef ("f2fs: do not issue small discard
commands during checkpoint"), the mechanism a) is broken, since no matter
how we configure the sysfs entry /sys/fs/f2fs/vdb/max_small_discards,
checkpoint will not cache small discard candidates any more.

echo 0 > /sys/fs/f2fs/vdb/max_small_discards
xfs_io -f /mnt/f2fs/file -c "pwrite 0 2m" -c "fsync"
xfs_io /mnt/f2fs/file -c "fpunch 0 4k"
sync
cat /proc/fs/f2fs/vdb/discard_plist_info |head -2

echo 100 > /sys/fs/f2fs/vdb/max_small_discards
rm /mnt/f2fs/file
xfs_io -f /mnt/f2fs/file -c "pwrite 0 2m" -c "fsync"
xfs_io /mnt/f2fs/file -c "fpunch 0 4k"
sync
cat /proc/fs/f2fs/vdb/discard_plist_info |head -2

Before the patch:
Discard pend list(Show diacrd_cmd count on each entry, .:not exist):
  0         .       .       .       .       .       .       .       .
Discard pend list(Show diacrd_cmd count on each entry, .:not exist):
  0         3       1       .       .       .       .       .       .

After the patch:
Discard pend list(Show diacrd_cmd count on each entry, .:not exist):
  0         .       .       .       .       .       .       .       .
Discard pend list(Show diacrd_cmd count on each entry, .:not exist):
  0         .       .       .       .       .       .       .       .

This patch reverts commit 9ac00e7cef ("f2fs: do not issue small discard
commands during checkpoint") in order to fix this issue.

Fixes: 9ac00e7cef ("f2fs: do not issue small discard commands during checkpoint")
Signed-off-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2023-08-18 14:28:34 -07:00
Zhiguo Niu
0cc81b1ad5 f2fs: should update REQ_TIME for direct write
The sending interval of discard and GC should also
consider direct write requests; filesystem is not
idle if there is direct write.

Signed-off-by: Zhiguo Niu <zhiguo.niu@unisoc.com>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2023-08-14 13:42:28 -07:00
Chao Yu
eb61c2cca2 f2fs: fix to account cp stats correctly
cp_foreground_calls sysfs entry shows total CP call count rather than
foreground CP call count, fix it.

Fixes: fc7100ea2a ("f2fs: Add f2fs stats to sysfs")
Signed-off-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2023-08-14 13:42:05 -07:00
Chao Yu
9bf1dcbdfd f2fs: fix to account gc stats correctly
As reported, status debugfs entry shows inconsistent GC stats as below:

GC calls: 6008 (BG: 6161)
  - data segments : 3053 (BG: 3053)
  - node segments : 2955 (BG: 2955)

Total GC calls is larger than BGGC calls, the reason is:
- f2fs_stat_info.call_count accounts total migrated section count
by f2fs_gc()
- f2fs_stat_info.bg_gc accounts total call times of f2fs_gc() from
background gc_thread

Another issue is gc_foreground_calls sysfs entry shows total GC call
count rather than FGGC call count.

This patch changes as below for fix:
- account GC calls and migrated segment count separately
- support to account migrated section count if it enables large section
mode
- fix to show correct value in gc_foreground_calls sysfs entry

Fixes: fc7100ea2a ("f2fs: Add f2fs stats to sysfs")
Signed-off-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2023-08-14 13:41:57 -07:00
Chao Yu
bc3994ffa4 f2fs: remove unneeded check condition in __f2fs_setxattr()
It has checked return value of write_all_xattrs(), remove unneeded
following check condition.

Signed-off-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2023-08-14 13:41:10 -07:00
Chao Yu
8874ad7dae f2fs: fix to update i_ctime in __f2fs_setxattr()
generic/728       - output mismatch (see /media/fstests/results//generic/728.out.bad)
    --- tests/generic/728.out	2023-07-19 07:10:48.362711407 +0000
    +++ /media/fstests/results//generic/728.out.bad	2023-07-19 08:39:57.000000000 +0000
     QA output created by 728
    +Expected ctime to change after setxattr.
    +Expected ctime to change after removexattr.
     Silence is golden
    ...
    (Run 'diff -u /media/fstests/tests/generic/728.out /media/fstests/results//generic/728.out.bad'  to see the entire diff)
generic/729        1s

It needs to update i_ctime after {set,remove}xattr, fix it.

Signed-off-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2023-08-14 13:41:09 -07:00
Chao Yu
958ccbbf1c Revert "f2fs: fix to do sanity check on extent cache correctly"
syzbot reports a f2fs bug as below:

UBSAN: array-index-out-of-bounds in fs/f2fs/f2fs.h:3275:19
index 1409 is out of range for type '__le32[923]' (aka 'unsigned int[923]')
Call Trace:
 __dump_stack lib/dump_stack.c:88 [inline]
 dump_stack_lvl+0x1e7/0x2d0 lib/dump_stack.c:106
 ubsan_epilogue lib/ubsan.c:217 [inline]
 __ubsan_handle_out_of_bounds+0x11c/0x150 lib/ubsan.c:348
 inline_data_addr fs/f2fs/f2fs.h:3275 [inline]
 __recover_inline_status fs/f2fs/inode.c:113 [inline]
 do_read_inode fs/f2fs/inode.c:480 [inline]
 f2fs_iget+0x4730/0x48b0 fs/f2fs/inode.c:604
 f2fs_fill_super+0x640e/0x80c0 fs/f2fs/super.c:4601
 mount_bdev+0x276/0x3b0 fs/super.c:1391
 legacy_get_tree+0xef/0x190 fs/fs_context.c:611
 vfs_get_tree+0x8c/0x270 fs/super.c:1519
 do_new_mount+0x28f/0xae0 fs/namespace.c:3335
 do_mount fs/namespace.c:3675 [inline]
 __do_sys_mount fs/namespace.c:3884 [inline]
 __se_sys_mount+0x2d9/0x3c0 fs/namespace.c:3861
 do_syscall_x64 arch/x86/entry/common.c:50 [inline]
 do_syscall_64+0x41/0xc0 arch/x86/entry/common.c:80
 entry_SYSCALL_64_after_hwframe+0x63/0xcd

The issue was bisected to:

commit d48a7b3a72
Author: Chao Yu <chao@kernel.org>
Date:   Mon Jan 9 03:49:20 2023 +0000

    f2fs: fix to do sanity check on extent cache correctly

The root cause is we applied both v1 and v2 of the patch, v2 is the right
fix, so it needs to revert v1 in order to fix reported issue.

v1:
commit d48a7b3a72 ("f2fs: fix to do sanity check on extent cache correctly")
https://lore.kernel.org/lkml/20230109034920.492914-1-chao@kernel.org/

v2:
commit 269d119481 ("f2fs: fix to do sanity check on extent cache correctly")
https://lore.kernel.org/lkml/20230207134808.1827869-1-chao@kernel.org/

Reported-by: syzbot+601018296973a481f302@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/linux-f2fs-devel/000000000000fcf0690600e4d04d@google.com/
Fixes: d48a7b3a72 ("f2fs: fix to do sanity check on extent cache correctly")
Signed-off-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2023-08-14 13:41:09 -07:00
Minjie Du
a842a90926 f2fs: increase usage of folio_next_index() helper
Simplify code pattern of 'folio->index + folio_nr_pages(folio)' by using
the existing helper folio_next_index().

Signed-off-by: Minjie Du <duminjie@vivo.com>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2023-08-14 13:41:09 -07:00
Chunhai Guo
2bd4df8fcb f2fs: Only lfs mode is allowed with zoned block device feature
Now f2fs support four block allocation modes: lfs, adaptive,
fragment:segment, fragment:block. Only lfs mode is allowed with zoned block
device feature.

Fixes: 6691d940b0 ("f2fs: introduce fragment allocation mode mount option")
Signed-off-by: Chunhai Guo <guochunhai@vivo.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2023-08-14 13:41:09 -07:00
Shin'ichiro Kawasaki
3cb88bc159 f2fs: check zone type before sending async reset zone command
The commit 25f9080576 ("f2fs: add async reset zone command support")
introduced "async reset zone commands" by calling
__submit_zone_reset_cmd() in async discard operations. However,
__submit_zone_reset_cmd() is called regardless of zone type of discard
target zone. When devices have conventional zones, zone reset commands
are sent to the conventional zones and cause I/O errors.

Avoid the I/O errors by checking that the discard target zone type is
sequential write required. If not, handle the discard operation in same
manner as non-zoned, regular block devices. For that purpose, add a new
helper function f2fs_bdev_index() which gets index of the zone reset
target device.

Fixes: 25f9080576 ("f2fs: add async reset zone command support")
Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2023-08-14 13:41:09 -07:00
Chao Yu
025b3602b5 f2fs: compress: don't {,de}compress non-full cluster
f2fs won't compress non-full cluster in tail of file, let's skip
dirtying and rewrite such cluster during f2fs_ioc_{,de}compress_file.

Signed-off-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2023-08-14 13:41:08 -07:00
Chao Yu
3a2c0e55f9 f2fs: allow f2fs_ioc_{,de}compress_file to be interrupted
This patch allows f2fs_ioc_{,de}compress_file() to be interrupted, so that,
userspace won't be blocked when manual {,de}compression on large file is
interrupted by signal.

Signed-off-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2023-08-14 13:41:08 -07:00
Christoph Hellwig
51bf8d3c81 f2fs: don't reopen the main block device in f2fs_scan_devices
f2fs_scan_devices reopens the main device since the very beginning, which
has always been useless, and also means that we don't pass the right
holder for the reopen, which now leads to a warning as the core super.c
holder ops aren't passed in for the reopen.

Fixes: 3c62be17d4 ("f2fs: support multiple devices")
Fixes: 0718afd47f ("block: introduce holder ops")
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2023-08-14 13:41:08 -07:00
Chao Yu
b5ab3276eb f2fs: fix to avoid mmap vs set_compress_option case
Compression option in inode should not be changed after they have
been used, however, it may happen in below race case:

Thread A				Thread B
- f2fs_ioc_set_compress_option
 - check f2fs_is_mmap_file()
 - check get_dirty_pages()
 - check F2FS_HAS_BLOCKS()
					- f2fs_file_mmap
					 - set_inode_flag(FI_MMAP_FILE)
					- fault
					 - do_page_mkwrite
					  - f2fs_vm_page_mkwrite
					  - f2fs_get_block_locked
					 - fault_dirty_shared_page
					  - set_page_dirty
 - update i_compress_algorithm
 - update i_log_cluster_size
 - update i_cluster_size

Avoid such race condition by covering f2fs_file_mmap() w/ i_sem lock,
meanwhile add mmap file check condition in f2fs_may_compress() as well.

Fixes: e1e8debec6 ("f2fs: add F2FS_IOC_SET_COMPRESS_OPTION ioctl")
Signed-off-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2023-08-14 13:41:07 -07:00
Jaegeuk Kim
d2d9bb3b6d f2fs: get out of a repeat loop when getting a locked data page
https://bugzilla.kernel.org/show_bug.cgi?id=216050

Somehow we're getting a page which has a different mapping.
Let's avoid the infinite loop.

Cc: <stable@vger.kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2023-08-14 13:41:07 -07:00
Jaegeuk Kim
a3ab557466 f2fs: flush inode if atomic file is aborted
Let's flush the inode being aborted atomic operation to avoid stale dirty
inode during eviction in this call stack:

  f2fs_mark_inode_dirty_sync+0x22/0x40 [f2fs]
  f2fs_abort_atomic_write+0xc4/0xf0 [f2fs]
  f2fs_evict_inode+0x3f/0x690 [f2fs]
  ? sugov_start+0x140/0x140
  evict+0xc3/0x1c0
  evict_inodes+0x17b/0x210
  generic_shutdown_super+0x32/0x120
  kill_block_super+0x21/0x50
  deactivate_locked_super+0x31/0x90
  cleanup_mnt+0x100/0x160
  task_work_run+0x59/0x90
  do_exit+0x33b/0xa50
  do_group_exit+0x2d/0x80
  __x64_sys_exit_group+0x14/0x20
  do_syscall_64+0x3b/0x90
  entry_SYSCALL_64_after_hwframe+0x63/0xcd

This triggers f2fs_bug_on() in f2fs_evict_inode:
 f2fs_bug_on(sbi, is_inode_flag_set(inode, FI_DIRTY_INODE));

This fixes the syzbot report:

loop0: detected capacity change from 0 to 131072
F2FS-fs (loop0): invalid crc value
F2FS-fs (loop0): Found nat_bits in checkpoint
F2FS-fs (loop0): Mounted with checkpoint version = 48b305e4
------------[ cut here ]------------
kernel BUG at fs/f2fs/inode.c:869!
invalid opcode: 0000 [#1] PREEMPT SMP KASAN
CPU: 0 PID: 5014 Comm: syz-executor220 Not tainted 6.4.0-syzkaller-11479-g6cd06ab12d1a #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 05/27/2023
RIP: 0010:f2fs_evict_inode+0x172d/0x1e00 fs/f2fs/inode.c:869
Code: ff df 48 c1 ea 03 80 3c 02 00 0f 85 6a 06 00 00 8b 75 40 ba 01 00 00 00 4c 89 e7 e8 6d ce 06 00 e9 aa fc ff ff e8 63 22 e2 fd <0f> 0b e8 5c 22 e2 fd 48 c7 c0 a8 3a 18 8d 48 ba 00 00 00 00 00 fc
RSP: 0018:ffffc90003a6fa00 EFLAGS: 00010293
RAX: 0000000000000000 RBX: 0000000000000001 RCX: 0000000000000000
RDX: ffff8880273b8000 RSI: ffffffff83a2bd0d RDI: 0000000000000007
RBP: ffff888077db91b0 R08: 0000000000000007 R09: 0000000000000000
R10: 0000000000000001 R11: 0000000000000001 R12: ffff888029a3c000
R13: ffff888077db9660 R14: ffff888029a3c0b8 R15: ffff888077db9c50
FS:  0000000000000000(0000) GS:ffff8880b9800000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f1909bb9000 CR3: 00000000276a9000 CR4: 0000000000350ef0
Call Trace:
 <TASK>
 evict+0x2ed/0x6b0 fs/inode.c:665
 dispose_list+0x117/0x1e0 fs/inode.c:698
 evict_inodes+0x345/0x440 fs/inode.c:748
 generic_shutdown_super+0xaf/0x480 fs/super.c:478
 kill_block_super+0x64/0xb0 fs/super.c:1417
 kill_f2fs_super+0x2af/0x3c0 fs/f2fs/super.c:4704
 deactivate_locked_super+0x98/0x160 fs/super.c:330
 deactivate_super+0xb1/0xd0 fs/super.c:361
 cleanup_mnt+0x2ae/0x3d0 fs/namespace.c:1254
 task_work_run+0x16f/0x270 kernel/task_work.c:179
 exit_task_work include/linux/task_work.h:38 [inline]
 do_exit+0xa9a/0x29a0 kernel/exit.c:874
 do_group_exit+0xd4/0x2a0 kernel/exit.c:1024
 __do_sys_exit_group kernel/exit.c:1035 [inline]
 __se_sys_exit_group kernel/exit.c:1033 [inline]
 __x64_sys_exit_group+0x3e/0x50 kernel/exit.c:1033
 do_syscall_x64 arch/x86/entry/common.c:50 [inline]
 do_syscall_64+0x39/0xb0 arch/x86/entry/common.c:80
 entry_SYSCALL_64_after_hwframe+0x63/0xcd
RIP: 0033:0x7f309be71a09
Code: Unable to access opcode bytes at 0x7f309be719df.
RSP: 002b:00007fff171df518 EFLAGS: 00000246 ORIG_RAX: 00000000000000e7
RAX: ffffffffffffffda RBX: 00007f309bef7330 RCX: 00007f309be71a09
RDX: 000000000000003c RSI: 00000000000000e7 RDI: 0000000000000001
RBP: 0000000000000001 R08: ffffffffffffffc0 R09: 00007f309bef1e40
R10: 0000000000010600 R11: 0000000000000246 R12: 00007f309bef7330
R13: 0000000000000001 R14: 0000000000000000 R15: 0000000000000001
 </TASK>
Modules linked in:
---[ end trace 0000000000000000 ]---
RIP: 0010:f2fs_evict_inode+0x172d/0x1e00 fs/f2fs/inode.c:869
Code: ff df 48 c1 ea 03 80 3c 02 00 0f 85 6a 06 00 00 8b 75 40 ba 01 00 00 00 4c 89 e7 e8 6d ce 06 00 e9 aa fc ff ff e8 63 22 e2 fd <0f> 0b e8 5c 22 e2 fd 48 c7 c0 a8 3a 18 8d 48 ba 00 00 00 00 00 fc
RSP: 0018:ffffc90003a6fa00 EFLAGS: 00010293
RAX: 0000000000000000 RBX: 0000000000000001 RCX: 0000000000000000
RDX: ffff8880273b8000 RSI: ffffffff83a2bd0d RDI: 0000000000000007
RBP: ffff888077db91b0 R08: 0000000000000007 R09: 0000000000000000
R10: 0000000000000001 R11: 0000000000000001 R12: ffff888029a3c000
R13: ffff888077db9660 R14: ffff888029a3c0b8 R15: ffff888077db9c50
FS:  0000000000000000(0000) GS:ffff8880b9800000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f1909bb9000 CR3: 00000000276a9000 CR4: 0000000000350ef0

Cc: <stable@vger.kernel.org>
Reported-and-tested-by: syzbot+e1246909d526a9d470fa@syzkaller.appspotmail.com
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2023-08-14 13:41:07 -07:00
Chao Yu
863907a4f5 f2fs: don't handle error case of f2fs_compress_alloc_page()
f2fs_compress_alloc_page() uses mempool to allocate memory, it never
fail, don't handle error case in its callers.

Signed-off-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2023-08-14 13:41:06 -07:00
Jaegeuk Kim
579c7e4150 Revert "f2fs: clean up w/ sbi->log_sectors_per_block"
This reverts commit bfd4766239.

Shinichiro Kawasaki reported:

When I ran workloads on f2fs using v6.5-rcX with fixes [1][2] and a zoned block
devices with 4kb logical block size, I observe mount failure as follows. When
I revert this commit, the failure goes away.

[  167.781975][ T1555] F2FS-fs (dm-0): IO Block Size:        4 KB
[  167.890728][ T1555] F2FS-fs (dm-0): Found nat_bits in checkpoint
[  171.482588][ T1555] F2FS-fs (dm-0): Zone without valid block has non-zero write pointer. Reset the write pointer: wp[0x1300,0x8]
[  171.496000][ T1555] F2FS-fs (dm-0): (0) : Unaligned zone reset attempted (block 280000 + 80000)
[  171.505037][ T1555] F2FS-fs (dm-0): Discard zone failed:  (errno=-5)

The patch replaced "sbi->log_blocksize - SECTOR_SHIFT" with
"sbi->log_sectors_per_block". However, I think these two are not equal when the
device has 4k logical block size. The former uses Linux kernel sector size 512
byte. The latter use 512b sector size or 4kb sector size depending on the
device. mkfs.f2fs obtains logical block size via BLKSSZGET ioctl from the device
and reflects it to the value sbi->log_sector_size_per_block. This causes
unexpected write pointer calculations in check_zone_write_pointer(). This
resulted in unexpected zone reset and the mount failure.

[1] https://lkml.kernel.org/linux-f2fs-devel/20230711050101.GA19128@lst.de/
[2] https://lore.kernel.org/linux-f2fs-devel/20230804091556.2372567-1-shinichiro.kawasaki@wdc.com/

Cc: stable@vger.kernel.org
Reported-by: Shinichiro Kawasaki <shinichiro.kawasaki@wdc.com>
Fixes: bfd4766239 ("f2fs: clean up w/ sbi->log_sectors_per_block")
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2023-08-14 13:40:27 -07:00
Linus Torvalds
76487845fd Minor cleanups for 6.5:
* Fix an uninitialized variable warning.
 
 Signed-off-by: Darrick J. Wong <djwong@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQQ2qTKExjcn+O1o2YRKO3ySh0YRpgUCZKjUjwAKCRBKO3ySh0YR
 pn92AQC4gY9GOyKcc/aiAd/t1u8gGxnFtcN06xh4TdVArMM4/AD/UtEKx9LYuaSF
 pyhw5SfzxI555HfXkA8ci/D+BxguVQs=
 =/vX1
 -----END PGP SIGNATURE-----

Merge tag 'xfs-6.5-merge-6' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux

Pull xfs fix from Darrick Wong:
 "Nothing exciting here, just getting rid of a gcc warning that I got
  tired of seeing when I turn on gcov"

* tag 'xfs-6.5-merge-6' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux:
  xfs: fix uninit warning in xfs_growfs_data
2023-07-09 09:50:42 -07:00
Linus Torvalds
4770353b66 3 smb3 client fixes
-----BEGIN PGP SIGNATURE-----
 
 iQGzBAABCgAdFiEE6fsu8pdIjtWE/DpLiiy9cAdyT1EFAmSqNkIACgkQiiy9cAdy
 T1GXsAwAhYUyjlXZLDsmO+9PjKhM9WRM1IO5myy3P396R0Tzq741f8LM7Lx08qc+
 D1701gsnhIrvprem1HjtW6DZzCVnLdpBIYUEnwUr8eDqMpk1VFKug3xSVhIRMih3
 Y30dHTgQ0aCLrrh5XHOWhBHJbpq7Wdlh3q0oi8I36Of8e6tGFNo2wI4ud7no4aIj
 N222dWOs56FXtVAmgEAuc7U2A40ztMOp7FXrbzhK4FwD5kO+pFkqJcLjG6Bk10ph
 Tyg3Wh2TnX+MviOY0xUaN0X50dSoSJPkSUYGkccrIcfVPwEoH7l6j0LNgAVyhG7K
 f5EUbM7Td51a1Znj9wX6U9N0UfO/IOZRDFZ7ACckLBBBEzfKYCgYY5dWJ6aVxZHb
 bB336f1ObvDiocEabS1SMa//sXUjpOy3Tg8etLCYJpqjWYE8nO7lERoBWGWXkUqy
 xO86pGQjYLzkw16R11tzbplv+1HxoGwIuQnOubivv2prn++NZ4Zr2ohBeDlyJc1/
 WwF42UfM
 =F8D0
 -----END PGP SIGNATURE-----

Merge tag '6.5-rc-smb3-client-fixes-part2' of git://git.samba.org/sfrench/cifs-2.6

Pull more smb client updates from Steve French:

 - fix potential use after free in unmount

 - minor cleanup

 - add worker to cleanup stale directory leases

* tag '6.5-rc-smb3-client-fixes-part2' of git://git.samba.org/sfrench/cifs-2.6:
  cifs: Add a laundromat thread for cached directories
  smb: client: remove redundant pointer 'server'
  cifs: fix session state transition to avoid use-after-free issue
2023-07-09 09:45:32 -07:00
Linus Torvalds
946c6b59c5 16 hotfixes. Six are cc:stable and the remainder address post-6.4 issues.
-----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCZKmgXAAKCRDdBJ7gKXxA
 joqDAP0V520Jy0cyJrRMvaQRFMqtVeDOdTpAue7ZOQHSi/LZnAD9EEAxDpYF/V4x
 PO27ixXQ4Glm2iYgH7bDX7J73WiA3wg=
 =JsYW
 -----END PGP SIGNATURE-----

Merge tag 'mm-hotfixes-stable-2023-07-08-10-43' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm

Pull hotfixes from Andrew Morton:
 "16 hotfixes. Six are cc:stable and the remainder address post-6.4
  issues"

The merge undoes the disabling of the CONFIG_PER_VMA_LOCK feature, since
it was all hopefully fixed in mainline.

* tag 'mm-hotfixes-stable-2023-07-08-10-43' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm:
  lib: dhry: fix sleeping allocations inside non-preemptable section
  kasan, slub: fix HW_TAGS zeroing with slub_debug
  kasan: fix type cast in memory_is_poisoned_n
  mailmap: add entries for Heiko Stuebner
  mailmap: update manpage link
  bootmem: remove the vmemmap pages from kmemleak in free_bootmem_page
  MAINTAINERS: add linux-next info
  mailmap: add Markus Schneider-Pargmann
  writeback: account the number of pages written back
  mm: call arch_swap_restore() from do_swap_page()
  squashfs: fix cache race with migration
  mm/hugetlb.c: fix a bug within a BUG(): inconsistent pte comparison
  docs: update ocfs2-devel mailing list address
  MAINTAINERS: update ocfs2-devel mailing list address
  mm: disable CONFIG_PER_VMA_LOCK until its fixed
  fork: lock VMAs of the parent process when forking
2023-07-08 14:30:25 -07:00
Vincent Whitchurch
08bab74ae6 squashfs: fix cache race with migration
Migration replaces the page in the mapping before copying the contents and
the flags over from the old page, so check that the page in the page cache
is really up to date before using it.  Without this, stressing squashfs
reads with parallel compaction sometimes results in squashfs reporting
data corruption.

Link: https://lkml.kernel.org/r/20230629-squashfs-cache-migration-v1-1-d50ebe55099d@axis.com
Fixes: e994f5b677 ("squashfs: cache partial compressed blocks")
Signed-off-by: Vincent Whitchurch <vincent.whitchurch@axis.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Phillip Lougher <phillip@squashfs.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-07-08 09:29:30 -07:00
Anthony Iliopoulos
5a569db68c docs: update ocfs2-devel mailing list address
The ocfs2-devel mailing list has been migrated to the kernel.org
infrastructure, update all related documentation pointers to reflect the
change.

Link: https://lkml.kernel.org/r/20230628013437.47030-3-ailiop@suse.com
Signed-off-by: Anthony Iliopoulos <ailiop@suse.com>
Acked-by: Joseph Qi <jiangqi903@gmail.com>
Acked-by: Joel Becker <jlbec@evilplan.org>
Cc: Changwei Ge <gechangwei@live.cn>
Cc: Gang He <ghe@suse.com>
Cc: Jun Piao <piaojun@huawei.com>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Cc: Mark Fasheh <mark@fasheh.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-07-08 09:29:29 -07:00
Darrick J. Wong
ed04a91f71 xfs: fix uninit warning in xfs_growfs_data
Quiet down this gcc warning:

fs/xfs/xfs_fsops.c: In function ‘xfs_growfs_data’:
fs/xfs/xfs_fsops.c:219:21: error: ‘lastag_extended’ may be used uninitialized in this function [-Werror=maybe-uninitialized]
  219 |                 if (lastag_extended) {
      |                     ^~~~~~~~~~~~~~~
fs/xfs/xfs_fsops.c💯33: note: ‘lastag_extended’ was declared here
  100 |         bool                    lastag_extended;
      |                                 ^~~~~~~~~~~~~~~

By setting its value explicitly.  From code analysis I don't think this
is a real problem, but I have better things to do than analyse this
closely.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
2023-07-07 20:13:41 -07:00
Linus Torvalds
3290badd1b A bunch of CephFS fixups from Xiubo, mostly around dropping caps, along
with a fix for a regression in the readahead handling code which sneaked
 in with the switch to netfs helpers.
 -----BEGIN PGP SIGNATURE-----
 
 iQFHBAABCAAxFiEEydHwtzie9C7TfviiSn/eOAIR84sFAmSoL3ITHGlkcnlvbW92
 QGdtYWlsLmNvbQAKCRBKf944AhHzi5GdCACVzRsWU75gmO74yrOKOy2BR70Kgz2q
 +uTAeXLYL57Q5Z2kREiLQQQsBhqkvkUcsE2kPZC40DIVP2554A8nBTnWLcdg//PM
 6e94UVYMW66GqDeTYvCA2gD0V+uPnnDc5frrcxsNb2F1hxGFuO+tYMYDASJgmuuV
 0gUKSqM5HbvFi30nM+RrNzOLPxr+/gMHahAVoM8uwuWN2LBFANADDY/7ya7JA4ZP
 61BVF7jEDpb2btNUH1z4RfFVIIJE0IpJRH+bSb5d7CsrbrrkZhAh90QZaAGtIo7C
 NhoZlT5fyQ57u4g4PM2UvoFJHeaxNMRb1JR73sN0FT8ngvw5Wb2HzaQb
 =rFqz
 -----END PGP SIGNATURE-----

Merge tag 'ceph-for-6.5-rc1' of https://github.com/ceph/ceph-client

Pull ceph updates from Ilya Dryomov:
 "A bunch of CephFS fixups from Xiubo, mostly around dropping caps,
  along with a fix for a regression in the readahead handling code which
  sneaked in with the switch to netfs helpers"

* tag 'ceph-for-6.5-rc1' of https://github.com/ceph/ceph-client:
  ceph: don't let check_caps skip sending responses for revoke msgs
  ceph: issue a cap release immediately if no cap exists
  ceph: trigger to flush the buffer when making snapshot
  ceph: fix blindly expanding the readahead windows
  ceph: add a dedicated private data for netfs rreq
  ceph: voluntarily drop Xx caps for requests those touch parent mtime
  ceph: try to dump the msgs when decoding fails
  ceph: only send metrics when the MDS rank is ready
2023-07-07 15:07:20 -07:00
Linus Torvalds
36b93aed9e driver ntfs3 for linux 6.5
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEh0DEKNP0I9IjwfWEqbAzH4MkB7YFAmSoF+sACgkQqbAzH4Mk
 B7b8pQ/+LSjMlS4qH5hHilmXtXcNpEumP3w5poNgwhx5hnBk6cbkgrOlvB5GlCHT
 BxElAcgzBBihNL/OsJ12Hr3+oDBv2eQYHSlw6vLve8UlAAiN5hWIyiBH59Xau6uv
 i/4JkPhnP+dC9EArq3qz7q2R8Cyc14i037zrQs1mF0iLJVDP27nLaQf1MNKU5uwd
 nHgl4T5bqEOfo2sXH6qNJhoTTF+uHyPrid9riaCx0YVuVpkypjZUOqDzFSPIJMFh
 7PxnfaeljgLVIBSGOfCuEB/ZXAxmp9aj8bbktFXju/LHQbB41Nr9tdhinzLopszS
 gfEu40z/infyKjS8nAQPy2uAmuMfw1HCk80FMaVIDsG3EZE5TwoZwgNlxMBqhcuc
 YWJhYqJh5VwIvCJYAUZkXsHEb/v3Cp7XFP0R6r12M86zisTzil6xwkrB+P2pHYbn
 izURPIF+kh7Sg3khUgu8/s95EHxq98xEVnKK/0P4zTLSAttx5ZEaV/FCeZvAMxMA
 t4WkYNpSJ1kUsq925K6KwWumHJ1vj+i8DXCPLOQCVbknlbfKU791edvvBxSTxHVq
 FxwLdiBADEOvbsXiHveN29Wb+j0g6mHVoA8VvPgN/z7HasE7dvTOSlZ2vvTDOAeh
 9GTGo1+UFsUtvs9yZ5xn41mi8+a4ANrcNfsN1gpZvuzo1S7g3Y4=
 =UnlH
 -----END PGP SIGNATURE-----

Merge tag 'ntfs3_for_6.5' of https://github.com/Paragon-Software-Group/linux-ntfs3

Pull ntfs3 updates from Konstantin Komarov:
 "Updates:
   - support /proc/fs/ntfs3/<dev>/volinfo and label
   - alternative boot if primary boot is corrupted
   - small optimizations

  Fixes:
   - fix endian problems
   - fix logic errors
   - code refactoring and reformatting"

* tag 'ntfs3_for_6.5' of https://github.com/Paragon-Software-Group/linux-ntfs3:
  fs/ntfs3: Correct mode for label entry inside /proc/fs/ntfs3/
  fs/ntfs3: Add support /proc/fs/ntfs3/<dev>/volinfo and /proc/fs/ntfs3/<dev>/label
  fs/ntfs3: Fix endian problem
  fs/ntfs3: Add ability to format new mft records with bigger/smaller header
  fs/ntfs3: Code refactoring
  fs/ntfs3: Code formatting
  fs/ntfs3: Do not update primary boot in ntfs_init_from_boot()
  fs/ntfs3: Alternative boot if primary boot is corrupted
  fs/ntfs3: Mark ntfs dirty when on-disk struct is corrupted
  fs/ntfs3: Fix ntfs_atomic_open
  fs/ntfs3: Correct checking while generating attr_list
  fs/ntfs3: Use __GFP_NOWARN allocation at ntfs_load_attr_list()
  fs: ntfs3: Fix possible null-pointer dereferences in mi_read()
  fs/ntfs3: Return error for inconsistent extended attributes
  fs/ntfs3: Enhance sanity check while generating attr_list
  fs/ntfs3: Use wrapper i_blocksize() in ntfs_zero_range()
  ntfs: Fix panic about slab-out-of-bounds caused by ntfs_listxattr()
2023-07-07 14:59:38 -07:00
Linus Torvalds
986ffe6070 \n
-----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCAAdFiEEq1nRK9aeMoq1VSgcnJ2qBz9kQNkFAmSoBkcACgkQnJ2qBz9k
 QNlG/wgA0J4gULTDiNu4xEQxmBWvQIzWTM9U9bfmCdGeuoMiAvwtTJszNGaEHYb3
 7rMHgvTKW9ap4u1/K9OCQin/TdQOyDgNcxbJIG1oU+qPiNcCHSpvnhVDxAhTGfEj
 TIZYYFNoihbEuEioFD0FojAU6tH17MJu9eUJ1qoHJSMqJLXRToWKezxMwPBIfpzp
 1kld9+1oRD4GLNK28PUGKk9St6G6uwcsDmdfPZHwYlTjQOlZ4Z7OFc//oKbRQqmc
 CFKkV5fn8zMVmAhsxhQ6VvZakgFINrDdqGOHibMVNYRLf0wqh45Oo2e6HKE2FH9b
 xRPQTbJf23xPC1c/g/W5aTO4ruc+XQ==
 =XP4O
 -----END PGP SIGNATURE-----

Merge tag 'fsnotify_for_v6.5-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs

Pull fsnotify fix from Jan Kara:
 "A fix for fanotify to disallow creating of mount or superblock marks
  for kernel internal pseudo filesystems"

* tag 'fsnotify_for_v6.5-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs:
  fanotify: disallow mount/sb marks on kernel internal pseudo fs
2023-07-07 14:51:37 -07:00
Linus Torvalds
7fdeb23f32 v6.5/vfs.fixes.2
-----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQRAhzRXHqcMeLMyaSiRxhvAZXjcogUCZKaBygAKCRCRxhvAZXjc
 ogwpAQC1lbyCAxmAoCa1ywELgqJnFSn8eWE63eC8lnnWw2zkBQEAyfQs1dCEc4E9
 Msf6muEBVg04OrA85czP4bw3KGzd7gQ=
 =JR2S
 -----END PGP SIGNATURE-----

Merge tag 'v6.5/vfs.fixes.2' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs

Pull vfs fixes from Christian Brauner:
 "This contains two minor fixes for Jan's rename locking work:

   - Unlocking the source inode was guarded by a check whether source
     was non-NULL. This doesn't make sense because source must be
     non-NULL and the commit message explains in detail why

   - The lock_two_nondirectories() helper called WARN_ON_ONCE() and
     dereferenced the inodes unconditionally but the underlying
     lock_two_inodes() helper and the kernel documentation for that
     function are clear that it is valid to pass NULL arguments, so a
     non-NULL check is needed. No callers currently pass NULL arguments
     but let's not knowingly leave landmines around"

* tag 'v6.5/vfs.fixes.2' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
  fs: don't assume arguments are non-NULL
  fs: no need to check source
2023-07-06 19:01:38 -07:00
Linus Torvalds
7b82e90411 asm-generic updates for 6.5
These are cleanups for architecture specific header files:
 
  - the comments in include/linux/syscalls.h have gone out of sync
    and are really pointless, so these get removed
 
  - The asm/bitsperlong.h header no longer needs to be architecture
    specific on modern compilers, so use a generic version for newer
    architectures that use new enough userspace compilers
 
  - A cleanup for virt_to_pfn/virt_to_bus to have proper type
    checking, forcing the use of pointers
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEiK/NIGsWEZVxh/FrYKtH/8kJUicFAmSl138ACgkQYKtH/8kJ
 UieqWxAA2WjNVfyuieYckglOVE0PZPs2fzCwyzTY5iUTH3gE5cBFWJDWcg2EnouG
 v3X3htEQcowYWaCF9+rypQXaGiSx4WXi2Bjxnz3D/BcreqWPI4eSQ0fpGG5SURTY
 2zYF72GTt4JGR++l+7/R9MZwPbwYDT9BsD5tkel8PxnyVLM6/c5xFvbjzRSKFE8x
 SMN1jGZ62ITLNf/8coAOEPNxBYtDT6yQyu7P2sx5cd65LAQq9yLKjFklnBBovgWT
 OoCIZAdGkhcNwOh1LjyHcdNdpfNJGceKyqKPqty07IhCQuF2jxiyFYFzuBbeyQfE
 S0itN8o/MIfUmxaQl3e8dPAVb1RlNVr1zfQ6y4tUtWNdkNL2WwSnSQSRHrBfHxCQ
 QCF++PMeFcLhGwMYtqdNJ7XGLQ0PsjD74pRf0vo+vjmqDk2BJsJBP57VU+8MJn5r
 SoxqnJ0WxLvm1TfrNKusV7zMNWquc2duJDW40zsOssP4itjYELSI6qa56qmzlqmX
 zKmRx6mxAlx9RRK8FHXFYHbz3p93vv8z9vTOZV3AjIjjED960CLknUAwCC8FoJyz
 9b5wyMXsLQHQjGt8luAvPc6OiU0EiU9a4SPK+feWcv27serFvnjJlRTS/yG2Z3zd
 BYsUgsXHypsdoud+aE7MeCy7fE8n3mhoyMQQRBkOMFJ7RsG6wAE=
 =S/he
 -----END PGP SIGNATURE-----

Merge tag 'asm-generic-6.5' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic

Pull asm-generic updates from Arnd Bergmann:
 "These are cleanups for architecture specific header files:

   - the comments in include/linux/syscalls.h have gone out of sync and
     are really pointless, so these get removed

   - The asm/bitsperlong.h header no longer needs to be architecture
     specific on modern compilers, so use a generic version for newer
     architectures that use new enough userspace compilers

   - A cleanup for virt_to_pfn/virt_to_bus to have proper type checking,
     forcing the use of pointers"

* tag 'asm-generic-6.5' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic:
  syscalls: Remove file path comments from headers
  tools arch: Remove uapi bitsperlong.h of hexagon and microblaze
  asm-generic: Unify uapi bitsperlong.h for arm64, riscv and loongarch
  m68k/mm: Make pfn accessors static inlines
  arm64: memory: Make virt_to_pfn() a static inline
  ARM: mm: Make virt_to_pfn() a static inline
  asm-generic/page.h: Make pfn accessors static inlines
  xen/netback: Pass (void *) to virt_to_page()
  netfs: Pass a pointer to virt_to_page()
  cifs: Pass a pointer to virt_to_page() in cifsglob
  cifs: Pass a pointer to virt_to_page()
  riscv: mm: init: Pass a pointer to virt_to_page()
  ARC: init: Pass a pointer to virt_to_pfn() in init
  m68k: Pass a pointer to virt_to_pfn() virt_to_page()
  fs/proc/kcore.c: Pass a pointer to virt_addr_valid()
2023-07-06 10:06:04 -07:00
Ronnie Sahlberg
d14de8067e cifs: Add a laundromat thread for cached directories
and drop cached directories after 30 seconds

Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
2023-07-05 22:36:07 -05:00
Linus Torvalds
73a3fcdaa7 f2fs update for 6.5-rc1
In this cycle, we've mainly investigated the zoned block device support along
 with patches such as correcting write pointers between f2fs and storage, adding
 asynchronous zone reset flow, and managing the number of open zones. Other than
 them, f2fs adds another mount option, "errors=x" to specify how to handle when
 it detects an unexpected behavior at runtime.
 
 Enhancement:
  - support errors=remount-ro|continue|panic mountoption
  - enforce some inode flag policies
  - allow .tmp compression given extensions
  - add some ioctls to manage the f2fs compression
  - improve looped node chain flow
  - avoid issuing small-sized discard commands during checkpoint
  - implement an asynchronous zone reset
 
 Bug fix:
  - fix deadlock in xattr and inode page lock
  - fix and add sanity check in some error paths
  - fix to avoid NULL pointer dereference f2fs_write_end_io() along with put_super
  - set proper flags to quota files
  - fix potential deadlock due to unpaired node_write lock use
  - fix over-estimating free section during FG GC
  - fix the wrong condition to determine atomic context
 
 As usual, also there are a number of patches having code refactoring and minor
 clean-ups.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEE00UqedjCtOrGVvQiQBSofoJIUNIFAmSlracACgkQQBSofoJI
 UNJETA//RhFkVGbKlZ7cAuB6xwaLPmRi+aUSn3/rBbdq6CjBOslClEYwW981Fe19
 q6VSIX7Zt5RKZg2+meHLhE28rGknLaZ4wFC/F4BDjoykN119g+KXkypjf+OVbatK
 zLImOzXVKjAWJevpYCENcUQY/xI2kNtg3csp6kq9GQoZCD2US/wzzI0QF6xCQ1q3
 WZpHHy2fi6Lry2xGEbDLKlxg9e5nDqhOSp6S/taF+w+RyUMlTcoIU2fIWT0iZ2kI
 taFe+PWHHkJg2oJcgZ+hx5ZACGb1BWdHg8n1N0MK/IiNbA6CWe+iSM9T0TtctM5V
 Gkz4233bFR976O27Bu2znqil+AH3ECZ/F2HbRbh5vTFJIhUMwNQ1GJTgDqE5Vf8h
 R4W/ejbBSLM/g4TK5boyWrrKpAAVmhFhSj+OBboIlFFKchP4RKaqL7Az+m0tGSD3
 0uCVAFtzmctBmgJ9ko+dxwwFwbbGiG6MDYeGIODMdqFQMQrcGqPoOrcWbj2hPoaW
 LRz/OzA8N0fQUfvCH6E31Ypd6cBFrG++FgtFA4lsv10KNsJUOx4MbEWxF5HobV3t
 axpcRsOOb95hAmqenY3sWyiR+IcIEAYlzx6D6QEbNPe47fuLxr9jx52Ollw6RDfj
 RvkaxzAIDeESMWWNftzJ8r8Wt6RzoBv5JjBkFIMbCz28V3v9R44=
 =TAG3
 -----END PGP SIGNATURE-----

Merge tag 'f2fs-for-6.5-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs

Pull f2fs updates from Jaegeuk Kim:
 "In this cycle, we've mainly investigated the zoned block device
  support along with patches such as correcting write pointers between
  f2fs and storage, adding asynchronous zone reset flow, and managing
  the number of open zones.

  Other than them, f2fs adds another mount option, "errors=x" to specify
  how to handle when it detects an unexpected behavior at runtime.

  Enhancements:
   - support 'errors=remount-ro|continue|panic' mount option
   - enforce some inode flag policies
   - allow .tmp compression given extensions
   - add some ioctls to manage the f2fs compression
   - improve looped node chain flow
   - avoid issuing small-sized discard commands during checkpoint
   - implement an asynchronous zone reset

  Bug fixes:
   - fix deadlock in xattr and inode page lock
   - fix and add sanity check in some error paths
   - fix to avoid NULL pointer dereference f2fs_write_end_io() along
     with put_super
   - set proper flags to quota files
   - fix potential deadlock due to unpaired node_write lock use
   - fix over-estimating free section during FG GC
   - fix the wrong condition to determine atomic context

  As usual, also there are a number of patches with code refactoring and
  minor clean-ups"

* tag 'f2fs-for-6.5-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs: (46 commits)
  f2fs: fix to do sanity check on direct node in truncate_dnode()
  f2fs: only set release for file that has compressed data
  f2fs: fix compile warning in f2fs_destroy_node_manager()
  f2fs: fix error path handling in truncate_dnode()
  f2fs: fix deadlock in i_xattr_sem and inode page lock
  f2fs: remove unneeded page uptodate check/set
  f2fs: update mtime and ctime in move file range method
  f2fs: compress tmp files given extension
  f2fs: refactor struct f2fs_attr macro
  f2fs: convert to use sbi directly
  f2fs: remove redundant assignment to variable err
  f2fs: do not issue small discard commands during checkpoint
  f2fs: check zone write pointer points to the end of zone
  f2fs: add f2fs_ioc_get_compress_blocks
  f2fs: cleanup MIN_INLINE_XATTR_SIZE
  f2fs: add helper to check compression level
  f2fs: set FMODE_CAN_ODIRECT instead of a dummy direct_IO method
  f2fs: do more sanity check on inode
  f2fs: compress: fix to check validity of i_compress_flag field
  f2fs: add sanity compress level check for compressed file
  ...
2023-07-05 14:14:37 -07:00
Linus Torvalds
bb8e7e9f0b More new code for 6.5:
* Fix some ordering problems with log items during log recovery.
  * Don't deadlock the system by trying to flush busy freed extents while
    holding on to busy freed extents.
  * Improve validation of log geometry parameters when reading the
    primary superblock.
  * Validate the length field in the AGF header.
  * Fix recordset filtering bugs when re-calling GETFSMAP to return more
    results when the resultset didn't previously fit in the caller's buffer.
  * Fix integer overflows in GETFSMAP when working with rt volumes larger
    than 2^32 fsblocks.
  * Fix GETFSMAP reporting the undefined space beyond the last rtextent.
  * Fix filtering bugs in GETFSMAP's log device backend if the log ever
    becomes longer than 2^32 fsblocks.
  * Improve validation of file offsets in the GETFSMAP range parameters.
  * Fix an off by one bug in the pmem media failure notification
    computation.
  * Validate the length field in the AGI header too.
 
 Signed-off-by: Darrick J. Wong <djwong@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQQ2qTKExjcn+O1o2YRKO3ySh0YRpgUCZKL9IwAKCRBKO3ySh0YR
 prFLAQC+dp1bV5ShBPfYJMCSUS7gmZEge01QrLTqcpyu8mO5GgD/YLUdD2Iebc8t
 AS1Awj1iec7AFtCWcd3bTeNZD7vL9w0=
 =j/oi
 -----END PGP SIGNATURE-----

Merge tag 'xfs-6.5-merge-5' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux

Pull more xfs updates from Darrick Wong:

 - Fix some ordering problems with log items during log recovery

 - Don't deadlock the system by trying to flush busy freed extents while
   holding on to busy freed extents

 - Improve validation of log geometry parameters when reading the
   primary superblock

 - Validate the length field in the AGF header

 - Fix recordset filtering bugs when re-calling GETFSMAP to return more
   results when the resultset didn't previously fit in the caller's
   buffer

 - Fix integer overflows in GETFSMAP when working with rt volumes larger
   than 2^32 fsblocks

 - Fix GETFSMAP reporting the undefined space beyond the last rtextent

 - Fix filtering bugs in GETFSMAP's log device backend if the log ever
   becomes longer than 2^32 fsblocks

 - Improve validation of file offsets in the GETFSMAP range parameters

 - Fix an off by one bug in the pmem media failure notification
   computation

 - Validate the length field in the AGI header too

* tag 'xfs-6.5-merge-5' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux:
  xfs: Remove unneeded semicolon
  xfs: AGI length should be bounds checked
  xfs: fix the calculation for "end" and "length"
  xfs: fix xfs_btree_query_range callers to initialize btree rec fully
  xfs: validate fsmap offsets specified in the query keys
  xfs: fix logdev fsmap query result filtering
  xfs: clean up the rtbitmap fsmap backend
  xfs: fix getfsmap reporting past the last rt extent
  xfs: fix integer overflows in the fsmap rtbitmap and logdev backends
  xfs: fix interval filtering in multi-step fsmap queries
  xfs: fix bounds check in xfs_defer_agfl_block()
  xfs: AGF length has never been bounds checked
  xfs: journal geometry is not properly bounds checked
  xfs: don't block in busy flushing when freeing extents
  xfs: allow extent free intents to be retried
  xfs: pass alloc flags through to xfs_extent_busy_flush()
  xfs: use deferred frees for btree block freeing
  xfs: don't reverse order of items in bulk AIL insertion
  xfs: remove redundant initializations of pointers drop_leaf and save_leaf
2023-07-05 14:08:03 -07:00
David Howells
03275585ca afs: Fix accidental truncation when storing data
When an AFS FS.StoreData RPC call is made, amongst other things it is
given the resultant file size to be.  On the server, this is processed
by truncating the file to new size and then writing the data.

Now, kafs has a lock (vnode->io_lock) that serves to serialise
operations against a specific vnode (ie.  inode), but the parameters for
the op are set before the lock is taken.  This allows two writebacks
(say sync and kswapd) to race - and if writes are ongoing the writeback
for a later write could occur before the writeback for an earlier one if
the latter gets interrupted.

Note that afs_writepages() cannot take i_mutex and only takes a shared
lock on vnode->validate_lock.

Also note that the server does the truncation and the write inside a
lock, so there's no problem at that end.

Fix this by moving the calculation for the proposed new i_size inside
the vnode->io_lock.  Also reset the iterator (which we might have read
from) and update the mtime setting there.

Fixes: bd80d8a80e ("afs: Use ITER_XARRAY for writing")
Reported-by: Marc Dionne <marc.dionne@auristor.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: Jeffrey Altman <jaltman@auristor.com>
Reviewed-by: Marc Dionne <marc.dionne@auristor.com>
cc: linux-afs@lists.infradead.org
cc: linux-fsdevel@vger.kernel.org
Link: https://lore.kernel.org/r/3526895.1687960024@warthog.procyon.org.uk/
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2023-07-04 12:24:32 -07:00
Linus Torvalds
538140ca60 overlayfs update for 6.5 - part 2
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEE9zuTYTs0RXF+Ke33EVvVyTe/1WoFAmSkQkYACgkQEVvVyTe/
 1Wr21A/+LLrN6fmY8wCnZqdk1i3Qlob8CpqeGEL0zV3yMVeE+d16AxI3uvm8GCAZ
 RAkTWUVwiLqnv8Ua28lylvWiM4Rmr1gwxoCCd19NlHrU2Suy599dXeE530aUxpuo
 PG74XR+3RHQ+tg1PJmLUiOtSqOwgBeWHaESfTyVhdNM7ywQGU4PCejxD8WAfOAuu
 rzEfYsX1deQ0vidRvc/Y2XFdW91woML2FoMmyqhS52pinRJ04clIheSe8poczgQ4
 xIg+GMc6XdIYW42XdEKPGE/sCEq3nFLAQZgMJDXUvt6fAHxeVhirslNKYwzBS35w
 lqlUTTcf7oo0yJr2vf8Ut9yNMZPYaxsOr9UR4+23IkJj9NP8jkowwih0nc85R1yr
 8eYLu3IwNlmP6azg4dHtzw5UZzJ6eabG63gNVJCyl7wqC+hkcOvt3IQDv2nklX38
 oDMX9VO3IUNyl5katAjGhPYYmUXTB/YJgnbgQGu77ZVizKQHgonyFglNQb8USFwP
 gHkLE3RWuTd6BaNOcKqDJSr63S6ylLPX7xfvYvrncTnoeGCcumKFhAYx3fHX3kIo
 42ZyEiVq0hhmqfRH5AQgE9MhBzN7c6jmrSyQHM6GaMvmLCDRtKBVQH6Y4iC5VoJ1
 JSXLFAy9aVu0DdcCdTsjgOb1t8FOyC0kFtZsdZLnDQKcZngRXoo=
 =YILi
 -----END PGP SIGNATURE-----

Merge tag 'ovl-update-6.5-2' of git://git.kernel.org/pub/scm/linux/kernel/git/overlayfs/vfs

Pull more overlayfs updates from Amir Goldstein:
 "This is a small 'move code around' followup by Christian to his work
  on porting overlayfs to the new mount api for 6.5. It makes things a
  bit cleaner and simpler for the next development cycle when I hand
  overlayfs back over to Miklos"

* tag 'ovl-update-6.5-2' of git://git.kernel.org/pub/scm/linux/kernel/git/overlayfs/vfs:
  ovl: move all parameter handling into params.{c,h}
2023-07-04 11:52:54 -07:00
Linus Torvalds
94c76955e8 gfs2 fixes
- Move the freeze/thaw logic from glock callback context to process /
   worker thread context to prevent deadlocks.
 
 - Fix a quota reference couting bug in do_qc().
 
 - Carry on deallocating inodes even when gfs2_rindex_update() fails.
 
 - Retry filesystem-internal reads when they are interruped by a signal.
 
 - Eliminate kmap_atomic() in favor of kmap_local_page() /
   memcpy_{from,to}_page().
 
 - Get rid of noop_direct_IO.
 
 - And a few more minor fixes and cleanups.
 -----BEGIN PGP SIGNATURE-----
 
 iQJIBAABCAAyFiEEJZs3krPW0xkhLMTc1b+f6wMTZToFAmSkEWIUHGFncnVlbmJh
 QHJlZGhhdC5jb20ACgkQ1b+f6wMTZTqTSQ/+OKFnr1O1pVFP2k4CA5bGuJX4sfOt
 TT045OAfy7VuE79q1Xi6HZ880kdP7fg4XUbcrwm45fEt5Q5kbbe6pjP34fCGGVSE
 vLmb5BG/msb6fDLRN17S6uzyqKwFv9tGofTAtCtWTs7gmtezpBByHNHAu6duG89c
 5LOhUbz3BhIOPrHUs6RB3h/OpmofVNwWCKQmfCEzYsHMzoWz1XuZJMWRb5Ho6BII
 mOF0tWlEfr5gdavyWZ9UkuHqe3HI/1PfUNFVAD9S/V3kp2XPnc+HM3yP+S8df4p8
 HT5VqVjH3JQL7sf6CnUXo9LP1veB+hHuvAaOyrHIKqHbIHwgxlWIIYcEYWKEGG8p
 1CMjm6bI/hLQuFBUrpD1z3ppLavPZdl16Z3kCAVx8Ixl3livqMuiiZGBRdtBGdBr
 RT66+SX1GurLp1EcWncqEXdJ5jgYKqVeArZKdh3thXSVaO+b8yq3IeuDRHseLnLA
 egyzO3yLocvze3YFiZI4Y0V4ako9NO+2GNPd5+O9Bh9L7RepjiMCloDNpfDPQsTR
 fJdW4t9vMts2eCzRASsXBUdUUV0k1Qvxdm+9BbDHFW9YvH7BQzEoi6qTZFsamdGT
 NQqKdhrfiKsSEA8HFCgePYPzOGMAyH7fiaQUaXUGwkZy6AVmK+piKlJn+csNKbHQ
 3dUo4upAzoeMwsA=
 =GKxj
 -----END PGP SIGNATURE-----

Merge tag 'gfs2-v6.4-rc5-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2

Pull gfs2 updates from Andreas Gruenbacher:

 - Move the freeze/thaw logic from glock callback context to process /
   worker thread context to prevent deadlocks

 - Fix a quota reference couting bug in do_qc()

 - Carry on deallocating inodes even when gfs2_rindex_update() fails

 - Retry filesystem-internal reads when they are interruped by a signal

 - Eliminate kmap_atomic() in favor of kmap_local_page() /
   memcpy_{from,to}_page()

 - Get rid of noop_direct_IO

 - And a few more minor fixes and cleanups

* tag 'gfs2-v6.4-rc5-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2: (23 commits)
  gfs2: Add quota_change type
  gfs2: Use memcpy_{from,to}_page where appropriate
  gfs2: Convert remaining kmap_atomic calls to kmap_local_page
  gfs2: Replace deprecated kmap_atomic with kmap_local_page
  gfs: Get rid of unnucessary locking in inode_go_dump
  gfs2: gfs2_freeze_lock_shared cleanup
  gfs2: Replace sd_freeze_state with SDF_FROZEN flag
  gfs2: Rework freeze / thaw logic
  gfs2: Rename SDF_{FS_FROZEN => FREEZE_INITIATOR}
  gfs2: Reconfiguring frozen filesystem already rejected
  gfs2: Rename gfs2_freeze_lock{ => _shared }
  gfs2: Rename the {freeze,thaw}_super callbacks
  gfs2: Rename remaining "transaction" glock references
  gfs2: retry interrupted internal reads
  gfs2: Fix possible data races in gfs2_show_options()
  gfs2: Fix duplicate should_fault_in_pages() call
  gfs2: set FMODE_CAN_ODIRECT instead of a dummy direct_IO method
  gfs2: Don't remember delete unless it's successful
  gfs2: Update rl_unlinked before releasing rgrp lock
  gfs2: Fix gfs2_qa_get imbalance in gfs2_quota_hold
  ...
2023-07-04 11:45:16 -07:00
Amir Goldstein
69562eb0bd fanotify: disallow mount/sb marks on kernel internal pseudo fs
Hopefully, nobody is trying to abuse mount/sb marks for watching all
anonymous pipes/inodes.

I cannot think of a good reason to allow this - it looks like an
oversight that dated back to the original fanotify API.

Link: https://lore.kernel.org/linux-fsdevel/20230628101132.kvchg544mczxv2pm@quack3/
Fixes: 0ff21db9fc ("fanotify: hooks the fanotify_mark syscall to the vfsmount code")
Signed-off-by: Amir Goldstein <amir73il@gmail.com>
Reviewed-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Jan Kara <jack@suse.cz>
Message-Id: <20230629042044.25723-1-amir73il@gmail.com>
2023-07-04 13:29:29 +02:00
Christian Brauner
33ab231f83
fs: don't assume arguments are non-NULL
The helper is explicitly documented as locking zero, one, or two
arguments. While all current callers do pass non-NULL arguments there's
no need or requirement for them to do so according to the code and the
unlock_two_nondirectories() helper is pretty clear about it as well. So
only call WARN_ON_ONCE() if the checked inode is valid.

Fixes: 2454ad83b9 ("fs: Restrict lock_two_nondirectories() to non-directory inodes")
Reviewed-by: Jan Kara <jack@suse.cz>
Cc: Jan Kara <jack@suse.cz>
Message-Id: <20230703-vfs-rename-source-v1-2-37eebb29b65b@kernel.org>
Signed-off-by: Christian Brauner <brauner@kernel.org>
2023-07-04 10:21:11 +02:00
Jan Kara
66d8fc0539
fs: no need to check source
The @source inode must be valid. It is even checked via IS_SWAPFILE()
above making it pretty clear. So no need to check it when we unlock.

What doesn't need to exist is the @target inode. The lock_two_inodes()
helper currently swaps the @inode1 and @inode2 arguments if @inode1 is
NULL to have consistent lock class usage. However, we know that at least
for vfs_rename() that @inode1 is @source and thus is never NULL as per
above. We also know that @source is a different inode than @target as
that is checked right at the beginning of vfs_rename(). So we know that
@source is valid and locked and that @target is locked. So drop the
check whether @source is non-NULL.

Fixes: 28eceeda13 ("fs: Lock moved directories")
Reported-by: kernel test robot <lkp@intel.com>
Reported-by: Dan Carpenter <dan.carpenter@linaro.org>
Closes: https://lore.kernel.org/r/202307030026.9sE2pk2x-lkp@intel.com
Message-Id: <20230703-vfs-rename-source-v1-1-37eebb29b65b@kernel.org>
[brauner: use commit message from patch I sent concurrently]
Signed-off-by: Christian Brauner <brauner@kernel.org>
2023-07-04 10:20:29 +02:00
Bob Peterson
432928c937 gfs2: Add quota_change type
Function do_qc has two main uses: (1) to re-sync the local quota changes
(qd) to the master quotas, and (2) normal quota changes. In the case of
normal quota changes, the change can be positive or negative, as the
quota usage goes up and down.

Before this patch function do_qc was distinguishing one from another by
whether the resulting value is or isn't zero: In the case of a re-sync
(called do_sync) the quota value is moved from the temporary value to a
master value, so the amount is added to one and subtracted from the
other. The problem is that since the values can be positive or negative
we can occasionally run into situations where we are not doing a re-sync
but the quota change just happens to cancel out the previous value.

In the case of a re-sync extra references and locks are taken, and so
do_qc needs to release them. In the case of a normal quota change, no
extra references and locks are taken, so it must not try to release
them.

The problem is: if the quota change is not a re-sync but the value just
happens to cancel out the original quota change, the resulting zero
value fools do_qc into thinking this is a re-sync and therefore it must
release the extra references. This results in problems, mainly having to
do with slot reference numbers going smaller than zero.

This patch introduces new constants, QC_SYNC and QC_CHANGE so do_qc can
really tell the difference. For QC_SYNC calls it must release the extra
references acquired by gfs2_quota_unlock's call to qd_check_sync. For
QC_CHANGE calls it does not have extra references to put.

Note that this allows quota changes back to a value of zero, and so I
removed an assert warning related to that.

Signed-off-by: Bob Peterson <rpeterso@redhat.com>
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2023-07-03 22:30:48 +02:00
Andreas Gruenbacher
d68d0c6c3f gfs2: Use memcpy_{from,to}_page where appropriate
Replace kmap_local_page() + memcpy() + kunmap_local() sequences with
memcpy_{from,to}_page() where we are not doing anything else with the
mapped page.

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2023-07-03 22:30:48 +02:00
Andreas Gruenbacher
b0c21c6d52 gfs2: Convert remaining kmap_atomic calls to kmap_local_page
Replace the remaining instances of kmap_atomic() ... kunmap_atomic()
with kmap_local_page() ... kunmap_local().

In gfs2_write_buf_to_page(), we can call flush_dcache_page() after
unmapping the page.

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2023-07-03 22:30:47 +02:00
Deepak R Varma
58721bd46c gfs2: Replace deprecated kmap_atomic with kmap_local_page
kmap_atomic() is deprecated in favor of kmap_local_{folio,page}().

Therefore, replace kmap_atomic() with kmap_local_page() in
gfs2_internal_read() and stuffed_readpage().

kmap_atomic() disables page-faults and preemption (the latter only for
!PREEMPT_RT kernels), However, the code within the mapping/un-mapping in
gfs2_internal_read() and stuffed_readpage() does not depend on the
above-mentioned side effects.

Therefore, a mere replacement of the old API with the new one is all that
is required (i.e., there is no need to explicitly add any calls to
pagefault_disable() and/or preempt_disable()).

Signed-off-by: Deepak R Varma <drv@mailo.com>
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2023-07-03 22:30:47 +02:00
Andreas Gruenbacher
f246dd4b78 gfs: Get rid of unnucessary locking in inode_go_dump
Commit 27a2660f1e ("gfs2: Dump nrpages for inodes and their glocks")
added some locking around reading inode->i_data.nrpages.  That locking
doesn't do anything really, so get rid of it.

With that, the glock argument to ->go_dump() can be made const again as
well.

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2023-07-03 22:30:47 +02:00
Andreas Gruenbacher
6c7410f449 gfs2: gfs2_freeze_lock_shared cleanup
All the remaining users of gfs2_freeze_lock_shared() set freeze_gh to
&sdp->sd_freeze_gh and flags to 0, so remove those two parameters.

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2023-07-03 22:30:26 +02:00
Andreas Gruenbacher
5432af15f8 gfs2: Replace sd_freeze_state with SDF_FROZEN flag
Replace sd_freeze_state with a new SDF_FROZEN flag.

There no longer is a need for indicating that a freeze is in progress
(SDF_STARTING_FREEZE); we are now protecting the critical sections with
the sd_freeze_mutex.

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2023-07-03 22:30:23 +02:00
Andreas Gruenbacher
b77b4a4815 gfs2: Rework freeze / thaw logic
So far, at mount time, gfs2 would take the freeze glock in shared mode
and then immediately drop it again, turning it into a cached glock that
can be reclaimed at any time.  To freeze the filesystem cluster-wide,
the node initiating the freeze would take the freeze glock in exclusive
mode, which would cause the freeze glock's freeze_go_sync() callback to
run on each node.  There, gfs2 would freeze the filesystem and schedule
gfs2_freeze_func() to run.  gfs2_freeze_func() would re-acquire the
freeze glock in shared mode, thaw the filesystem, and drop the freeze
glock again.  The initiating node would keep the freeze glock held in
exclusive mode.  To thaw the filesystem, the initiating node would drop
the freeze glock again, which would allow gfs2_freeze_func() to resume
on all nodes, leaving the filesystem in the thawed state.

It turns out that in freeze_go_sync(), we cannot reliably and safely
freeze the filesystem.  This is primarily because the final unmount of a
filesystem takes a write lock on the s_umount rw semaphore before
calling into gfs2_put_super(), and freeze_go_sync() needs to call
freeze_super() which also takes a write lock on the same semaphore,
causing a deadlock.  We could work around this by trying to take an
active reference on the super block first, which would prevent unmount
from running at the same time.  But that can fail, and freeze_go_sync()
isn't actually allowed to fail.

To get around this, this patch changes the freeze glock locking scheme
as follows:

At mount time, each node takes the freeze glock in shared mode.  To
freeze a filesystem, the initiating node first freezes the filesystem
locally and then drops and re-acquires the freeze glock in exclusive
mode.  All other nodes notice that there is contention on the freeze
glock in their go_callback callbacks, and they schedule
gfs2_freeze_func() to run.  There, they freeze the filesystem locally
and drop and re-acquire the freeze glock before re-thawing the
filesystem.  This is happening outside of the glock state engine, so
there, we are allowed to fail.

From a cluster point of view, taking and immediately dropping a glock is
indistinguishable from taking the glock and only dropping it upon
contention, so this new scheme is compatible with the old one.

Thanks to Li Dong <lidong@vivo.com> for reporting a locking bug in
gfs2_freeze_func() in a previous version of this commit.

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2023-07-03 22:25:02 +02:00