linux/fs
Dave Chinner b5f17bec12 xfs: log shutdown triggers should only shut down the log
We've got a mess on our hands.

1. xfs_trans_commit() cannot cancel transactions because the mount is
shut down - that causes dirty, aborted, unlogged log items to sit
unpinned in memory and potentially get written to disk before the
log is shut down. Hence xfs_trans_commit() can only abort
transactions when xlog_is_shutdown() is true.

2. xfs_force_shutdown() is used in places to cause the current
modification to be aborted via xfs_trans_commit() because it may be
impractical or impossible to cancel the transaction directly, and
hence xfs_trans_commit() must cancel transactions when
xfs_is_shutdown() is true in this situation. But we can't do that
because of #1.

3. Log IO errors cause log shutdowns by calling xfs_force_shutdown()
to shut down the mount and then the log from log IO completion.

4. xfs_force_shutdown() can result in a log force being issued,
which has to wait for log IO completion before it will mark the log
as shut down. If #3 races with some other shutdown trigger that runs
a log force, we rely on xfs_force_shutdown() silently ignoring #3
and avoiding shutting down the log until the failed log force
completes.

5. To ensure #2 always works, we have to ensure that
xfs_force_shutdown() does not return until the the log is shut down.
But in the case of #4, this will result in a deadlock because the
log Io completion will block waiting for a log force to complete
which is blocked waiting for log IO to complete....

So the very first thing we have to do here to untangle this mess is
dissociate log shutdown triggers from mount shutdowns. We already
have xlog_forced_shutdown, which will atomically transistion to the
log a shutdown state. Due to internal asserts it cannot be called
multiple times, but was done simply because the only place that
could call it was xfs_do_force_shutdown() (i.e. the mount shutdown!)
and that could only call it once and once only.  So the first thing
we do is remove the asserts.

We then convert all the internal log shutdown triggers to call
xlog_force_shutdown() directly instead of xfs_force_shutdown(). This
allows the log shutdown triggers to shut down the log without
needing to care about mount based shutdown constraints. This means
we shut down the log independently of the mount and the mount may
not notice this until it's next attempt to read or modify metadata.
At that point (e.g. xfs_trans_commit()) it will see that the log is
shutdown, error out and shutdown the mount.

To ensure that all the unmount behaviours and asserts track
correctly as a result of a log shutdown, propagate the shutdown up
to the mount if it is not already set. This keeps the mount and log
state in sync, and saves a huge amount of hassle where code fails
because of a log shutdown but only checks for mount shutdowns and
hence ends up doing the wrong thing. Cleaning up that mess is
an exercise for another day.

This enables us to address the other problems noted above in
followup patches.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2022-03-29 18:22:01 -07:00
..
9p Revert "fs/9p: search open fids first" 2022-01-30 22:13:37 +09:00
adfs fs/adfs: remove unneeded variable make code cleaner 2022-01-20 08:52:55 +02:00
affs affs: use bdev_nr_sectors instead of open coding it 2021-10-18 14:43:22 -06:00
afs proc: remove PDE_DATA() completely 2022-01-22 08:33:37 +02:00
autofs autofs: fix wait name hash calculation in autofs_wait() 2021-10-20 21:09:02 -04:00
befs isystem: ship and use stdarg.h 2021-08-19 09:02:55 +09:00
bfs mm: require ->set_page_dirty to be explicitly wired up 2021-06-29 10:53:48 -07:00
btrfs for-5.17-rc5-tag 2022-02-25 14:08:03 -08:00
cachefiles netfs, cachefiles: Add a method to query presence of data in the cache 2022-02-01 10:29:18 -06:00
ceph ceph: set pool_ns in new inode layout for async creates 2022-01-26 20:17:50 +01:00
cifs cifs: fix confusing unneeded warning message on smb2.1 and earlier 2022-02-16 17:16:49 -06:00
coda coda: bump module version to 7.2 2021-11-09 10:02:51 -08:00
configfs configfs: fix a race in configfs_{,un}register_subsystem() 2022-02-22 18:30:28 +01:00
cramfs cramfs: use bdev_nr_bytes instead of open coding it 2021-10-18 14:43:22 -06:00
crypto fscrypt: improve a few comments 2021-10-25 19:11:50 -07:00
debugfs debugfs: lockdown: Allow reading debugfs files that are not world readable 2022-01-06 15:47:41 +01:00
devpts fsnotify: fix fsnotify hooks in pseudo filesystems 2022-01-24 14:17:02 +01:00
dlm driver core changes for 5.17-rc1 2022-01-12 11:11:34 -08:00
ecryptfs fs: add is_idmapped_mnt() helper 2021-12-03 18:44:06 +01:00
efivarfs
efs
erofs erofs: fix small compressed files inlining 2022-02-04 12:37:12 +08:00
exfat exfat: fix missing REQ_SYNC in exfat_update_bhs() 2022-01-10 11:00:04 +09:00
exportfs
ext2 fsdax: shift partition offset handling into the file systems 2021-12-04 08:58:54 -08:00
ext4 Various bug fixes for ext4 fast commit and inline data handling. Also 2022-02-06 10:34:45 -08:00
f2fs Fix from Christoph Hellwig merging the CONFIG_UNICODE_UTF8_DATA into the 2022-02-01 11:13:24 -08:00
fat FAT: use io_schedule_timeout() instead of congestion_wait() 2022-01-20 08:52:54 +02:00
freevxfs
fscache fscache: Fix the volume collision wait condition 2022-01-21 21:36:28 +00:00
fuse virtio,vdpa,qemu_fw_cfg: features, cleanups, fixes 2022-01-18 10:05:48 +02:00
gfs2 gfs2 fixes: 2022-02-11 11:36:32 -08:00
hfs Merge branch 'akpm' (patches from Andrew) 2021-11-09 10:11:53 -08:00
hfsplus hfsplus: use struct_group_attr() for memcpy() region 2022-01-20 08:52:54 +02:00
hostfs hostfs: Fix writeback of dirty pages 2021-12-21 21:44:27 +01:00
hpfs treewide: Replace open-coded flex arrays in unions 2021-10-18 12:28:53 -07:00
hugetlbfs hugetlbfs: fix off-by-one error in hugetlb_vmdelete_list() 2022-01-15 16:30:30 +02:00
iomap xfs, iomap: limit individual ioend chain lengths in writeback 2022-01-26 09:19:20 -08:00
isofs isofs: Fix out of bound access for corrupted isofs image 2021-10-19 12:51:02 +02:00
jbd2 Various bug fixes for ext4 fast commit and inline data handling. Also 2022-02-06 10:34:45 -08:00
jffs2 Merge branch 'signal-for-v5.17' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace 2022-01-17 05:49:30 +02:00
jfs Just one JFS patch 2021-11-03 09:23:25 -07:00
kernfs kernfs: prevent early freeing of root node 2021-12-03 14:36:21 +01:00
ksmbd ksmbd: add support for key exchange 2022-02-04 00:12:22 -06:00
lockd Notable bug fixes: 2022-02-02 10:14:31 -08:00
minix mm: require ->set_page_dirty to be explicitly wired up 2021-06-29 10:53:48 -07:00
netfs netfs: Make ops->init_rreq() optional 2022-01-21 21:36:28 +00:00
nfs NFS: Do not report writeback errors in nfs_getattr() 2022-02-16 15:15:22 -05:00
nfs_common nfs: Fix kerneldoc warning shown up by W=1 2021-10-04 22:02:17 +01:00
nfsd Notable bug fixes: 2022-02-09 09:56:57 -08:00
nilfs2 Merge branch 'akpm' (patches from Andrew) 2022-01-20 10:41:01 +02:00
nls
notify fanotify: Fix stale file descriptor in copy_event_to_user() 2022-02-01 12:52:07 +01:00
ntfs fs/ntfs/attrib.c: fix one kernel-doc comment 2022-01-15 16:30:24 +02:00
ntfs3 mm: remove cleancache 2022-01-22 08:33:38 +02:00
ocfs2 ocfs2: fix a deadlock when commit trans 2022-01-30 09:56:58 +02:00
omfs mm: require ->set_page_dirty to be explicitly wired up 2021-06-29 10:53:48 -07:00
openpromfs
orangefs orangefs: Fix the size of a memory allocation in orangefs_bufmap_alloc() 2021-12-31 14:37:43 -05:00
overlayfs overlayfs fixes for 5.17-rc3 2022-02-01 11:23:02 -08:00
proc fs/proc: task_mmu.c: don't read mapcount for migration entry 2022-02-11 17:55:00 -08:00
pstore pstore update for v5.17-rc1 2022-01-10 11:48:37 -08:00
qnx4 qnx4: work around gcc false positive warning bug 2021-09-21 08:36:48 -07:00
qnx6
quota quota: make dquot_quota_sync return errors from ->sync_fs 2022-01-30 08:59:47 -08:00
ramfs Merge branch 'akpm' (patches from Andrew) 2021-11-09 10:11:53 -08:00
reiserfs reiserfs: don't use congestion_wait() 2021-11-18 11:52:22 +01:00
romfs
smbfs_common smb3: add new defines from protocol specification 2022-01-18 16:50:47 -06:00
squashfs squashfs: provide backing_dev_info in order to disable read-ahead 2022-01-15 16:30:24 +02:00
sysfs fs/sysfs/dir.c: replace S_IRWXU|S_IRUGO|S_IXUGO with 0755 sysfs_create_dir_ns() 2021-10-05 16:35:05 +02:00
sysv sysv: use BUILD_BUG_ON instead of runtime check 2021-11-09 10:02:52 -08:00
tracefs tracefs: Set the group ownership in apply_options() not parse_options() 2022-02-25 21:05:04 -05:00
ubifs ubifs: read-only if LEB may always be taken in ubifs_garbage_collect 2021-12-23 22:30:38 +01:00
udf udf: Restore i_lenAlloc when inode expansion fails 2022-01-24 14:45:02 +01:00
ufs isystem: ship and use stdarg.h 2021-08-19 09:02:55 +09:00
unicode Fix from Christoph Hellwig merging the CONFIG_UNICODE_UTF8_DATA into the 2022-02-01 11:13:24 -08:00
vboxsf vboxfs: fix broken legacy mount signature checking 2021-09-27 11:26:21 -07:00
verity fs-verity: fix signed integer overflow with i_size near S64_MAX 2021-09-22 10:56:34 -07:00
xfs xfs: log shutdown triggers should only shut down the log 2022-03-29 18:22:01 -07:00
zonefs zonefs: add MODULE_ALIAS_FS 2021-12-17 16:56:35 +09:00
aio.c aio: move aio sysctl to aio.c 2022-01-22 08:33:34 +02:00
anon_inodes.c fs: add anon_inode_getfile_secure() similar to anon_inode_getfd_secure() 2021-09-19 22:35:37 -04:00
attr.c fs: handle circular mappings correctly 2021-11-17 09:26:09 +01:00
bad_inode.c vfs: add rcu argument to ->get_acl() callback 2021-08-18 22:08:24 +02:00
binfmt_aout.c binfmt: a.out: Fix bogus semicolon 2021-09-05 10:15:05 -07:00
binfmt_elf_fdpic.c coredump: Limit coredumps to a single thread group 2021-10-08 12:06:02 -05:00
binfmt_elf.c fs/binfmt_elf: fix PT_LOAD p_align values for loaders 2022-02-11 17:55:00 -08:00
binfmt_flat.c binfmt: remove in-tree usage of MAP_EXECUTABLE 2021-06-29 10:53:50 -07:00
binfmt_misc.c Fix regression due to "fs: move binfmt_misc sysctl to its own file" 2022-02-09 09:50:02 -08:00
binfmt_script.c
buffer.c fs/buffer: Convert __block_write_begin_int() to take a folio 2021-12-16 15:49:51 -05:00
char_dev.c
compat_binfmt_elf.c
coredump.c fs/coredump: move coredump sysctls into its own file 2022-01-22 08:33:36 +02:00
d_path.c d_path: fix Kernel doc validator complaining 2021-11-06 13:30:32 -07:00
dax.c dax: remove the copy_from_iter and copy_to_iter methods 2021-12-18 08:04:53 -08:00
dcache.c fs: move dcache sysctls to its own file 2022-01-22 08:33:36 +02:00
direct-io.c fs: get rid of the res2 iocb->ki_complete argument 2021-10-25 10:36:24 -06:00
drop_caches.c fs: drop_caches: fix skipping over shadow cache inodes 2021-09-03 09:58:10 -07:00
eventfd.c eventfd: Export eventfd_wake_count to modules 2021-09-06 07:20:56 -04:00
eventpoll.c eventpoll: simplify sysctl declaration with register_sysctl() 2022-01-22 08:33:35 +02:00
exec.c fs/coredump: move coredump sysctls into its own file 2022-01-22 08:33:36 +02:00
fcntl.c Merge branch 'akpm' (patches from Andrew) 2021-09-03 10:08:28 -07:00
fhandle.c
file_table.c fs/file_table: fix adding missing kmemleak_not_leak() 2022-02-17 10:23:19 -08:00
file.c fget: clarify and improve __fget_files() implementation 2021-12-13 10:55:30 -08:00
filesystems.c fs: simplify get_filesystem_list / get_all_fs_names 2021-08-23 01:25:40 -04:00
fs_context.c vfs: fs_context: fix up param length parsing in legacy_parse_param 2022-01-18 09:23:19 +02:00
fs_parser.c fs_parse: allow parameter value to be empty 2021-12-09 14:09:36 -05:00
fs_pin.c
fs_struct.c
fs_types.c
fs-writeback.c fscache rewrite 2022-01-12 13:45:12 -08:00
fsopen.c
init.c
inode.c fs: move inode sysctls to its own file 2022-01-22 08:33:35 +02:00
internal.h fs/buffer: Convert __block_write_begin_int() to take a folio 2021-12-16 15:49:51 -05:00
io_uring.c io_uring: disallow modification of rsrc_data during quiesce 2022-02-22 09:57:32 -07:00
io-wq.c io_uring-5.17-2022-01-21 2022-01-21 16:07:21 +02:00
io-wq.h Merge branch 'signal-for-v5.17' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace 2022-01-17 05:49:30 +02:00
ioctl.c fs/ioctl: remove unnecessary __user annotation 2022-01-15 16:30:25 +02:00
Kconfig ksmbd: add support for key exchange 2022-02-04 00:12:22 -06:00
Kconfig.binfmt binfmt: remove support for em86 (alpha only) 2021-07-25 22:33:03 -07:00
kernel_read_file.c vfs: check fd has read access in kernel_read_file_from_fd() 2021-10-18 20:22:03 -10:00
libfs.c unicode: clean up the Kconfig symbol confusion 2022-01-20 19:57:24 -05:00
locks.c fs: move locking sysctls where they are used 2022-01-22 08:33:36 +02:00
Makefile Fix from Christoph Hellwig merging the CONFIG_UNICODE_UTF8_DATA into the 2022-02-01 11:13:24 -08:00
mbcache.c
mount.h
mpage.c mm: remove cleancache 2022-01-22 08:33:38 +02:00
namei.c \n 2022-01-28 17:51:31 +02:00
namespace.c fs: add kernel doc for mnt_{hold,unhold}_writers() 2022-02-14 08:35:32 +01:00
no-block.c
nsfs.c
open.c fs: support mapped mounts of mapped filesystems 2021-12-05 10:28:57 +01:00
pipe.c fs: move pipe sysctls to is own file 2022-01-22 08:33:36 +02:00
pnode.c
pnode.h
posix_acl.c fs: support mapped mounts of mapped filesystems 2021-12-05 10:28:57 +01:00
proc_namespace.c fs: add is_idmapped_mnt() helper 2021-12-03 18:44:06 +01:00
read_write.c fs: remove leftover comments from mandatory locking removal 2021-10-26 12:20:50 -04:00
readdir.c
remap_range.c fs: Convert vfs_dedupe_file_range_compare to folios 2022-01-08 00:28:41 -05:00
select.c select: Fix indefinitely sleeping task in poll_schedule_timeout() 2022-01-11 09:03:05 -08:00
seq_file.c seq_file: move seq_escape() to a header 2021-11-09 10:02:52 -08:00
signalfd.c Merge branch 'signal-for-v5.17' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace 2022-01-17 05:49:30 +02:00
splice.c
stack.c
stat.c fs: add generic helper for filling statx attribute flags 2021-08-17 11:47:43 +02:00
statfs.c
super.c vfs: make freeze_super abort when sync_filesystem returns error 2022-01-30 08:59:47 -08:00
sync.c vfs: make sync_filesystem return errors from ->sync_fs 2022-01-30 08:59:47 -08:00
sysctls.c fs: move namespace sysctls and declare fs base directory 2022-01-22 08:33:36 +02:00
timerfd.c timerfd: Provide timerfd_resume() 2021-08-10 17:57:22 +02:00
userfaultfd.c mm: move anon_vma declarations to linux/mm_inline.h 2022-01-15 16:30:27 +02:00
utimes.c
xattr.c