linux/fs
Roman Gushchin 593311e85b writeback, cgroup: do not reparent dax inodes
The inode switching code is not suited for dax inodes.  An attempt to
switch a dax inode to a parent writeback structure (as a part of a
writeback cleanup procedure) results in a panic like this:

  run fstests generic/270 at 2021-07-15 05:54:02
  XFS (pmem0p2): EXPERIMENTAL big timestamp feature in use.  Use at your own risk!
  XFS (pmem0p2): DAX enabled. Warning: EXPERIMENTAL, use at your own risk
  XFS (pmem0p2): EXPERIMENTAL inode btree counters feature in use. Use at your own risk!
  XFS (pmem0p2): Mounting V5 Filesystem
  XFS (pmem0p2): Ending clean mount
  XFS (pmem0p2): Quotacheck needed: Please wait.
  XFS (pmem0p2): Quotacheck: Done.
  XFS (pmem0p2): xlog_verify_grant_tail: space > BBTOB(tail_blocks)
  XFS (pmem0p2): xlog_verify_grant_tail: space > BBTOB(tail_blocks)
  XFS (pmem0p2): xlog_verify_grant_tail: space > BBTOB(tail_blocks)
  BUG: unable to handle page fault for address: 0000000005b0f669
  #PF: supervisor read access in kernel mode
  #PF: error_code(0x0000) - not-present page
  PGD 0 P4D 0
  Oops: 0000 [#1] SMP PTI
  CPU: 13 PID: 10479 Comm: kworker/13:16 Not tainted 5.14.0-rc1-master-8096acd7442e+ #8
  Hardware name: HP ProLiant DL360 Gen9/ProLiant DL360 Gen9, BIOS P89 09/13/2016
  Workqueue: inode_switch_wbs inode_switch_wbs_work_fn
  RIP: 0010:inode_do_switch_wbs+0xaf/0x470
  Code: 00 30 0f 85 c1 03 00 00 0f 1f 44 00 00 31 d2 48 c7 c6 ff ff ff ff 48 8d 7c 24 08 e8 eb 49 1a 00 48 85 c0 74 4a bb ff ff ff ff <48> 8b 50 08 48 8d 4a ff 83 e2 01 48 0f 45 c1 48 8b 00 a8 08 0f 85
  RSP: 0018:ffff9c66691abdc8 EFLAGS: 00010002
  RAX: 0000000005b0f661 RBX: 00000000ffffffff RCX: ffff89e6a21382b0
  RDX: 0000000000000001 RSI: ffff89e350230248 RDI: ffffffffffffffff
  RBP: ffff89e681d19400 R08: 0000000000000000 R09: 0000000000000228
  R10: ffffffffffffffff R11: ffffffffffffffc0 R12: ffff89e6a2138130
  R13: ffff89e316af7400 R14: ffff89e316af6e78 R15: ffff89e6a21382b0
  FS:  0000000000000000(0000) GS:ffff89ee5fb40000(0000) knlGS:0000000000000000
  CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  CR2: 0000000005b0f669 CR3: 0000000cb2410004 CR4: 00000000001706e0
  Call Trace:
   inode_switch_wbs_work_fn+0xb6/0x2a0
   process_one_work+0x1e6/0x380
   worker_thread+0x53/0x3d0
   kthread+0x10f/0x130
   ret_from_fork+0x22/0x30
  Modules linked in: xt_CHECKSUM xt_MASQUERADE xt_conntrack ipt_REJECT nf_reject_ipv4 nft_compat nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 nft_counter nf_tables nfnetlink bridge stp llc rfkill sunrpc intel_rapl_msr intel_rapl_common sb_edac x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel ipmi_ssif kvm mgag200 i2c_algo_bit iTCO_wdt irqbypass drm_kms_helper iTCO_vendor_support acpi_ipmi rapl syscopyarea sysfillrect intel_cstate ipmi_si sysimgblt ioatdma dax_pmem_compat fb_sys_fops ipmi_devintf device_dax i2c_i801 pcspkr intel_uncore hpilo nd_pmem cec dax_pmem_core dca i2c_smbus acpi_tad lpc_ich ipmi_msghandler acpi_power_meter drm fuse xfs libcrc32c sd_mod t10_pi crct10dif_pclmul crc32_pclmul crc32c_intel tg3 ghash_clmulni_intel serio_raw hpsa hpwdt scsi_transport_sas wmi dm_mirror dm_region_hash dm_log dm_mod
  CR2: 0000000005b0f669
  ---[ end trace ed2105faff8384f3 ]---
  RIP: 0010:inode_do_switch_wbs+0xaf/0x470
  Code: 00 30 0f 85 c1 03 00 00 0f 1f 44 00 00 31 d2 48 c7 c6 ff ff ff ff 48 8d 7c 24 08 e8 eb 49 1a 00 48 85 c0 74 4a bb ff ff ff ff <48> 8b 50 08 48 8d 4a ff 83 e2 01 48 0f 45 c1 48 8b 00 a8 08 0f 85
  RSP: 0018:ffff9c66691abdc8 EFLAGS: 00010002
  RAX: 0000000005b0f661 RBX: 00000000ffffffff RCX: ffff89e6a21382b0
  RDX: 0000000000000001 RSI: ffff89e350230248 RDI: ffffffffffffffff
  RBP: ffff89e681d19400 R08: 0000000000000000 R09: 0000000000000228
  R10: ffffffffffffffff R11: ffffffffffffffc0 R12: ffff89e6a2138130
  R13: ffff89e316af7400 R14: ffff89e316af6e78 R15: ffff89e6a21382b0
  FS:  0000000000000000(0000) GS:ffff89ee5fb40000(0000) knlGS:0000000000000000
  CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  CR2: 0000000005b0f669 CR3: 0000000cb2410004 CR4: 00000000001706e0
  Kernel panic - not syncing: Fatal exception
  Kernel Offset: 0x15200000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
  ---[ end Kernel panic - not syncing: Fatal exception ]---

The crash happens on an attempt to iterate over attached pagecache pages
and check the dirty flag: a dax inode's xarray contains pfn's instead of
generic struct page pointers.

This happens for DAX and not for other kinds of non-page entries in the
inodes because it's a tagged iteration, and shadow/swap entries are
never tagged; only DAX entries get tagged.

Fix the problem by bailing out (with the false return value) of
inode_prepare_sbs_switch() if a dax inode is passed.

[willy@infradead.org: changelog addition]

Link: https://lkml.kernel.org/r/20210719171350.3876830-1-guro@fb.com
Fixes: c22d70a162 ("writeback, cgroup: release dying cgwbs by switching attached inodes")
Signed-off-by: Roman Gushchin <guro@fb.com>
Reported-by: Murphy Zhou <jencce.kernel@gmail.com>
Reported-by: Darrick J. Wong <djwong@kernel.org>
Tested-by: Darrick J. Wong <djwong@kernel.org>
Tested-by: Murphy Zhou <jencce.kernel@gmail.com>
Acked-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Jan Kara <jack@suse.cz>
Cc: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-07-23 17:43:28 -07:00
..
9p 9p for 5.13-rc1 2021-05-07 11:18:52 -07:00
adfs mm: require ->set_page_dirty to be explicitly wired up 2021-06-29 10:53:48 -07:00
affs mm: require ->set_page_dirty to be explicitly wired up 2021-06-29 10:53:48 -07:00
afs afs: Remove redundant assignment to ret 2021-07-21 15:11:22 +01:00
autofs autofs: should_expire() argument is guaranteed to be positive 2021-03-24 14:14:27 -04:00
befs fs/befs: Delete obsolete TODO file 2021-03-30 16:54:49 -07:00
bfs mm: require ->set_page_dirty to be explicitly wired up 2021-06-29 10:53:48 -07:00
btrfs for-5.14-rc1-tag 2021-07-13 12:02:07 -07:00
cachefiles fscache, cachefiles: Add alternate API to use kiocb for read/write to cache 2021-04-23 10:14:32 +01:00
ceph ceph: don't WARN if we're still opening a session to an MDS 2021-07-20 17:57:33 +02:00
cifs cifs: do not share tcp sessions of dfs connections 2021-07-16 00:21:47 -05:00
coda coda: fix reference counting in coda_file_mmap error path 2021-04-23 14:42:39 -07:00
configfs configfs: fix the read and write iterators 2021-07-13 20:56:24 +02:00
cramfs cramfs: use %pD instead of messing with file_dentry()->d_name 2021-01-05 23:02:47 -05:00
crypto fscrypt: fix derivation of SipHash keys on big endian CPUs 2021-06-05 00:52:52 -07:00
debugfs Linux 5.13-rc6 2021-06-14 09:07:45 +02:00
devpts
dlm fs: dlm: invalid buffer access in lookup error 2021-06-11 12:44:47 -05:00
ecryptfs mm: require ->set_page_dirty to be explicitly wired up 2021-06-29 10:53:48 -07:00
efivarfs efivars: convert to fileattr 2021-04-12 15:04:29 +02:00
efs
erofs erofs: clean up file headers & footers 2021-06-08 00:41:24 +08:00
exfat Description for this pull request: 2021-07-06 11:06:04 -07:00
exportfs exportfs: Add a function to return the raw output from fh_to_dentry() 2020-12-09 09:39:38 -05:00
ext2 fs: remove noop_set_page_dirty() 2021-06-29 10:53:48 -07:00
ext4 Ext4 regression and bug fixes for v5.14-rc1 2021-07-09 09:57:27 -07:00
f2fs f2fs: drop dirty node pages when cp is in error status 2021-07-06 22:05:06 -07:00
fat mm: require ->set_page_dirty to be explicitly wired up 2021-06-29 10:53:48 -07:00
freevxfs
fscache fscache, cachefiles: Add alternate API to use kiocb for read/write to cache 2021-04-23 10:14:32 +01:00
fuse fuse update for 5.14 2021-07-06 11:17:41 -07:00
gfs2 Various minor gfs2 cleanups and fixes 2021-06-29 20:23:08 -07:00
hfs hfs: add lock nesting notation to hfs_find_init 2021-07-15 10:13:49 -07:00
hfsplus hfsplus: report create_date to kstat.btime 2021-07-01 11:06:06 -07:00
hostfs Merge branch 'work.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2021-05-02 09:14:01 -07:00
hpfs mm: require ->set_page_dirty to be explicitly wired up 2021-06-29 10:53:48 -07:00
hugetlbfs Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace 2021-06-28 20:39:26 -07:00
iomap iomap: Don't create iomap_page objects in iomap_page_mkwrite_actor 2021-07-15 09:58:06 -07:00
isofs isofs: remove redundant continue statement 2021-06-17 17:11:42 +02:00
jbd2 ext4: inline jbd2_journal_[un]register_shrinker() 2021-07-08 08:37:31 -04:00
jffs2 This pull request contains changes for JFFS2, UBI and UBIFS 2021-05-04 18:08:40 -07:00
jfs JFS fixes for 5.14 2021-07-02 14:25:17 -07:00
kernfs Driver core changes for 5.14-rc1 2021-07-05 13:51:41 -07:00
lockd lockd: Update the NLMv4 SHARE results encoder to use struct xdr_stream 2021-07-06 20:14:44 -04:00
minix mm: require ->set_page_dirty to be explicitly wired up 2021-06-29 10:53:48 -07:00
netfs netfs: fix test for whether we can skip read when writing beyond EOF 2021-06-21 21:24:07 +01:00
nfs NFS client updates for Linux 5.14 2021-07-09 09:43:57 -07:00
nfs_common nfs_common: fix doc warning 2021-07-06 20:14:41 -04:00
nfsd block-5.14-2021-07-08 2021-07-09 12:05:33 -07:00
nilfs2 Merge branch 'akpm' (patches from Andrew) 2021-07-02 12:08:10 -07:00
nls
notify fanotify: fix copy_event_to_user() fid error clean up 2021-06-14 12:16:37 +02:00
ntfs Merge branch 'work.iov_iter' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2021-07-03 11:30:04 -07:00
ocfs2 In addition to bug fixes and cleanups, there are two new features for 2021-06-30 19:37:39 -07:00
omfs mm: require ->set_page_dirty to be explicitly wired up 2021-06-29 10:53:48 -07:00
openpromfs openpromfs: don't do unlock_new_inode() until the new inode is set up 2021-03-12 22:15:22 -05:00
orangefs orangefs: fix orangefs df output. 2021-06-28 08:40:08 -04:00
overlayfs overlayfs update for 5.13 2021-04-30 15:17:08 -07:00
proc Merge branch 'work.namei' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2021-07-03 11:41:14 -07:00
pstore for-5.14/drivers-2021-06-29 2021-06-30 12:21:16 -07:00
qnx4
qnx6
quota quota: remove unnecessary oom message 2021-06-22 10:40:52 +02:00
ramfs fs: move ramfs_aops to libfs 2021-06-29 10:53:48 -07:00
reiserfs \n 2021-07-01 12:06:39 -07:00
romfs
squashfs squashfs: add option to panic on errors 2021-06-29 10:53:46 -07:00
sysfs sysfs: Support zapping of binary attr mmaps 2021-01-12 14:26:31 +01:00
sysv mm: require ->set_page_dirty to be explicitly wired up 2021-06-29 10:53:48 -07:00
tracefs tracing: Fix various typos in comments 2021-03-23 14:08:18 -04:00
ubifs ubifs: Set/Clear I_LINKABLE under i_lock for whiteout inode 2021-06-22 09:21:39 +02:00
udf \n 2021-07-01 12:06:39 -07:00
ufs mm: require ->set_page_dirty to be explicitly wired up 2021-06-29 10:53:48 -07:00
unicode .gitignore: prefix local generated files with a slash 2021-05-02 00:43:35 +09:00
vboxsf vboxsf: Add support for the atomic_open directory-inode op 2021-06-23 14:36:52 +02:00
verity fsverity: relax build time dependency on CRYPTO_SHA256 2021-04-22 17:31:32 +10:00
xfs Fixes for 5.14-rc: 2021-07-18 11:27:25 -07:00
zonefs zonefs: remove redundant null bio check 2021-07-16 13:45:18 +09:00
aio.c Revert "mremap: don't allow MREMAP_DONTUNMAP on special_mappings and aio" 2021-04-30 11:20:39 -07:00
anon_inodes.c fs: anon_inodes: rephrase to appropriate kernel-doc 2021-01-15 12:17:25 -05:00
attr.c ima: handle idmapped mounts 2021-01-24 14:27:20 +01:00
bad_inode.c fs: make helpers idmap mount aware 2021-01-24 14:27:20 +01:00
binfmt_aout.c binfmt: remove in-tree usage of MAP_EXECUTABLE 2021-06-29 10:53:50 -07:00
binfmt_elf_fdpic.c Merge branch 'akpm' (patches from Andrew) 2021-06-29 17:29:11 -07:00
binfmt_elf.c Merge branch 'akpm' (patches from Andrew) 2021-06-29 17:29:11 -07:00
binfmt_em86.c
binfmt_flat.c binfmt: remove in-tree usage of MAP_EXECUTABLE 2021-06-29 10:53:50 -07:00
binfmt_misc.c binfmt_misc: fix possible deadlock in bm_register_write 2021-03-13 11:27:30 -08:00
binfmt_script.c
block_dev.c Char / Misc driver updates for 5.14-rc1 2021-07-05 13:42:16 -07:00
buffer.c mm/writeback: move __set_page_dirty() to core mm 2021-06-29 10:53:48 -07:00
char_dev.c
compat_binfmt_elf.c get rid of COMPAT_ELF_EXEC_PAGESIZE 2021-01-06 08:42:51 -05:00
coredump.c Merge branch 'work.namei' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2021-07-03 11:41:14 -07:00
d_path.c getcwd(2): clean up error handling 2021-05-18 20:15:58 -04:00
dax.c dax: fix ENOMEM handling in grab_mapping_entry() 2021-06-29 10:53:47 -07:00
dcache.c useful constants: struct qstr for ".." 2021-04-15 22:36:45 -04:00
direct-io.c fs: direct-io: fix missing sdio->boundary 2021-04-09 14:54:23 -07:00
drop_caches.c
eventfd.c
eventpoll.c fs/epoll: restore waking from ep_done_scan() 2021-05-06 19:24:13 -07:00
exec.c Merge branch 'akpm' (patches from Andrew) 2021-07-02 12:08:10 -07:00
fcntl.c fcntl: Fix unreachable code in do_fcntl() 2021-07-12 11:09:13 -05:00
fhandle.c switch file_open_root() to struct path 2021-04-07 13:56:43 -04:00
file_table.c
file.c Merge branch 'work.file' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2021-05-03 11:05:28 -07:00
filesystems.c
fs_context.c fs: add vfs_parse_fs_param_source() helper 2021-07-14 09:19:06 -07:00
fs_parser.c vfs: fs_parser: clean up kernel-doc warnings 2021-04-30 11:20:35 -07:00
fs_pin.c
fs_struct.c
fs_types.c
fs-writeback.c writeback, cgroup: do not reparent dax inodes 2021-07-23 17:43:28 -07:00
fsopen.c
init.c init: handle idmapped mounts 2021-01-24 14:27:19 +01:00
inode.c mm: remove nrexceptional from inode: remove BUG_ON 2021-05-05 11:27:20 -07:00
internal.h switch file_open_root() to struct path 2021-04-07 13:56:43 -04:00
io_uring.c io_uring: fix io_drain_req() 2021-07-11 16:39:06 -06:00
io-wq.c io_uring: fix false WARN_ONCE 2021-06-18 09:22:02 -06:00
io-wq.h io_uring: move creds from io-wq work to io_kiocb 2021-06-18 09:22:02 -06:00
ioctl.c vfs: add fileattr ops 2021-04-12 15:04:23 +02:00
Kconfig mm: hugetlb: introduce CONFIG_HUGETLB_PAGE_FREE_VMEMMAP_DEFAULT_ON 2021-06-30 20:47:26 -07:00
Kconfig.binfmt binfmt_flat: allow not offsetting data start 2021-04-19 09:56:37 +10:00
kernel_read_file.c switch file_open_root() to struct path 2021-04-07 13:56:43 -04:00
libfs.c fs: remove noop_set_page_dirty() 2021-06-29 10:53:48 -07:00
locks.c Additional fixes and clean-ups for NFSD since tags/nfsd-5.13, 2021-05-05 13:44:19 -07:00
Makefile netfs: Provide readahead and readpage netfs helpers 2021-04-23 10:14:32 +01:00
mbcache.c
mount.h mount: make {lock,unlock}_mount_hash() static 2021-01-24 14:29:34 +01:00
mpage.c block: rename BIO_MAX_PAGES to BIO_MAX_VECS 2021-03-11 07:47:48 -07:00
namei.c Merge branch 'work.namei' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2021-07-03 11:41:14 -07:00
namespace.c mount: Support "nosymfollow" in new mount api 2021-06-01 12:09:27 +02:00
no-block.c
nsfs.c
open.c Merge branch 'work.namei' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2021-07-03 11:41:14 -07:00
pipe.c fs: delete repeated words in comments 2021-02-24 13:38:26 -08:00
pnode.c
pnode.h mount: fix mounting of detached mounts onto targets that reside on shared mounts 2021-03-08 15:18:43 +01:00
posix_acl.c fs: make helpers idmap mount aware 2021-01-24 14:27:20 +01:00
proc_namespace.c fs: introduce MOUNT_ATTR_IDMAP 2021-01-24 14:43:45 +01:00
read_write.c teach sendfile(2) to handle send-to-pipe directly 2021-01-25 23:29:36 -05:00
readdir.c readdir: make sure to verify directory entry for legacy interfaces too 2021-04-17 11:39:49 -07:00
remap_range.c ioctl: handle idmapped mounts 2021-01-24 14:27:19 +01:00
select.c kernel, fs: Introduce and use set_restart_fn() and arch_set_restart_data() 2021-03-16 22:13:10 +01:00
seq_file.c seq_file: disallow extremely large seq buffer allocations 2021-07-19 17:18:48 -07:00
signalfd.c signalfd: Remove SIL_PERF_EVENT fields from signalfd_siginfo 2021-05-18 16:20:54 -05:00
splice.c for-5.12/block-2021-02-17 2021-02-21 11:02:48 -08:00
stack.c
stat.c fs: fix reporting supported extra file attributes for statx() 2021-04-17 23:03:50 -04:00
statfs.c s390,alpha: switch to 64-bit ino_t 2021-02-13 17:17:53 +01:00
super.c block: move bd_mutex to struct gendisk 2021-06-01 07:44:32 -06:00
sync.c
timerfd.c
userfaultfd.c userfaultfd: do not untag user pointers 2021-07-23 17:43:28 -07:00
utimes.c utimes: handle idmapped mounts 2021-01-24 14:27:18 +01:00
xattr.c xattr: fix kernel-doc for mnt_userns and vfs xattr helpers 2021-03-23 11:20:26 +01:00