linux/fs/btrfs
Qu Wenruo 2749f7ef36 btrfs: subpage: avoid potential deadlock with compression and delalloc
[BUG]
With experimental subpage compression enabled, a simple fsstress can
lead to self deadlock on page 720896:

        mkfs.btrfs -f -s 4k $dev > /dev/null
        mount $dev -o compress $mnt
        $fsstress -p 1 -n 100 -w -d $mnt -v -s 1625511156

[CAUSE]
If we have a file layout looks like below:

	0	32K	64K	96K	128K
	|//|		|///////////////|
	   4K

Then we run delalloc range for the inode, it will:

- Call find_lock_delalloc_range() with @delalloc_start = 0
  Then we got a delalloc range [0, 4K).

  This range will be COWed.

- Call find_lock_delalloc_range() again with @delalloc_start = 4K
  Since find_lock_delalloc_range() never cares whether the range
  is still inside page range [0, 64K), it will return range [64K, 128K).

  This range meets the condition for subpage compression, will go
  through async COW path.

  And async COW path will return @page_started.

  But that @page_started is now for range [64K, 128K), not for range
  [0, 64K).

- writepage_dellloc() returned 1 for page [0, 64K)
  Thus page [0, 64K) will not be unlocked, nor its page dirty status
  will be cleared.

Next time when we try to lock page [0, 64K) we will deadlock, as there
is no one to release page [0, 64K).

This problem will never happen for regular page size as one page only
contains one sector.  After the first find_lock_delalloc_range() call,
the @delalloc_end will go beyond @page_end no matter if we found a
delalloc range or not

Thus this bug only happens for subpage, as now we need multiple runs to
exhaust the delalloc range of a page.

[FIX]
Fix the problem by ensuring the delalloc range we ran at least started
inside @locked_page.

So that we will never get incorrect @page_started.

And to prevent such problem from happening again:

- Make find_lock_delalloc_range() return false if the found range is
  beyond @end value passed in.

  Since @end will be utilized now, add an ASSERT() to ensure we pass
  correct @end into find_lock_delalloc_range().

  This also means, for selftests we needs to populate @end before calling
  find_lock_delalloc_range().

- New ASSERT() in find_lock_delalloc_range()
  Now we will make sure the @start/@end passed in at least covers part
  of the page.

- New ASSERT() in run_delalloc_range()
  To make sure the range at least starts inside @locked page.

- Use @delalloc_start as proper cursor, while @delalloc_end is always
  reset to @page_end.

Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2021-10-26 19:08:05 +02:00
..
tests btrfs: subpage: avoid potential deadlock with compression and delalloc 2021-10-26 19:08:05 +02:00
acl.c overlayfs update for 5.15 2021-09-02 09:21:27 -07:00
async-thread.c Btrfs: fix crash during unmount due to race with delayed inode workers 2020-03-23 17:01:51 +01:00
async-thread.h Btrfs: fix crash during unmount due to race with delayed inode workers 2020-03-23 17:01:51 +01:00
backref.c btrfs: remove ignore_offset argument from btrfs_find_all_roots() 2021-08-23 13:19:01 +02:00
backref.h btrfs: remove ignore_offset argument from btrfs_find_all_roots() 2021-08-23 13:19:01 +02:00
block-group.c btrfs: zoned: add a dedicated data relocation block group 2021-10-26 19:08:01 +02:00
block-group.h btrfs: zoned: implement active zone tracking 2021-10-26 19:07:59 +02:00
block-rsv.c btrfs: introduce mount option rescue=ignorebadroots 2020-12-08 15:53:41 +01:00
block-rsv.h btrfs: Remove __ prefix from btrfs_block_rsv_release 2020-03-23 17:01:55 +01:00
btrfs_inode.h btrfs: keep track of the last logged keys when logging a directory 2021-10-26 19:08:02 +02:00
check-integrity.c btrfs: rename struct btrfs_io_bio to btrfs_bio 2021-10-26 19:08:02 +02:00
check-integrity.h btrfs: remove btrfsic_submit_bh() 2020-03-23 17:01:39 +01:00
compression.c btrfs: subpage: make end_compressed_bio_writeback() compatible 2021-10-26 19:08:04 +02:00
compression.h btrfs: determine stripe boundary at bio allocation time in btrfs_submit_compressed_write 2021-10-26 19:08:04 +02:00
ctree.c btrfs: unexport setup_items_for_insert() 2021-10-26 19:08:03 +02:00
ctree.h btrfs: remove unused function btrfs_bio_fits_in_stripe() 2021-10-26 19:08:04 +02:00
delalloc-space.c btrfs: fix typos in comments 2021-06-22 14:11:57 +02:00
delalloc-space.h btrfs: make btrfs_delalloc_reserve_space take btrfs_inode 2020-07-27 12:55:36 +02:00
delayed-inode.c btrfs: loop only once over data sizes array when inserting an item batch 2021-10-26 19:08:03 +02:00
delayed-inode.h btrfs: make btrfs_delayed_update_inode take btrfs_inode 2020-12-08 15:54:10 +01:00
delayed-ref.c btrfs: fix lock inversion problem when doing qgroup extent tracing 2021-07-22 15:50:07 +02:00
delayed-ref.h btrfs: only let one thread pre-flush delayed refs in commit 2021-02-08 22:58:56 +01:00
dev-replace.c btrfs: fix typos in comments 2021-06-22 14:11:57 +02:00
dev-replace.h btrfs: zoned: mark block groups to copy for device-replace 2021-02-09 02:46:07 +01:00
dir-item.c btrfs: unify lookup return value when dir entry is missing 2021-10-07 22:06:32 +02:00
discard.c btrfs: fix typos in comments 2021-06-22 14:11:57 +02:00
discard.h btrfs: cleanup btrfs_discard_update_discardable usage 2020-12-08 15:54:02 +01:00
disk-io.c btrfs: assert that extent buffers are write locked instead of only locked 2021-10-26 19:08:02 +02:00
disk-io.h btrfs: rename struct btrfs_io_bio to btrfs_bio 2021-10-26 19:08:02 +02:00
export.c btrfs: locking: rip out path->leave_spinning 2020-12-08 15:54:02 +01:00
export.h btrfs: export helpers for subvolume name/id resolution 2020-03-23 17:01:42 +01:00
extent_io.c btrfs: subpage: avoid potential deadlock with compression and delalloc 2021-10-26 19:08:05 +02:00
extent_io.h btrfs: cleanup for extent_write_locked_range() 2021-10-26 19:08:04 +02:00
extent_map.c btrfs: rename btrfs_bio to btrfs_io_context 2021-10-26 19:08:02 +02:00
extent_map.h btrfs: remove extent_map::bdev 2019-11-18 23:43:44 +01:00
extent-io-tree.h btrfs: use fixed width int type for extent_state::state 2020-12-08 15:54:13 +01:00
extent-tree.c btrfs: assert that extent buffers are write locked instead of only locked 2021-10-26 19:08:02 +02:00
file-item.c btrfs: rename struct btrfs_io_bio to btrfs_bio 2021-10-26 19:08:02 +02:00
file.c btrfs: subpage: add bitmap for PageChecked flag 2021-10-26 19:08:03 +02:00
free-space-cache.c btrfs: subpage: add bitmap for PageChecked flag 2021-10-26 19:08:03 +02:00
free-space-cache.h btrfs: zoned: track unusable bytes for zones 2021-02-09 02:46:03 +01:00
free-space-tree.c btrfs: fix possible free space tree corruption with online conversion 2021-01-25 18:44:37 +01:00
free-space-tree.h btrfs: rename btrfs_block_group_cache 2019-11-18 17:51:51 +01:00
inode-item.c btrfs: locking: rip out path->leave_spinning 2020-12-08 15:54:02 +01:00
inode.c btrfs: subpage: avoid potential deadlock with compression and delalloc 2021-10-26 19:08:05 +02:00
ioctl.c btrfs: defrag: enable defrag for subpage case 2021-10-26 19:07:58 +02:00
Kconfig btrfs: disable build on platforms having page size 256K 2021-06-22 14:11:57 +02:00
locking.c btrfs: fix typos in comments 2021-06-22 14:11:57 +02:00
locking.h btrfs: assert that extent buffers are write locked instead of only locked 2021-10-26 19:08:02 +02:00
lzo.c btrfs: subpage: make lzo_compress_pages() compatible 2021-10-26 19:08:05 +02:00
Makefile btrfs: initial fsverity support 2021-08-23 13:19:09 +02:00
misc.h btrfs: use correct header for div_u64 in misc.h 2021-09-07 14:29:50 +02:00
ordered-data.c btrfs: zoned: fix double counting of split ordered extent 2021-09-07 14:30:41 +02:00
ordered-data.h btrfs: remove uptodate parameter from btrfs_dec_test_first_ordered_pending 2021-08-23 13:19:02 +02:00
orphan.c
print-tree.c btrfs: print the actual offset in btrfs_root_name 2021-01-07 17:25:05 +01:00
print-tree.h btrfs: print the actual offset in btrfs_root_name 2021-01-07 17:25:05 +01:00
props.c btrfs: props: change how empty value is interpreted 2021-06-22 14:11:58 +02:00
props.h
qgroup.c btrfs: remove ignore_offset argument from btrfs_find_all_roots() 2021-08-23 13:19:01 +02:00
qgroup.h btrfs: fix lock inversion problem when doing qgroup extent tracing 2021-07-22 15:50:07 +02:00
raid56.c btrfs: remove btrfs_raid_bio::fs_info member 2021-10-26 19:08:03 +02:00
raid56.h btrfs: remove btrfs_raid_bio::fs_info member 2021-10-26 19:08:03 +02:00
rcu-string.h btrfs: rcu-string: Replace zero-length array with flexible-array member 2020-03-23 17:01:53 +01:00
reada.c btrfs: rename btrfs_bio to btrfs_io_context 2021-10-26 19:08:02 +02:00
ref-verify.c btrfs: stop doing GFP_KERNEL memory allocations in the ref verify tool 2021-08-23 13:19:00 +02:00
ref-verify.h
reflink.c btrfs: subpage: add bitmap for PageChecked flag 2021-10-26 19:08:03 +02:00
reflink.h Btrfs: move all reflink implementation code into its own file 2020-03-23 17:01:54 +01:00
relocation.c btrfs: rename setup_extent_mapping in relocation code 2021-10-26 19:08:01 +02:00
root-tree.c btrfs: qgroup: fix qgroup meta rsv leak for subvolume operations 2020-10-07 12:12:13 +02:00
scrub.c btrfs: remove btrfs_raid_bio::fs_info member 2021-10-26 19:08:03 +02:00
send.c btrfs: send: simplify send_create_inode_if_needed 2021-10-25 21:17:16 +02:00
send.h btrfs: send: avoid copying file data 2020-10-07 12:13:17 +02:00
space-info.c btrfs: prevent __btrfs_dump_space_info() to underflow its free space 2021-09-17 19:29:54 +02:00
space-info.h btrfs: rip out btrfs_space_info::total_bytes_pinned 2021-06-22 14:55:25 +02:00
struct-funcs.c btrfs: add special case to setget helpers for 64k pages 2021-08-23 13:18:58 +02:00
subpage.c btrfs: handle page locking in btrfs_page_end_writer_lock with no writers 2021-10-26 19:08:05 +02:00
subpage.h btrfs: rework page locking in __extent_writepage() 2021-10-26 19:08:05 +02:00
super.c btrfs: use latest_dev in btrfs_show_devname 2021-10-26 19:08:00 +02:00
sysfs.c btrfs: sysfs: document structures and their associated files 2021-08-23 13:19:12 +02:00
sysfs.h btrfs: split and refactor btrfs_sysfs_remove_devices_dir 2020-10-07 12:12:21 +02:00
transaction.c btrfs: rework chunk allocation to avoid exhaustion of the system chunk array 2021-07-07 17:42:41 +02:00
transaction.h btrfs: rework chunk allocation to avoid exhaustion of the system chunk array 2021-07-07 17:42:41 +02:00
tree-checker.c btrfs: add ro compat flags to inodes 2021-08-23 13:19:09 +02:00
tree-checker.h
tree-defrag.c btrfs: locking: remove all the blocking helpers 2020-12-08 15:54:01 +01:00
tree-log.c btrfs: use single bulk copy operations when logging directories 2021-10-26 19:08:03 +02:00
tree-log.h btrfs: keep track of the last logged keys when logging a directory 2021-10-26 19:08:02 +02:00
tree-mod-log.c btrfs: fix race when picking most recent mod log operation for an old root 2021-04-20 19:27:17 +02:00
tree-mod-log.h btrfs: add and use helper to get lowest sequence number for the tree mod log 2021-04-19 17:25:17 +02:00
ulist.c
ulist.h
uuid-tree.c btrfs: remove unnecessary casts in printk 2020-12-08 15:53:52 +01:00
verity.c btrfs: fix transaction handle leak after verity rollback failure 2021-09-17 19:29:41 +02:00
volumes.c btrfs: remove btrfs_raid_bio::fs_info member 2021-10-26 19:08:03 +02:00
volumes.h btrfs: rename struct btrfs_io_bio to btrfs_bio 2021-10-26 19:08:02 +02:00
xattr.c btrfs: assert that extent buffers are write locked instead of only locked 2021-10-26 19:08:02 +02:00
xattr.h
zlib.c btrfs: rework btrfs_decompress_buf2page() 2021-08-23 13:19:04 +02:00
zoned.c btrfs: rename btrfs_bio to btrfs_io_context 2021-10-26 19:08:02 +02:00
zoned.h btrfs: zoned: add a dedicated data relocation block group 2021-10-26 19:08:01 +02:00
zstd.c btrfs: rework btrfs_decompress_buf2page() 2021-08-23 13:19:04 +02:00