linux/fs/btrfs
Qu Wenruo 3d078efae6 btrfs: subpage: fix a rare race between metadata endio and eb freeing
[BUG]
There is a very rare ASSERT() triggering during full fstests run for
subpage rw support.

No other reproducer so far.

The ASSERT() gets triggered for metadata read in
btrfs_page_set_uptodate() inside end_page_read().

[CAUSE]
There is still a small race window for metadata only, the race could
happen like this:

                T1                  |              T2
------------------------------------+-----------------------------
end_bio_extent_readpage()           |
|- btrfs_validate_metadata_buffer() |
|  |- free_extent_buffer()          |
|     Still have 2 refs             |
|- end_page_read()                  |
   |- if (unlikely(PagePrivate())   |
   |  The page still has Private    |
   |                                | free_extent_buffer()
   |                                | |  Only one ref 1, will be
   |                                | |  released
   |                                | |- detach_extent_buffer_page()
   |                                |    |- btrfs_detach_subpage()
   |- btrfs_set_page_uptodate()     |
      The page no longer has Private|
      >>> ASSERT() triggered <<<    |

This race window is super small, thus pretty hard to hit, even with so
many runs of fstests.

But the race window is still there, we have to go another way to solve
it other than relying on random PagePrivate() check.

Data path is not affected, as it will lock the page before reading,
while unlocking the page after the last read has finished, thus no race
window.

[FIX]
This patch will fix the bug by repurposing btrfs_subpage::readers.

Now btrfs_subpage::readers will be a member shared by both metadata and
data.

For metadata path, we don't do the page unlock as metadata only relies
on extent locking.

At the same time, teach page_range_has_eb() to take
btrfs_subpage::readers into consideration.

So that even if the last eb of a page gets freed, page::private won't be
detached as long as there still are pending end_page_read() calls.

By this we eliminate the race window, this will slight increase the
metadata memory usage, as the page may not be released as frequently as
usual.  But it should not be a big deal.

The code got introduced in ("btrfs: submit read time repair only for
each corrupted sector"), but the fix is in a separate patch to keep the
problem description and the crash is rare so it should not hurt
bisectability.

Signed-off-by: Qu Wegruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2021-06-21 15:19:10 +02:00
..
tests idmapped-mounts-v5.12 2021-02-23 13:39:45 -08:00
acl.c fs: make helpers idmap mount aware 2021-01-24 14:27:20 +01:00
async-thread.c
async-thread.h
backref.c btrfs: move the tree mod log code into its own file 2021-04-19 17:25:16 +02:00
backref.h btrfs: add asserts for deleting backref cache nodes 2021-02-08 22:58:56 +01:00
block-group.c btrfs: make free space cache size consistent across different PAGE_SIZE 2021-06-21 15:19:08 +02:00
block-group.h btrfs: zoned: automatically reclaim zones 2021-04-20 20:46:31 +02:00
block-rsv.c btrfs: introduce mount option rescue=ignorebadroots 2020-12-08 15:53:41 +01:00
block-rsv.h
btrfs_inode.h btrfs: remove stale comment and logic from btrfs_inode_in_log() 2021-04-19 17:25:16 +02:00
check-integrity.c btrfs: integrity-checker: convert block context kmap's to kmap_local_page 2021-04-19 17:25:16 +02:00
check-integrity.h
compression.c btrfs: pass btrfs_inode to btrfs_writepage_endio_finish_ordered() 2021-06-21 15:19:08 +02:00
compression.h btrfs: optimize variables size in btrfs_submit_compressed_write 2021-06-21 15:19:07 +02:00
ctree.c btrfs: always abort the transaction if we abort a trans handle 2021-06-21 15:19:06 +02:00
ctree.h btrfs: make btrfs_set_range_writeback() subpage compatible 2021-06-21 15:19:10 +02:00
delalloc-space.c btrfs: fix parameter description of btrfs_inode_rsv_release/btrfs_delalloc_release_space 2021-02-08 22:58:54 +01:00
delalloc-space.h
delayed-inode.c btrfs: abort transaction if we fail to update the delayed inode 2021-06-21 15:19:05 +02:00
delayed-inode.h btrfs: make btrfs_delayed_update_inode take btrfs_inode 2020-12-08 15:54:10 +01:00
delayed-ref.c btrfs: update debug message when checking seq number of a delayed ref 2021-04-19 17:25:17 +02:00
delayed-ref.h btrfs: only let one thread pre-flush delayed refs in commit 2021-02-08 22:58:56 +01:00
dev-replace.c btrfs: do not initialize dev replace for bad dev root 2021-03-17 19:42:14 +01:00
dev-replace.h btrfs: zoned: mark block groups to copy for device-replace 2021-02-09 02:46:07 +01:00
dir-item.c btrfs: locking: rip out path->leave_spinning 2020-12-08 15:54:02 +01:00
discard.c btrfs: document now parameter of peek_discard_list 2021-02-08 22:58:53 +01:00
discard.h btrfs: cleanup btrfs_discard_update_discardable usage 2020-12-08 15:54:02 +01:00
disk-io.c btrfs: add cancellable chunk relocation support 2021-06-21 15:19:07 +02:00
disk-io.h btrfs: split alloc_log_tree() 2021-02-09 02:46:07 +01:00
export.c btrfs: locking: rip out path->leave_spinning 2020-12-08 15:54:02 +01:00
export.h
extent_io.c btrfs: subpage: fix a rare race between metadata endio and eb freeing 2021-06-21 15:19:10 +02:00
extent_io.h btrfs: rename PagePrivate2 to PageOrdered inside btrfs 2021-06-21 15:19:09 +02:00
extent_map.c btrfs: fix parameter description of btrfs_add_extent_mapping 2021-02-08 22:58:53 +01:00
extent_map.h
extent-io-tree.h btrfs: use fixed width int type for extent_state::state 2020-12-08 15:54:13 +01:00
extent-tree.c btrfs: always abort the transaction if we abort a trans handle 2021-06-21 15:19:06 +02:00
file-item.c btrfs: fix fsync failure and transaction abort after writes to prealloc extents 2021-05-27 23:31:36 +02:00
file.c btrfs: fix the filemap_range_has_page() call in btrfs_punch_hole_lock_range() 2021-06-21 15:19:10 +02:00
free-space-cache.c btrfs: don't set the full sync flag when truncation does not touch extents 2021-06-21 15:19:05 +02:00
free-space-cache.h btrfs: zoned: track unusable bytes for zones 2021-02-09 02:46:03 +01:00
free-space-tree.c btrfs: fix possible free space tree corruption with online conversion 2021-01-25 18:44:37 +01:00
free-space-tree.h
inode-item.c btrfs: locking: rip out path->leave_spinning 2020-12-08 15:54:02 +01:00
inode.c btrfs: don't clear page extent mapped if we're not invalidating the full page 2021-06-21 15:19:10 +02:00
ioctl.c btrfs: add device delete cancel 2021-06-21 15:19:07 +02:00
Kconfig btrfs: switch to iomap for direct IO 2020-10-07 12:06:57 +02:00
locking.c btrfs: remove the recurse parameter from __btrfs_tree_read_lock 2020-12-08 15:54:09 +01:00
locking.h btrfs: remove the recurse parameter from __btrfs_tree_read_lock 2020-12-08 15:54:09 +01:00
lzo.c btrfs: convert kmap to kmap_local_page, simple cases 2021-04-19 17:25:16 +02:00
Makefile btrfs: move the tree mod log code into its own file 2021-04-19 17:25:16 +02:00
misc.h
ordered-data.c btrfs: make page Ordered bit to be subpage compatible 2021-06-21 15:19:10 +02:00
ordered-data.h btrfs: introduce btrfs_lookup_first_ordered_range() 2021-06-21 15:19:08 +02:00
orphan.c
print-tree.c btrfs: print the actual offset in btrfs_root_name 2021-01-07 17:25:05 +01:00
print-tree.h btrfs: print the actual offset in btrfs_root_name 2021-01-07 17:25:05 +01:00
props.c
props.h
qgroup.c btrfs: fix deadlock when cloning inline extents and using qgroups 2021-04-28 20:09:47 +02:00
qgroup.h btrfs: export and rename qgroup_reserve_meta 2021-03-02 16:58:30 +01:00
raid56.c CFI on arm64 series for v5.13-rc1 2021-04-27 10:16:46 -07:00
raid56.h
rcu-string.h
reada.c btrfs: subpage: make readahead work properly 2021-03-16 11:06:21 +01:00
ref-verify.c btrfs: ref-verify: use 'inline void' keyword ordering 2021-03-02 16:55:40 +01:00
ref-verify.h
reflink.c btrfs: reflink: make copy_inline_to_page() to be subpage compatible 2021-06-21 15:19:10 +02:00
reflink.h
relocation.c btrfs: add cancellable chunk relocation support 2021-06-21 15:19:07 +02:00
root-tree.c btrfs: qgroup: fix qgroup meta rsv leak for subvolume operations 2020-10-07 12:12:13 +02:00
scrub.c btrfs: scrub: fix subpage repair error caused by hard coded PAGE_SIZE 2021-06-21 15:19:07 +02:00
send.c btrfs: fix deadlock when cloning inline extents and using qgroups 2021-04-28 20:09:47 +02:00
send.h btrfs: send: avoid copying file data 2020-10-07 12:13:17 +02:00
space-info.c btrfs: handle preemptive delalloc flushing slightly differently 2021-06-21 15:19:04 +02:00
space-info.h btrfs: zoned: track unusable bytes for zones 2021-02-09 02:46:03 +01:00
struct-funcs.c btrfs: handle sectorsize < PAGE_SIZE case for extent buffer accessors 2020-12-09 19:16:10 +01:00
subpage.c btrfs: subpage: fix a rare race between metadata endio and eb freeing 2021-06-21 15:19:10 +02:00
subpage.h btrfs: subpage: fix a rare race between metadata endio and eb freeing 2021-06-21 15:19:10 +02:00
super.c btrfs: always abort the transaction if we abort a trans handle 2021-06-21 15:19:06 +02:00
sysfs.c btrfs: sysfs: fix format string for some discard stats 2021-06-21 15:19:06 +02:00
sysfs.h btrfs: split and refactor btrfs_sysfs_remove_devices_dir 2020-10-07 12:12:21 +02:00
transaction.c btrfs: clear defrag status of a root if starting transaction fails 2021-06-21 15:19:06 +02:00
transaction.h btrfs: always abort the transaction if we abort a trans handle 2021-06-21 15:19:06 +02:00
tree-checker.c btrfs: tree-checker: check for BTRFS_BLOCK_FLAG_FULL_BACKREF being set improperly 2021-04-19 17:25:21 +02:00
tree-checker.h
tree-defrag.c btrfs: locking: remove all the blocking helpers 2020-12-08 15:54:01 +01:00
tree-log.c btrfs: avoid unnecessary logging of xattrs during fast fsyncs 2021-06-21 15:19:07 +02:00
tree-log.h btrfs: make fast fsyncs wait only for writeback 2020-10-07 12:06:56 +02:00
tree-mod-log.c btrfs: fix race when picking most recent mod log operation for an old root 2021-04-20 19:27:17 +02:00
tree-mod-log.h btrfs: add and use helper to get lowest sequence number for the tree mod log 2021-04-19 17:25:17 +02:00
ulist.c
ulist.h
uuid-tree.c btrfs: remove unnecessary casts in printk 2020-12-08 15:53:52 +01:00
volumes.c btrfs: remove the unused parameter @len for btrfs_bio_fits_in_stripe() 2021-06-21 15:19:08 +02:00
volumes.h btrfs: remove the unused parameter @len for btrfs_bio_fits_in_stripe() 2021-06-21 15:19:08 +02:00
xattr.c for-5.12-rc1-tag 2021-03-05 12:21:14 -08:00
xattr.h
zlib.c btrfs: use memzero_page() instead of open coded kmap pattern 2021-05-05 11:27:27 -07:00
zoned.c btrfs: zoned: factor out zoned device lookup 2021-06-21 15:19:05 +02:00
zoned.h btrfs: zoned: factor out zoned device lookup 2021-06-21 15:19:05 +02:00
zstd.c btrfs: use memzero_page() instead of open coded kmap pattern 2021-05-05 11:27:27 -07:00