linux/fs/btrfs
Josef Bacik f4b1363cae btrfs: do not do delalloc reservation under page lock
We ran into a deadlock in production with the fixup worker.  The stack
traces were as follows:

Thread responsible for the writeout, waiting on the page lock

  [<0>] io_schedule+0x12/0x40
  [<0>] __lock_page+0x109/0x1e0
  [<0>] extent_write_cache_pages+0x206/0x360
  [<0>] extent_writepages+0x40/0x60
  [<0>] do_writepages+0x31/0xb0
  [<0>] __writeback_single_inode+0x3d/0x350
  [<0>] writeback_sb_inodes+0x19d/0x3c0
  [<0>] __writeback_inodes_wb+0x5d/0xb0
  [<0>] wb_writeback+0x231/0x2c0
  [<0>] wb_workfn+0x308/0x3c0
  [<0>] process_one_work+0x1e0/0x390
  [<0>] worker_thread+0x2b/0x3c0
  [<0>] kthread+0x113/0x130
  [<0>] ret_from_fork+0x35/0x40
  [<0>] 0xffffffffffffffff

Thread of the fixup worker who is holding the page lock

  [<0>] start_delalloc_inodes+0x241/0x2d0
  [<0>] btrfs_start_delalloc_roots+0x179/0x230
  [<0>] btrfs_alloc_data_chunk_ondemand+0x11b/0x2e0
  [<0>] btrfs_check_data_free_space+0x53/0xa0
  [<0>] btrfs_delalloc_reserve_space+0x20/0x70
  [<0>] btrfs_writepage_fixup_worker+0x1fc/0x2a0
  [<0>] normal_work_helper+0x11c/0x360
  [<0>] process_one_work+0x1e0/0x390
  [<0>] worker_thread+0x2b/0x3c0
  [<0>] kthread+0x113/0x130
  [<0>] ret_from_fork+0x35/0x40
  [<0>] 0xffffffffffffffff

Thankfully the stars have to align just right to hit this.  First you
have to end up in the fixup worker, which is tricky by itself (my
reproducer does DIO reads into a MMAP'ed region, so not a common
operation).  Then you have to have less than a page size of free data
space and 0 unallocated space so you go down the "commit the transaction
to free up pinned space" path.  This was accomplished by a random
balance that was running on the host.  Then you get this deadlock.

I'm still in the process of trying to force the deadlock to happen on
demand, but I've hit other issues.  I can still trigger the fixup worker
path itself so this patch has been tested in that regard, so the normal
case is fine.

Fixes: 87826df0ec ("btrfs: delalloc for page dirtied out-of-band in fixup worker")
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2020-01-31 14:02:15 +01:00
..
tests btrfs: Correctly handle empty trees in find_first_clear_extent_bit 2020-01-31 14:01:29 +01:00
acl.c btrfs: cleanup btrfs_setxattr_trans and drop transaction parameter 2019-04-29 19:02:44 +02:00
async-thread.c btrfs: add __pure attribute to functions 2019-11-18 12:46:52 +01:00
async-thread.h btrfs: add __pure attribute to functions 2019-11-18 12:46:52 +01:00
backref.c Btrfs: fix deadlock between fiemap and transaction commits 2019-07-30 18:25:12 +02:00
backref.h btrfs: fiemap: preallocate ulists for btrfs_check_shared 2019-07-01 13:34:53 +02:00
block-group.c btrfs: take overcommit into account in inc_block_group_ro 2020-01-31 14:02:01 +01:00
block-group.h btrfs: Move and unexport btrfs_rmap_block 2020-01-23 17:24:34 +01:00
block-rsv.c btrfs: use btrfs_try_granting_tickets in update_global_rsv 2019-09-09 14:59:19 +02:00
block-rsv.h btrfs: migrate the global_block_rsv helpers to block-rsv.c 2019-07-02 12:30:55 +02:00
btrfs_inode.h Btrfs: remove unnecessary delalloc mutex for inodes 2019-11-18 17:51:46 +01:00
check-integrity.c btrfs: remove superfluous BUG_ON() in integrity checks 2020-01-20 16:40:52 +01:00
check-integrity.h btrfs: replace GPL boilerplate by SPDX -- headers 2018-04-12 16:29:46 +02:00
compression.c btrfs: get rid of at_offset parameter to btrfs_lookup_bio_sums() 2020-01-20 16:40:54 +01:00
compression.h btrfs: compression: remove ops pointer from workspace_manager 2019-11-18 12:46:59 +01:00
ctree.c Btrfs: fix race between adding and putting tree mod seq elements and nodes 2020-01-31 14:01:20 +01:00
ctree.h Btrfs: fix race between adding and putting tree mod seq elements and nodes 2020-01-31 14:01:20 +01:00
delalloc-space.c Btrfs: remove unnecessary delalloc mutex for inodes 2019-11-18 17:51:46 +01:00
delalloc-space.h btrfs: migrate the delalloc space stuff to it's own home 2019-07-04 17:26:17 +02:00
delayed-inode.c btrfs: use refcount_inc_not_zero in kill_all_nodes 2019-11-18 12:46:51 +01:00
delayed-inode.h Btrfs: delayed-inode: use rb_first_cached for ins_root and del_root 2018-10-15 17:23:33 +02:00
delayed-ref.c Btrfs: fix race between adding and putting tree mod seq elements and nodes 2020-01-31 14:01:20 +01:00
delayed-ref.h btrfs: migrate the delayed refs rsv code 2019-07-04 17:26:17 +02:00
dev-replace.c btrfs: sysfs, add devid/dev_state kobject and device attributes 2020-01-23 17:24:36 +01:00
dev-replace.h btrfs: add __pure attribute to functions 2019-11-18 12:46:52 +01:00
dir-item.c btrfs: remove unused parameter fs_info from btrfs_extend_item 2019-04-29 19:02:50 +02:00
discard.c btrfs: add correction to handle -1 edge case in async discard 2020-01-20 16:41:01 +01:00
discard.h btrfs: have multiple discard lists 2020-01-20 16:41:00 +01:00
disk-io.c Btrfs: fix race between adding and putting tree mod seq elements and nodes 2020-01-31 14:01:20 +01:00
disk-io.h btrfs: drop create parameter to btrfs_get_extent() 2020-01-20 16:40:55 +01:00
export.c btrfs: drop unused parameter is_new from btrfs_iget 2019-11-18 12:46:52 +01:00
export.h btrfs: replace GPL boilerplate by SPDX -- headers 2018-04-12 16:29:46 +02:00
extent_io.c btrfs: drop the -EBUSY case in __extent_writepage_io 2020-01-31 14:02:11 +01:00
extent_io.h btrfs: drop create parameter to btrfs_get_extent() 2020-01-20 16:40:55 +01:00
extent_map.c btrfs: remove extent_map::bdev 2019-11-18 23:43:44 +01:00
extent_map.h btrfs: remove extent_map::bdev 2019-11-18 23:43:44 +01:00
extent-io-tree.h btrfs: move the failrec tree stuff into extent-io-tree.h 2019-11-18 12:46:47 +01:00
extent-tree.c btrfs: calculate discard delay based on number of extents 2020-01-20 16:40:59 +01:00
file-item.c btrfs: safely advance counter when looking up bio csums 2020-01-20 16:41:01 +01:00
file.c btrfs: drop create parameter to btrfs_get_extent() 2020-01-20 16:40:55 +01:00
free-space-cache.c btrfs: ensure removal of discardable_* in free_bitmap() 2020-01-20 16:41:01 +01:00
free-space-cache.h btrfs: have multiple discard lists 2020-01-20 16:41:00 +01:00
free-space-tree.c btrfs: rename btrfs_block_group_cache 2019-11-18 17:51:51 +01:00
free-space-tree.h btrfs: rename btrfs_block_group_cache 2019-11-18 17:51:51 +01:00
inode-item.c btrfs: Make btrfs_find_name_in_ext_backref return struct btrfs_inode_extref 2019-09-09 14:59:16 +02:00
inode-map.c btrfs: keep track of which extents have been discarded 2020-01-20 16:40:57 +01:00
inode-map.h btrfs: replace GPL boilerplate by SPDX -- headers 2018-04-12 16:29:46 +02:00
inode.c btrfs: do not do delalloc reservation under page lock 2020-01-31 14:02:15 +01:00
ioctl.c btrfs: drop create parameter to btrfs_get_extent() 2020-01-20 16:40:55 +01:00
Kconfig btrfs: add Kconfig dependency for BLAKE2B 2019-12-09 17:56:06 +01:00
locking.c btrfs: document extent buffer locking 2019-11-18 17:51:50 +01:00
locking.h btrfs: move btrfs_unlock_up_safe to other locking functions 2019-11-18 12:46:49 +01:00
lzo.c btrfs: compression: inline free_workspace 2019-11-18 12:46:59 +01:00
Makefile btrfs: add the beginning of async discard, discard workqueue 2020-01-20 16:40:57 +01:00
misc.h btrfs: add 64bit safe helper for power of two checks 2019-11-18 12:46:50 +01:00
ordered-data.c btrfs: make btrfs_ordered_extent naming consistent with btrfs_file_extent_item 2020-01-20 16:40:54 +01:00
ordered-data.h btrfs: make btrfs_ordered_extent naming consistent with btrfs_file_extent_item 2020-01-20 16:40:54 +01:00
orphan.c btrfs: replace GPL boilerplate by SPDX -- sources 2018-04-12 16:29:51 +02:00
print-tree.c btrfs: Remove unneeded semicolon 2020-01-20 16:40:55 +01:00
print-tree.h btrfs: print-tree: debugging output enhancement 2018-04-20 19:18:16 +02:00
props.c btrfs: props: remove unnecessary hash_init() 2019-11-18 12:46:55 +01:00
props.h btrfs: delete unused function btrfs_set_prop_trans 2019-04-29 19:02:54 +02:00
qgroup.c btrfs: qgroup: return ENOTCONN instead of EINVAL when quotas are not enabled 2020-01-20 16:40:50 +01:00
qgroup.h btrfs: rename btrfs_block_group_cache 2019-11-18 17:51:51 +01:00
raid56.c btrfs: remove pointless local variable in lock_stripe_add() 2019-11-18 12:47:00 +01:00
raid56.h btrfs: constify map parameter for nr_parity_stripes and nr_data_stripes 2019-07-01 13:34:58 +02:00
rcu-string.h btrfs: replace GPL boilerplate by SPDX -- headers 2018-04-12 16:29:46 +02:00
reada.c btrfs: rename btrfs_block_group_cache 2019-11-18 17:51:51 +01:00
ref-verify.c btrfs: fix uninitialized ret in ref-verify 2019-10-03 15:00:56 +02:00
ref-verify.h btrfs: ref-verify: Use btrfs_ref to refactor btrfs_ref_tree_mod() 2019-04-29 19:02:49 +02:00
relocation.c btrfs: make btrfs_ordered_extent naming consistent with btrfs_file_extent_item 2020-01-20 16:40:54 +01:00
root-tree.c btrfs: do not delete mismatched root refs 2020-01-08 14:44:24 +01:00
scrub.c btrfs: handle empty block_group removal for async discard 2020-01-20 16:40:57 +01:00
send.c btrfs: send: remove WARN_ON for readonly mount 2019-12-13 14:10:46 +01:00
send.h btrfs: replace GPL boilerplate by SPDX -- headers 2018-04-12 16:29:46 +02:00
space-info.c btrfs: take overcommit into account in inc_block_group_ro 2020-01-31 14:02:01 +01:00
space-info.h btrfs: take overcommit into account in inc_block_group_ro 2020-01-31 14:02:01 +01:00
struct-funcs.c btrfs: tie extent buffer and it's token together 2019-09-09 14:59:16 +02:00
super.c btrfs: add the beginning of async discard, discard workqueue 2020-01-20 16:40:57 +01:00
sysfs.c btrfs: sysfs, add devid/dev_state kobject and device attributes 2020-01-23 17:24:36 +01:00
sysfs.h btrfs: sysfs, add devid/dev_state kobject and device attributes 2020-01-23 17:24:36 +01:00
transaction.c btrfs: set trans->drity in btrfs_commit_transaction 2020-01-23 17:24:37 +01:00
transaction.h btrfs: Rename btrfs_join_transaction_nolock 2019-11-18 12:46:54 +01:00
tree-checker.c btrfs: tree-checker: Verify location key for DIR_ITEM/DIR_INDEX 2020-01-20 16:40:56 +01:00
tree-checker.h btrfs: get fs_info from eb in btrfs_check_chunk_valid 2019-04-29 19:02:39 +02:00
tree-defrag.c btrfs: open code now trivial btrfs_set_lock_blocking 2019-02-25 14:13:27 +01:00
tree-log.c Btrfs: fix infinite loop during fsync after rename operations 2020-01-23 17:24:37 +01:00
tree-log.h btrfs: get fs_info from trans in btrfs_set_log_full_commit 2019-04-29 19:02:41 +02:00
ulist.c btrfs: replace GPL boilerplate by SPDX -- sources 2018-04-12 16:29:51 +02:00
ulist.h btrfs: replace GPL boilerplate by SPDX -- headers 2018-04-12 16:29:46 +02:00
uuid-tree.c btrfs: handle ENOENT in btrfs_uuid_tree_iterate 2019-12-13 14:10:45 +01:00
volumes.c btrfs: Fix split-brain handling when changing FSID to metadata uuid 2020-01-23 17:24:39 +01:00
volumes.h btrfs: sysfs, add devid/dev_state kobject and device attributes 2020-01-23 17:24:36 +01:00
xattr.c Btrfs: fix failure to persist compression property xattr deletion on fsync 2019-06-17 16:37:17 +02:00
xattr.h btrfs: cleanup btrfs_setxattr_trans and drop transaction parameter 2019-04-29 19:02:44 +02:00
zlib.c btrfs: compression: inline free_workspace 2019-11-18 12:46:59 +01:00
zstd.c btrfs: compression: inline free_workspace 2019-11-18 12:46:59 +01:00