linux

History

Josef Bacik ca10845a56 btrfs: sysfs: init devices outside of the chunk_mutex While running btrfs/061, btrfs/073, btrfs/078, or btrfs/178 we hit the following lockdep splat: ====================================================== WARNING: possible circular locking dependency detected 5.9.0-rc3+ #4 Not tainted ------------------------------------------------------ kswapd0/100 is trying to acquire lock: ffff96ecc22ef4a0 (&delayed_node->mutex){+.+.}-{3:3}, at: __btrfs_release_delayed_node.part.0+0x3f/0x330 but task is already holding lock: ffffffff8dd74700 (fs_reclaim){+.+.}-{0:0}, at: __fs_reclaim_acquire+0x5/0x30 which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #3 (fs_reclaim){+.+.}-{0:0}: fs_reclaim_acquire+0x65/0x80 slab_pre_alloc_hook.constprop.0+0x20/0x200 kmem_cache_alloc+0x37/0x270 alloc_inode+0x82/0xb0 iget_locked+0x10d/0x2c0 kernfs_get_inode+0x1b/0x130 kernfs_get_tree+0x136/0x240 sysfs_get_tree+0x16/0x40 vfs_get_tree+0x28/0xc0 path_mount+0x434/0xc00 __x64_sys_mount+0xe3/0x120 do_syscall_64+0x33/0x40 entry_SYSCALL_64_after_hwframe+0x44/0xa9 -> #2 (kernfs_mutex){+.+.}-{3:3}: __mutex_lock+0x7e/0x7e0 kernfs_add_one+0x23/0x150 kernfs_create_link+0x63/0xa0 sysfs_do_create_link_sd+0x5e/0xd0 btrfs_sysfs_add_devices_dir+0x81/0x130 btrfs_init_new_device+0x67f/0x1250 btrfs_ioctl+0x1ef/0x2e20 __x64_sys_ioctl+0x83/0xb0 do_syscall_64+0x33/0x40 entry_SYSCALL_64_after_hwframe+0x44/0xa9 -> #1 (&fs_info->chunk_mutex){+.+.}-{3:3}: __mutex_lock+0x7e/0x7e0 btrfs_chunk_alloc+0x125/0x3a0 find_free_extent+0xdf6/0x1210 btrfs_reserve_extent+0xb3/0x1b0 btrfs_alloc_tree_block+0xb0/0x310 alloc_tree_block_no_bg_flush+0x4a/0x60 __btrfs_cow_block+0x11a/0x530 btrfs_cow_block+0x104/0x220 btrfs_search_slot+0x52e/0x9d0 btrfs_insert_empty_items+0x64/0xb0 btrfs_insert_delayed_items+0x90/0x4f0 btrfs_commit_inode_delayed_items+0x93/0x140 btrfs_log_inode+0x5de/0x2020 btrfs_log_inode_parent+0x429/0xc90 btrfs_log_new_name+0x95/0x9b btrfs_rename2+0xbb9/0x1800 vfs_rename+0x64f/0x9f0 do_renameat2+0x320/0x4e0 __x64_sys_rename+0x1f/0x30 do_syscall_64+0x33/0x40 entry_SYSCALL_64_after_hwframe+0x44/0xa9 -> #0 (&delayed_node->mutex){+.+.}-{3:3}: __lock_acquire+0x119c/0x1fc0 lock_acquire+0xa7/0x3d0 __mutex_lock+0x7e/0x7e0 __btrfs_release_delayed_node.part.0+0x3f/0x330 btrfs_evict_inode+0x24c/0x500 evict+0xcf/0x1f0 dispose_list+0x48/0x70 prune_icache_sb+0x44/0x50 super_cache_scan+0x161/0x1e0 do_shrink_slab+0x178/0x3c0 shrink_slab+0x17c/0x290 shrink_node+0x2b2/0x6d0 balance_pgdat+0x30a/0x670 kswapd+0x213/0x4c0 kthread+0x138/0x160 ret_from_fork+0x1f/0x30 other info that might help us debug this: Chain exists of: &delayed_node->mutex --> kernfs_mutex --> fs_reclaim Possible unsafe locking scenario: CPU0 CPU1 ---- ---- lock(fs_reclaim); lock(kernfs_mutex); lock(fs_reclaim); lock(&delayed_node->mutex); * DEADLOCK * 3 locks held by kswapd0/100: #0: ffffffff8dd74700 (fs_reclaim){+.+.}-{0:0}, at: __fs_reclaim_acquire+0x5/0x30 #1: ffffffff8dd65c50 (shrinker_rwsem){++++}-{3:3}, at: shrink_slab+0x115/0x290 #2: ffff96ed2ade30e0 (&type->s_umount_key#36){++++}-{3:3}, at: super_cache_scan+0x38/0x1e0 stack backtrace: CPU: 0 PID: 100 Comm: kswapd0 Not tainted 5.9.0-rc3+ #4 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.13.0-2.fc32 04/01/2014 Call Trace: dump_stack+0x8b/0xb8 check_noncircular+0x12d/0x150 __lock_acquire+0x119c/0x1fc0 lock_acquire+0xa7/0x3d0 ? __btrfs_release_delayed_node.part.0+0x3f/0x330 __mutex_lock+0x7e/0x7e0 ? __btrfs_release_delayed_node.part.0+0x3f/0x330 ? __btrfs_release_delayed_node.part.0+0x3f/0x330 ? lock_acquire+0xa7/0x3d0 ? find_held_lock+0x2b/0x80 __btrfs_release_delayed_node.part.0+0x3f/0x330 btrfs_evict_inode+0x24c/0x500 evict+0xcf/0x1f0 dispose_list+0x48/0x70 prune_icache_sb+0x44/0x50 super_cache_scan+0x161/0x1e0 do_shrink_slab+0x178/0x3c0 shrink_slab+0x17c/0x290 shrink_node+0x2b2/0x6d0 balance_pgdat+0x30a/0x670 kswapd+0x213/0x4c0 ? _raw_spin_unlock_irqrestore+0x41/0x50 ? add_wait_queue_exclusive+0x70/0x70 ? balance_pgdat+0x670/0x670 kthread+0x138/0x160 ? kthread_create_worker_on_cpu+0x40/0x40 ret_from_fork+0x1f/0x30 This happens because we are holding the chunk_mutex at the time of adding in a new device. However we only need to hold the device_list_mutex, as we're going to iterate over the fs_devices devices. Move the sysfs init stuff outside of the chunk_mutex to get rid of this lockdep splat. CC: stable@vger.kernel.org # 4.4.x: `f3cd2c5811`: btrfs: sysfs, rename device_link add/remove functions CC: stable@vger.kernel.org # 4.4.x Reported-by: David Sterba <dsterba@suse.com> Signed-off-by: Josef Bacik <josef@toxicpanda.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>		2020-10-07 12:12:19 +02:00
..
tests	btrfs: make btrfs_set_extent_delalloc take btrfs_inode	2020-07-27 12:55:35 +02:00
acl.c
async-thread.c	Btrfs: fix crash during unmount due to race with delayed inode workers	2020-03-23 17:01:51 +01:00
async-thread.h	Btrfs: fix crash during unmount due to race with delayed inode workers	2020-03-23 17:01:51 +01:00
backref.c	btrfs: remove unnecessarily shadowed variables	2020-10-07 12:06:55 +02:00
backref.h	btrfs: rename BTRFS_ROOT_REF_COWS to BTRFS_ROOT_SHAREABLE	2020-05-25 11:25:35 +02:00
block-group.c	btrfs: make read_block_group_item return void	2020-10-07 12:06:56 +02:00
block-group.h	btrfs: convert block group refcount to refcount_t	2020-07-27 12:55:42 +02:00
block-rsv.c	btrfs: rename BTRFS_ROOT_REF_COWS to BTRFS_ROOT_SHAREABLE	2020-05-25 11:25:35 +02:00
block-rsv.h	btrfs: Remove __ prefix from btrfs_block_rsv_release	2020-03-23 17:01:55 +01:00
btrfs_inode.h	btrfs: convert btrfs_inode_sectorsize to take btrfs_inode	2020-10-07 12:12:18 +02:00
check-integrity.c	btrfs: check-integrity: remove unnecessary failure messages during memory allocation	2020-07-27 12:55:21 +02:00
check-integrity.h	btrfs: remove btrfsic_submit_bh()	2020-03-23 17:01:39 +01:00
compression.c	btrfs: compression: move declarations to header	2020-10-07 12:06:55 +02:00
compression.h	btrfs: compression: move declarations to header	2020-10-07 12:06:55 +02:00
ctree.c	btrfs: use BTRFS_NESTED_NEW_ROOT for double splits	2020-10-07 12:12:17 +02:00
ctree.h	btrfs: convert btrfs_inode_sectorsize to take btrfs_inode	2020-10-07 12:12:18 +02:00
delalloc-space.c	btrfs: add btrfs_reserve_data_bytes and use it	2020-10-07 12:06:52 +02:00
delalloc-space.h	btrfs: make btrfs_delalloc_reserve_space take btrfs_inode	2020-07-27 12:55:36 +02:00
delayed-inode.c	btrfs: qgroup: fix wrong qgroup metadata reserve for delayed inode	2020-10-07 12:12:13 +02:00
delayed-inode.h	btrfs: delayed-inode: Replace zero-length array with flexible-array member	2020-03-23 17:01:53 +01:00
delayed-ref.c	btrfs: Remove __ prefix from btrfs_block_rsv_release	2020-03-23 17:01:55 +01:00
delayed-ref.h
dev-replace.c	btrfs: change nr to u64 in btrfs_start_delalloc_roots	2020-10-07 12:06:50 +02:00
dev-replace.h
dir-item.c
discard.c	btrfs: discard: add missing put when grabbing block group from unused list	2020-07-07 16:06:28 +02:00
discard.h	btrfs: discard: Use the correct style for SPDX License Identifier	2020-04-20 17:43:42 +02:00
disk-io.c	btrfs: introduce BTRFS_NESTING_COW for cow'ing blocks	2020-10-07 12:12:16 +02:00
disk-io.h	btrfs: preallocate anon block device at first phase of snapshot creation	2020-07-27 12:55:38 +02:00
export.c	btrfs: simplify iget helpers	2020-05-25 11:25:37 +02:00
export.h	btrfs: export helpers for subvolume name/id resolution	2020-03-23 17:01:42 +01:00
extent_io.c	btrfs: make extent_fiemap take btrfs_inode	2020-10-07 12:12:19 +02:00
extent_io.h	btrfs: make extent_fiemap take btrfs_inode	2020-10-07 12:12:19 +02:00
extent_map.c	Btrfs: fix race between using extent maps and merging them	2020-02-12 17:16:46 +01:00
extent_map.h	btrfs: remove extent_map::bdev	2019-11-18 23:43:44 +01:00
extent-io-tree.h	btrfs: add owner and fs_info to alloc_state io_tree	2020-10-07 12:06:56 +02:00
extent-tree.c	btrfs: introduce BTRFS_NESTING_COW for cow'ing blocks	2020-10-07 12:12:16 +02:00
file-item.c	btrfs: make btrfs_find_ordered_sum take btrfs_inode	2020-10-07 12:12:19 +02:00
file.c	btrfs: make btrfs_zero_range_check_range_boundary take btrfs_inode	2020-10-07 12:12:19 +02:00
free-space-cache.c	btrfs: delete duplicated words + other fixes in comments	2020-10-07 12:06:50 +02:00
free-space-cache.h	btrfs: let btrfs_return_cluster_to_free_space() return void	2020-07-27 12:55:21 +02:00
free-space-tree.c	btrfs: block-group: fix free-space bitmap threshold	2020-08-27 13:37:54 +02:00
free-space-tree.h	btrfs: rename btrfs_block_group_cache	2019-11-18 17:51:51 +01:00
inode-item.c
inode-map.c	btrfs: make btrfs_delalloc_reserve_space take btrfs_inode	2020-07-27 12:55:36 +02:00
inode-map.h
inode.c	btrfs: make extent_fiemap take btrfs_inode	2020-10-07 12:12:19 +02:00
ioctl.c	btrfs: introduce BTRFS_NESTING_COW for cow'ing blocks	2020-10-07 12:12:16 +02:00
Kconfig	btrfs: switch to iomap for direct IO	2020-10-07 12:06:57 +02:00
locking.c	btrfs: add nesting tags to the locking helpers	2020-10-07 12:12:16 +02:00
locking.h	btrfs: introduce BTRFS_NESTING_NEW_ROOT for adding new roots	2020-10-07 12:12:17 +02:00
lzo.c	btrfs: compression: inline free_workspace	2019-11-18 12:46:59 +01:00
Makefile	Btrfs: move all reflink implementation code into its own file	2020-03-23 17:01:54 +01:00
misc.h	btrfs: rename tree_entry to rb_simple_node and export it	2020-05-25 11:25:19 +02:00
ordered-data.c	btrfs: make btrfs_find_ordered_sum take btrfs_inode	2020-10-07 12:12:19 +02:00
ordered-data.h	btrfs: make btrfs_find_ordered_sum take btrfs_inode	2020-10-07 12:12:19 +02:00
orphan.c
print-tree.c	btrfs: require only sector size alignment for parent eb bytenr	2020-09-07 14:51:05 +02:00
print-tree.h
props.c	btrfs: simplify iget helpers	2020-05-25 11:25:37 +02:00
props.h
qgroup.c	btrfs: delete duplicated words + other fixes in comments	2020-10-07 12:06:50 +02:00
qgroup.h	btrfs: qgroup: export qgroups in sysfs	2020-07-27 12:55:37 +02:00
raid56.c	btrfs: raid56: remove out label in __raid56_parity_recover	2020-07-27 12:55:44 +02:00
raid56.h
rcu-string.h	btrfs: rcu-string: Replace zero-length array with flexible-array member	2020-03-23 17:01:53 +01:00
reada.c	btrfs: switch seed device to list api	2020-10-07 12:06:58 +02:00
ref-verify.c	btrfs: ref-verify: fix memory leak in add_block_entry	2020-07-27 12:55:43 +02:00
ref-verify.h
reflink.c	btrfs: make copy_inline_to_page take btrfs_inode	2020-10-07 12:12:19 +02:00
reflink.h	Btrfs: move all reflink implementation code into its own file	2020-03-23 17:01:54 +01:00
relocation.c	btrfs: introduce BTRFS_NESTING_COW for cow'ing blocks	2020-10-07 12:12:16 +02:00
root-tree.c	btrfs: qgroup: fix qgroup meta rsv leak for subvolume operations	2020-10-07 12:12:13 +02:00
scrub.c	btrfs: scrub: rename ratelimit state varaible to avoid shadowing	2020-10-07 12:06:55 +02:00
send.c	btrfs: send: remove indirect callback parameter for changed_cb	2020-10-07 12:06:55 +02:00
send.h
space-info.c	btrfs: fix possible infinite loop in data async reclaim	2020-10-07 12:06:54 +02:00
space-info.h	btrfs: add btrfs_reserve_data_bytes and use it	2020-10-07 12:06:52 +02:00
struct-funcs.c	btrfs: update documentation of set/get helpers	2020-05-25 11:25:35 +02:00
super.c	btrfs: do async reclaim for data reservations	2020-10-07 12:06:54 +02:00
sysfs.c	btrfs: simplify setting/clearing fs_info to btrfs_fs_devices	2020-10-07 12:06:58 +02:00
sysfs.h	btrfs: remove const from btrfs_feature_set_name	2020-10-07 12:06:55 +02:00
transaction.c	btrfs: introduce BTRFS_NESTING_COW for cow'ing blocks	2020-10-07 12:12:16 +02:00
transaction.h	btrfs: dio iomap DSYNC workaround	2020-10-07 12:06:57 +02:00
tree-checker.c	btrfs: tree-checker: fix the error message for transid error	2020-08-27 14:16:05 +02:00
tree-checker.h
tree-defrag.c	btrfs: remove unused btrfs_root::defrag_trans_start	2020-07-27 12:55:28 +02:00
tree-log.c	btrfs: make fast fsyncs wait only for writeback	2020-10-07 12:06:56 +02:00
tree-log.h	btrfs: make fast fsyncs wait only for writeback	2020-10-07 12:06:56 +02:00
ulist.c
ulist.h
uuid-tree.c	btrfs: simplify root lookup by id	2020-05-25 11:25:36 +02:00
volumes.c	btrfs: sysfs: init devices outside of the chunk_mutex	2020-10-07 12:12:19 +02:00
volumes.h	btrfs: switch seed device to list api	2020-10-07 12:06:58 +02:00
xattr.c
xattr.h
zlib.c	btrfs: use larger zlib buffer for s390 hardware compression	2020-01-31 10:30:40 -08:00
zstd.c	btrfs: compression: inline free_workspace	2019-11-18 12:46:59 +01:00