linux/fs/btrfs
Josef Bacik e076ab2a2c btrfs: shrink delalloc pages instead of full inodes
Commit 38d715f494 ("btrfs: use btrfs_start_delalloc_roots in
shrink_delalloc") cleaned up how we do delalloc shrinking by utilizing
some infrastructure we have in place to flush inodes that we use for
device replace and snapshot.  However this introduced a pretty serious
performance regression.  To reproduce the user untarred the source
tarball of Firefox (360MiB xz compressed/1.5GiB uncompressed), and would
see it take anywhere from 5 to 20 times as long to untar in 5.10
compared to 5.9. This was observed on fast devices (SSD and better) and
not on HDD.

The root cause is because before we would generally use the normal
writeback path to reclaim delalloc space, and for this we would provide
it with the number of pages we wanted to flush.  The referenced commit
changed this to flush that many inodes, which drastically increased the
amount of space we were flushing in certain cases, which severely
affected performance.

We cannot revert this patch unfortunately because of 3d45f221ce
("btrfs: fix deadlock when cloning inline extent and low on free
metadata space") which requires the ability to skip flushing inodes that
are being cloned in certain scenarios, which means we need to keep using
our flushing infrastructure or risk re-introducing the deadlock.

Instead to fix this problem we can go back to providing
btrfs_start_delalloc_roots with a number of pages to flush, and then set
up a writeback_control and utilize sync_inode() to handle the flushing
for us.  This gives us the same behavior we had prior to the fix, while
still allowing us to avoid the deadlock that was fixed by Filipe.  I
redid the users original test and got the following results on one of
our test machines (256GiB of ram, 56 cores, 2TiB Intel NVMe drive)

  5.9		0m54.258s
  5.10		1m26.212s
  5.10+patch	0m38.800s

5.10+patch is significantly faster than plain 5.9 because of my patch
series "Change data reservations to use the ticketing infra" which
contained the patch that introduced the regression, but generally
improved the overall ENOSPC flushing mechanisms.

Additional testing on consumer-grade SSD (8GiB ram, 8 CPU) confirm
the results:

  5.10.5            4m00s
  5.10.5+patch      1m08s
  5.11-rc2	    5m14s
  5.11-rc2+patch    1m30s

Reported-by: René Rebe <rene@exactcode.de>
Fixes: 38d715f494 ("btrfs: use btrfs_start_delalloc_roots in shrink_delalloc")
CC: stable@vger.kernel.org # 5.10
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Tested-by: David Sterba <dsterba@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
[ add my test results ]
Signed-off-by: David Sterba <dsterba@suse.com>
2021-01-08 16:36:44 +01:00
..
tests btrfs: tests: initialize test inodes location 2020-12-18 14:59:49 +01:00
acl.c
async-thread.c Btrfs: fix crash during unmount due to race with delayed inode workers 2020-03-23 17:01:51 +01:00
async-thread.h Btrfs: fix crash during unmount due to race with delayed inode workers 2020-03-23 17:01:51 +01:00
backref.c btrfs: pass root owner to read_tree_block 2020-12-08 15:54:07 +01:00
backref.h btrfs: rename BTRFS_ROOT_REF_COWS to BTRFS_ROOT_SHAREABLE 2020-05-25 11:25:35 +02:00
block-group.c btrfs: skip space_cache v1 setup when not using it 2020-12-09 19:16:09 +01:00
block-group.h btrfs: load free space cache asynchronously 2020-12-08 15:54:03 +01:00
block-rsv.c btrfs: introduce mount option rescue=ignorebadroots 2020-12-08 15:53:41 +01:00
block-rsv.h btrfs: Remove __ prefix from btrfs_block_rsv_release 2020-03-23 17:01:55 +01:00
btrfs_inode.h btrfs: fix deadlock when cloning inline extent and low on free metadata space 2020-12-18 14:49:50 +01:00
check-integrity.c btrfs: drop casts of bio bi_sector 2020-12-09 19:16:05 +01:00
check-integrity.h btrfs: remove btrfsic_submit_bh() 2020-03-23 17:01:39 +01:00
compression.c btrfs: refactor btrfs_lookup_bio_sums to handle out-of-order bvecs 2020-12-09 19:16:11 +01:00
compression.h btrfs: compression: move declarations to header 2020-10-07 12:06:55 +02:00
ctree.c btrfs: correctly calculate item size used when item key collision happens 2020-12-18 14:50:00 +01:00
ctree.h btrfs: fix race between RO remount and the cleaner task 2020-12-18 15:00:02 +01:00
delalloc-space.c btrfs: add btrfs_reserve_data_bytes and use it 2020-10-07 12:06:52 +02:00
delalloc-space.h btrfs: make btrfs_delalloc_reserve_space take btrfs_inode 2020-07-27 12:55:36 +02:00
delayed-inode.c btrfs: make btrfs_delayed_update_inode take btrfs_inode 2020-12-08 15:54:10 +01:00
delayed-inode.h btrfs: make btrfs_delayed_update_inode take btrfs_inode 2020-12-08 15:54:10 +01:00
delayed-ref.c btrfs: Remove __ prefix from btrfs_block_rsv_release 2020-03-23 17:01:55 +01:00
delayed-ref.h btrfs: migrate the delayed refs rsv code 2019-07-04 17:26:17 +02:00
dev-replace.c btrfs: fix deadlock when cloning inline extent and low on free metadata space 2020-12-18 14:49:50 +01:00
dev-replace.h btrfs: add __pure attribute to functions 2019-11-18 12:46:52 +01:00
dir-item.c btrfs: locking: rip out path->leave_spinning 2020-12-08 15:54:02 +01:00
discard.c btrfs: merge critical sections of discard lock in workfn 2020-12-18 14:59:54 +01:00
discard.h btrfs: cleanup btrfs_discard_update_discardable usage 2020-12-08 15:54:02 +01:00
disk-io.c btrfs: print the actual offset in btrfs_root_name 2021-01-07 17:25:05 +01:00
disk-io.h btrfs: rename bio_offset of extent_submit_bio_start_t to dio_file_offset 2020-12-09 19:16:09 +01:00
export.c btrfs: locking: rip out path->leave_spinning 2020-12-08 15:54:02 +01:00
export.h btrfs: export helpers for subvolume name/id resolution 2020-03-23 17:01:42 +01:00
extent_io.c btrfs: prevent NULL pointer dereference in extent_io_tree_panic 2021-01-07 17:25:05 +01:00
extent_io.h btrfs: update num_extent_pages to support subpage sized extent buffer 2020-12-09 19:16:10 +01:00
extent_map.c Btrfs: fix race between using extent maps and merging them 2020-02-12 17:16:46 +01:00
extent_map.h btrfs: remove extent_map::bdev 2019-11-18 23:43:44 +01:00
extent-io-tree.h btrfs: use fixed width int type for extent_state::state 2020-12-08 15:54:13 +01:00
extent-tree.c btrfs: correctly calculate item size used when item key collision happens 2020-12-18 14:50:00 +01:00
file-item.c btrfs: correctly calculate item size used when item key collision happens 2020-12-18 14:50:00 +01:00
file.c btrfs: disable fallocate in ZONED mode 2020-12-09 19:16:04 +01:00
free-space-cache.c btrfs: remove free space items when disabling space cache v1 2020-12-09 19:16:09 +01:00
free-space-cache.h btrfs: remove free space items when disabling space cache v1 2020-12-09 19:16:09 +01:00
free-space-tree.c btrfs: locking: rip out path->leave_spinning 2020-12-08 15:54:02 +01:00
free-space-tree.h btrfs: rename btrfs_block_group_cache 2019-11-18 17:51:51 +01:00
inode-item.c btrfs: locking: rip out path->leave_spinning 2020-12-08 15:54:02 +01:00
inode.c btrfs: shrink delalloc pages instead of full inodes 2021-01-08 16:36:44 +01:00
ioctl.c btrfs: fix deadlock when cloning inline extent and low on free metadata space 2020-12-18 14:49:50 +01:00
Kconfig btrfs: switch to iomap for direct IO 2020-10-07 12:06:57 +02:00
locking.c btrfs: remove the recurse parameter from __btrfs_tree_read_lock 2020-12-08 15:54:09 +01:00
locking.h btrfs: remove the recurse parameter from __btrfs_tree_read_lock 2020-12-08 15:54:09 +01:00
lzo.c btrfs: compression: inline free_workspace 2019-11-18 12:46:59 +01:00
Makefile btrfs: remove inode number cache feature 2020-12-09 19:16:05 +01:00
misc.h btrfs: rename tree_entry to rb_simple_node and export it 2020-05-25 11:25:19 +02:00
ordered-data.c btrfs: remove btrfs_find_ordered_sum call from btrfs_lookup_bio_sums 2020-12-09 19:16:10 +01:00
ordered-data.h btrfs: remove btrfs_find_ordered_sum call from btrfs_lookup_bio_sums 2020-12-09 19:16:10 +01:00
orphan.c
print-tree.c btrfs: print the actual offset in btrfs_root_name 2021-01-07 17:25:05 +01:00
print-tree.h btrfs: print the actual offset in btrfs_root_name 2021-01-07 17:25:05 +01:00
props.c btrfs: simplify iget helpers 2020-05-25 11:25:37 +02:00
props.h
qgroup.c btrfs: fix transaction leak and crash after RO remount caused by qgroup rescan 2020-12-18 14:59:57 +01:00
qgroup.h btrfs: qgroup: export qgroups in sysfs 2020-07-27 12:55:37 +02:00
raid56.c btrfs: drop casts of bio bi_sector 2020-12-09 19:16:05 +01:00
raid56.h btrfs: constify map parameter for nr_parity_stripes and nr_data_stripes 2019-07-01 13:34:58 +02:00
rcu-string.h btrfs: rcu-string: Replace zero-length array with flexible-array member 2020-03-23 17:01:53 +01:00
reada.c btrfs: pass the owner_root and level to alloc_extent_buffer 2020-12-08 15:54:07 +01:00
ref-verify.c btrfs: use btrfs_read_node_slot in walk_down_tree 2020-12-08 15:54:06 +01:00
ref-verify.h
reflink.c btrfs: fix deadlock when cloning inline extent and low on free metadata space 2020-12-18 14:49:50 +01:00
reflink.h Btrfs: move all reflink implementation code into its own file 2020-03-23 17:01:54 +01:00
relocation.c btrfs: reloc: fix wrong file extent type check to avoid false ENOENT 2021-01-07 17:25:05 +01:00
root-tree.c btrfs: qgroup: fix qgroup meta rsv leak for subvolume operations 2020-10-07 12:12:13 +02:00
scrub.c btrfs: scrub: allow scrub to work with subpage sectorsize 2020-12-09 19:16:11 +01:00
send.c btrfs: send: fix wrong file path when there is an inode with a pending rmdir 2020-12-18 14:50:16 +01:00
send.h btrfs: send: avoid copying file data 2020-10-07 12:13:17 +02:00
space-info.c btrfs: shrink delalloc pages instead of full inodes 2021-01-08 16:36:44 +01:00
space-info.h btrfs: add btrfs_reserve_data_bytes and use it 2020-10-07 12:06:52 +02:00
struct-funcs.c btrfs: handle sectorsize < PAGE_SIZE case for extent buffer accessors 2020-12-09 19:16:10 +01:00
super.c btrfs: run delayed iputs when remounting RO to avoid leaking them 2020-12-18 15:00:08 +01:00
sysfs.c btrfs: introduce ZONED feature flag 2020-12-08 15:54:16 +01:00
sysfs.h btrfs: split and refactor btrfs_sysfs_remove_devices_dir 2020-10-07 12:12:21 +02:00
transaction.c btrfs: keep sb cache_generation consistent with space_cache 2020-12-09 19:16:08 +01:00
transaction.h btrfs: return bool from btrfs_should_end_transaction 2020-12-08 15:54:16 +01:00
tree-checker.c btrfs: tree-checker: check if chunk item end overflows 2021-01-07 17:25:05 +01:00
tree-checker.h
tree-defrag.c btrfs: locking: remove all the blocking helpers 2020-12-08 15:54:01 +01:00
tree-log.c btrfs: do not block inode logging for so long during transaction commit 2020-12-09 19:16:07 +01:00
tree-log.h btrfs: make fast fsyncs wait only for writeback 2020-10-07 12:06:56 +02:00
ulist.c
ulist.h
uuid-tree.c btrfs: remove unnecessary casts in printk 2020-12-08 15:53:52 +01:00
volumes.c btrfs: fix race between RO remount and the cleaner task 2020-12-18 15:00:02 +01:00
volumes.h btrfs: get zone information of zoned block devices 2020-12-09 19:15:57 +01:00
xattr.c btrfs: skip unnecessary searches for xattrs when logging an inode 2020-12-08 15:54:12 +01:00
xattr.h
zlib.c btrfs: use larger zlib buffer for s390 hardware compression 2020-01-31 10:30:40 -08:00
zoned.c btrfs: implement log-structured superblock for ZONED mode 2020-12-09 19:16:04 +01:00
zoned.h btrfs: implement log-structured superblock for ZONED mode 2020-12-09 19:16:04 +01:00
zstd.c btrfs: compression: inline free_workspace 2019-11-18 12:46:59 +01:00