linux/fs/xfs
Darrick J. Wong b0ffe661fa xfs: fix performance problems when fstrimming a subset of a fragmented AG
On a 10TB filesystem where the free space in each AG is heavily
fragmented, I noticed some very high runtimes on a FITRIM call for the
entire filesystem.  xfs_scrub likes to report progress information on
each phase of the scrub, which means that a strace for the entire
filesystem:

ioctl(3, FITRIM, {start=0x0, len=10995116277760, minlen=0}) = 0 <686.209839>

shows that scrub is uncommunicative for the entire duration.  Reducing
the size of the FITRIM requests to a single AG at a time produces lower
times for each individual call, but even this isn't quite acceptable,
because the time between progress reports are still very high:

Strace for the first 4x 1TB AGs looks like (2):
ioctl(3, FITRIM, {start=0x0, len=1099511627776, minlen=0}) = 0 <68.352033>
ioctl(3, FITRIM, {start=0x10000000000, len=1099511627776, minlen=0}) = 0 <68.760323>
ioctl(3, FITRIM, {start=0x20000000000, len=1099511627776, minlen=0}) = 0 <67.235226>
ioctl(3, FITRIM, {start=0x30000000000, len=1099511627776, minlen=0}) = 0 <69.465744>

I then had the idea to limit the length parameter of each call to a
smallish amount (~11GB) so that we could report progress relatively
quickly, but much to my surprise, each FITRIM call still took ~68
seconds!

Unfortunately, the by-length fstrim implementation handles this poorly
because it walks the entire free space by length index (cntbt), which is
a very inefficient way to walk a subset of the blocks of an AG.

Therefore, create a second implementation that will walk the bnobt and
perform the trims in block number order.  This implementation avoids the
worst problems of the original code, though it lacks the desirable
attribute of freeing the biggest chunks first.

On the other hand, this second implementation will be much easier to
constrain the system call latency, and makes it much easier to report
fstrim progress to anyone who's running xfs_scrub.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Dave Chinner <dchinner@redhat.com
2024-04-15 14:59:00 -07:00
..
libxfs xfs: pin inodes that would otherwise overflow link count 2024-04-15 14:58:59 -07:00
scrub xfs: create subordinate scrub contexts for xchk_metadata_inode_subtype 2024-04-15 14:59:00 -07:00
Kconfig xfs: support in-memory btrees 2024-02-22 12:43:35 -08:00
Makefile xfs: online repair of symbolic links 2024-04-15 14:58:58 -07:00
xfs_acl.c xfs: convert kmem_free() for kvmalloc users to kvfree() 2024-02-13 18:07:34 +05:30
xfs_acl.h
xfs_aops.c xfs: don't use current->journal_info 2024-03-25 10:21:01 +05:30
xfs_aops.h
xfs_attr_inactive.c xfs: report dir/attr block corruption errors to the health system 2024-02-22 12:32:18 -08:00
xfs_attr_item.c xfs: add an explicit owner field to xfs_da_args 2024-04-15 14:58:50 -07:00
xfs_attr_item.h
xfs_attr_list.c xfs: validate dabtree node buffer owners 2024-04-15 14:58:51 -07:00
xfs_bio_io.c
xfs_bmap_item.c xfs: support recovering bmap intent items targetting realtime extents 2024-02-22 12:44:24 -08:00
xfs_bmap_item.h xfs: move xfs_bmap_defer_add to xfs_bmap_item.c 2024-02-22 12:44:21 -08:00
xfs_bmap_util.c xfs: hoist multi-fsb allocation unit detection to a helper 2024-04-15 14:54:11 -07:00
xfs_bmap_util.h xfs: indicate if xfs_bmap_adjacent changed ap->blkno 2023-12-22 11:18:11 +05:30
xfs_buf_item_recover.c xfs: convert remaining kmem_free() to kfree() 2024-02-13 18:07:34 +05:30
xfs_buf_item.c xfs: convert remaining kmem_free() to kfree() 2024-02-13 18:07:34 +05:30
xfs_buf_item.h
xfs_buf_mem.c xfs: fix dev_t usage in xmbuf tracepoints 2024-03-15 10:30:23 +05:30
xfs_buf_mem.h xfs: launder in-memory btree buffers before transaction commit 2024-02-22 12:43:36 -08:00
xfs_buf.c xfs: repair extended attributes 2024-04-15 14:58:53 -07:00
xfs_buf.h New code for 6.9: 2024-03-13 13:52:24 -07:00
xfs_dahash_test.c
xfs_dahash_test.h
xfs_dir2_readdir.c xfs: validate explicit directory block buffer owners 2024-04-15 14:58:52 -07:00
xfs_discard.c xfs: fix performance problems when fstrimming a subset of a fragmented AG 2024-04-15 14:59:00 -07:00
xfs_discard.h xfs: move log discard work to xfs_discard.c 2023-10-04 09:24:02 +11:00
xfs_dquot_item_recover.c xfs: dquot recovery does not validate the recovered dquot 2023-11-22 23:39:36 +05:30
xfs_dquot_item.c
xfs_dquot_item.h
xfs_dquot.c xfs: quota radix tree allocations need to be NOFS on insert 2024-03-15 10:30:23 +05:30
xfs_dquot.h xfs: repair quotas 2023-12-15 10:03:45 -08:00
xfs_drain.c
xfs_drain.h
xfs_error.c xfs: add error injection to test file mapping exchange recovery 2024-04-15 14:54:19 -07:00
xfs_error.h
xfs_exchmaps_item.c xfs: capture inode generation numbers in the ondisk exchmaps log item 2024-04-15 14:54:24 -07:00
xfs_exchmaps_item.h xfs: create deferred log items for file mapping exchanges 2024-04-15 14:54:17 -07:00
xfs_exchrange.c xfs: support non-power-of-two rtextsize with exchange-range 2024-04-15 14:54:23 -07:00
xfs_exchrange.h xfs: create deferred log items for file mapping exchanges 2024-04-15 14:54:17 -07:00
xfs_export.c xfs: hide private inodes from bulkstat and handle functions 2024-04-15 14:58:48 -07:00
xfs_export.h
xfs_extent_busy.c xfs: convert remaining kmem_free() to kfree() 2024-02-13 18:07:34 +05:30
xfs_extent_busy.h xfs: repair free space btrees 2023-12-15 10:03:32 -08:00
xfs_extfree_item.c xfs: convert remaining kmem_free() to kfree() 2024-02-13 18:07:34 +05:30
xfs_extfree_item.h
xfs_file.c xfs: refactor non-power-of-two alignment checks 2024-04-15 14:54:12 -07:00
xfs_file.h xfs: create a new helper to return a file's allocation unit 2024-04-15 14:54:10 -07:00
xfs_filestream.c xfs: convert remaining kmem_free() to kfree() 2024-02-13 18:07:34 +05:30
xfs_filestream.h
xfs_fsmap.c xfs: split xfs_allocbt_init_cursor 2024-02-22 12:40:12 -08:00
xfs_fsmap.h
xfs_fsops.c New code for 6.8: 2024-01-10 08:45:22 -08:00
xfs_fsops.h xfs: clean up xfs_fsops.h 2023-12-07 14:51:07 +05:30
xfs_globals.c xfs: add debug knobs to control btree bulk load slack factors 2023-12-15 10:03:28 -08:00
xfs_health.c xfs: support in-memory btrees 2024-02-22 12:43:35 -08:00
xfs_hooks.c xfs: allow scrub to hook metadata updates in other writers 2024-02-22 12:30:45 -08:00
xfs_hooks.h xfs: allow scrub to hook metadata updates in other writers 2024-02-22 12:30:45 -08:00
xfs_icache.c xfs: don't use current->journal_info 2024-03-25 10:21:01 +05:30
xfs_icache.h xfs: use per-mount cpumask to track nonempty percpu inodegc lists 2023-09-11 08:39:03 -07:00
xfs_icreate_item.c xfs: convert kmem_free() for kvmalloc users to kvfree() 2024-02-13 18:07:34 +05:30
xfs_icreate_item.h
xfs_inode_item_recover.c xfs: convert remaining kmem_free() to kfree() 2024-02-13 18:07:34 +05:30
xfs_inode_item.c xfs: Replace xfs_isilocked with xfs_assert_ilocked 2024-02-19 21:19:33 +05:30
xfs_inode_item.h xfs: fix AGF vs inode cluster buffer deadlock 2023-06-05 04:08:27 +10:00
xfs_inode.c xfs: pin inodes that would otherwise overflow link count 2024-04-15 14:58:59 -07:00
xfs_inode.h xfs: check AGI unlinked inode buckets 2024-04-15 14:58:58 -07:00
xfs_ioctl32.c
xfs_ioctl32.h arch: Remove Itanium (IA-64) architecture 2023-09-11 08:13:17 +00:00
xfs_ioctl.c xfs: introduce new file range exchange ioctl 2024-04-15 14:54:14 -07:00
xfs_ioctl.h
xfs_iomap.c xfs: report block map corruption errors to the health tracking system 2024-02-22 12:31:51 -08:00
xfs_iomap.h
xfs_iops.c xfs: hide private inodes from bulkstat and handle functions 2024-04-15 14:58:48 -07:00
xfs_iops.h xfs: declare xfs_file.c symbols in xfs_file.h 2024-04-15 14:54:09 -07:00
xfs_itable.c xfs: hide private inodes from bulkstat and handle functions 2024-04-15 14:58:48 -07:00
xfs_itable.h
xfs_iunlink_item.c
xfs_iunlink_item.h
xfs_iwalk.c xfs: pass xfs_buf lookup flags to xfs_*read_agi 2024-04-15 14:54:03 -07:00
xfs_iwalk.h
xfs_linux.h xfs: refactor non-power-of-two alignment checks 2024-04-15 14:54:12 -07:00
xfs_log_cil.c xfs: use kvfree() in xlog_cil_free_logvec() 2024-02-28 14:04:45 +05:30
xfs_log_priv.h xfs: only clear log incompat flags at clean unmount 2024-04-15 14:54:06 -07:00
xfs_log_recover.c xfs: capture inode generation numbers in the ondisk exchmaps log item 2024-04-15 14:54:24 -07:00
xfs_log.c xfs: only clear log incompat flags at clean unmount 2024-04-15 14:54:06 -07:00
xfs_log.h xfs: only clear log incompat flags at clean unmount 2024-04-15 14:54:06 -07:00
xfs_message.c
xfs_message.h
xfs_mount.c xfs: only clear log incompat flags at clean unmount 2024-04-15 14:54:06 -07:00
xfs_mount.h xfs: create a incompat flag for atomic file mapping exchanges 2024-04-15 14:54:15 -07:00
xfs_mru_cache.c xfs: use GFP_KERNEL in pure transaction contexts 2024-02-13 18:07:35 +05:30
xfs_mru_cache.h
xfs_notify_failure.c mm, pmem, xfs: Introduce MF_MEM_PRE_REMOVE for unbind 2023-12-07 14:34:26 +05:30
xfs_pnfs.c
xfs_pnfs.h
xfs_pwork.c
xfs_pwork.h
xfs_qm_bhv.c xfs: track quota updates during live quotacheck 2024-02-22 12:30:55 -08:00
xfs_qm_syscalls.c
xfs_qm.c xfs: report quota block corruption errors to the health system 2024-02-22 12:32:44 -08:00
xfs_qm.h xfs: track quota updates during live quotacheck 2024-02-22 12:30:55 -08:00
xfs_quota.h xfs: track quota updates during live quotacheck 2024-02-22 12:30:55 -08:00
xfs_quotaops.c
xfs_refcount_item.c xfs: place intent recovery under NOFS allocation context 2024-02-13 18:07:35 +05:30
xfs_refcount_item.h
xfs_reflink.c xfs: support deferred bmap updates on the attr fork 2024-02-22 12:44:32 -08:00
xfs_reflink.h
xfs_rmap_item.c xfs: place intent recovery under NOFS allocation context 2024-02-13 18:07:35 +05:30
xfs_rmap_item.h
xfs_rtalloc.c xfs: report realtime metadata corruption errors to the health system 2024-02-22 12:32:44 -08:00
xfs_rtalloc.h xfs: move xfs_bmap_rtalloc to xfs_rtalloc.c 2023-12-22 11:18:11 +05:30
xfs_stats.c xfs: define an in-memory btree for storing refcount bag info during repairs 2024-02-22 12:43:40 -08:00
xfs_stats.h xfs: define an in-memory btree for storing refcount bag info during repairs 2024-02-22 12:43:40 -08:00
xfs_super.c xfs: introduce a file mapping exchange log intent item 2024-04-15 14:54:16 -07:00
xfs_super.h xfs: create scaffolding for creating debugfs entries 2023-08-10 07:48:07 -07:00
xfs_symlink.c xfs: pass the owner to xfs_symlink_write_target 2024-04-15 14:58:57 -07:00
xfs_symlink.h xfs: move remote symlink target read function to libxfs 2024-02-22 12:45:17 -08:00
xfs_sysctl.c fs: Remove the now superfluous sentinel elements from ctl_table array 2023-12-28 04:57:57 -08:00
xfs_sysctl.h xfs: add debug knobs to control btree bulk load slack factors 2023-12-15 10:03:28 -08:00
xfs_sysfs.c xfs: remove duplicate ifdefs 2024-02-17 09:32:32 +05:30
xfs_sysfs.h
xfs_trace.c xfs: bind together the front and back ends of the file range exchange code 2024-04-15 14:54:18 -07:00
xfs_trace.h xfs: repair extended attributes 2024-04-15 14:58:53 -07:00
xfs_trans_ail.c xfs: convert remaining kmem_free() to kfree() 2024-02-13 18:07:34 +05:30
xfs_trans_buf.c xfs: launder in-memory btree buffers before transaction commit 2024-02-22 12:43:36 -08:00
xfs_trans_dquot.c xfs: track quota updates during live quotacheck 2024-02-22 12:30:55 -08:00
xfs_trans_priv.h
xfs_trans.c xfs: Replace xfs_isilocked with xfs_assert_ilocked 2024-02-19 21:19:33 +05:30
xfs_trans.h xfs: don't use current->journal_info 2024-03-25 10:21:01 +05:30
xfs_xattr.c xfs: only clear log incompat flags at clean unmount 2024-04-15 14:54:06 -07:00
xfs_xattr.h xfs: move xfs_xattr_handlers to .rodata 2023-10-10 13:49:20 +02:00
xfs.h