linux/fs/xfs
Dave Chinner 0020a190cf xfs: AIL needs asynchronous CIL forcing
The AIL pushing is stalling on log forces when it comes across
pinned items. This is happening on removal workloads where the AIL
is dominated by stale items that are removed from AIL when the
checkpoint that marks the items stale is committed to the journal.
This results is relatively few items in the AIL, but those that are
are often pinned as directories items are being removed from are
still being logged.

As a result, many push cycles through the CIL will first issue a
blocking log force to unpin the items. This can take some time to
complete, with tracing regularly showing push delays of half a
second and sometimes up into the range of several seconds. Sequences
like this aren't uncommon:

....
 399.829437:  xfsaild: last lsn 0x11002dd000 count 101 stuck 101 flushing 0 tout 20
<wanted 20ms, got 270ms delay>
 400.099622:  xfsaild: target 0x11002f3600, prev 0x11002f3600, last lsn 0x0
 400.099623:  xfsaild: first lsn 0x11002f3600
 400.099679:  xfsaild: last lsn 0x1100305000 count 16 stuck 11 flushing 0 tout 50
<wanted 50ms, got 500ms delay>
 400.589348:  xfsaild: target 0x110032e600, prev 0x11002f3600, last lsn 0x0
 400.589349:  xfsaild: first lsn 0x1100305000
 400.589595:  xfsaild: last lsn 0x110032e600 count 156 stuck 101 flushing 30 tout 50
<wanted 50ms, got 460ms delay>
 400.950341:  xfsaild: target 0x1100353000, prev 0x110032e600, last lsn 0x0
 400.950343:  xfsaild: first lsn 0x1100317c00
 400.950436:  xfsaild: last lsn 0x110033d200 count 105 stuck 101 flushing 0 tout 20
<wanted 20ms, got 200ms delay>
 401.142333:  xfsaild: target 0x1100361600, prev 0x1100353000, last lsn 0x0
 401.142334:  xfsaild: first lsn 0x110032e600
 401.142535:  xfsaild: last lsn 0x1100353000 count 122 stuck 101 flushing 8 tout 10
<wanted 10ms, got 10ms delay>
 401.154323:  xfsaild: target 0x1100361600, prev 0x1100361600, last lsn 0x1100353000
 401.154328:  xfsaild: first lsn 0x1100353000
 401.154389:  xfsaild: last lsn 0x1100353000 count 101 stuck 101 flushing 0 tout 20
<wanted 20ms, got 300ms delay>
 401.451525:  xfsaild: target 0x1100361600, prev 0x1100361600, last lsn 0x0
 401.451526:  xfsaild: first lsn 0x1100353000
 401.451804:  xfsaild: last lsn 0x1100377200 count 170 stuck 22 flushing 122 tout 50
<wanted 50ms, got 500ms delay>
 401.933581:  xfsaild: target 0x1100361600, prev 0x1100361600, last lsn 0x0
....

In each of these cases, every AIL pass saw 101 log items stuck on
the AIL (pinned) with very few other items being found. Each pass, a
log force was issued, and delay between last/first is the sleep time
+ the sync log force time.

Some of these 101 items pinned the tail of the log. The tail of the
log does slowly creep forward (first lsn), but the problem is that
the log is actually out of reservation space because it's been
running so many transactions that stale items that never reach the
AIL but consume log space. Hence we have a largely empty AIL, with
long term pins on items that pin the tail of the log that don't get
pushed frequently enough to keep log space available.

The problem is the hundreds of milliseconds that we block in the log
force pushing the CIL out to disk. The AIL should not be stalled
like this - it needs to run and flush items that are at the tail of
the log with minimal latency. What we really need to do is trigger a
log flush, but then not wait for it at all - we've already done our
waiting for stuff to complete when we backed off prior to the log
force being issued.

Even if we remove the XFS_LOG_SYNC from the xfs_log_force() call, we
still do a blocking flush of the CIL and that is what is causing the
issue. Hence we need a new interface for the CIL to trigger an
immediate background push of the CIL to get it moving faster but not
to wait on that to occur. While the CIL is pushing, the AIL can also
be pushing.

We already have an internal interface to do this -
xlog_cil_push_now() - but we need a wrapper for it to be used
externally. xlog_cil_force_seq() can easily be extended to do what
we need as it already implements the synchronous CIL push via
xlog_cil_push_now(). Add the necessary flags and "push current
sequence" semantics to xlog_cil_force_seq() and convert the AIL
pushing to use it.

One of the complexities here is that the CIL push does not guarantee
that the commit record for the CIL checkpoint is written to disk.
The current log force ensures this by submitting the current ACTIVE
iclog that the commit record was written to. We need the CIL to
actually write this commit record to disk for an async push to
ensure that the checkpoint actually makes it to disk and unpins the
pinned items in the checkpoint on completion. Hence we need to pass
down to the CIL push that we are doing an async flush so that it can
switch out the commit_iclog if necessary to get written to disk when
the commit iclog is finally released.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Allison Henderson <allison.henderson@oracle.com>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2021-08-16 12:09:30 -07:00
..
libxfs xfs: Rename __xfs_attr_rmtval_remove 2021-08-11 09:12:45 -07:00
scrub xfs: replace kmem_alloc_large() with kvmalloc() 2021-08-09 15:57:43 -07:00
Kconfig xfs: fix Kconfig asking about XFS_SUPPORT_V4 when XFS_FS=n 2020-10-16 15:34:28 -07:00
kmem.c xfs: replace kmem_alloc_large() with kvmalloc() 2021-08-09 15:57:43 -07:00
kmem.h xfs: replace kmem_alloc_large() with kvmalloc() 2021-08-09 15:57:43 -07:00
Makefile
mrlock.h
xfs_acl.c xfs: support idmapped mounts 2021-01-24 14:43:46 +01:00
xfs_acl.h fs: make helpers idmap mount aware 2021-01-24 14:27:20 +01:00
xfs_aops.c fs: remove noop_set_page_dirty() 2021-06-29 10:53:48 -07:00
xfs_aops.h
xfs_attr_inactive.c xfs: Add delay ready attr remove routines 2021-06-01 10:49:47 -07:00
xfs_attr_list.c xfs: rename and simplify xfs_bmap_one_block 2021-04-15 09:35:50 -07:00
xfs_bio_io.c xfs: async blkdev cache flush 2021-06-21 10:05:51 -07:00
xfs_bmap_item.c xfs: refactor xfs_iget calls from log intent recovery 2021-08-09 15:57:59 -07:00
xfs_bmap_item.h
xfs_bmap_util.c New code for 5.14: 2021-07-02 14:30:27 -07:00
xfs_bmap_util.h
xfs_buf_item_recover.c xfs: prevent spoofing of rtbitmap blocks when recovering buffers 2021-07-29 09:27:29 -07:00
xfs_buf_item.c xfs: remove dead stale buf unpin handling code 2021-06-21 10:14:24 -07:00
xfs_buf_item.h xfs: move the buffer retry logic to xfs_buf.c 2020-09-15 20:52:38 -07:00
xfs_buf.c xfs: remove kmem_alloc_io() 2021-08-09 15:57:43 -07:00
xfs_buf.h xfs: remove kmem_alloc_io() 2021-08-09 15:57:43 -07:00
xfs_dir2_readdir.c xfs: remove XFS_IFINLINE 2021-04-15 09:35:51 -07:00
xfs_discard.c xfs: convert allocbt cursors to use perags 2021-06-02 10:48:24 +10:00
xfs_discard.h
xfs_dquot_item_recover.c xfs: rename the ondisk dquot d_flags to d_type 2020-07-28 20:24:14 -07:00
xfs_dquot_item.c xfs: remove support for disabling quota accounting on a mounted file system 2021-08-06 11:05:36 -07:00
xfs_dquot_item.h xfs: remove support for disabling quota accounting on a mounted file system 2021-08-06 11:05:36 -07:00
xfs_dquot.c xfs: remove the active vs running quota differentiation 2021-08-06 11:05:37 -07:00
xfs_dquot.h xfs: queue inactivation immediately when quota is nearing enforcement 2021-08-09 10:52:18 -07:00
xfs_error.c xfs: add error injection for per-AG resv failure 2021-03-25 16:47:53 -07:00
xfs_error.h
xfs_export.c xfs: Fix fall-through warnings for Clang 2021-05-26 14:51:26 -05:00
xfs_export.h
xfs_extent_busy.c xfs: pass perags through to the busy extent code 2021-06-02 10:48:24 +10:00
xfs_extent_busy.h xfs: pass perags through to the busy extent code 2021-06-02 10:48:24 +10:00
xfs_extfree_item.c xfs: dump log intent items that cannot be recovered due to corruption 2021-08-09 11:13:17 -07:00
xfs_extfree_item.h
xfs_file.c New code for 5.14: 2021-07-02 14:30:27 -07:00
xfs_filestream.c xfs: move xfs_perag_get/put to xfs_ag.[ch] 2021-06-02 10:48:24 +10:00
xfs_filestream.h xfs: move the di_flags field to struct xfs_inode 2021-04-07 14:37:05 -07:00
xfs_fsmap.c xfs: remove agno from btree cursor 2021-06-02 10:48:24 +10:00
xfs_fsmap.h xfs: fix deadlock and streamline xfs_getfsmap performance 2020-10-07 08:40:29 -07:00
xfs_fsops.c xfs: make forced shutdown processing atomic 2021-08-16 12:09:28 -07:00
xfs_fsops.h xfs: get rid of xfs_growfs_{data,log}_t 2021-02-03 09:18:50 -08:00
xfs_globals.c xfs: consolidate the eofblocks and cowblocks workers 2021-02-03 09:18:49 -08:00
xfs_health.c xfs: drop IDONTCACHE on inodes when we mark them sick 2021-06-08 09:30:20 -07:00
xfs_icache.c xfs: throttle inode inactivation queuing on memory reclaim 2021-08-09 11:13:17 -07:00
xfs_icache.h xfs: throttle inode inactivation queuing on memory reclaim 2021-08-09 11:13:17 -07:00
xfs_icreate_item.c xfs: cleanup __FUNCTION__ usage 2021-08-11 09:13:12 -07:00
xfs_icreate_item.h
xfs_inode_item_recover.c xfs: logging the on disk inode LSN can make it go backwards 2021-07-29 09:27:29 -07:00
xfs_inode_item.c xfs: xfs_log_force_lsn isn't passed a LSN 2021-06-21 10:12:33 -07:00
xfs_inode_item.h xfs: xfs_log_force_lsn isn't passed a LSN 2021-06-21 10:12:33 -07:00
xfs_inode.c xfs: detach dquots from inode if we don't need to inactivate it 2021-08-06 11:05:39 -07:00
xfs_inode.h xfs: per-cpu deferred inode inactivation queues 2021-08-06 11:05:39 -07:00
xfs_ioctl32.c xfs: convert to fileattr 2021-04-12 15:04:29 +02:00
xfs_ioctl32.h xfs: convert to fileattr 2021-04-12 15:04:29 +02:00
xfs_ioctl.c xfs: remove the active vs running quota differentiation 2021-08-06 11:05:37 -07:00
xfs_ioctl.h xfs: convert to fileattr 2021-04-12 15:04:29 +02:00
xfs_iomap.c xfs: Fix fall-through warnings for Clang 2021-05-26 14:51:26 -05:00
xfs_iomap.h
xfs_iops.c xfs: remove the active vs running quota differentiation 2021-08-06 11:05:37 -07:00
xfs_iops.h xfs: support idmapped mounts 2021-01-24 14:43:46 +01:00
xfs_itable.c xfs: avoid buffer deadlocks when walking fs inodes 2021-08-09 11:13:16 -07:00
xfs_itable.h xfs: support idmapped mounts 2021-01-24 14:43:46 +01:00
xfs_iwalk.c xfs: avoid buffer deadlocks when walking fs inodes 2021-08-09 11:13:16 -07:00
xfs_iwalk.h
xfs_linux.h xfs: async blkdev cache flush 2021-06-21 10:05:51 -07:00
xfs_log_cil.c xfs: AIL needs asynchronous CIL forcing 2021-08-16 12:09:30 -07:00
xfs_log_priv.h xfs: AIL needs asynchronous CIL forcing 2021-08-16 12:09:30 -07:00
xfs_log_recover.c xfs: convert log flags to an operational state field 2021-08-16 12:09:28 -07:00
xfs_log.c xfs: AIL needs asynchronous CIL forcing 2021-08-16 12:09:30 -07:00
xfs_log.h xfs: AIL needs asynchronous CIL forcing 2021-08-16 12:09:30 -07:00
xfs_message.c
xfs_message.h once: implement DO_ONCE_LITE for non-fast-path "do once" functionality 2021-06-28 15:54:57 -07:00
xfs_mount.c xfs: allow setting and clearing of log incompat feature flags 2021-08-09 15:57:59 -07:00
xfs_mount.h xfs: allow setting and clearing of log incompat feature flags 2021-08-09 15:57:59 -07:00
xfs_mru_cache.c xfs: set WQ_SYSFS on all workqueues in debug mode 2021-02-03 09:18:49 -08:00
xfs_mru_cache.h
xfs_ondisk.h xfs: rename struct xfs_legacy_ictimestamp 2021-04-22 18:29:25 -07:00
xfs_pnfs.c xfs: move the di_size field to struct xfs_inode 2021-04-07 14:37:03 -07:00
xfs_pnfs.h
xfs_pwork.c xfs: increase the default parallelism levels of pwork clients 2021-02-03 09:18:49 -08:00
xfs_pwork.h xfs: increase the default parallelism levels of pwork clients 2021-02-03 09:18:49 -08:00
xfs_qm_bhv.c xfs: move the di_projid field to struct xfs_inode 2021-04-07 14:37:03 -07:00
xfs_qm_syscalls.c xfs: flush inode inactivation work when compiling usage statistics 2021-08-09 10:52:18 -07:00
xfs_qm.c xfs: queue inactivation immediately when quota is nearing enforcement 2021-08-09 10:52:18 -07:00
xfs_qm.h xfs: remove support for disabling quota accounting on a mounted file system 2021-08-06 11:05:36 -07:00
xfs_quota.h xfs: queue inactivation immediately when quota is nearing enforcement 2021-08-09 10:52:18 -07:00
xfs_quotaops.c xfs: remove the active vs running quota differentiation 2021-08-06 11:05:37 -07:00
xfs_refcount_item.c xfs: dump log intent items that cannot be recovered due to corruption 2021-08-09 11:13:17 -07:00
xfs_refcount_item.h
xfs_reflink.c xfs: convert refcount btree cursor to use perags 2021-06-02 10:48:24 +10:00
xfs_reflink.h xfs: move helpers that lock and unlock two inodes against userspace IO 2020-07-06 10:46:57 -07:00
xfs_rmap_item.c xfs: dump log intent items that cannot be recovered due to corruption 2021-08-09 11:13:17 -07:00
xfs_rmap_item.h
xfs_rtalloc.c xfs: fix an integer overflow error in xfs_growfs_rt 2021-07-15 09:58:42 -07:00
xfs_rtalloc.h xfs: remove xfs_buf_t typedef 2020-12-16 16:07:34 -08:00
xfs_stats.c xfs: periodically relog deferred intent items 2020-10-07 08:40:28 -07:00
xfs_stats.h xfs: periodically relog deferred intent items 2020-10-07 08:40:28 -07:00
xfs_super.c xfs: convert log flags to an operational state field 2021-08-16 12:09:28 -07:00
xfs_super.h xfs: remove xfs_blkdev_issue_flush 2021-06-21 10:05:46 -07:00
xfs_symlink.c xfs: get rid of xfs_dir_ialloc() 2021-06-02 10:48:24 +10:00
xfs_symlink.h xfs: support idmapped mounts 2021-01-24 14:43:46 +01:00
xfs_sysctl.c xfs: restore speculative_cow_prealloc_lifetime sysctl 2021-02-24 10:16:08 -08:00
xfs_sysctl.h xfs: consolidate the eofblocks and cowblocks workers 2021-02-03 09:18:49 -08:00
xfs_sysfs.c xfs: AIL needs asynchronous CIL forcing 2021-08-16 12:09:30 -07:00
xfs_sysfs.h xfs: Fix UBSAN null-ptr-deref in xfs_sysfs_init 2020-08-07 11:50:17 -07:00
xfs_trace.c xfs: AIL needs asynchronous CIL forcing 2021-08-16 12:09:30 -07:00
xfs_trace.h xfs: XLOG_STATE_IOERROR must die 2021-08-16 12:09:27 -07:00
xfs_trans_ail.c xfs: AIL needs asynchronous CIL forcing 2021-08-16 12:09:30 -07:00
xfs_trans_buf.c xfs: Fix fall-through warnings for Clang 2021-05-26 14:51:26 -05:00
xfs_trans_dquot.c xfs: remove the active vs running quota differentiation 2021-08-06 11:05:37 -07:00
xfs_trans_priv.h
xfs_trans.c xfs: AIL needs asynchronous CIL forcing 2021-08-16 12:09:30 -07:00
xfs_trans.h xfs: xfs_log_force_lsn isn't passed a LSN 2021-06-21 10:12:33 -07:00
xfs_xattr.c xfs: prevent metadata files from being inactivated 2021-03-25 16:47:50 -07:00
xfs.h