xfs: fix potential AGI <-> ILOCK ABBA deadlock in xrep_dinode_findmode_walk_directory

xfs/399 found the following deadlock when fuzzing core.mode = ones:

/proc/20506/task/20558/stack :
[<0>] xfs_ilock+0xa0/0x240 [xfs]
[<0>] xfs_ilock_data_map_shared+0x1b/0x20 [xfs]
[<0>] xrep_dinode_findmode_walk_directory+0x69/0xe0 [xfs]
[<0>] xrep_dinode_find_mode+0x103/0x2a0 [xfs]
[<0>] xrep_dinode_mode+0x7c/0x120 [xfs]
[<0>] xrep_dinode_core+0xed/0x2b0 [xfs]
[<0>] xrep_dinode_problems+0x10/0x80 [xfs]
[<0>] xrep_inode+0x6c/0xc0 [xfs]
[<0>] xrep_attempt+0x64/0x1d0 [xfs]
[<0>] xfs_scrub_metadata+0x365/0x840 [xfs]
[<0>] xfs_scrubv_metadata+0x282/0x430 [xfs]
[<0>] xfs_ioc_scrubv_metadata+0x149/0x1a0 [xfs]
[<0>] xfs_file_ioctl+0xc68/0x1780 [xfs]
/proc/20506/task/20559/stack :
[<0>] xfs_buf_lock+0x3b/0x110 [xfs]
[<0>] xfs_buf_find_lock+0x66/0x1c0 [xfs]
[<0>] xfs_buf_get_map+0x208/0xc00 [xfs]
[<0>] xfs_buf_read_map+0x5d/0x2c0 [xfs]
[<0>] xfs_trans_read_buf_map+0x1b0/0x4c0 [xfs]
[<0>] xfs_read_agi+0xbd/0x190 [xfs]
[<0>] xfs_ialloc_read_agi+0x47/0x160 [xfs]
[<0>] xfs_imap_lookup+0x69/0x1f0 [xfs]
[<0>] xfs_imap+0x1fc/0x3d0 [xfs]
[<0>] xfs_iget+0x357/0xd50 [xfs]
[<0>] xchk_dir_actor+0x16e/0x330 [xfs]
[<0>] xchk_dir_walk_block+0x164/0x1e0 [xfs]
[<0>] xchk_dir_walk+0x13a/0x190 [xfs]
[<0>] xchk_directory+0x1a2/0x2b0 [xfs]
[<0>] xfs_scrub_metadata+0x2f4/0x840 [xfs]
[<0>] xfs_scrubv_metadata+0x282/0x430 [xfs]
[<0>] xfs_ioc_scrubv_metadata+0x149/0x1a0 [xfs]
[<0>] xfs_file_ioctl+0xc68/0x1780 [xfs]

Thread 20558 holds an AGI buffer and is trying to grab the ILOCK of the
root directory.  Thread 20559 holds the root directory ILOCK and is
trying to grab the AGI of an inode that is one of the root directory's
children.  The AGI held by 20558 is the same buffer that 20559 is trying
to acquire.  In other words, this is an ABBA deadlock.

In general, the lock order is ILOCK and then AGI -- rename does this
while preparing for an operation involving whiteouts or renaming files
out of existence; and unlink does this when moving an inode to the
unlinked list.  The only place where we do it in the opposite order is
on the child during an icreate, but at that point the child is marked
INEW and is not visible to other threads.

Work around this deadlock by replacing the blocking ilock attempt with a
nonblocking loop that aborts after 30 seconds.  Relax for a jiffy after
a failed lock attempt.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
This commit is contained in:
Darrick J. Wong 2024-04-15 14:54:05 -07:00
parent 2afd5276d3
commit 21ad2d0364

View File

@ -282,6 +282,51 @@ xrep_dinode_findmode_dirent(
return 0;
}
/* Try to lock a directory, or wait a jiffy. */
static inline int
xrep_dinode_ilock_nowait(
struct xfs_inode *dp,
unsigned int lock_mode)
{
if (xfs_ilock_nowait(dp, lock_mode))
return true;
schedule_timeout_killable(1);
return false;
}
/*
* Try to lock a directory to look for ftype hints. Since we already hold the
* AGI buffer, we cannot block waiting for the ILOCK because rename can take
* the ILOCK and then try to lock AGIs.
*/
STATIC int
xrep_dinode_trylock_directory(
struct xrep_inode *ri,
struct xfs_inode *dp,
unsigned int *lock_modep)
{
unsigned long deadline = jiffies + msecs_to_jiffies(30000);
unsigned int lock_mode;
int error = 0;
do {
if (xchk_should_terminate(ri->sc, &error))
return error;
if (xfs_need_iread_extents(&dp->i_df))
lock_mode = XFS_ILOCK_EXCL;
else
lock_mode = XFS_ILOCK_SHARED;
if (xrep_dinode_ilock_nowait(dp, lock_mode)) {
*lock_modep = lock_mode;
return 0;
}
} while (!time_is_before_jiffies(deadline));
return -EBUSY;
}
/*
* If this is a directory, walk the dirents looking for any that point to the
* scrub target inode.
@ -299,7 +344,9 @@ xrep_dinode_findmode_walk_directory(
* Scan the directory to see if there it contains an entry pointing to
* the directory that we are repairing.
*/
lock_mode = xfs_ilock_data_map_shared(dp);
error = xrep_dinode_trylock_directory(ri, dp, &lock_mode);
if (error)
return error;
/*
* If this directory is known to be sick, we cannot scan it reliably