linux/fs/xfs/libxfs
Dave Chinner 0fe0bbe00a xfs: bunmapi has unnecessary AG lock ordering issues
large directory block size operations are assert failing because
xfs_bunmapi() is not completely removing fragmented directory blocks
like so:

XFS: Assertion failed: done, file: fs/xfs/libxfs/xfs_dir2.c, line: 677
....
Call Trace:
 xfs_dir2_shrink_inode+0x1a8/0x210
 xfs_dir2_block_to_sf+0x2ae/0x410
 xfs_dir2_block_removename+0x21a/0x280
 xfs_dir_removename+0x195/0x1d0
 xfs_rename+0xb79/0xc50
 ? avc_has_perm+0x8d/0x1a0
 ? avc_has_perm_noaudit+0x9a/0x120
 xfs_vn_rename+0xdb/0x150
 vfs_rename+0x719/0xb50
 ? __lookup_hash+0x6a/0xa0
 do_renameat2+0x413/0x5e0
 __x64_sys_rename+0x45/0x50
 do_syscall_64+0x3a/0x70
 entry_SYSCALL_64_after_hwframe+0x44/0xae

We are aborting the bunmapi() pass because of this specific chunk of
code:

                /*
                 * Make sure we don't touch multiple AGF headers out of order
                 * in a single transaction, as that could cause AB-BA deadlocks.
                 */
                if (!wasdel && !isrt) {
                        agno = XFS_FSB_TO_AGNO(mp, del.br_startblock);
                        if (prev_agno != NULLAGNUMBER && prev_agno > agno)
                                break;
                        prev_agno = agno;
                }

This is designed to prevent deadlocks in AGF locking when freeing
multiple extents by ensuring that we only ever lock in increasing
AG number order. Unfortunately, this also violates the "bunmapi will
always succeed" semantic that some high level callers depend on,
such as xfs_dir2_shrink_inode(), xfs_da_shrink_inode() and
xfs_inactive_symlink_rmt().

This AG lock ordering was introduced back in 2017 to fix deadlocks
triggered by generic/299 as reported here:

https://lore.kernel.org/linux-xfs/800468eb-3ded-9166-20a4-047de8018582@gmail.com/

This codebase is old enough that it was before we were defering all
AG based extent freeing from within xfs_bunmapi(). THat is, we never
actually lock AGs in xfs_bunmapi() any more - every non-rt based
extent free is added to the defer ops list, as is all BMBT block
freeing. And RT extents are not RT based, so there's no lock
ordering issues associated with them.

Hence this AGF lock ordering code is both broken and dead. Let's
just remove it so that the large directory block code works reliably
again.

Tested against xfs/538 and generic/299 which is the original test
that exposed the deadlocks that this code fixed.

Fixes: 5b094d6dac ("xfs: fix multi-AG deadlock in xfs_bunmapi")
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2021-05-27 08:11:24 -07:00
..
xfs_ag_resv.c xfs: check free AG space when making per-AG reservations 2021-05-24 18:01:04 -07:00
xfs_ag_resv.h xfs: get rid of unnecessary xfs_perag_{get,put} pairs 2020-07-14 08:47:33 -07:00
xfs_ag.c xfs: introduce xfs_ag_shrink_space() 2021-03-25 16:47:52 -07:00
xfs_ag.h xfs: introduce xfs_ag_shrink_space() 2021-03-25 16:47:52 -07:00
xfs_alloc_btree.c xfs: introduce in-core global counter of allocbt blocks 2021-04-29 07:45:44 -07:00
xfs_alloc_btree.h xfs: Use the correct style for SPDX License Identifier 2020-05-13 15:32:45 -07:00
xfs_alloc.c xfs: introduce in-core global counter of allocbt blocks 2021-04-29 07:45:44 -07:00
xfs_alloc.h xfs: Introduce error injection to allocate only minlen size extents for files 2021-01-22 16:54:49 -08:00
xfs_attr_leaf.c xfs: remove XFS_IFEXTENTS 2021-04-15 09:35:51 -07:00
xfs_attr_leaf.h xfs: Add xfs_has_attr and subroutines 2020-07-28 20:24:14 -07:00
xfs_attr_remote.c xfs: remove the redundant crc feature check in xfs_attr3_rmt_verify 2020-09-25 11:34:07 -07:00
xfs_attr_remote.h xfs: Refactor xfs_attr_rmtval_remove 2020-07-28 20:28:11 -07:00
xfs_attr_sf.h xfs: Convert xfs_attr_sf macros to inline functions 2020-09-15 20:52:42 -07:00
xfs_attr.c xfs: remove XFS_IFINLINE 2021-04-15 09:35:51 -07:00
xfs_attr.h xfs: rename and simplify xfs_bmap_one_block 2021-04-15 09:35:50 -07:00
xfs_bit.c xfs: fix missing header includes 2019-11-07 13:00:53 -08:00
xfs_bit.h xfs: Use the correct style for SPDX License Identifier 2020-05-13 15:32:45 -07:00
xfs_bmap_btree.c xfs: move the di_flags field to struct xfs_inode 2021-04-07 14:37:05 -07:00
xfs_bmap_btree.h xfs: Use the correct style for SPDX License Identifier 2020-05-13 15:32:45 -07:00
xfs_bmap.c xfs: bunmapi has unnecessary AG lock ordering issues 2021-05-27 08:11:24 -07:00
xfs_bmap.h xfs: rename and simplify xfs_bmap_one_block 2021-04-15 09:35:50 -07:00
xfs_btree_staging.c xfs: remove XFS_IFBROOT 2021-04-15 09:35:51 -07:00
xfs_btree_staging.h xfs: xfs_btree_staging.h: delete duplicated words 2020-07-28 20:24:14 -07:00
xfs_btree.c xfs: use current->journal_info for detecting transaction recursion 2021-02-25 08:07:04 -08:00
xfs_btree.h xfs: Use the correct style for SPDX License Identifier 2020-05-13 15:32:45 -07:00
xfs_cksum.h
xfs_da_btree.c xfs: move the di_nblocks field to struct xfs_inode 2021-04-07 14:37:03 -07:00
xfs_da_btree.h xfs: Refactor xfs_da_state_alloc() helper 2020-07-28 20:24:14 -07:00
xfs_da_format.h xfs: code cleanup in xfs_attr_leaf_entsize_{remote,local} 2020-09-25 11:34:08 -07:00
xfs_defer.c xfs: only relog deferred intent items if free space in the log gets low 2020-10-07 08:40:29 -07:00
xfs_defer.h xfs: fix an incore inode UAF in xfs_bui_recover 2020-10-07 08:40:28 -07:00
xfs_dir2_block.c xfs: remove XFS_IFINLINE 2021-04-15 09:35:51 -07:00
xfs_dir2_data.c xfs: No need for inode number error injection in __xfs_dir3_data_check 2021-03-25 16:47:51 -07:00
xfs_dir2_leaf.c xfs: move the di_size field to struct xfs_inode 2021-04-07 14:37:03 -07:00
xfs_dir2_node.c xfs: move the di_size field to struct xfs_inode 2021-04-07 14:37:03 -07:00
xfs_dir2_priv.h xfs: reduce debug overhead of dir leaf/node checks 2021-03-25 16:47:51 -07:00
xfs_dir2_sf.c xfs: remove XFS_IFEXTENTS 2021-04-15 09:35:51 -07:00
xfs_dir2.c xfs: move the di_size field to struct xfs_inode 2021-04-07 14:37:03 -07:00
xfs_dir2.h xfs: fix an ABBA deadlock in xfs_rename 2021-01-22 16:54:43 -08:00
xfs_dquot_buf.c xfs: widen ondisk quota expiration timestamps to handle y2038+ 2020-09-15 20:52:41 -07:00
xfs_errortag.h xfs: add error injection for per-AG resv failure 2021-03-25 16:47:53 -07:00
xfs_format.h xfs: move the di_crtime field to struct xfs_inode 2021-04-07 14:37:05 -07:00
xfs_fs.h xfs: restore old ioctl definitions 2021-05-20 08:31:22 -07:00
xfs_health.h xfs: Use the correct style for SPDX License Identifier 2020-05-13 15:32:45 -07:00
xfs_ialloc_btree.c xfs: remove unneeded return value check for *init_cursor() 2020-12-09 09:49:38 -08:00
xfs_ialloc_btree.h xfs: add support for inode btree staging cursors 2020-03-18 08:12:23 -07:00
xfs_ialloc.c xfs: validate ag btree levels using the precomputed values 2021-03-25 16:47:50 -07:00
xfs_ialloc.h xfs: spilt xfs_dialloc() into 2 functions 2020-12-12 10:48:25 -08:00
xfs_iext_tree.c xfs: prevent metadata files from being inactivated 2021-03-25 16:47:50 -07:00
xfs_inode_buf.c xfs: validate extsz hints against rt extent size when rtinherit is set 2021-05-24 18:01:04 -07:00
xfs_inode_buf.h xfs: move the di_crtime field to struct xfs_inode 2021-04-07 14:37:05 -07:00
xfs_inode_fork.c xfs: remove XFS_IFEXTENTS 2021-04-15 09:35:51 -07:00
xfs_inode_fork.h xfs: remove XFS_IFEXTENTS 2021-04-15 09:35:51 -07:00
xfs_log_format.h xfs: rename struct xfs_legacy_ictimestamp 2021-04-22 18:29:25 -07:00
xfs_log_recover.h xfs: remove xlog_recover_iodone 2020-09-15 20:52:39 -07:00
xfs_log_rlimit.c xfs: remove unused header files 2019-06-28 19:30:43 -07:00
xfs_quota_defs.h xfs: widen ondisk quota expiration timestamps to handle y2038+ 2020-09-15 20:52:41 -07:00
xfs_refcount_btree.c xfs: Remove kmem_zone_zalloc() usage 2020-07-28 20:24:14 -07:00
xfs_refcount_btree.h xfs: add support for refcount btree staging cursors 2020-03-18 08:12:23 -07:00
xfs_refcount.c xfs: remove unneeded return value check for *init_cursor() 2020-12-09 09:49:38 -08:00
xfs_refcount.h xfs: remove unnecessary int returns from deferred refcount functions 2019-08-28 08:31:02 -07:00
xfs_rmap_btree.c xfs: remove obsolete AGF counter debugging 2021-04-29 07:44:18 -07:00
xfs_rmap_btree.h xfs: add support for rmap btree staging cursors 2020-03-18 08:12:23 -07:00
xfs_rmap.c xfs: remove unneeded return value check for *init_cursor() 2020-12-09 09:49:38 -08:00
xfs_rmap.h xfs: reinitialize rm_flags when unpacking an offset into an rmap irec 2019-08-28 08:31:02 -07:00
xfs_rtbitmap.c xfs: move the di_flags field to struct xfs_inode 2021-04-07 14:37:05 -07:00
xfs_sb.c xfs: update superblock counters correctly for !lazysbcount 2021-04-29 07:44:18 -07:00
xfs_sb.h xfs: introduce xfs_validate_stripe_geometry() 2020-12-09 09:49:38 -08:00
xfs_shared.h xfs: precalculate default inode attribute offset 2021-04-07 14:37:07 -07:00
xfs_symlink_remote.c xfs: move the fork format fields into struct xfs_ifork 2020-05-19 09:40:58 -07:00
xfs_trans_inode.c xfs: validate extsz hints against rt extent size when rtinherit is set 2021-05-24 18:01:04 -07:00
xfs_trans_resv.c xfs: add a new xfs_sb_version_has_v3inode helper 2020-03-19 08:47:34 -07:00
xfs_trans_resv.h xfs: convert to SPDX license tags 2018-06-06 14:17:53 -07:00
xfs_trans_space.h xfs: fix off-by-one in inode alloc block reservation calculation 2020-08-26 14:13:21 -07:00
xfs_types.c xfs: type verification is expensive 2021-03-25 16:47:51 -07:00
xfs_types.h xfs: refactor file range validation 2020-12-09 09:49:38 -08:00