linux

History

Filipe Manana e2e58d0f8d btrfs: try to unlock parent nodes earlier when inserting a key When inserting a new key, we release the write lock on the leaf's parent only after doing the binary search on the leaf. This is because if the key ends up at slot 0, we will have to update the key at slot 0 of the parent node. The same reasoning applies to any other upper level nodes when their slot is 0. We also need to keep the parent locked in case the leaf does not have enough free space to insert the new key/item, because in that case we will split the leaf and we will need to add a new key to the parent due to a new leaf resulting from the split operation. However if the leaf has enough space for the new key and the key does not end up at slot 0 of the leaf we could release our write lock on the parent before doing the binary search on the leaf to figure out the destination slot. That leads to reducing the amount of time other tasks are blocked waiting to lock the parent, therefore increasing parallelism when there are other tasks that are trying to access other leaves accessible through the same parent. This also applies to other upper nodes besides the immediate parent, when their slot is 0, since we keep locks on them until we figure out if the leaf slot is slot 0 or not. In fact, having the key ending at up slot 0 when is rare. Typically it only happens when the key is less than or equals to the smallest, the "left most", key of the entire btree, during a split attempt when we try to push to the right sibling leaf or when the caller just wants to update the item of an existing key. It's also very common that a leaf has enough space to insert a new key, since after a split we move about half of the keys from one into the new leaf. So unlock the parent, and any other upper level nodes, when during a key insertion we notice the key is greater then the first key in the leaf and the leaf has enough free space. After unlocking the upper level nodes, do the binary search using a low boundary of slot 1 and not slot 0, to figure out the slot where the key will be inserted (or where the key already is in case it exists and the caller wants to modify its item data). This extra comparison, with the first key, is cheap and the key is very likely already in a cache line because it immediately follows the header of the extent buffer and we have recently read the level field of the header (which in fact is the last field of the header). The following fs_mark test was run on a non-debug kernel (debian's default kernel config), with a 12 cores intel CPU, and using a NVMe device: $ cat run-fsmark.sh #!/bin/bash DEV=/dev/nvme0n1 MNT=/mnt/nvme0n1 MOUNT_OPTIONS="-o ssd" MKFS_OPTIONS="-O no-holes -R free-space-tree" FILES=100000 THREADS=$(nproc --all) FILE_SIZE=0 echo "performance" \| \ tee /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor mkfs.btrfs -f $MKFS_OPTIONS $DEV mount $MOUNT_OPTIONS $DEV $MNT OPTS="-S 0 -L 10 -n $FILES -s $FILE_SIZE -t $THREADS -k" for ((i = 1; i <= $THREADS; i++)); do OPTS="$OPTS -d $MNT/d$i" done fs_mark $OPTS umount $MNT Before this change: FSUse% Count Size Files/sec App Overhead 0 1200000 0 165273.6 5958381 0 2400000 0 190938.3 6284477 0 3600000 0 181429.1 6044059 0 4800000 0 173979.2 6223418 0 6000000 0 139288.0 6384560 0 7200000 0 163000.4 6520083 1 8400000 0 57799.2 5388544 1 9600000 0 66461.6 5552969 2 10800000 0 49593.5 5163675 2 12000000 0 57672.1 4889398 After this change: FSUse% Count Size Files/sec App Overhead 0 1200000 0 167987.3 (+1.6%) 6272730 0 2400000 0 198563.9 (+4.0%) 6048847 0 3600000 0 197436.6 (+8.8%) 6163637 0 4800000 0 202880.7 (+16.6%) 6371771 1 6000000 0 167275.9 (+20.1%) 6556733 1 7200000 0 204051.2 (+25.2%) 6817091 1 8400000 0 69622.8 (+20.5%) 5525675 1 9600000 0 69384.5 (+4.4%) 5700723 1 10800000 0 61454.1 (+23.9%) 5363754 3 12000000 0 61908.7 (+7.3%) 5370196 Reviewed-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>		2022-01-07 14:18:23 +01:00
..
9p	netfs, 9p, afs, ceph: Use folios	2021-11-10 21:16:56 +00:00
adfs	mm: require ->set_page_dirty to be explicitly wired up	2021-06-29 10:53:48 -07:00
affs	affs: use bdev_nr_sectors instead of open coding it	2021-10-18 14:43:22 -06:00
afs	afs: Fix mmap	2021-12-16 09:10:13 -08:00
autofs	autofs: fix wait name hash calculation in autofs_wait()	2021-10-20 21:09:02 -04:00
befs	isystem: ship and use stdarg.h	2021-08-19 09:02:55 +09:00
bfs	mm: require ->set_page_dirty to be explicitly wired up	2021-06-29 10:53:48 -07:00
btrfs	btrfs: try to unlock parent nodes earlier when inserting a key	2022-01-07 14:18:23 +01:00
cachefiles	for-5.16/ki_complete-2021-10-29	2021-11-01 10:17:11 -07:00
ceph	ceph: fix up non-directory creation in SGID directories	2021-12-01 17:08:27 +01:00
cifs	cifs: sanitize multiple delimiters in prepath	2021-12-17 19:16:49 -06:00
coda	coda: bump module version to 7.2	2021-11-09 10:02:51 -08:00
configfs	configfs: fix a race in configfs_lookup()	2021-08-25 07:58:49 +02:00
cramfs	cramfs: use bdev_nr_bytes instead of open coding it	2021-10-18 14:43:22 -06:00
crypto	fscrypt: improve a few comments	2021-10-25 19:11:50 -07:00
debugfs	debugfs: debugfs_create_file_size(): use IS_ERR to check for error	2021-09-21 09:09:06 +02:00
devpts
dlm	fs: dlm: avoid comms shutdown delay in release_lockspace	2021-09-01 11:29:14 -05:00
ecryptfs	mm: require ->set_page_dirty to be explicitly wired up	2021-06-29 10:53:48 -07:00
efivarfs	efivars: convert to fileattr	2021-04-12 15:04:29 +02:00
efs
erofs	erofs: fix deadlock when shrink erofs slab	2021-11-23 14:58:16 +08:00
exfat	exfat: fix incorrect loading of i_blocks for large files	2021-11-01 07:49:21 +09:00
exportfs
ext2	ext2: fix sleeping in atomic bugs on error	2021-09-22 13:05:23 +02:00
ext4	Only bug fixes and cleanups for ext4 this merge window. Of note are	2021-11-10 17:05:37 -08:00
f2fs	Update to zstd-1.4.10	2021-11-13 15:32:30 -08:00
fat	for-5.16/inode-sync-2021-10-29	2021-11-01 10:25:27 -07:00
freevxfs
fscache	fscache: Remove an unused static variable	2021-10-04 22:13:12 +01:00
fuse	fuse: release pipe buf after last use	2021-11-25 14:05:18 +01:00
gfs2	gfs2: gfs2_create_inode rework	2021-12-02 12:41:10 +01:00
hfs	Merge branch 'akpm' (patches from Andrew)	2021-11-09 10:11:53 -08:00
hfsplus	Merge branch 'akpm' (patches from Andrew)	2021-11-09 10:11:53 -08:00
hostfs	hostfs: support splice_write	2021-08-26 22:28:02 +02:00
hpfs	treewide: Replace open-coded flex arrays in unions	2021-10-18 12:28:53 -07:00
hugetlbfs	mm,hugetlb: remove mlock ulimit for SHM_HUGETLB	2021-11-09 10:02:48 -08:00
iomap	iomap: iomap_read_inline_data cleanup	2021-11-24 10:15:47 -08:00
isofs	isofs: Fix out of bound access for corrupted isofs image	2021-10-19 12:51:02 +02:00
jbd2	jbd2: add sparse annotations for add_transaction_credits()	2021-08-30 23:36:50 -04:00
jffs2	vfs: add rcu argument to ->get_acl() callback	2021-08-18 22:08:24 +02:00
jfs	Just one JFS patch	2021-11-03 09:23:25 -07:00
kernfs	Merge 5.15-rc6 into driver-core-next	2021-10-18 09:43:37 +02:00
ksmbd	ksmbd: disable SMB2_GLOBAL_CAP_ENCRYPTION for SMB 3.1.1	2021-12-17 19:19:45 -06:00
lockd	A slow cycle for nfsd: mainly cleanup, including Neil's patch dropping	2021-11-10 16:45:54 -08:00
minix	mm: require ->set_page_dirty to be explicitly wired up	2021-06-29 10:53:48 -07:00
netfs	netfs: fix parameter of cleanup()	2021-12-07 15:47:09 +00:00
nfs	NFS client bugfixes for Linux 5.16	2021-11-27 10:33:55 -08:00
nfs_common	nfs: Fix kerneldoc warning shown up by W=1	2021-10-04 22:02:17 +01:00
nfsd	NFSD: Fix READDIR buffer overflow	2021-12-18 17:11:06 -05:00
nilfs2	Merge branch 'akpm' (patches from Andrew)	2021-11-09 10:11:53 -08:00
nls
notify	fanotify: Allow users to request FAN_FS_ERROR events	2021-10-27 12:53:45 +02:00
ntfs	fs: ntfs: Limit NTFS_RW to page sizes smaller than 64k	2021-11-27 14:34:41 -08:00
ntfs3	gfs2: Fix mmap + page fault deadlocks	2021-11-02 12:25:03 -07:00
ocfs2	Merge branch 'exit-cleanups-for-v5.16' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace	2021-11-10 16:15:54 -08:00
omfs	mm: require ->set_page_dirty to be explicitly wired up	2021-06-29 10:53:48 -07:00
openpromfs
orangefs	orangefs: three fixes from other folks...	2021-11-09 10:34:06 -08:00
overlayfs	overlayfs update for 5.16	2021-11-09 10:51:12 -08:00
proc	proc/vmcore: fix clearing user buffer by properly using clear_user()	2021-11-20 10:35:55 -08:00
pstore	pstore/blk: Use "%lu" to format unsigned long	2021-11-21 09:44:19 -08:00
qnx4	qnx4: work around gcc false positive warning bug	2021-09-21 08:36:48 -07:00
qnx6
quota	\n	2021-11-06 16:40:48 -07:00
ramfs	Merge branch 'akpm' (patches from Andrew)	2021-11-09 10:11:53 -08:00
reiserfs	\n	2021-11-06 16:40:48 -07:00
romfs
smbfs_common	cifs: Fix crash on unload of cifs_arc4.ko	2021-12-07 22:38:03 -06:00
squashfs	lib: zstd: Add kernel-specific API	2021-11-08 16:55:21 -08:00
sysfs	fs/sysfs/dir.c: replace S_IRWXU\|S_IRUGO\|S_IXUGO with 0755 sysfs_create_dir_ns()	2021-10-05 16:35:05 +02:00
sysv	sysv: use BUILD_BUG_ON instead of runtime check	2021-11-09 10:02:52 -08:00
tracefs	tracefs: Set all files to the same group ownership as the mount option	2021-12-08 08:06:40 -05:00
ubifs	fscrypt: remove fscrypt_operations::max_namelen	2021-09-20 19:32:33 -07:00
udf	udf: Fix crash after seekdir	2021-11-09 12:53:58 +01:00
ufs	isystem: ship and use stdarg.h	2021-08-19 09:02:55 +09:00
unicode	.gitignore: prefix local generated files with a slash	2021-05-02 00:43:35 +09:00
vboxsf	vboxfs: fix broken legacy mount signature checking	2021-09-27 11:26:21 -07:00
verity	fs-verity: fix signed integer overflow with i_size near S64_MAX	2021-09-22 10:56:34 -07:00
xfs	xfs: remove all COW fork extents when remounting readonly	2021-12-07 10:17:29 -08:00
zonefs	zonefs: add MODULE_ALIAS_FS	2021-12-17 16:56:35 +09:00
aio.c	aio: Fix incorrect usage of eventfd_signal_allowed()	2021-12-09 10:52:55 -08:00
anon_inodes.c	fs: add anon_inode_getfile_secure() similar to anon_inode_getfd_secure()	2021-09-19 22:35:37 -04:00
attr.c	fs: handle circular mappings correctly	2021-11-17 09:26:09 +01:00
bad_inode.c	vfs: add rcu argument to ->get_acl() callback	2021-08-18 22:08:24 +02:00
binfmt_aout.c	binfmt: a.out: Fix bogus semicolon	2021-09-05 10:15:05 -07:00
binfmt_elf_fdpic.c	coredump: Limit coredumps to a single thread group	2021-10-08 12:06:02 -05:00
binfmt_elf.c	Merge branch 'akpm' (patches from Andrew)	2021-11-09 10:11:53 -08:00
binfmt_flat.c	binfmt: remove in-tree usage of MAP_EXECUTABLE	2021-06-29 10:53:50 -07:00
binfmt_misc.c
binfmt_script.c
buffer.c	fs: simplify init_page_buffers	2021-10-18 14:43:22 -06:00
char_dev.c
compat_binfmt_elf.c
coredump.c	coredump: Limit coredumps to a single thread group	2021-10-08 12:06:02 -05:00
d_path.c	d_path: fix Kernel doc validator complaining	2021-11-06 13:30:32 -07:00
dax.c	New code for 5.15:	2021-08-31 11:13:35 -07:00
dcache.c	useful constants: struct qstr for ".."	2021-04-15 22:36:45 -04:00
direct-io.c	fs: get rid of the res2 iocb->ki_complete argument	2021-10-25 10:36:24 -06:00
drop_caches.c	fs: drop_caches: fix skipping over shadow cache inodes	2021-09-03 09:58:10 -07:00
eventfd.c	eventfd: Export eventfd_wake_count to modules	2021-09-06 07:20:56 -04:00
eventpoll.c	ARM development updates for 5.15:	2021-09-09 13:25:49 -07:00
exec.c	Merge branch 'exit-cleanups-for-v5.16' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace	2021-11-10 16:15:54 -08:00
fcntl.c	Merge branch 'akpm' (patches from Andrew)	2021-09-03 10:08:28 -07:00
fhandle.c	switch file_open_root() to struct path	2021-04-07 13:56:43 -04:00
file_table.c
file.c	fget: clarify and improve __fget_files() implementation	2021-12-13 10:55:30 -08:00
filesystems.c	fs: simplify get_filesystem_list / get_all_fs_names	2021-08-23 01:25:40 -04:00
fs_context.c	memcg: charge fs_context and legacy_fs_context	2021-09-03 09:58:12 -07:00
fs_parser.c	namei: Standardize callers of filename_lookup()	2021-09-07 16:07:47 -04:00
fs_pin.c
fs_struct.c
fs_types.c
fs-writeback.c	Various hardening fixes and cleanups for 5.16-rc1	2021-11-01 17:29:10 -07:00
fsopen.c
init.c
inode.c	fs: Remove FS_THP_SUPPORT	2021-11-17 10:36:35 -05:00
internal.h	Merge branch 'akpm' (patches from Andrew)	2021-11-09 10:11:53 -08:00
io_uring.c	io_uring: zero iocb->ki_pos for stream file types	2021-12-22 20:34:32 -07:00
io-wq.c	io-wq: drop wqe lock before creating new worker	2021-12-13 09:04:01 -07:00
io-wq.h	io_uring: optimise INIT_WQ_LIST	2021-10-19 05:49:54 -06:00
ioctl.c	New code for 5.15:	2021-08-31 11:06:32 -07:00
Kconfig	4 cifs/smb3 fixes, one for DFS reconnect, and one to begin creating common headers for server and client and the other two to rename the cifs_common directory to smbfs_common to be more consistent ie change use of the name cifs to smb which is more accurate	2021-09-12 10:10:21 -07:00
Kconfig.binfmt	binfmt: remove support for em86 (alpha only)	2021-07-25 22:33:03 -07:00
kernel_read_file.c	vfs: check fd has read access in kernel_read_file_from_fd()	2021-10-18 20:22:03 -10:00
libfs.c	libfs: Support RENAME_EXCHANGE in simple_rename()	2021-11-03 15:43:08 +01:00
locks.c	locks: remove changelog comments	2021-10-19 14:11:39 -04:00
Makefile	4 cifs/smb3 fixes, one for DFS reconnect, and one to begin creating common headers for server and client and the other two to rename the cifs_common directory to smbfs_common to be more consistent ie change use of the name cifs to smb which is more accurate	2021-09-12 10:10:21 -07:00
mbcache.c
mount.h
mpage.c
namei.c	File locking changes for v5.16	2021-11-01 09:06:53 -07:00
namespace.c	fs/mount_setattr: always cleanup mount_kattr	2021-12-30 15:12:13 -08:00
no-block.c
nsfs.c
open.c	Merge branch 'akpm' (patches from Andrew)	2021-11-06 14:08:17 -07:00
pipe.c	Revert "mm/gup: remove try_get_page(), call try_get_compound_head() directly"	2021-09-07 11:03:45 -07:00
pnode.c
pnode.h
posix_acl.c	fs/posix_acl.c: avoid -Wempty-body warning	2021-11-06 13:30:32 -07:00
proc_namespace.c
read_write.c	fs: remove leftover comments from mandatory locking removal	2021-10-26 12:20:50 -04:00
readdir.c	readdir: make sure to verify directory entry for legacy interfaces too	2021-04-17 11:39:49 -07:00
remap_range.c	fs: remove mandatory file locking support	2021-08-23 06:15:36 -04:00
select.c	Revert "memcg: enable accounting for pollfd and select bits arrays"	2021-09-07 11:26:23 -07:00
seq_file.c	seq_file: move seq_escape() to a header	2021-11-09 10:02:52 -08:00
signalfd.c	signalfd: use wake_up_pollfree()	2021-12-09 10:49:56 -08:00
splice.c
stack.c
stat.c	fs: add generic helper for filling statx attribute flags	2021-08-17 11:47:43 +02:00
statfs.c
super.c	fs: explicitly unregister per-superblock BDIs	2021-11-06 13:30:34 -07:00
sync.c	block: simplify the block device syncing code	2021-10-22 08:36:55 -06:00
timerfd.c	timerfd: Provide timerfd_resume()	2021-08-10 17:57:22 +02:00
userfaultfd.c	userfaultfd: fix a race between writeprotect and exit_mmap()	2021-10-18 20:22:02 -10:00
utimes.c
xattr.c	xattr: fix kernel-doc for mnt_userns and vfs xattr helpers	2021-03-23 11:20:26 +01:00