mainlining shenanigans
Go to file
Filipe Manana db21370bff btrfs: drop extent map range more efficiently
Currently when dropping extent maps for a file range, through
btrfs_drop_extent_map_range(), we do the following non-optimal things:

1) We lookup for extent maps one by one, always starting the search from
   the root of the extent map tree. This is not efficient if we have
   multiple extent maps in the range;

2) We check on every iteration if we have the 'split' and 'split2' spare
   extent maps in case we need to split an extent map that intersects our
   range but also crosses its boundaries (to the left, to the right or
   both cases). If our target range is for example:

       [2M, 8M)

   And we have 3 extents maps in the range:

       [1M, 3M) [3M, 6M) [6M, 10M[

   The on the first iteration we allocate two extent maps for 'split' and
   'split2', and use the 'split' to split the first extent map, so after
   the split we set 'split' to 'split2' and then set 'split2' to NULL.

   On the second iteration, we don't need to split the second extent map,
   but because 'split2' is now NULL, we allocate a new extent map for
   'split2'.

   On the third iteration we need to split the third extent map, so we
   use the extent map pointed by 'split'.

   So we ended up allocating 3 extent maps for splitting, but all we
   needed was 2 extent maps. We never need to allocate more than 2,
   because extent maps that need to be split are always the first one
   and the last one in the target range.

Improve on this by:

1) Using rb_next() to move on to the next extent map. This results in
   iterating over less nodes of the tree and it does not require comparing
   the ranges of nodes to our start/end offset;

2) Allocate the 2 extent maps for splitting before entering the loop and
   never allocate more than 2. In practice it's very rare to have the
   combination of both extent map allocations fail, since we have a
   dedicated slab for extent maps, and also have the need to split two
   extent maps.

This patch is part of a patchset comprised of the following patches:

   btrfs: fix missed extent on fsync after dropping extent maps
   btrfs: move btrfs_drop_extent_cache() to extent_map.c
   btrfs: use extent_map_end() at btrfs_drop_extent_map_range()
   btrfs: use cond_resched_rwlock_write() during inode eviction
   btrfs: move open coded extent map tree deletion out of inode eviction
   btrfs: add helper to replace extent map range with a new extent map
   btrfs: remove the refcount warning/check at free_extent_map()
   btrfs: remove unnecessary extent map initializations
   btrfs: assert tree is locked when clearing extent map from logging
   btrfs: remove unnecessary NULL pointer checks when searching extent maps
   btrfs: remove unnecessary next extent map search
   btrfs: avoid pointless extent map tree search when flushing delalloc
   btrfs: drop extent map range more efficiently

And the following fio test was done before and after applying the whole
patchset, on a non-debug kernel (Debian's default kernel config) on a 12
cores Intel box with 64G of ram:

   $ cat test.sh
   #!/bin/bash

   DEV=/dev/nvme0n1
   MNT=/mnt/nvme0n1
   MOUNT_OPTIONS="-o ssd"
   MKFS_OPTIONS="-R free-space-tree -O no-holes"

   cat <<EOF > /tmp/fio-job.ini
   [writers]
   rw=randwrite
   fsync=8
   fallocate=none
   group_reporting=1
   direct=0
   bssplit=4k/20:8k/20:16k/20:32k/10:64k/10:128k/5:256k/5:512k/5:1m/5
   ioengine=psync
   filesize=2G
   runtime=300
   time_based
   directory=$MNT
   numjobs=8
   thread
   EOF

   echo performance | \
       tee /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor

   echo
   echo "Using config:"
   echo
   cat /tmp/fio-job.ini
   echo

   umount $MNT &> /dev/null
   mkfs.btrfs -f $MKFS_OPTIONS $DEV
   mount $MOUNT_OPTIONS $DEV $MNT

   fio /tmp/fio-job.ini

   umount $MNT

Result before applying the patchset:

   WRITE: bw=197MiB/s (206MB/s), 197MiB/s-197MiB/s (206MB/s-206MB/s), io=57.7GiB (61.9GB), run=300188-300188msec

Result after applying the patchset:

   WRITE: bw=203MiB/s (213MB/s), 203MiB/s-203MiB/s (213MB/s-213MB/s), io=59.5GiB (63.9GB), run=300019-300019msec

Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2022-09-29 17:08:31 +02:00
arch arm64 fixes for -rc7 2022-09-23 15:28:51 -07:00
block block-6.0-2022-09-22 2022-09-24 08:22:53 -07:00
certs certs: make system keyring depend on built-in x509 parser 2022-09-24 04:31:18 +09:00
crypto crypto: blake2b: effectively disable frame size warning 2022-08-10 17:59:11 -07:00
Documentation I2C driver bugfixes for mlxbf and imx, a few documentation fixes after 2022-09-25 08:44:46 -07:00
drivers dax-and-nvdimm-fixes-v6.0-final 2022-09-25 08:53:52 -07:00
fs btrfs: drop extent map range more efficiently 2022-09-29 17:08:31 +02:00
include btrfs: stop tracking failed reads in the I/O tree 2022-09-26 12:28:05 +02:00
init arm64 fixes for -rc3 2022-08-26 11:32:53 -07:00
io_uring io_uring-6.0-2022-09-23 2022-09-24 08:27:08 -07:00
ipc Updates to various subsystems which I help look after. lib, ocfs2, 2022-08-07 10:03:24 -07:00
kernel Cgroup fixes for v6.0-rc6 2022-09-24 08:36:10 -07:00
lib Makefile.debug: re-enable debug info for .S files 2022-09-24 11:19:19 +09:00
LICENSES LICENSES/LGPL-2.1: Add LGPL-2.1-or-later as valid identifiers 2021-12-16 14:33:10 +01:00
mm mm: export balance_dirty_pages_ratelimited_flags() 2022-09-26 12:28:07 +02:00
net Including fixes from wifi, netfilter and can. 2022-09-22 10:58:13 -07:00
samples Tracing updates for 5.20 / 6.0 2022-08-05 09:41:12 -07:00
scripts Makefile.debug: re-enable debug info for .S files 2022-09-24 11:19:19 +09:00
security Landlock fix for v6.0-rc4 2022-09-02 15:24:08 -07:00
sound Revert "ALSA: usb-audio: Split endpoint setups for hw_params and prepare" 2022-09-20 13:40:18 +02:00
tools dax-and-nvdimm-fixes-v6.0-final 2022-09-25 08:53:52 -07:00
usr Not a lot of material this cycle. Many singleton patches against various 2022-05-27 11:22:03 -07:00
virt KVM: Drop unnecessary initialization of "ops" in kvm_ioctl_create_device() 2022-08-19 04:05:43 -04:00
.clang-format PCI/DOE: Add DOE mailbox support functions 2022-07-19 15:38:04 -07:00
.cocciconfig
.get_maintainer.ignore get_maintainer: add Alan to .get_maintainer.ignore 2022-08-20 15:17:44 -07:00
.gitattributes
.gitignore kbuild: split the second line of *.mod into *.usyms 2022-05-08 03:16:59 +09:00
.mailmap Devicetree fixes for v6.0, take 2: 2022-09-14 10:22:39 +01:00
COPYING
CREDITS drm for 5.20/6.0 2022-08-03 19:52:08 -07:00
Kbuild
Kconfig kbuild: ensure full rebuild when the compiler is updated 2020-05-12 13:28:33 +09:00
MAINTAINERS I2C driver bugfixes for mlxbf and imx, a few documentation fixes after 2022-09-25 08:44:46 -07:00
Makefile Linux 6.0-rc7 2022-09-25 14:01:02 -07:00
README

Linux kernel
============

There are several guides for kernel developers and users. These guides can
be rendered in a number of formats, like HTML and PDF. Please read
Documentation/admin-guide/README.rst first.

In order to build the documentation, use ``make htmldocs`` or
``make pdfdocs``.  The formatted documentation can also be read online at:

    https://www.kernel.org/doc/html/latest/

There are various text files in the Documentation/ subdirectory,
several of them using the Restructured Text markup notation.

Please read the Documentation/process/changes.rst file, as it contains the
requirements for building and running the kernel, and information about
the problems which may result by upgrading your kernel.