A mirror of the official Linux kernel repository just in case
Go to file
Yury Norov e79864f316 lib/find_bit: optimize find_next_bit() functions
Over the past couple years, the function _find_next_bit() was extended
with parameters that modify its behavior to implement and- zero- and le-
flavors. The parameters are passed at compile time, but current design
prevents a compiler from optimizing out the conditionals.

As find_next_bit() API grows, I expect that more parameters will be added.
Current design would require more conditional code in _find_next_bit(),
which would bloat the helper even more and make it barely readable.

This patch replaces _find_next_bit() with a macro FIND_NEXT_BIT, and adds
a set of wrappers, so that the compile-time optimizations become possible.

The common logic is moved to the new macro, and all flavors may be
generated by providing a FETCH macro parameter, like in this example:

  #define FIND_NEXT_BIT(FETCH, MUNGE, size, start) ...

  find_next_xornot_and_bit(addr1, addr2, addr3, size, start)
  {
	return FIND_NEXT_BIT(addr1[idx] ^ ~addr2[idx] & addr3[idx],
				/* nop */, size, start);
  }

The FETCH may be of any complexity, as soon as it only refers the bitmap(s)
and an iterator idx.

MUNGE is here to support _le code generation for BE builds. May be
empty.

I ran find_bit_benchmark 16 times on top of 6.0-rc2 and 16 times on top
of 6.0-rc2 + this series. The results for kvm/x86_64 are:

                      v6.0-rc2  Optimized       Difference  Z-score
Random dense bitmap         ns         ns        ns      %
find_next_bit:          787735     670546    117189   14.9     3.97
find_next_zero_bit:     777492     664208    113284   14.6    10.51
find_last_bit:          830925     687573    143352   17.3     2.35
find_first_bit:        3874366    3306635    567731   14.7     1.84
find_first_and_bit:   40677125   37739887   2937238    7.2     1.36
find_next_and_bit:      347865     304456     43409   12.5     1.35

Random sparse bitmap
find_next_bit:           19816      14021      5795   29.2     6.10
find_next_zero_bit:    1318901    1223794     95107    7.2     1.41
find_last_bit:           14573      13514      1059    7.3     6.92
find_first_bit:        1313321    1249024     64297    4.9     1.53
find_first_and_bit:       8921       8098       823    9.2     4.56
find_next_and_bit:        9796       7176      2620   26.7     5.39

Where the statistics is significant (z-score > 3), the improvement
is ~15%.

According to the bloat-o-meter, the Image size is 10-11K less:

x86_64/defconfig:
add/remove: 32/14 grow/shrink: 61/782 up/down: 6344/-16521 (-10177)

arm64/defconfig:
add/remove: 3/2 grow/shrink: 50/714 up/down: 608/-11556 (-10948)

Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Yury Norov <yury.norov@gmail.com>
2022-09-21 12:21:32 -07:00
arch powerpc/64: don't refer nr_cpu_ids in asm code when it's undefined 2022-09-20 16:11:44 -07:00
block block-6.0-2022-08-26 2022-08-26 11:05:54 -07:00
certs Kbuild updates for v5.20 2022-08-10 10:40:41 -07:00
crypto crypto: blake2b: effectively disable frame size warning 2022-08-10 17:59:11 -07:00
Documentation Input updates for v6.0-rc3 2022-09-03 13:09:46 -07:00
drivers s390: 2022-09-04 11:27:14 -07:00
fs 5 cifs/smb3 fixes, all also for stable 2022-09-02 16:20:24 -07:00
include lib/find_bit: optimize find_next_bit() functions 2022-09-21 12:21:32 -07:00
init arm64 fixes for -rc3 2022-08-26 11:32:53 -07:00
io_uring io_uring-6.0-2022-09-02 2022-09-02 16:37:01 -07:00
ipc Updates to various subsystems which I help look after. lib, ocfs2, 2022-08-07 10:03:24 -07:00
kernel lib/cpumask: add FORCE_NR_CPUS config option 2022-09-20 16:11:44 -07:00
lib lib/find_bit: optimize find_next_bit() functions 2022-09-21 12:21:32 -07:00
LICENSES LICENSES/LGPL-2.1: Add LGPL-2.1-or-later as valid identifiers 2021-12-16 14:33:10 +01:00
mm mm: pagewalk: Fix race between unmap and page walker 2022-09-03 10:13:13 -07:00
net net/smc: Remove redundant refcount increase 2022-09-01 10:04:45 +02:00
samples Tracing updates for 5.20 / 6.0 2022-08-05 09:41:12 -07:00
scripts Makefile.extrawarn: re-enable -Wformat for clang; take 2 2022-09-04 11:15:50 -07:00
security Landlock fix for v6.0-rc4 2022-09-02 15:24:08 -07:00
sound ALSA: usb-audio: Add quirk for LH Labs Geek Out HD Audio 1V5 2022-08-28 09:42:14 +02:00
tools s390: 2022-09-04 11:27:14 -07:00
usr Not a lot of material this cycle. Many singleton patches against various 2022-05-27 11:22:03 -07:00
virt KVM: Drop unnecessary initialization of "ops" in kvm_ioctl_create_device() 2022-08-19 04:05:43 -04:00
.clang-format PCI/DOE: Add DOE mailbox support functions 2022-07-19 15:38:04 -07:00
.cocciconfig
.get_maintainer.ignore get_maintainer: add Alan to .get_maintainer.ignore 2022-08-20 15:17:44 -07:00
.gitattributes
.gitignore kbuild: split the second line of *.mod into *.usyms 2022-05-08 03:16:59 +09:00
.mailmap .mailmap: update Luca Ceresoli's e-mail address 2022-08-28 14:02:46 -07:00
COPYING COPYING: state that all contributions really are covered by this file 2020-02-10 13:32:20 -08:00
CREDITS drm for 5.20/6.0 2022-08-03 19:52:08 -07:00
Kbuild kbuild: rename hostprogs-y/always to hostprogs/always-y 2020-02-04 01:53:07 +09:00
Kconfig kbuild: ensure full rebuild when the compiler is updated 2020-05-12 13:28:33 +09:00
MAINTAINERS gpio fixes for v6.0-rc4 2022-09-03 21:27:27 -07:00
Makefile Linux 6.0-rc4 2022-09-04 13:10:01 -07:00
README

Linux kernel
============

There are several guides for kernel developers and users. These guides can
be rendered in a number of formats, like HTML and PDF. Please read
Documentation/admin-guide/README.rst first.

In order to build the documentation, use ``make htmldocs`` or
``make pdfdocs``.  The formatted documentation can also be read online at:

    https://www.kernel.org/doc/html/latest/

There are various text files in the Documentation/ subdirectory,
several of them using the Restructured Text markup notation.

Please read the Documentation/process/changes.rst file, as it contains the
requirements for building and running the kernel, and information about
the problems which may result by upgrading your kernel.