linux/include/asm-generic
Yury Norov 277a20a498 lib: add fast path for find_next_*_bit()
Similarly to bitmap functions, find_next_*_bit() users will benefit if
we'll handle a case of bitmaps that fit into a single word inline.  In the
very best case, the compiler may replace a function call with a few
instructions.

This is the quite typical find_next_bit() user:

	unsigned int cpumask_next(int n, const struct cpumask *srcp)
	{
		/* -1 is a legal arg here. */
		if (n != -1)
			cpumask_check(n);
		return find_next_bit(cpumask_bits(srcp), nr_cpumask_bits, n + 1);
	}
	EXPORT_SYMBOL(cpumask_next);

Currently, on ARM64 the generated code looks like this:
	0000000000000000 <cpumask_next>:
	   0:   a9bf7bfd        stp     x29, x30, [sp, #-16]!
	   4:   11000402        add     w2, w0, #0x1
	   8:   aa0103e0        mov     x0, x1
	   c:   d2800401        mov     x1, #0x40                       // #64
	  10:   910003fd        mov     x29, sp
	  14:   93407c42        sxtw    x2, w2
	  18:   94000000        bl      0 <find_next_bit>
	  1c:   a8c17bfd        ldp     x29, x30, [sp], #16
	  20:   d65f03c0        ret
	  24:   d503201f        nop

After applying this patch:
	0000000000000140 <cpumask_next>:
	 140:   11000400        add     w0, w0, #0x1
	 144:   93407c00        sxtw    x0, w0
	 148:   f100fc1f        cmp     x0, #0x3f
	 14c:   54000168        b.hi    178 <cpumask_next+0x38>  // b.pmore
	 150:   f9400023        ldr     x3, [x1]
	 154:   92800001        mov     x1, #0xffffffffffffffff         // #-1
	 158:   9ac02020        lsl     x0, x1, x0
	 15c:   52800802        mov     w2, #0x40                       // #64
	 160:   8a030001        and     x1, x0, x3
	 164:   dac00020        rbit    x0, x1
	 168:   f100003f        cmp     x1, #0x0
	 16c:   dac01000        clz     x0, x0
	 170:   1a800040        csel    w0, w2, w0, eq  // eq = none
	 174:   d65f03c0        ret
	 178:   52800800        mov     w0, #0x40                       // #64
	 17c:   d65f03c0        ret

find_next_bit() call is replaced with 6 instructions.  find_next_bit()
itself is 41 instructions plus function call overhead.

Despite inlining, the scripts/bloat-o-meter report smaller .text size
after applying the series:
	add/remove: 11/9 grow/shrink: 233/176 up/down: 5780/-6768 (-988)

Link: https://lkml.kernel.org/r/20210401003153.97325-10-yury.norov@gmail.com
Signed-off-by: Yury Norov <yury.norov@gmail.com>
Acked-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Acked-by: Andy Shevchenko <andy.shevchenko@gmail.com>
Cc: Alexey Klimov <aklimov@redhat.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: David Sterba <dsterba@suse.com>
Cc: Dennis Zhou <dennis@kernel.org>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Jianpeng Ma <jianpeng.ma@intel.com>
Cc: Joe Perches <joe@perches.com>
Cc: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Rich Felker <dalias@libc.org>
Cc: Stefano Brivio <sbrivio@redhat.com>
Cc: Wei Yang <richard.weiyang@linux.alibaba.com>
Cc: Wolfram Sang <wsa+renesas@sang-engineering.com>
Cc: Yoshinori Sato <ysato@users.osdn.me>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-05-06 19:24:12 -07:00
..
bitops lib: add fast path for find_next_*_bit() 2021-05-06 19:24:12 -07:00
vdso lib/vdso: Avoid highres update if clocksource is not VDSO capable 2020-02-17 20:12:17 +01:00
asm-offsets.h
asm-prototypes.h License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
atomic64.h locking/atomic: Use s64 for atomic64 2019-06-03 12:32:56 +02:00
atomic-instrumented.h locking/atomics: Regenerate the atomics-check SHA1's 2020-11-07 13:20:41 +01:00
atomic-long.h asm-generic/atomic: Use __always_inline for pure wrappers 2020-01-07 07:47:23 -08:00
atomic.h locking/atomic: Move ATOMIC_INIT into linux/types.h 2020-07-29 16:14:18 +02:00
audit_change_attr.h License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
audit_dir_write.h audit: Avoid build failures on systems without renameat 2018-01-30 19:07:54 -08:00
audit_read.h License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
audit_signal.h
audit_write.h audit/stable-4.15 PR 20171113 2017-11-15 13:28:48 -08:00
barrier.h compiler.h: fix barrier_data() on clang 2020-11-14 11:26:03 -08:00
bitops.h asm-generic/bitops: Update stale comment 2020-03-06 11:06:19 +01:00
bitsperlong.h lib: extend the scope of small_const_nbits() macro 2021-05-06 19:24:11 -07:00
bug.h add support for Clang CFI 2021-04-08 16:04:20 -07:00
bugs.h License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
cache.h License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
cacheflush.h make asm-generic/cacheflush.h more standalone 2020-06-26 00:27:37 -07:00
checksum.h unify generic instances of csum_partial_copy_nocheck() 2020-08-20 15:45:14 -04:00
cmpxchg-local.h License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
cmpxchg.h License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
compat.h compat: lift compat_s64 and compat_u64 to <asm-generic/compat.h> 2020-09-17 13:00:46 -04:00
current.h License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
delay.h License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
device.h treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 428 2019-06-05 17:37:16 +02:00
div64.h div64: Correct inline documentation for `do_div' 2021-04-21 13:45:36 +02:00
dma-mapping.h dma-mapping: bypass indirect calls for dma-direct 2018-12-13 21:06:18 +01:00
dma.h License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
early_ioremap.h License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
emergency-restart.h License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
error-injection.h treewide: Convert macro and uses of __section(foo) to __section("foo") 2020-10-25 14:51:49 -07:00
exec.h treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 36 2019-05-24 17:27:11 +02:00
export.h asm-generic: export: Stub EXPORT_SYMBOL with __DISABLE_EXPORTS 2021-02-03 16:42:57 +00:00
extable.h License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
fb.h License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
fixmap.h mm: introduce common STRUCT_PAGE_MAX_SHIFT define 2018-12-14 15:05:45 -08:00
flat.h binfmt_flat: remove the persistent argument from flat_get_addr_from_rp 2019-06-24 09:16:47 +10:00
ftrace.h treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 500 2019-06-19 17:09:55 +02:00
futex.h generic arch_futex_atomic_op_inuser() doesn't need access_ok() 2020-03-27 23:58:55 -04:00
getorder.h asm-generic: force inlining of get_order() to work around gcc10 poor decision 2020-12-15 22:46:15 -08:00
gpio.h gpio: Avoid kernel.h inclusion where it's possible 2020-02-10 12:58:36 +01:00
hardirq.h irqstat: Move declaration into asm-generic/hardirq.h 2020-11-23 10:31:06 +01:00
hugetlb.h mm: Allow arches to provide ptep_get() 2020-06-20 22:14:53 +10:00
hw_irq.h
hyperv-tlfs.h x86/Hyper-V: Support for free page reporting 2021-03-24 11:35:24 +00:00
ide_iops.h License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
int-ll64.h int-ll64.h: define u{8,16,32,64} and s{8,16,32,64} based on uapi header 2018-06-07 17:34:38 -07:00
io.h asm-generic/io.h: Unbork ioremap_np() declaration 2021-04-09 08:48:27 +02:00
ioctl.h License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
iomap.h asm-generic/io.h: Add a non-posted variant of ioremap() 2021-04-08 20:18:38 +09:00
irq_regs.h treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 152 2019-05-30 11:26:32 -07:00
irq_work.h License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
irq.h License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
irqflags.h License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
Kbuild Rework of the X86 irq stack handling: 2021-02-24 16:32:23 -08:00
kdebug.h License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
kmap_size.h mm/highmem: Provide and use CONFIG_DEBUG_KMAP_LOCAL 2020-11-24 14:42:08 +01:00
kprobes.h treewide: Convert macro and uses of __section(foo) to __section("foo") 2020-10-25 14:51:49 -07:00
kvm_para.h KVM: Introduce paravirtualization hints and KVM_HINTS_DEDICATED 2018-03-06 18:40:44 +01:00
kvm_types.h KVM: Move x86's version of struct kvm_mmu_memory_cache to common code 2020-07-09 13:29:42 -04:00
linkage.h
local64.h License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
local.h License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
mcs_spinlock.h
memory_model.h License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
mm_hooks.h mm: remove arch_bprm_mm_init() hook 2020-01-23 10:41:16 -08:00
mmiowb_types.h asm-generic/mmiowb: Add generic implementation of mmiowb() tracking 2019-04-08 11:59:39 +01:00
mmiowb.h asm-generic/mmiowb: Allow mmiowb_set_pending() when preemptible() 2020-07-17 10:02:03 +01:00
mmu_context.h asm-generic: add generic MMU versions of mmu context functions 2020-10-26 16:45:03 +01:00
mmu.h License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
module.h License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
module.lds.h kbuild: preprocess module linker script 2020-09-25 00:36:41 +09:00
mshyperv.h drivers: hv: Create a consistent pattern for checking Hyper-V hypercall status 2021-04-21 09:49:19 +00:00
msi.h Generic interrupt and irqchips subsystem: 2020-12-15 15:03:31 -08:00
nommu_context.h asm-generic: add generic MMU versions of mmu context functions 2020-10-26 16:45:03 +01:00
numa.h numa: Move numa implementation to common code 2021-01-14 15:08:55 -08:00
page.h c6x: remove architecture 2021-01-20 09:30:45 +01:00
param.h License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
parport.h License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
pci_iomap.h mn10300: Remove the architecture 2018-03-09 23:19:56 +01:00
pci.h PCI: remove PCI_DMA_BUS_IS_PHYS 2018-05-07 07:15:41 +02:00
percpu.h asm-generic: percpu: avoid Wshadow warning 2020-10-26 23:54:48 +00:00
pgalloc.h asm-generic: pgalloc.h: use correct #ifdef to enable pud_alloc_one() 2020-08-14 19:56:55 -07:00
pgtable_uffd.h userfaultfd: wp: add pmd_swp_*uffd_wp() helpers 2020-04-07 10:43:39 -07:00
pgtable-nop4d.h asm-generic/tlb: stub out p4d_free_tlb() if nop4d ... 2019-12-01 06:29:19 -08:00
pgtable-nopmd.h mm: consolidate pte_index() and pte_offset_*() definitions 2020-06-09 09:39:14 -07:00
pgtable-nopud.h mm: consolidate pte_index() and pte_offset_*() definitions 2020-06-09 09:39:14 -07:00
preempt.h sched/preempt: Use CONFIG_PREEMPTION where appropriate 2019-07-31 19:03:34 +02:00
qrwlock_types.h locking/qrwlock: include asm/byteorder.h as needed 2018-02-06 10:28:58 +01:00
qrwlock.h locking/arch: Move qrwlock.h include after qspinlock.h 2021-02-11 07:59:54 -05:00
qspinlock_types.h locking/qspinlock: Do not include atomic.h from qspinlock_types.h 2020-07-29 16:14:19 +02:00
qspinlock.h qspinlock: use signed temporaries for cmpxchg 2020-10-26 20:19:48 +01:00
resource.h License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
rwonce.h asm/rwonce: Don't pull <asm/barrier.h> into 'asm-generic/rwonce.h' 2020-07-21 10:50:36 +01:00
seccomp.h seccomp: Use -1 marker for end of mode 1 syscall list 2020-07-10 16:01:52 -07:00
sections.h sections.h: dereference_function_descriptor() returns void pointer 2020-08-11 12:06:15 +02:00
serial.h License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
set_memory.h License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
shmparam.h treewide: remove SPDX "WITH Linux-syscall-note" from kernel-space headers 2019-05-14 19:52:48 -07:00
signal.h License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
simd.h License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
softirq_stack.h softirq: Move do_softirq_own_stack() to generic asm header 2021-02-10 23:34:16 +01:00
spinlock.h License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
statfs.h License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
string.h
switch_to.h treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 36 2019-05-24 17:27:11 +02:00
syscall.h audit: Migrate to use SYSCALL_WORK flag 2020-11-16 21:53:16 +01:00
syscalls.h License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
termios-base.h License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
termios.h License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
timex.h License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
tlb.h tlb: mmu_gather: Introduce tlb_gather_mmu_fullmm() 2021-01-29 20:02:29 +01:00
tlbflush.h License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
topology.h include/asm-generic/topology.h: guard cpumask_of_node() macro argument 2020-05-28 11:35:41 -07:00
trace_clock.h License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
uaccess.h asm-generic: mark __{get,put}_user_fn as __always_inline 2020-10-27 16:13:09 +01:00
unaligned.h License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
user.h
vermagic.h arch: split MODULE_ARCH_VERMAGIC definitions out to <asm/vermagic.h> 2020-04-23 10:50:26 +09:00
vga.h License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
vmlinux.lds.h add support for Clang CFI 2021-04-08 16:04:20 -07:00
vtime.h
word-at-a-time.h License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
xor.h treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 47 2019-05-24 17:27:13 +02:00