linux/arch/arm/include/asm
Russell King 87067a935a ARM: Optimize multi-CPU tlb flushing a little more
The compiler does not conditionalize the assembly instructions for
the tlb operations, which leads to sub-optimal code being generated
when building a kernel for multiple CPUs.

We can tweak things fairly simply as the code fragment below shows:

    17f8:       e3120001        tst     r2, #1  ; 0x1
...
    1800:       0a000000        beq     1808 <handle_pte_fault+0x194>
    1804:       ee061f10        mcr     15, 0, r1, cr6, cr0, {0}
    1808:       e3120004        tst     r2, #4  ; 0x4
    180c:       0a000000        beq     1814 <handle_pte_fault+0x1a0>
    1810:       ee081f36        mcr     15, 0, r1, cr8, cr6, {1}
becomes:
    17f0:       e3120001        tst     r2, #1  ; 0x1
    17f4:       1e063f10        mcrne   15, 0, r3, cr6, cr0, {0}
    17f8:       e3120004        tst     r2, #4  ; 0x4
    17fc:       1e083f36        mcrne   15, 0, r3, cr8, cr6, {1}

Overall, for Realview with V6 and V7 CPUs configured:

   text    data     bss     dec     hex filename
4153998  207340 5371036 9732374  948116 ../build/realview/vmlinux.before
4153366  207332 5371036 9731734  947e96 ../build/realview/vmlinux.after

Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2012-03-24 09:38:52 +00:00
..
hardware Merge branch 'restart' into for-linus 2012-01-05 13:25:27 +00:00
mach Merge branch 'linux-next' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci 2012-01-11 18:50:26 -08:00
a.out-core.h ARM: 6798/1: aout-core: zero thread debug registers in a.out core dump 2011-03-10 15:16:29 +00:00
a.out.h headers_check fix: arm, a.out.h 2009-02-01 11:01:22 +05:30
asm-offsets.h kbuild: move asm-offsets.h to include/generated 2009-12-12 13:08:14 +01:00
assembler.h ARM: LPAE: add ISBs around MMU enabling code 2011-12-08 10:30:38 +00:00
atomic.h atomic: cleanup asm-generic atomic*.h inclusion 2011-07-26 16:49:47 -07:00
bitops.h asm-generic: add another generic ext2 atomic bitops 2011-07-26 16:49:46 -07:00
bug.h ARM: 7267/1: Remove BUILD_BUG_ON from asm/bug.h 2012-01-05 13:23:22 +00:00
bugs.h
byteorder.h byteorder: make swab.h include asm/swab.h like a regular header 2009-01-14 19:56:50 -08:00
cache.h ARM: implement support for read-mostly sections 2010-12-05 08:39:36 +00:00
cacheflush.h Merge branch 'v6v7' into devel 2011-03-16 23:35:26 +00:00
cachetype.h ARM: 7062/1: cache: detect PIPT I-cache using CTR 2011-10-17 09:13:41 +01:00
checksum.h
clkdev.h ARM: Consolidate the clkdev header files 2011-07-19 18:09:45 +02:00
cp15.h ARM: move CP15 definitions to separate header file 2012-03-24 09:38:51 +00:00
cpu.h ARM: 5872/1: ARM: include needed linux/cpu.h in asm/cpu.h 2010-01-10 13:03:52 +00:00
cputype.h ARM: 7011/1: Add ARM cpu topology definition 2011-10-17 09:02:43 +01:00
cti.h arm: introduce cross trigger interface helpers 2011-12-02 15:16:33 +00:00
current.h
delay.h
device.h Merge branch 'next/cleanup' of git://git.linaro.org/people/arnd/arm-soc 2011-11-01 20:11:00 -07:00
div64.h
dma-mapping.h Merge branch 'devel-stable' of http://ftp.arm.linux.org.uk/pub/linux/arm/kernel/git-cur/linux-2.6-arm 2011-10-28 12:02:27 -07:00
dma.h locking, ARM: Annotate low level hw locks as raw 2011-09-13 11:12:14 +02:00
domain.h ARM: 6384/1: Remove the domain switching on ARMv6k/v7 CPUs 2010-11-04 15:44:31 +00:00
ecard.h ARM: io: ecard: move ioaddr() inside __ecard_address 2011-08-17 08:44:16 +01:00
edac.h ARM: 7201/1: add EDAC atomic_scrub function 2011-12-11 08:35:50 +00:00
elf.h ARM: 7294/1: vectors: use gate_vma for vectors user mapping 2012-03-24 09:38:51 +00:00
entry-macro-multi.S ARM: gic: consolidate PPI handling 2011-10-23 13:32:29 +01:00
exception.h ARM: 7115/4: move __exception and friends to asm/exception.h 2011-10-17 09:02:44 +01:00
fb.h
fcntl.h
fiq.h ARM: 6940/1: fiq: Briefly document driver responsibilities for suspend/resume 2011-05-26 10:31:06 +01:00
fixmap.h [ARM] fixmap support 2009-03-15 21:01:20 -04:00
flat.h flat: fix data sections alignment 2009-05-29 08:40:02 -07:00
floppy.h
fncpy.h ARM: 6640/1: Thumb-2: Symbol manipulation macros for function body copying 2011-01-27 11:48:58 +00:00
fpstate.h Fix common misspellings 2011-03-31 11:26:23 -03:00
ftrace.h ARM: 6319/1: ftrace: add Thumb-2 support to dynamic ftrace 2010-09-02 15:28:43 +01:00
futex.h ARM: 7099/1: futex: preserve oldval in SMP __futex_atomic_op 2011-09-26 12:36:47 +01:00
glue-cache.h Fix common misspellings 2011-03-31 11:26:23 -03:00
glue-df.h ARM: move cache/processor/fault glue to separate include files 2011-02-12 11:52:21 +00:00
glue-pf.h ARM: move cache/processor/fault glue to separate include files 2011-02-12 11:52:21 +00:00
glue-proc.h Merge branch 'v6v7' into devel 2011-03-16 23:35:26 +00:00
glue.h Fix common misspellings 2011-03-31 11:26:23 -03:00
gpio.h ARM: 7271/1: Fix typo in conversion of ARCH_NR_GPIOS to Kconfig 2012-01-08 09:27:19 +00:00
hardirq.h ARM: 7140/1: remove NR_IRQS dependency for ARM-specific HARDIRQ_BITS definition 2011-12-06 11:14:01 +00:00
highmem.h ARM: 6639/1: allow highmem on SMP platforms without h/w TLB ops broadcast 2011-02-23 17:24:17 +00:00
hw_breakpoint.h ARM: hw_breakpoint: add support for multiple watchpoints 2011-08-31 10:42:48 +01:00
hw_irq.h arm: dove: Use proper irq accessor functions 2011-03-29 14:47:57 +02:00
hwcap.h UAPI: Split trivial #if defined(__KERNEL__) && X conditionals 2011-12-13 15:07:49 +00:00
ide.h
idmap.h ARM: SMP: use idmap_pgd for mapping MMU enable during secondary booting 2011-12-06 14:04:15 +00:00
io.h arm: switch to GENERIC_PCI_IOMAP 2011-11-28 21:13:06 +02:00
ioctls.h ioctl: Use asm-generic/ioctls.h on arm (enables termiox) 2010-10-22 10:19:59 -07:00
ipcbuf.h consolidate a bunch of ipcbuf.h instances 2012-01-03 22:55:18 -05:00
irq.h ARM: introduce handle_IRQ() not to dump exception stack 2011-07-12 19:42:40 +08:00
irqflags.h Fix IRQ flag handling naming 2010-10-07 14:08:55 +01:00
Kbuild ARM: 7006/1: Migrate to asm-generic wrapper support 2011-10-17 09:12:40 +01:00
kexec.h [ARM] add machine-specific hook to machine_kexec 2011-03-03 16:26:55 -05:00
kgdb.h kgdb,arm: fix register dump 2010-10-29 13:14:40 -05:00
kmap_types.h kdb: core for kgdb back end (2 of 2) 2010-05-20 21:04:21 -05:00
kprobes.h Kernel: Audit Support For The ARM Platform 2012-01-17 16:17:01 -05:00
leds.h
limits.h
linkage.h
localtimer.h Merge branch 'dt/gic' into next/dt 2011-10-31 14:08:10 +01:00
locks.h
mach-types.h arm: move mach-types to include/generated 2009-12-12 13:08:14 +01:00
mc146818rtc.h
memblock.h ARM: Add arm_memblock_steal() to allocate memory away from the kernel 2012-01-13 15:02:35 +00:00
memory.h ARM: switch from NO_MACH_MEMORY_H to NEED_MACH_MEMORY_H 2011-10-13 12:53:53 -04:00
mman.h arm: add arch_mmap_check(), get rid of sys_arm_mremap() 2009-12-11 06:34:09 -05:00
mmu_context.h ARM: 7294/1: vectors: use gate_vma for vectors user mapping 2012-03-24 09:38:51 +00:00
mmu.h locking, ARM: Annotate low level hw locks as raw 2011-09-13 11:12:14 +02:00
module.h ARM: 7013/1: P2V: Remove ARM_PATCH_PHYS_VIRT_16BIT 2011-08-13 11:26:40 +01:00
msgbuf.h
mtd-xip.h [ARM] move asm/xip.h's mach/hardware.h include to mach/xip.h 2008-12-14 13:22:51 +00:00
mutex.h
nwflash.h
opcodes.h ARM: 7311/1: Add generic instruction opcode manipulation helpers 2012-03-24 09:38:51 +00:00
outercache.h ARM: 7114/1: cache-l2x0: add resume entry for l2 in secure mode 2011-10-17 09:11:51 +01:00
page-nommu.h nommu: Remove the memory_start/end variables from ARM page-nommu.h 2009-07-24 12:35:01 +01:00
page.h ARM: 7294/1: vectors: use gate_vma for vectors user mapping 2012-03-24 09:38:51 +00:00
param.h
parport.h
pci.h PCI: ARM: convert pcibios_set_master() to a non-inlined function 2012-01-06 12:10:36 -08:00
perf_event.h ARM: perf: remove unused armpmu_get_max_events 2011-12-02 15:16:25 +00:00
pgalloc.h ARM: LPAE: Page table maintenance for the 3-level format 2011-12-08 10:30:39 +00:00
pgtable-2level-hwdef.h ARM: 7077/1: LPAE: Use a mask for physical addresses in page table entries 2011-10-06 15:40:06 +01:00
pgtable-2level-types.h ARM: 7076/1: LPAE: Add (pte|pmd)val_t type definitions as u32 2011-10-06 15:40:05 +01:00
pgtable-2level.h ARM: LPAE: Move page table maintenance macros to pgtable-2level.h 2011-12-08 10:30:37 +00:00
pgtable-3level-hwdef.h ARM: LPAE: Introduce the 3-level page table format definitions 2011-12-08 10:30:39 +00:00
pgtable-3level-types.h ARM: LPAE: Introduce the 3-level page table format definitions 2011-12-08 10:30:39 +00:00
pgtable-3level.h ARM: LPAE: Page table maintenance for the 3-level format 2011-12-08 10:30:39 +00:00
pgtable-hwdef.h ARM: LPAE: Introduce the 3-level page table format definitions 2011-12-08 10:30:39 +00:00
pgtable-nommu.h ARM: 5988/1: pgprot_dmacoherent() for non-mmu builds 2010-03-13 10:48:22 +00:00
pgtable.h Merge branch 'devel-stable' into for-linus 2012-01-05 13:24:33 +00:00
pmu.h arm: pmu: allow platform specific irq enable/disable handling 2011-12-02 15:16:33 +00:00
posix_types.h
proc-fns.h ARM: LPAE: Page table maintenance for the 3-level format 2011-12-08 10:30:39 +00:00
processor.h ARM: 7169/1: topdown mmap support 2011-12-06 11:15:25 +00:00
procinfo.h
prom.h ARM: prom.h: Fix build error by removing unneeded header file 2012-01-04 23:47:52 -07:00
ptrace.h Kernel: Audit Support For The ARM Platform 2012-01-17 16:17:01 -05:00
scatterlist.h ARM: Allow SoCs to enable scatterlist chaining 2011-06-02 11:16:22 +01:00
sched_clock.h ARM: 7205/2: sched_clock: allow sched_clock to be selected at runtime 2011-12-18 23:00:26 +00:00
seccomp.h ARM: SECCOMP support 2010-10-01 22:32:18 -04:00
segment.h
sembuf.h
serial.h
setup.h ARM: Allow Kconfig to control the definition of NR_BANKS 2011-12-13 08:52:02 +00:00
shmbuf.h
shmparam.h
sigcontext.h
signal.h asm-generic: rename termios.h, signal.h and mman.h 2009-06-11 21:01:52 +02:00
smp_plat.h Merge branches 'at91', 'dcache', 'ftrace', 'hwbpt', 'misc', 'mmci', 's3c', 'st-ux' and 'unwind' into devel 2010-10-18 22:34:25 +01:00
smp_scu.h ARM: pm: add function to set WFI low-power mode for SMP CPUs 2011-02-11 12:29:18 +00:00
smp_twd.h ARM: gic, local timers: use the request_percpu_irq() interface 2011-10-23 13:32:33 +01:00
smp.h ARM: gic: consolidate PPI handling 2011-10-23 13:32:29 +01:00
socket.h net: add wireless TX status socket option 2011-11-09 16:01:02 -05:00
sockios.h
sparsemem.h
spinlock_types.h locking: Convert raw_rwlock to arch_rwlock 2009-12-14 23:55:32 +01:00
spinlock.h ARM: 6939/1: fix missing 'cpu_relax()' declaration 2011-05-23 17:19:26 +01:00
stackprotector.h ARM: initial stack protector (-fstack-protector) support 2010-06-14 21:31:00 -04:00
stacktrace.h [ARM] 5382/1: unwind: Reorganise the stacktrace support 2009-02-12 13:21:17 +00:00
stat.h
statfs.h
string.h
suspend.h ARM: pm: preallocate a page table for suspend/resume 2011-09-20 23:33:36 +01:00
swab.h Merge branch 'for-next' of git://git.infradead.org/users/dhowells/linux-headers 2012-01-14 18:03:30 -08:00
system.h ARM: move CP15 definitions to separate header file 2012-03-24 09:38:51 +00:00
tcm.h ARM: 6985/1: export functions to determine the presence of I/DTCM 2011-07-06 20:49:45 +01:00
termbits.h tty: Add EXTPROC support for LINEMODE 2010-08-10 13:47:39 -07:00
termios.h
therm.h
thread_info.h Kernel: Audit Support For The ARM Platform 2012-01-17 16:17:01 -05:00
thread_notify.h ARM: 6867/1: Introduce THREAD_NOTIFY_COPY for copy_thread() hooks 2011-04-10 21:13:36 +01:00
timex.h
tlb.h ARM: LPAE: Invalidate the TLB before freeing the PMD 2011-12-08 10:30:40 +00:00
tlbflush.h ARM: Optimize multi-CPU tlb flushing a little more 2012-03-24 09:38:52 +00:00
tls.h ARM: v6k: select TLS register code according to V6 variants 2011-02-02 21:23:29 +00:00
topology.h ARM: 7182/1: ARM cpu topology: fix warning 2011-11-30 23:55:21 +00:00
traps.h ARM: earlier initialization of vectors page 2012-01-23 10:24:11 +00:00
types.h consolidate umode_t declarations 2012-01-03 22:55:17 -05:00
uaccess.h ARM: 6384/1: Remove the domain switching on ARMv6k/v7 CPUs 2010-11-04 15:44:31 +00:00
ucontext.h Fix common misspellings 2011-03-31 11:26:23 -03:00
unaligned.h
unified.h ARM: make BSYM macro assembly only 2012-01-16 08:56:25 -06:00
unistd.h UAPI: Split trivial #if defined(__KERNEL__) && X conditionals 2011-12-13 15:07:49 +00:00
unwind.h ARM: 7187/1: fix unwinding for XIP kernels 2011-12-06 11:16:13 +00:00
user.h ARM: 6798/1: aout-core: zero thread debug registers in a.out core dump 2011-03-10 15:16:29 +00:00
vfp.h
vfpmacros.h ARM: 6203/1: Make VFPv3 usable on ARMv6 2010-07-09 14:41:34 +01:00
vga.h ARM: set vga memory base at run-time 2011-07-12 11:19:29 -05:00
xor.h