linux/arch
Vineet Gupta 2e22502c08 ARC: dw2 unwind: Remove falllback linear search thru FDE entries
Fixes STAR 9000953410: "perf callgraph profiling causing RCU stalls"

| perf record -g -c 15000 -e cycles /sbin/hackbench
|
| INFO: rcu_preempt self-detected stall on CPU
| 1: (1 GPs behind) idle=609/140000000000002/0 softirq=2914/2915 fqs=603
| Task dump for CPU 1:

in-kernel dwarf unwinder has a fast binary lookup and a fallback linear
search (which iterates thru each of ~11K entries) thus takes 2 orders of
magnitude longer (~3 million cycles vs. 2000). Routines written in hand
assembler lack dwarf info (as we don't support assembler CFI pseudo-ops
yet) fail the unwinder binary lookup, hit linear search, failing
nevertheless in the end.

However the linear search is pointless as binary lookup tables are created
from it in first place. It is impossible to have binary lookup fail while
succeed the linear search. It is pure waste of cycles thus removed by
this patch.

This manifested as RCU stalls / NMI watchdog splat when running
hackbench under perf with callgraph profiling. The triggering condition
was perf counter overflowing in routine lacking dwarf info (like memset)
leading to patheic 3 million cycle unwinder slow path and by the time it
returned new interrupts were already pending (Timer, IPI) and taken
rightaway. The original memset didn't make forward progress, system kept
accruing more interrupts and more unwinder delayes in a vicious feedback
loop, ultimately triggering the NMI diagnostic.

Cc: stable@vger.kernel.org
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-11-23 21:36:49 +05:30
..
alpha mm: mlock: add mlock flags to enable VM_LOCKONFAULT usage 2015-11-05 19:34:48 -08:00
arc ARC: dw2 unwind: Remove falllback linear search thru FDE entries 2015-11-23 21:36:49 +05:30
arm ARM: SoC defconfig updates for v4.4 2015-11-10 15:08:32 -08:00
arm64 arm64 fixes and clean-ups: 2015-11-12 15:33:11 -08:00
avr32 dmaengine updates for 4.4-rc1 2015-11-10 10:05:17 -08:00
blackfin Merge branch 'strscpy' of git://git.kernel.org/pub/scm/linux/kernel/git/cmetcalf/linux-tile 2015-10-04 16:31:13 +01:00
c6x irqdomain: Use irq_domain_get_of_node() instead of direct field access 2015-10-13 19:01:23 +02:00
cris cris: Drop reference to get_cmos_time() 2015-11-02 20:03:05 +01:00
frv kmap_atomic_to_page() has no users, remove it 2015-11-09 15:11:24 -08:00
h8300 h8300 update for v4.4 2015-11-12 15:26:39 -08:00
hexagon Linux 4.3-rc4 2015-10-06 17:10:28 +02:00
ia64 Power management and ACPI updates for v4.4-rc1 2015-11-04 18:10:13 -08:00
m32r Linux 4.3-rc4 2015-10-06 17:10:28 +02:00
m68k block: change ->make_request_fn() and users to return a queue cookie 2015-11-07 10:40:46 -07:00
metag Metag architecture changes for v4.4 2015-11-10 16:24:25 -08:00
microblaze kmap_atomic_to_page() has no users, remove it 2015-11-09 15:11:24 -08:00
mips Merge branch 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus 2015-11-15 09:10:53 -08:00
mn10300 Linux 4.3-rc4 2015-10-06 17:10:28 +02:00
nios2 nios2 update for v4.4-rc1 2015-11-09 16:36:10 -08:00
openrisc
parisc Merge branch 'akpm' (patches from Andrew) 2015-11-09 21:05:13 -08:00
powerpc Four changes: 2015-11-12 14:34:06 -08:00
s390 s390: A bunch of fixes and optimizations for interrupt and time 2015-11-05 16:26:26 -08:00
score Merge branch 'strscpy' of git://git.kernel.org/pub/scm/linux/kernel/git/cmetcalf/linux-tile 2015-10-04 16:31:13 +01:00
sh Merge branch 'akpm' (patches from Andrew) 2015-11-07 14:32:45 -08:00
sparc sparc/sparc64: allocate sys_membarrier system call number 2015-11-09 15:11:24 -08:00
tile kmap_atomic_to_page() has no users, remove it 2015-11-09 15:11:24 -08:00
um um: Switch clocksource to hrtimers 2015-11-06 22:54:49 +01:00
unicore32 pwm: Changes for v4.4-rc1 2015-11-11 09:16:10 -08:00
x86 Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2015-11-15 09:36:24 -08:00
xtensa Merge branch 'for-4.4/io-poll' of git://git.kernel.dk/linux-block 2015-11-10 17:23:49 -08:00
.gitignore
Kconfig