linux/arch
bob picco 4ccb927289 sparc64: sun4v TLB error power off events
We've witnessed a few TLB events causing the machine to power off because
of prom_halt. In one case it was some nfs related area during rmmod. Another
was an mmapper of /dev/mem. A more recent one is an ITLB issue with
a bad pagesize which could be a hardware bug. Bugs happen but we should
attempt to not power off the machine and/or hang it when possible.

This is a DTLB error from an mmapper of /dev/mem:
[root@sparcie ~]# SUN4V-DTLB: Error at TPC[fffff80100903e6c], tl 1
SUN4V-DTLB: TPC<0xfffff80100903e6c>
SUN4V-DTLB: O7[fffff801081979d0]
SUN4V-DTLB: O7<0xfffff801081979d0>
SUN4V-DTLB: vaddr[fffff80100000000] ctx[1250] pte[98000000000f0610] error[2]
.

This is recent mainline for ITLB:
[ 3708.179864] SUN4V-ITLB: TPC<0xfffffc010071cefc>
[ 3708.188866] SUN4V-ITLB: O7[fffffc010071cee8]
[ 3708.197377] SUN4V-ITLB: O7<0xfffffc010071cee8>
[ 3708.206539] SUN4V-ITLB: vaddr[e0003] ctx[1a3c] pte[2900000dcc800eeb] error[4]
.

Normally sun4v_itlb_error_report() and sun4v_dtlb_error_report() would call
prom_halt() and drop us to OF command prompt "ok". This isn't the case for
LDOMs and the machine powers off.

For the HV reported error of HV_ENORADDR for HV HV_MMU_MAP_ADDR_TRAP we cause
a SIGBUS error by qualifying it within do_sparc64_fault() for fault code mask
of FAULT_CODE_BAD_RA. This is done when trap level (%tl) is less or equal
one("1"). Otherwise, for %tl > 1,  we proceed eventually to die_if_kernel().

The logic of this patch was partially inspired by David Miller's feedback.

Power off of large sparc64 machines is painful. Plus die_if_kernel provides
more context. A reset sequence isn't a brief period on large sparc64 but
better than power-off/power-on sequence.

Cc: sparclinux@vger.kernel.org
Signed-off-by: Bob Picco <bob.picco@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-16 17:46:44 -07:00
..
alpha alpha: io: implement relaxed accessor macros for writes 2014-08-29 11:18:45 -07:00
arc ARC: [mm] Fix compilation breakage 2014-09-03 10:08:50 -07:00
arm A smattering of bug fixes across most architectures. 2014-09-06 16:42:12 -07:00
arm64 A smattering of bug fixes across most architectures. 2014-09-06 16:42:12 -07:00
avr32 Merge branch 'signal-cleanup' of git://git.kernel.org/pub/scm/linux/kernel/git/rw/misc 2014-08-09 09:58:12 -07:00
blackfin Merge branch 'signal-cleanup' of git://git.kernel.org/pub/scm/linux/kernel/git/rw/misc 2014-08-09 09:58:12 -07:00
c6x Merge branch 'signal-cleanup' of git://git.kernel.org/pub/scm/linux/kernel/git/rw/misc 2014-08-09 09:58:12 -07:00
cris Merge branch 'signal-cleanup' of git://git.kernel.org/pub/scm/linux/kernel/git/rw/misc 2014-08-09 09:58:12 -07:00
frv frv: Define cpu_relax_lowlatency() 2014-08-19 09:40:08 -05:00
hexagon flush_icache_range: export symbol to fix build errors 2014-08-29 16:28:17 -07:00
ia64 kexec: remove CONFIG_KEXEC dependency on crypto 2014-08-29 16:28:16 -07:00
m32r Merge branch 'signal-cleanup' of git://git.kernel.org/pub/scm/linux/kernel/git/rw/misc 2014-08-09 09:58:12 -07:00
m68k m68k: Wire up memfd_create 2014-09-01 10:28:00 +02:00
metag Metag architecture changes for v3.17 2014-08-13 18:18:09 -06:00
microblaze microblaze: Fix number of syscalls 2014-09-09 13:14:47 +02:00
mips kexec: remove CONFIG_KEXEC dependency on crypto 2014-08-29 16:28:16 -07:00
mn10300 Merge branch 'signal-cleanup' of git://git.kernel.org/pub/scm/linux/kernel/git/rw/misc 2014-08-09 09:58:12 -07:00
openrisc Merge branch 'signal-cleanup' of git://git.kernel.org/pub/scm/linux/kernel/git/rw/misc 2014-08-09 09:58:12 -07:00
parisc Merge branch 'signal-cleanup' of git://git.kernel.org/pub/scm/linux/kernel/git/rw/misc 2014-08-09 09:58:12 -07:00
powerpc A smattering of bug fixes across most architectures. 2014-09-06 16:42:12 -07:00
s390 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux 2014-09-08 08:27:00 -07:00
score Merge branch 'signal-cleanup' of git://git.kernel.org/pub/scm/linux/kernel/git/rw/misc 2014-08-09 09:58:12 -07:00
sh flush_icache_range: export symbol to fix build errors 2014-08-29 16:28:17 -07:00
sparc sparc64: sun4v TLB error power off events 2014-09-16 17:46:44 -07:00
tile flush_icache_range: export symbol to fix build errors 2014-08-29 16:28:17 -07:00
um Merge branch 'signal-cleanup' of git://git.kernel.org/pub/scm/linux/kernel/git/rw/misc 2014-08-09 09:58:12 -07:00
unicore32 unicore32: Fix build error 2014-08-31 17:08:12 -07:00
x86 Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2014-08-29 17:22:27 -07:00
xtensa Xtensa improvements for 3.17: 2014-08-31 17:08:42 -07:00
.gitignore
Kconfig seccomp: add "seccomp" syscall 2014-07-18 12:13:37 -07:00