linux/arch/powerpc/kvm
Paul Mackerras 67f8a8c115 KVM: PPC: Book3S HV: Fix bug causing host SLB to be restored incorrectly
Aneesh Kumar reported seeing host crashes when running recent kernels
on POWER8.  The symptom was an oops like this:

Unable to handle kernel paging request for data at address 0xf00000000786c620
Faulting instruction address: 0xc00000000030e1e4
Oops: Kernel access of bad area, sig: 11 [#1]
LE SMP NR_CPUS=2048 NUMA PowerNV
Modules linked in: powernv_op_panel
CPU: 24 PID: 6663 Comm: qemu-system-ppc Tainted: G        W 4.13.0-rc7-43932-gfc36c59 #2
task: c000000fdeadfe80 task.stack: c000000fdeb68000
NIP:  c00000000030e1e4 LR: c00000000030de6c CTR: c000000000103620
REGS: c000000fdeb6b450 TRAP: 0300   Tainted: G        W        (4.13.0-rc7-43932-gfc36c59)
MSR:  9000000000009033 <SF,HV,EE,ME,IR,DR,RI,LE>  CR: 24044428  XER: 20000000
CFAR: c00000000030e134 DAR: f00000000786c620 DSISR: 40000000 SOFTE: 0
GPR00: 0000000000000000 c000000fdeb6b6d0 c0000000010bd000 000000000000e1b0
GPR04: c00000000115e168 c000001fffa6e4b0 c00000000115d000 c000001e1b180386
GPR08: f000000000000000 c000000f9a8913e0 f00000000786c600 00007fff587d0000
GPR12: c000000fdeb68000 c00000000fb0f000 0000000000000001 00007fff587cffff
GPR16: 0000000000000000 c000000000000000 00000000003fffff c000000fdebfe1f8
GPR20: 0000000000000004 c000000fdeb6b8a8 0000000000000001 0008000000000040
GPR24: 07000000000000c0 00007fff587cffff c000000fdec20bf8 00007fff587d0000
GPR28: c000000fdeca9ac0 00007fff587d0000 00007fff587c0000 00007fff587d0000
NIP [c00000000030e1e4] __get_user_pages_fast+0x434/0x1070
LR [c00000000030de6c] __get_user_pages_fast+0xbc/0x1070
Call Trace:
[c000000fdeb6b6d0] [c00000000139dab8] lock_classes+0x0/0x35fe50 (unreliable)
[c000000fdeb6b7e0] [c00000000030ef38] get_user_pages_fast+0xf8/0x120
[c000000fdeb6b830] [c000000000112318] kvmppc_book3s_hv_page_fault+0x308/0xf30
[c000000fdeb6b960] [c00000000010e10c] kvmppc_vcpu_run_hv+0xfdc/0x1f00
[c000000fdeb6bb20] [c0000000000e915c] kvmppc_vcpu_run+0x2c/0x40
[c000000fdeb6bb40] [c0000000000e5650] kvm_arch_vcpu_ioctl_run+0x110/0x300
[c000000fdeb6bbe0] [c0000000000d6468] kvm_vcpu_ioctl+0x528/0x900
[c000000fdeb6bd40] [c0000000003bc04c] do_vfs_ioctl+0xcc/0x950
[c000000fdeb6bde0] [c0000000003bc930] SyS_ioctl+0x60/0x100
[c000000fdeb6be30] [c00000000000b96c] system_call+0x58/0x6c
Instruction dump:
7ca81a14 2fa50000 41de0010 7cc8182a 68c60002 78c6ffe2 0b060000 3cc2000a
794a3664 390610d8 e9080000 7d485214 <e90a0020> 7d435378 790507e1 408202f0
---[ end trace fad4a342d0414aa2 ]---

It turns out that what has happened is that the SLB entry for the
vmmemap region hasn't been reloaded on exit from a guest, and it has
the wrong page size.  Then, when the host next accesses the vmemmap
region, it gets a page fault.

Commit a25bd72bad ("powerpc/mm/radix: Workaround prefetch issue with
KVM", 2017-07-24) modified the guest exit code so that it now only clears
out the SLB for hash guest.  The code tests the radix flag and puts the
result in a non-volatile CR field, CR2, and later branches based on CR2.

Unfortunately, the kvmppc_save_tm function, which gets called between
those two points, modifies all the user-visible registers in the case
where the guest was in transactional or suspended state, except for a
few which it restores (namely r1, r2, r9 and r13).  Thus the hash/radix indication in CR2 gets corrupted.

This fixes the problem by re-doing the comparison just before the
result is needed.  For good measure, this also adds comments next to
the call sites of kvmppc_save_tm and kvmppc_restore_tm pointing out
that non-volatile register state will be lost.

Cc: stable@vger.kernel.org # v4.13
Fixes: a25bd72bad ("powerpc/mm/radix: Workaround prefetch issue with KVM")
Tested-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
2017-09-12 16:02:46 +10:00
..
book3s_32_mmu_host.c powerpc/mm: Move hash related mmu-*.h headers to book3s/ 2016-03-03 21:19:21 +11:00
book3s_32_mmu.c KVM: PPC: Book3S PR: Ratelimit copy data failure error messages 2017-02-17 14:03:35 +11:00
book3s_32_sr.S
book3s_64_mmu_host.c * ARM: HYP mode stub supports kexec/kdump on 32-bit; improved PMU 2017-05-08 12:37:56 -07:00
book3s_64_mmu_hv.c KVM: PPC: Book3S HV: Fix memory leak in kvm_vm_ioctl_get_htab_fd 2017-09-01 10:17:58 +10:00
book3s_64_mmu_radix.c powerpc/mm: Rename find_linux_pte_or_hugepte() 2017-08-17 23:13:46 +10:00
book3s_64_mmu.c KVM: PPC: Book3S PR: Preserve storage control bits 2017-04-20 11:38:14 +10:00
book3s_64_slb.S KVM: PPC: Book3S PR: Rework SLB switching code 2014-05-30 14:26:30 +02:00
book3s_64_vio_hv.c powerpc/mm: Rename find_linux_pte_or_hugepte() 2017-08-17 23:13:46 +10:00
book3s_64_vio.c KVM: PPC: Book3S HV: Protect updates to spapr_tce_tables list 2017-08-30 14:59:31 +10:00
book3s_emulate.c KVM: PPC: Book3S PR: Do not fail emulation with mtspr/mfspr for unknown SPRs 2017-04-20 11:39:32 +10:00
book3s_exports.c KVM: PPC: Make shared struct aka magic page guest endian 2014-05-30 14:26:21 +02:00
book3s_hv_builtin.c KVM: PPC: Book3S HV: Simplify dynamic micro-threading code 2017-07-01 18:59:01 +10:00
book3s_hv_hmi.c powerpc: move hmi.c to arch/powerpc/kvm/ 2016-09-09 16:18:07 +10:00
book3s_hv_interrupts.S KVM: PPC: Book3S HV: Close race with testing for signals on guest entry 2017-07-01 18:59:38 +10:00
book3s_hv_ras.c KVM: PPC: Book3S HV: Exit guest upon MCE when FWNMI capability is enabled 2017-06-22 11:24:57 +10:00
book3s_hv_rm_mmu.c Merge remote-tracking branch 'remotes/powerpc/topic/ppc-kvm' into kvm-ppc-next 2017-08-31 12:37:03 +10:00
book3s_hv_rm_xics.c Merge branch 'kvm-ppc-next' of git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc into HEAD 2017-05-09 11:50:01 +02:00
book3s_hv_rm_xive.c KVM: PPC: Book3S HV: Don't access XIVE PIPR register using byte accesses 2017-09-12 16:02:07 +10:00
book3s_hv_rmhandlers.S KVM: PPC: Book3S HV: Fix bug causing host SLB to be restored incorrectly 2017-09-12 16:02:46 +10:00
book3s_hv.c KVM: PPC: Book3S HV: Hold kvm->lock around call to kvmppc_update_lpcr 2017-09-12 16:02:27 +10:00
book3s_interrupts.S powerpc: Define and use PPC64_ELF_ABI_v2/v1 2016-06-14 13:58:27 +10:00
book3s_mmu_hpte.c sched/headers: Prepare to use <linux/rcuupdate.h> instead of <linux/rculist.h> in <linux/sched.h> 2017-03-02 08:42:38 +01:00
book3s_paired_singles.c powerpc: Create disable_kernel_{fp,altivec,vsx,spe}() 2015-12-01 13:52:25 +11:00
book3s_pr_papr.c KVM: PPC: Book3S PR: Don't include SPAPR TCE code on non-pseries platforms 2017-05-12 15:32:30 +10:00
book3s_pr.c KVM: add kvm_{test,clear}_request to replace {test,clear}_bit 2017-04-27 14:12:22 +02:00
book3s_rmhandlers.S powerpc: Define and use PPC64_ELF_ABI_v2/v1 2016-06-14 13:58:27 +10:00
book3s_rtas.c KVM: PPC: Book3S HV: Native usage of the XIVE interrupt controller 2017-04-27 21:37:29 +10:00
book3s_segment.S KVM: PPC: Book3S: 64-bit CONFIG_RELOCATABLE support for interrupts 2017-01-31 19:07:39 +11:00
book3s_xics.c Merge branch 'kvm-ppc-next' of git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc into HEAD 2017-05-09 11:50:01 +02:00
book3s_xics.h KVM: PPC: Book3S HV: Native usage of the XIVE interrupt controller 2017-04-27 21:37:29 +10:00
book3s_xive_template.c KVM: PPC: Book3S HV: Don't access XIVE PIPR register using byte accesses 2017-09-12 16:02:07 +10:00
book3s_xive.c KVM: PPC: Book3S HV: Don't access XIVE PIPR register using byte accesses 2017-09-12 16:02:07 +10:00
book3s_xive.h KVM: PPC: Book3S HV: Native usage of the XIVE interrupt controller 2017-04-27 21:37:29 +10:00
book3s.c * ARM: HYP mode stub supports kexec/kdump on 32-bit; improved PMU 2017-05-08 12:37:56 -07:00
book3s.h kvm: Fix page ageing bugs 2014-09-24 14:07:58 +02:00
booke_emulate.c KVM: PPC: BOOKE: Emulate debug registers and exception 2014-09-22 10:11:33 +02:00
booke_interrupts.S KVM: PPC: Remove 440 support 2014-07-28 15:23:15 +02:00
booke.c KVM: add kvm_request_pending 2017-06-04 16:53:00 +02:00
booke.h KVM: PPC: Book3e: Add AltiVec support 2014-09-22 10:11:32 +02:00
bookehv_interrupts.S powerpc/kvm: common sw breakpoint instr across ppc 2014-09-22 10:11:36 +02:00
e500_emulate.c KVM: PPC: e500: Emulate TMCFG0 TMRN register 2015-10-15 15:58:16 +11:00
e500_mmu_host.c powerpc/mm: Rename find_linux_pte_or_hugepte() 2017-08-17 23:13:46 +10:00
e500_mmu_host.h KVM: PPC: E500: Make clear_tlb_refs and clear_tlb1_bitmap static 2013-01-24 19:23:33 +01:00
e500_mmu.c KVM: PPC: e500: Rename jump labels in kvmppc_e500_tlb_init() 2016-09-13 14:32:47 +10:00
e500.c KVM: PPC: e500: Fix some NULL dereferences on error 2017-08-31 12:36:44 +10:00
e500.h kvm: rename pfn_t to kvm_pfn_t 2016-01-15 17:56:32 -08:00
e500mc.c KVM: PPC: e500mc: Fix a NULL dereference 2017-08-31 12:36:44 +10:00
emulate_loadstore.c KVM: PPC: Add MMIO emulation for remaining floating-point instructions 2017-04-20 11:37:44 +10:00
emulate.c KVM: PPC: Book3S HV: Enable guests to use large decrementer mode on POWER9 2017-06-19 14:02:04 +10:00
fpu.S
irq.h KVM: PPC: Book3S HV: Native usage of the XIVE interrupt controller 2017-04-27 21:37:29 +10:00
Kconfig KVM: PPC: Book3S PR: Don't include SPAPR TCE code on non-pseries platforms 2017-05-12 15:32:30 +10:00
Makefile KVM: PPC: Book3S PR: Don't include SPAPR TCE code on non-pseries platforms 2017-05-12 15:32:30 +10:00
mpic.c Replace <asm/uaccess.h> with <linux/uaccess.h> globally 2016-12-24 11:46:01 -08:00
powerpc.c KVM: add spinlock optimization framework 2017-08-08 10:57:43 +02:00
timing.c KVM: PPC: Remove DCR handling 2014-07-28 19:29:15 +02:00
timing.h KVM: PPC: Remove DCR handling 2014-07-28 19:29:15 +02:00
trace_book3s.h KVM: PPC: Book3S HV: Tracepoints for KVM HV guest interactions 2014-12-17 13:29:27 +01:00
trace_booke.h KVM: PPC: BookE: Improve irq inject tracepoint 2014-12-15 13:27:23 +01:00
trace_hv.h KVM: PPC: Book3S HV: Comment style and print format fixups 2016-11-28 11:48:47 +11:00
trace_pr.h kvm: rename pfn_t to kvm_pfn_t 2016-01-15 17:56:32 -08:00
trace.h kvm: powerpc: booke: Move booke related tracepoints to separate header 2013-10-17 15:37:16 +02:00