linux/arch/powerpc/kvm
Paul Mackerras ef42719814 KVM: PPC: Book3S HV: Fix host crash on changing HPT size
Commit f98a8bf9ee ("KVM: PPC: Book3S HV: Allow KVM_PPC_ALLOCATE_HTAB
ioctl() to change HPT size", 2016-12-20) changed the behaviour of
the KVM_PPC_ALLOCATE_HTAB ioctl so that it now allocates a new HPT
and new revmap array if there was a previously-allocated HPT of a
different size from the size being requested.  In this case, we need
to reset the rmap arrays of the memslots, because the rmap arrays
will contain references to HPTEs which are no longer valid.  Worse,
these references are also references to slots in the new revmap
array (which parallels the HPT), and the new revmap array contains
random contents, since it doesn't get zeroed on allocation.

The effect of having these stale references to slots in the revmap
array that contain random contents is that subsequent calls to
functions such as kvmppc_add_revmap_chain will crash because they
will interpret the non-zero contents of the revmap array as HPTE
indexes and thus index outside of the revmap array.  This leads to
host crashes such as the following.

[ 7072.862122] Unable to handle kernel paging request for data at address 0xd000000c250c00f8
[ 7072.862218] Faulting instruction address: 0xc0000000000e1c78
[ 7072.862233] Oops: Kernel access of bad area, sig: 11 [#1]
[ 7072.862286] SMP NR_CPUS=1024
[ 7072.862286] NUMA
[ 7072.862325] PowerNV
[ 7072.862378] Modules linked in: kvm_hv vhost_net vhost tap xt_CHECKSUM ipt_MASQUERADE nf_nat_masquerade_ipv4 ip6t_rpfilter ip6t_REJECT nf_reject_ipv6 nf_conntrack_ipv6 nf_defrag_ipv6 xt_conntrack ip_set nfnetlink ebtable_nat ebtable_broute bridge stp llc ip6table_mangle ip6table_security ip6table_raw iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack iptable_mangle iptable_security iptable_raw ebtable_filter ebtables ip6table_filter ip6_tables rpcrdma ib_isert iscsi_target_mod ib_iser libiscsi scsi_transport_iscsi ib_srpt target_core_mod ib_srp scsi_transport_srp ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm iw_cxgb3 mlx5_ib ib_core ses enclosure scsi_transport_sas ipmi_powernv ipmi_devintf ipmi_msghandler powernv_op_panel i2c_opal nfsd auth_rpcgss oid_registry
[ 7072.863085]  nfs_acl lockd grace sunrpc kvm_pr kvm xfs libcrc32c scsi_dh_alua dm_service_time radeon lpfc nvme_fc nvme_fabrics nvme_core scsi_transport_fc i2c_algo_bit tg3 drm_kms_helper ptp pps_core syscopyarea sysfillrect sysimgblt fb_sys_fops ttm drm dm_multipath i2c_core cxgb3 mlx5_core mdio [last unloaded: kvm_hv]
[ 7072.863381] CPU: 72 PID: 56929 Comm: qemu-system-ppc Not tainted 4.12.0-kvm+ #59
[ 7072.863457] task: c000000fe29e7600 task.stack: c000001e3ffec000
[ 7072.863520] NIP: c0000000000e1c78 LR: c0000000000e2e3c CTR: c0000000000e25f0
[ 7072.863596] REGS: c000001e3ffef560 TRAP: 0300   Not tainted  (4.12.0-kvm+)
[ 7072.863658] MSR: 9000000100009033 <SF,HV,EE,ME,IR,DR,RI,LE,TM[E]>
[ 7072.863667]   CR: 44082882  XER: 20000000
[ 7072.863767] CFAR: c0000000000e2e38 DAR: d000000c250c00f8 DSISR: 42000000 SOFTE: 1
GPR00: c0000000000e2e3c c000001e3ffef7e0 c000000001407d00 d000000c250c00f0
GPR04: d00000006509fb70 d00000000b3d2048 0000000003ffdfb7 0000000000000000
GPR08: 00000001007fdfb7 00000000c000000f d0000000250c0000 000000000070f7bf
GPR12: 0000000000000008 c00000000fdad000 0000000010879478 00000000105a0d78
GPR16: 00007ffaf4080000 0000000000001190 0000000000000000 0000000000010000
GPR20: 4001ffffff000415 d00000006509fb70 0000000004091190 0000000ee1881190
GPR24: 0000000003ffdfb7 0000000003ffdfb7 00000000007fdfb7 c000000f5c958000
GPR28: d00000002d09fb70 0000000003ffdfb7 d00000006509fb70 d00000000b3d2048
[ 7072.864439] NIP [c0000000000e1c78] kvmppc_add_revmap_chain+0x88/0x130
[ 7072.864503] LR [c0000000000e2e3c] kvmppc_do_h_enter+0x84c/0x9e0
[ 7072.864566] Call Trace:
[ 7072.864594] [c000001e3ffef7e0] [c000001e3ffef830] 0xc000001e3ffef830 (unreliable)
[ 7072.864671] [c000001e3ffef830] [c0000000000e2e3c] kvmppc_do_h_enter+0x84c/0x9e0
[ 7072.864751] [c000001e3ffef920] [d00000000b38d878] kvmppc_map_vrma+0x168/0x200 [kvm_hv]
[ 7072.864831] [c000001e3ffef9e0] [d00000000b38a684] kvmppc_vcpu_run_hv+0x1284/0x1300 [kvm_hv]
[ 7072.864914] [c000001e3ffefb30] [d00000000f465664] kvmppc_vcpu_run+0x44/0x60 [kvm]
[ 7072.865008] [c000001e3ffefb60] [d00000000f461864] kvm_arch_vcpu_ioctl_run+0x114/0x290 [kvm]
[ 7072.865152] [c000001e3ffefbe0] [d00000000f453c98] kvm_vcpu_ioctl+0x598/0x7a0 [kvm]
[ 7072.865292] [c000001e3ffefd40] [c000000000389328] do_vfs_ioctl+0xd8/0x8c0
[ 7072.865410] [c000001e3ffefde0] [c000000000389be4] SyS_ioctl+0xd4/0x130
[ 7072.865526] [c000001e3ffefe30] [c00000000000b760] system_call+0x58/0x6c
[ 7072.865644] Instruction dump:
[ 7072.865715] e95b2110 793a0020 7b4926e4 7f8a4a14 409e0098 807c000c 786326e4 7c6a1a14
[ 7072.865857] 935e0008 7bbd0020 813c000c 913e000c <93a30008> 93bc000c 48000038 60000000
[ 7072.866001] ---[ end trace 627b6e4bf8080edc ]---

Note that to trigger this, it is necessary to use a recent upstream
QEMU (or other userspace that resizes the HPT at CAS time), specify
a maximum memory size substantially larger than the current memory
size, and boot a guest kernel that does not support HPT resizing.

This fixes the problem by resetting the rmap arrays when the old HPT
is freed.

Fixes: f98a8bf9ee ("KVM: PPC: Book3S HV: Allow KVM_PPC_ALLOCATE_HTAB ioctl() to change HPT size")
Cc: stable@vger.kernel.org # v4.11+
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
2017-07-24 16:10:44 +10:00
..
book3s_32_mmu_host.c powerpc/mm: Move hash related mmu-*.h headers to book3s/ 2016-03-03 21:19:21 +11:00
book3s_32_mmu.c KVM: PPC: Book3S PR: Ratelimit copy data failure error messages 2017-02-17 14:03:35 +11:00
book3s_32_sr.S
book3s_64_mmu_host.c * ARM: HYP mode stub supports kexec/kdump on 32-bit; improved PMU 2017-05-08 12:37:56 -07:00
book3s_64_mmu_hv.c KVM: PPC: Book3S HV: Fix host crash on changing HPT size 2017-07-24 16:10:44 +10:00
book3s_64_mmu_radix.c KVM: PPC: Book3S HV: Fix software walk of guest process page tables 2017-03-01 11:53:45 +11:00
book3s_64_mmu.c KVM: PPC: Book3S PR: Preserve storage control bits 2017-04-20 11:38:14 +10:00
book3s_64_slb.S
book3s_64_vio_hv.c KVM: PPC: Book3S HV: Add radix checks in real-mode hypercall handlers 2017-05-12 15:08:09 +10:00
book3s_64_vio.c KVM: PPC: VFIO: Add in-kernel acceleration for VFIO 2017-04-20 11:39:26 +10:00
book3s_emulate.c KVM: PPC: Book3S PR: Do not fail emulation with mtspr/mfspr for unknown SPRs 2017-04-20 11:39:32 +10:00
book3s_exports.c
book3s_hv_builtin.c KVM: PPC: Book3S HV: Simplify dynamic micro-threading code 2017-07-01 18:59:01 +10:00
book3s_hv_hmi.c powerpc: move hmi.c to arch/powerpc/kvm/ 2016-09-09 16:18:07 +10:00
book3s_hv_interrupts.S KVM: PPC: Book3S HV: Close race with testing for signals on guest entry 2017-07-01 18:59:38 +10:00
book3s_hv_ras.c KVM: PPC: Book3S HV: Exit guest upon MCE when FWNMI capability is enabled 2017-06-22 11:24:57 +10:00
book3s_hv_rm_mmu.c powerpc/mm: Trace tlbie(l) instructions 2017-06-23 21:14:49 +10:00
book3s_hv_rm_xics.c Merge branch 'kvm-ppc-next' of git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc into HEAD 2017-05-09 11:50:01 +02:00
book3s_hv_rm_xive.c KVM: PPC: Book3S HV: Native usage of the XIVE interrupt controller 2017-04-27 21:37:29 +10:00
book3s_hv_rmhandlers.S powerpc updates for 4.13 2017-07-07 13:55:45 -07:00
book3s_hv.c KVM: PPC: Book3S HV: Enable TM before accessing TM registers 2017-07-24 16:10:44 +10:00
book3s_interrupts.S powerpc: Define and use PPC64_ELF_ABI_v2/v1 2016-06-14 13:58:27 +10:00
book3s_mmu_hpte.c sched/headers: Prepare to use <linux/rcuupdate.h> instead of <linux/rculist.h> in <linux/sched.h> 2017-03-02 08:42:38 +01:00
book3s_paired_singles.c powerpc: Create disable_kernel_{fp,altivec,vsx,spe}() 2015-12-01 13:52:25 +11:00
book3s_pr_papr.c KVM: PPC: Book3S PR: Don't include SPAPR TCE code on non-pseries platforms 2017-05-12 15:32:30 +10:00
book3s_pr.c KVM: add kvm_{test,clear}_request to replace {test,clear}_bit 2017-04-27 14:12:22 +02:00
book3s_rmhandlers.S powerpc: Define and use PPC64_ELF_ABI_v2/v1 2016-06-14 13:58:27 +10:00
book3s_rtas.c KVM: PPC: Book3S HV: Native usage of the XIVE interrupt controller 2017-04-27 21:37:29 +10:00
book3s_segment.S KVM: PPC: Book3S: 64-bit CONFIG_RELOCATABLE support for interrupts 2017-01-31 19:07:39 +11:00
book3s_xics.c Merge branch 'kvm-ppc-next' of git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc into HEAD 2017-05-09 11:50:01 +02:00
book3s_xics.h KVM: PPC: Book3S HV: Native usage of the XIVE interrupt controller 2017-04-27 21:37:29 +10:00
book3s_xive_template.c powerpc/xive: Fix offset for store EOI MMIOs 2017-06-15 23:29:39 +10:00
book3s_xive.c KVM: PPC: Book3S: Fix typo in XICS-on-XIVE state saving code 2017-07-03 10:40:22 +02:00
book3s_xive.h KVM: PPC: Book3S HV: Native usage of the XIVE interrupt controller 2017-04-27 21:37:29 +10:00
book3s.c * ARM: HYP mode stub supports kexec/kdump on 32-bit; improved PMU 2017-05-08 12:37:56 -07:00
book3s.h
booke_emulate.c
booke_interrupts.S
booke.c KVM: add kvm_request_pending 2017-06-04 16:53:00 +02:00
booke.h
bookehv_interrupts.S
e500_emulate.c KVM: PPC: e500: Emulate TMCFG0 TMRN register 2015-10-15 15:58:16 +11:00
e500_mmu_host.c KVM: PPC: e500: Use kcalloc() in e500_mmu_host_init() 2017-04-20 11:37:55 +10:00
e500_mmu_host.h
e500_mmu.c KVM: PPC: e500: Rename jump labels in kvmppc_e500_tlb_init() 2016-09-13 14:32:47 +10:00
e500.c KVM: PPC: e500: fix handling local_sid_lookup result 2015-10-15 15:58:16 +11:00
e500.h kvm: rename pfn_t to kvm_pfn_t 2016-01-15 17:56:32 -08:00
e500mc.c powerpc: Fix misspellings in comments. 2016-03-01 19:27:20 +11:00
emulate_loadstore.c KVM: PPC: Add MMIO emulation for remaining floating-point instructions 2017-04-20 11:37:44 +10:00
emulate.c KVM: PPC: Book3S HV: Enable guests to use large decrementer mode on POWER9 2017-06-19 14:02:04 +10:00
fpu.S
irq.h KVM: PPC: Book3S HV: Native usage of the XIVE interrupt controller 2017-04-27 21:37:29 +10:00
Kconfig KVM: PPC: Book3S PR: Don't include SPAPR TCE code on non-pseries platforms 2017-05-12 15:32:30 +10:00
Makefile KVM: PPC: Book3S PR: Don't include SPAPR TCE code on non-pseries platforms 2017-05-12 15:32:30 +10:00
mpic.c Replace <asm/uaccess.h> with <linux/uaccess.h> globally 2016-12-24 11:46:01 -08:00
powerpc.c Merge branch 'kvm-ppc-next' of git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc into HEAD 2017-07-03 10:41:59 +02:00
timing.c
timing.h
trace_book3s.h
trace_booke.h
trace_hv.h KVM: PPC: Book3S HV: Comment style and print format fixups 2016-11-28 11:48:47 +11:00
trace_pr.h kvm: rename pfn_t to kvm_pfn_t 2016-01-15 17:56:32 -08:00
trace.h