linux/arch/powerpc/mm
Benjamin Herrenschmidt a25bd72bad powerpc/mm/radix: Workaround prefetch issue with KVM
There's a somewhat architectural issue with Radix MMU and KVM.

When coming out of a guest with AIL (Alternate Interrupt Location, ie,
MMU enabled), we start executing hypervisor code with the PID register
still containing whatever the guest has been using.

The problem is that the CPU can (and will) then start prefetching or
speculatively load from whatever host context has that same PID (if
any), thus bringing translations for that context into the TLB, which
Linux doesn't know about.

This can cause stale translations and subsequent crashes.

Fixing this in a way that is neither racy nor a huge performance
impact is difficult. We could just make the host invalidations always
use broadcast forms but that would hurt single threaded programs for
example.

We chose to fix it instead by partitioning the PID space between guest
and host. This is possible because today Linux only use 19 out of the
20 bits of PID space, so existing guests will work if we make the host
use the top half of the 20 bits space.

We additionally add support for a property to indicate to Linux the
size of the PID register which will be useful if we eventually have
processors with a larger PID space available.

There is still an issue with malicious guests purposefully setting the
PID register to a value in the hosts PID range. Hopefully future HW
can prevent that, but in the meantime, we handle it with a pair of
kludges:

 - On the way out of a guest, before we clear the current VCPU in the
   PACA, we check the PID and if it's outside of the permitted range
   we flush the TLB for that PID.

 - When context switching, if the mm is "new" on that CPU (the
   corresponding bit was set for the first time in the mm cpumask), we
   check if any sibling thread is in KVM (has a non-NULL VCPU pointer
   in the PACA). If that is the case, we also flush the PID for that
   CPU (core).

This second part is needed to handle the case where a process is
migrated (or starts a new pthread) on a sibling thread of the CPU
coming out of KVM, as there's a window where stale translations can
exist before we detect it and flush them out.

A future optimization could be added by keeping track of whether the
PID has ever been used and avoid doing that for completely fresh PIDs.
We could similarily mark PIDs that have been the subject of a global
invalidation as "fresh". But for now this will do.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
[mpe: Rework the asm to build with CONFIG_PPC_RADIX_MMU=n, drop
      unneeded include of kvm_book3s_asm.h]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-07-26 16:41:52 +10:00
..
8xx_mmu.c powerpc/mm: Rename map_page() to map_kernel_page() on 32-bit 2017-06-05 19:59:03 +10:00
40x_mmu.c Replace <asm/uaccess.h> with <linux/uaccess.h> globally 2016-12-24 11:46:01 -08:00
44x_mmu.c
copro_fault.c powerpc/mm: Update PROTFAULT handling in the page fault path 2017-02-15 20:02:39 +11:00
dma-noncoherent.c powerpc/mm: Rename map_page() to map_kernel_page() on 32-bit 2017-06-05 19:59:03 +10:00
dump_hashpagetable.c powerpc/mm/ptdump: Dump the first entry of the linear mapping as well 2017-06-02 19:09:52 +10:00
dump_linuxpagetables.c powerpc/mm: Fix crash in page table dump with huge pages 2017-05-17 11:56:33 +10:00
fault.c powerpc/mm: The 8xx doesn't call do_page_fault() for breakpoints 2017-06-02 19:20:12 +10:00
fsl_booke_mmu.c Replace <asm/uaccess.h> with <linux/uaccess.h> globally 2016-12-24 11:46:01 -08:00
hash64_4k.c powerpc/mm: Fix lazy icache flush on pre-POWER5 2016-11-29 23:59:40 +11:00
hash64_64k.c powerpc/mm: Fix lazy icache flush on pre-POWER5 2016-11-29 23:59:40 +11:00
hash_low_32.S powerpc/32: Remove Mac-on-Linux/rtlinux hooks 2017-03-21 22:09:26 +11:00
hash_native_64.c powerpc/mm: Wire up hpte_removebolted for powernv 2017-07-02 20:40:28 +10:00
hash_utils_64.c powerpc/mm: Trace tlbie(l) instructions 2017-06-23 21:14:49 +10:00
highmem.c
hugepage-hash64.c powerpc/mm: Move hash table ops to a separate structure 2016-07-21 18:59:09 +10:00
hugetlbpage-book3e.c powerpc/mm/nohash: MM_SLICE is only used by book3s 64 2017-03-31 23:09:47 +11:00
hugetlbpage-hash64.c powerpc/mm: Remove the debug hugepd_ok check 2017-01-23 19:19:28 +11:00
hugetlbpage-radix.c mm: larger stack guard gap, between vmas 2017-06-19 21:50:20 +08:00
hugetlbpage.c powerpc updates for 4.13 2017-07-07 13:55:45 -07:00
icswx_pid.c
icswx.c scripts/spelling.txt: add regsiter -> register spelling mistake 2017-05-08 17:15:13 -07:00
icswx.h
init_32.c powerpc/32: Add missing \n and switch to pr_warn() 2016-09-13 17:37:11 +10:00
init_64.c powerpc/vmemmap: Add altmap support 2017-07-02 20:40:27 +10:00
init-common.c Merge branch 'topic/ppc-kvm' into next 2017-02-14 17:18:29 +11:00
Makefile Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/scottwood/linux into next 2016-12-16 15:05:38 +11:00
mem.c powerpc/mm: Mark __init memory no-execute when STRICT_KERNEL_RWX=y 2017-07-18 19:54:24 +10:00
mmap.c mm: larger stack guard gap, between vmas 2017-06-19 21:50:20 +08:00
mmu_context_book3s64.c powerpc/mm/radix: Workaround prefetch issue with KVM 2017-07-26 16:41:52 +10:00
mmu_context_hash32.c
mmu_context_iommu.c Merge branch 'topic/ppc-kvm' into next 2017-04-28 20:19:37 +10:00
mmu_context_nohash.c powerpc/mm/nohash: MM_SLICE is only used by book3s 64 2017-03-31 23:09:47 +11:00
mmu_decl.h powerpc/mm: Rename map_page() to map_kernel_page() on 32-bit 2017-06-05 19:59:03 +10:00
numa.c powerpc: Only obtain cpu_hotplug_lock if called by rtasd 2017-06-23 09:32:11 +02:00
pgtable_32.c powerpc/mm: Rename map_page() to map_kernel_page() on 32-bit 2017-06-05 19:59:03 +10:00
pgtable_64.c powerpc/mm: Mark __init memory no-execute when STRICT_KERNEL_RWX=y 2017-07-18 19:54:24 +10:00
pgtable-book3e.c powerpc/mm: Make page table size a variable 2016-05-01 18:32:48 +10:00
pgtable-book3s64.c powerpc/mm: Add devmap support for ppc64 2017-07-02 20:40:28 +10:00
pgtable-hash64.c powerpc/mm: Mark __init memory no-execute when STRICT_KERNEL_RWX=y 2017-07-18 19:54:24 +10:00
pgtable-radix.c powerpc/mm/radix: Workaround prefetch issue with KVM 2017-07-26 16:41:52 +10:00
pgtable.c powerpc/mm: Fix typo in set_pte_at() 2017-02-17 22:16:25 +11:00
ppc_mmu_32.c powerpc32: refactor x_mapped_by_bats() and x_mapped_by_tlbcam() together 2016-03-11 17:18:02 -06:00
slb_low.S powerpc/64s: Rename slb_allocate_realmode() to slb_allocate() 2017-06-21 16:18:33 +10:00
slb.c powerpc/64s: Rename slb_allocate_realmode() to slb_allocate() 2017-06-21 16:18:33 +10:00
slice.c mm: larger stack guard gap, between vmas 2017-06-19 21:50:20 +08:00
subpage-prot.c powerpc/mm/radix: Use mm->task_size for boundary checking instead of addr_limit 2017-04-19 20:00:18 +10:00
tlb_hash32.c powerpc/mm: remove flush_tlb_page_nohash 2016-08-01 11:15:13 +10:00
tlb_hash64.c powerpc/mm/hash: Do a local flush if possible when no batch is active 2017-06-05 19:02:55 +10:00
tlb_low_64e.S powerpc: Fix misspellings in comments. 2016-03-01 19:27:20 +11:00
tlb_nohash_low.S powerpc: Fix misspellings in comments. 2016-03-01 19:27:20 +11:00
tlb_nohash.c powerpc/nohash: Fix use of mmu_has_feature() in setup_initial_memory_limit() 2017-04-11 07:46:04 +10:00
tlb-radix.c powerpc/mm/radix: Workaround prefetch issue with KVM 2017-07-26 16:41:52 +10:00
vphn.c
vphn.h