linux/arch/x86/mm
Stefano Stabellini 279b706bf8 x86,xen: introduce x86_init.mapping.pagetable_reserve
Introduce a new x86_init hook called pagetable_reserve that at the end
of init_memory_mapping is used to reserve a range of memory addresses for
the kernel pagetable pages we used and free the other ones.

On native it just calls memblock_x86_reserve_range while on xen it also
takes care of setting the spare memory previously allocated
for kernel pagetable pages from RO to RW, so that it can be used for
other purposes.

A detailed explanation of the reason why this hook is needed follows.

As a consequence of the commit:

commit 4b239f458c
Author: Yinghai Lu <yinghai@kernel.org>
Date:   Fri Dec 17 16:58:28 2010 -0800

    x86-64, mm: Put early page table high

at some point init_memory_mapping is going to reach the pagetable pages
area and map those pages too (mapping them as normal memory that falls
in the range of addresses passed to init_memory_mapping as argument).
Some of those pages are already pagetable pages (they are in the range
pgt_buf_start-pgt_buf_end) therefore they are going to be mapped RO and
everything is fine.
Some of these pages are not pagetable pages yet (they fall in the range
pgt_buf_end-pgt_buf_top; for example the page at pgt_buf_end) so they
are going to be mapped RW.  When these pages become pagetable pages and
are hooked into the pagetable, xen will find that the guest has already
a RW mapping of them somewhere and fail the operation.
The reason Xen requires pagetables to be RO is that the hypervisor needs
to verify that the pagetables are valid before using them. The validation
operations are called "pinning" (more details in arch/x86/xen/mmu.c).

In order to fix the issue we mark all the pages in the entire range
pgt_buf_start-pgt_buf_top as RO, however when the pagetable allocation
is completed only the range pgt_buf_start-pgt_buf_end is reserved by
init_memory_mapping. Hence the kernel is going to crash as soon as one
of the pages in the range pgt_buf_end-pgt_buf_top is reused (b/c those
ranges are RO).

For this reason we need a hook to reserve the kernel pagetable pages we
used and free the other ones so that they can be reused for other
purposes.
On native it just means calling memblock_x86_reserve_range, on Xen it
also means marking RW the pagetable pages that we allocated before but
that haven't been used before.

Another way to fix this is without using the hook is by adding a 'if
(xen_pv_domain)' in the 'init_memory_mapping' code and calling the Xen
counterpart, but that is just nasty.

Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Acked-by: Yinghai Lu <yinghai@kernel.org>
Acked-by: H. Peter Anvin <hpa@zytor.com>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2011-05-12 13:05:04 -04:00
..
kmemcheck x86: Eliminate bp argument from the stack tracing routines 2010-11-18 14:37:34 +01:00
amdtopology_64.c x86-64, NUMA: Unify emulated distance mapping 2011-02-16 17:11:10 +01:00
dump_pagetables.c x86, mm: Create symbolic index into address_markers array 2010-07-20 16:56:19 -07:00
extable.c x86, 64-bit: Move K8 B step iret fixup to fault entry asm 2009-10-12 18:29:46 +02:00
fault.c x86/mm: Fix pgd_lock deadlock 2011-03-10 09:41:57 +01:00
gup.c thp: mmu_notifier_test_young 2011-01-13 17:32:46 -08:00
highmem_32.c mm: fix race in kunmap_atomic() 2010-10-27 18:03:05 -07:00
hugetlbpage.c x86: Fix common misspellings 2011-03-18 10:39:30 +01:00
init_32.c x86: Fix common misspellings 2011-03-18 10:39:30 +01:00
init_64.c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6 2011-03-23 20:51:42 -07:00
init.c x86,xen: introduce x86_init.mapping.pagetable_reserve 2011-05-12 13:05:04 -04:00
iomap_32.c mm: fix race in kunmap_atomic() 2010-10-27 18:03:05 -07:00
ioremap.c xen: Cope with unmapped pages when initializing kernel pagetable 2010-10-13 16:07:13 -07:00
kmmio.c x86, kmmio/mmiotrace: Fix double free of kmmio_fault_pages 2010-06-18 11:30:09 +02:00
Makefile x86-64, NUMA: Move NUMA emulation into numa_emulation.c 2011-02-22 11:10:08 +01:00
memblock.c x86, memblock: Remove __memblock_x86_find_in_range_size() 2010-10-05 21:45:43 -07:00
memtest.c x86, memblock: Replace e820_/_early string with memblock_ 2010-08-27 11:13:47 -07:00
mmap.c x86: Use helpers for rlimits 2010-01-27 15:17:31 -08:00
mmio-mod.c include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h 2010-03-30 22:02:32 +09:00
numa_32.c x86, NUMA: Move *_numa_init() invocations into initmem_init() 2011-02-16 12:13:06 +01:00
numa_64.c x86, NUMA: Fix empty memblk detection in numa_cleanup_meminfo() 2011-05-01 19:15:11 +02:00
numa_emulation.c x86, numa: Fix cpu nodemasks for NUMA emulation and CONFIG_DEBUG_PER_CPU_MAPS 2011-04-21 11:31:00 +02:00
numa_internal.h x86-64, NUMA: Move NUMA emulation into numa_emulation.c 2011-02-22 11:10:08 +01:00
numa.c x86, numa: Fix cpu nodemasks for NUMA emulation and CONFIG_DEBUG_PER_CPU_MAPS 2011-04-21 11:31:00 +02:00
pageattr-test.c x86: make sure the CPA test code's use of _PAGE_UNUSED1 is obvious 2008-09-05 17:09:57 +02:00
pageattr.c x86: Fix common misspellings 2011-03-18 10:39:30 +01:00
pat_internal.h x86, pat: Fix memory leak in free_memtype 2010-05-26 11:26:04 -07:00
pat_rbtree.c rbtree: Undo augmented trees performance damage and regression 2010-07-05 14:43:50 +02:00
pat.c Merge branch 'x86-mm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip 2010-08-06 10:17:52 -07:00
pf_in.c x86,mmiotrace: Add support for tracing STOS instruction 2010-08-02 01:32:01 +02:00
pf_in.h x86 mmiotrace: move files into arch/x86/mm/. 2008-05-24 11:25:37 +02:00
pgtable_32.c x86: remove last traces of quicklist usage 2010-05-24 13:33:31 -07:00
pgtable.c x86: Flush TLB if PGD entry is changed in i386 PAE mode 2011-03-18 11:44:01 +01:00
physaddr.c x86: split __phys_addr out into separate file 2009-09-10 11:48:55 -07:00
physaddr.h x86: split __phys_addr out into separate file 2009-09-10 11:48:55 -07:00
setup_nx.c x86, cpu: Only CPU features determine NX capabilities 2010-11-10 15:43:15 -08:00
srat_32.c x86-32, NUMA: Fix ACPI NUMA init broken by recent x86-64 change 2011-04-04 16:56:33 +02:00
srat_64.c x86-64, NUMA: Unify emulated distance mapping 2011-02-16 17:11:10 +01:00
testmmiotrace.c x86, kmmio/mmiotrace: Fix double free of kmmio_fault_pages 2010-06-18 11:30:09 +02:00
tlb.c x86, tlb, UV: Do small micro-optimization for native_flush_tlb_others() 2011-03-15 08:30:34 +01:00