linux/arch/x86/mm
David Rientjes adc1938994 x86: Interleave emulated nodes over physical nodes
Add interleaved NUMA emulation support

This patch interleaves emulated nodes over the system's physical
nodes. This is required for interleave optimizations since
mempolicies, for example, operate by iterating over a nodemask and
act without knowledge of node distances.  It can also be used for
testing memory latencies and NUMA bugs in the kernel.

There're a couple of ways to do this:

 - divide the number of emulated nodes by the number of physical
   nodes and allocate the result on each physical node, or

 - allocate each successive emulated node on a different physical
   node until all memory is exhausted.

The disadvantage of the first option is, depending on the asymmetry
in node capacities of each physical node, emulated nodes may
substantially differ in size on a particular physical node compared
to another.

The disadvantage of the second option is, also depending on the
asymmetry in node capacities of each physical node, there may be
more emulated nodes allocated on a single physical node as another.

This patch implements the second option; we sacrifice the
possibility that we may have slightly more emulated nodes on a
particular physical node compared to another in lieu of node size
asymmetry.

 [ Note that "node capacity" of a physical node is not only a
   function of its addressable range, but also is affected by
   subtracting out the amount of reserved memory over that range.
   NUMA emulation only deals with available, non-reserved memory
   quantities. ]

We ensure there is at least a minimal amount of available memory
allocated to each node.  We also make sure that at least this
amount of available memory is available in ZONE_DMA32 for any node
that includes both ZONE_DMA32 and ZONE_NORMAL.

This patch also cleans the emulation code up by no longer passing
the statically allocated struct bootnode array among the various
functions. This init.data array is not allocated on the stack since
it may be very large and thus it may be accessed at file scope.

The WARN_ON() for nodes_cover_memory() when faking proximity
domains is removed since it relies on successive nodes always
having greater start addresses than previous nodes; with
interleaving this is no longer always true.

Signed-off-by: David Rientjes <rientjes@google.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andreas Herrmann <andreas.herrmann3@amd.com>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: Balbir Singh <balbir@linux.vnet.ibm.com>
Cc: Ankita Garg <ankita@in.ibm.com>
Cc: Len Brown <len.brown@intel.com>
LKML-Reference: <alpine.DEB.1.00.0909251519150.14754@chino.kir.corp.google.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-12 22:56:46 +02:00
..
kmemcheck Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/vegard/kmemcheck 2009-09-22 08:07:54 -07:00
dump_pagetables.c x86: remove (null) in /sys kernel_page_tables 2009-04-14 11:50:22 +02:00
extable.c x86: uaccess: introduce try and catch framework 2009-01-23 17:17:36 -08:00
fault.c Merge branch 'hwpoison' of git://git.kernel.org/pub/scm/linux/kernel/git/ak/linux-mce-2.6 2009-09-24 07:53:22 -07:00
gup.c Merge branch 'perfcounters-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip 2009-06-20 11:29:32 -07:00
highmem_32.c Merge branch 'kvm-updates/2.6.32' of git://git.kernel.org/pub/scm/virt/kvm/kvm 2009-09-14 17:43:43 -07:00
hugetlbpage.c x86: ignore VM_LOCKED when determining if hugetlb-backed page tables can be shared or not 2009-05-29 08:40:03 -07:00
init_32.c x86: Export k8 physical topology 2009-10-12 22:56:45 +02:00
init_64.c x86: Export k8 physical topology 2009-10-12 22:56:45 +02:00
init.c x86: split NX setup into separate file to limit unstack-protected code 2009-09-21 13:56:58 -07:00
iomap_32.c x86, pat: Add PAT reserve free to io_mapping* APIs 2009-08-26 15:41:16 -07:00
ioremap.c Merge branch 'x86-pat-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip 2009-09-15 09:19:38 -07:00
k8topology_64.c x86: Export k8 physical topology 2009-10-12 22:56:45 +02:00
kmmio.c Merge branch 'linus' into tracing/core 2009-05-07 11:17:34 +02:00
Makefile x86: split NX setup into separate file to limit unstack-protected code 2009-09-21 13:56:58 -07:00
memtest.c x86: memtest: use pointers of equal type for comparison 2009-06-11 16:26:35 +02:00
mmap.c x86: Increase MIN_GAP to include randomized stack 2009-09-10 17:00:12 -07:00
mmio-mod.c tracing: x86, mmiotrace: only register for die notifier when tracer active 2009-04-29 11:33:34 +02:00
numa_32.c x86: Export k8 physical topology 2009-10-12 22:56:45 +02:00
numa_64.c x86: Interleave emulated nodes over physical nodes 2009-10-12 22:56:46 +02:00
numa.c cpumask: convert node_to_cpumask_map[] to cpumask_var_t 2009-03-13 14:35:31 +01:00
pageattr-test.c x86: make sure the CPA test code's use of _PAGE_UNUSED1 is obvious 2008-09-05 17:09:57 +02:00
pageattr.c Merge branch 'drm-intel-next' of git://git.kernel.org/pub/scm/linux/kernel/git/anholt/drm-intel 2009-09-24 10:30:41 -07:00
pat.c x86: Reduce verbosity of "PAT enabled" kernel message 2009-09-24 11:35:19 +02:00
pf_in.c x86: fix mmiotrace 8-bit register decoding 2008-10-14 10:33:50 +02:00
pf_in.h x86 mmiotrace: move files into arch/x86/mm/. 2008-05-24 11:25:37 +02:00
pgtable_32.c x86/32: no need to use set_pte_present in set_pte_vaddr 2009-03-19 14:04:18 +01:00
pgtable.c x86, 32-bit: Fix double accounting in reserve_top_address() 2009-08-04 16:27:29 +02:00
physaddr.c x86: split __phys_addr out into separate file 2009-09-10 11:48:55 -07:00
physaddr.h x86: split __phys_addr out into separate file 2009-09-10 11:48:55 -07:00
setup_nx.c x86: split NX setup into separate file to limit unstack-protected code 2009-09-21 13:56:58 -07:00
srat_32.c x86: Decrease the level of some NUMA messages to KERN_DEBUG 2009-09-06 06:32:23 +02:00
srat_64.c x86: Interleave emulated nodes over physical nodes 2009-10-12 22:56:46 +02:00
testmmiotrace.c x86: add far read test to testmmiotrace 2009-03-02 10:20:35 +01:00
tlb.c cpumask: use mm_cpumask() wrapper: x86 2009-09-24 09:34:52 +09:30