linux

mirror of https://github.com/torvalds/linux.git synced 2024-12-13 06:32:50 +00:00

History

Nishanth Aravamudan 2fabf084b6 powerpc: reorder per-cpu NUMA information's initialization There is an issue currently where NUMA information is used on powerpc (and possibly ia64) before it has been read from the device-tree, which leads to large slab consumption with CONFIG_SLUB and memoryless nodes. NUMA powerpc non-boot CPU's cpu_to_node/cpu_to_mem is only accurate after start_secondary(), similar to ia64, which is invoked via smp_init(). Commit `6ee0578b4d` ("workqueue: mark init_workqueues() as early_initcall()") made init_workqueues() be invoked via do_pre_smp_initcalls(), which is obviously before the secondary processors are online. Additionally, the following commits changed init_workqueues() to use cpu_to_node to determine the node to use for kthread_create_on_node: `bce903809a` ("workqueue: add wq_numa_tbl_len and wq_numa_possible_cpumask[]") `f3f90ad469` ("workqueue: determine NUMA node of workers accourding to the allowed cpumask") Therefore, when init_workqueues() runs, it sees all CPUs as being on Node 0. On LPARs or KVM guests where Node 0 is memoryless, this leads to a high number of slab deactivations (http://www.spinics.net/lists/linux-mm/msg67489.html). Fix this by initializing the powerpc-specific CPU<->node/local memory node mapping as early as possible, which on powerpc is do_init_bootmem(). Currently that function initializes the mapping for the boot CPU, but we extend it to setup the mapping for all possible CPUs. Then, in smp_prepare_cpus(), we can correspondingly set the per-cpu values for all possible CPUs. That ensures that before the early_initcalls run (and really as early as possible), the per-cpu NUMA mapping is accurate. While testing memoryless nodes on PowerKVM guests with a fix to the workqueue logic to use cpu_to_mem() instead of cpu_to_node(), with a guest topology of: available: 2 nodes (0-1) node 0 cpus: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 node 0 size: 0 MB node 0 free: 0 MB node 1 cpus: 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 node 1 size: 16336 MB node 1 free: 15329 MB node distances: node 0 1 0: 10 40 1: 40 10 the slab consumption decreases from Slab: 932416 kB SUnreclaim: 902336 kB to Slab: 395264 kB SUnreclaim: 359424 kB And we a corresponding increase in the slab efficiency from slab mem objs slabs used active active ------------------------------------------------------------ kmalloc-16384 337 MB 11.28% 100.00% task_struct 288 MB 9.93% 100.00% to slab mem objs slabs used active active ------------------------------------------------------------ kmalloc-16384 37 MB 100.00% 100.00% task_struct 31 MB 100.00% 100.00% Powerpc didn't support memoryless nodes until recently (`64bb80d87f` "powerpc/numa: Enable CONFIG_HAVE_MEMORYLESS_NODES" and `8c27226119` "powerpc/numa: Enable USE_PERCPU_NUMA_NODE_ID"). Those commits also helped improve memory consumption with these kind of environments. Signed-off-by: Nishanth Aravamudan <nacc@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>		2014-08-13 15:14:05 +10:00
..
boot	powerpc/boot: Use correct zlib types for comparison	2014-08-13 15:13:45 +10:00
configs	Here are the PPC and ARM changes for KVM, which I separated because	2014-08-07 11:35:30 -07:00
crypto	powerpc: Fix compile of sha1-powerpc-asm.S on 32-bit	2013-03-05 16:56:26 +11:00
include	powerpc: remove duplicate definition of TEXASR_FS	2014-08-13 15:13:47 +10:00
kernel	powerpc: reorder per-cpu NUMA information's initialization	2014-08-13 15:14:05 +10:00
kvm	Here are the PPC and ARM changes for KVM, which I separated because	2014-08-07 11:35:30 -07:00
lib	powerpc: Add smp_mb()s to arch_spin_unlock_wait()	2014-08-13 15:13:27 +10:00
math-emu	powerpc: Correct emulated mtfsf instruction	2014-04-07 10:33:11 +10:00
mm	powerpc: reorder per-cpu NUMA information's initialization	2014-08-13 15:14:05 +10:00
net	net: filter: split 'struct sk_filter' into socket and bpf parts	2014-08-02 15:03:58 -07:00
oprofile	powerpc: Remove oprofile RS64 support	2014-07-28 14:10:25 +10:00
perf	powerpc/perf/hv-24x7: Use kmem_cache_free	2014-08-13 15:14:04 +10:00
platforms	powerpc/pseries/hvcserver: Fix endian issue in hvcs_get_partner_info	2014-08-13 15:14:04 +10:00
sysdev	Merge remote-tracking branch 'scott/next' into next	2014-08-05 14:13:41 +10:00
xmon	powerpc: Hard disable interrupts in xmon	2014-08-13 15:13:48 +10:00
Kconfig	kexec: load and relocate purgatory at kernel load time	2014-08-08 15:57:32 -07:00
Kconfig.debug	Patch queue for ppc - 2014-08-01	2014-08-05 09:58:11 +02:00
Makefile	Merge branch 'merge' into next	2014-05-28 13:30:12 +10:00
relocs_check.pl	Fix warning typo "CONFIG_RELCOATABLE"	2013-05-29 15:11:30 +02:00