linux/mm
Christoph Lameter 8fce4d8e3b [PATCH] slab: Node rotor for freeing alien caches and remote per cpu pages.
The cache reaper currently tries to free all alien caches and all remote
per cpu pages in each pass of cache_reap.  For a machines with large number
of nodes (such as Altix) this may lead to sporadic delays of around ~10ms.
Interrupts are disabled while reclaiming creating unacceptable delays.

This patch changes that behavior by adding a per cpu reap_node variable.
Instead of attempting to free all caches, we free only one alien cache and
the per cpu pages from one remote node.  That reduces the time spend in
cache_reap.  However, doing so will lengthen the time it takes to
completely drain all remote per cpu pagesets and all alien caches.  The
time needed will grow with the number of nodes in the system.  All caches
are drained when they overflow their respective capacity.  So the drawback
here is only that a bit of memory may be wasted for awhile longer.

Details:

1. Rename drain_remote_pages to drain_node_pages to allow the specification
   of the node to drain of pcp pages.

2. Add additional functions init_reap_node, next_reap_node for NUMA
   that manage a per cpu reap_node counter.

3. Add a reap_alien function that reaps only from the current reap_node.

For us this seems to be a critical issue.  Holdoffs of an average of ~7ms
cause some HPC benchmarks to slow down significantly.  F.e.  NAS parallel
slows down dramatically.  NAS parallel has a 12-16 seconds runtime w/o rotor
compared to 5.8 secs with the rotor patches.  It gets down to 5.05 secs with
the additional interrupt holdoff reductions.

Signed-off-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-09 19:47:38 -08:00
..
bootmem.c [PATCH] FRV: Clean up bootmem allocator's page freeing algorithm 2006-01-06 08:33:26 -08:00
fadvise.c [PATCH] fadvise: return ESPIPE on FIFO/pipe 2006-01-08 20:14:00 -08:00
filemap_xip.c [PATCH] replace inode_update_time with file_update_time 2006-01-10 08:01:30 -08:00
filemap.c [PATCH] mm: migration page refcounting fix 2006-01-18 19:20:17 -08:00
filemap.h [PATCH] xip: reduce code duplication 2005-06-24 00:06:41 -07:00
fremap.c VM: add common helper function to create the page tables 2005-11-29 14:03:14 -08:00
highmem.c [PATCH] gfp_t: the rest 2005-10-28 08:16:51 -07:00
hugetlb.c [PATCH] compound page: use page[1].lru 2006-02-14 16:09:33 -08:00
internal.h [PATCH] FRV: Clean up bootmem allocator's page freeing algorithm 2006-01-06 08:33:26 -08:00
Kconfig [PATCH] Swap Migration V5: Add CONFIG_MIGRATION for page migration support 2006-01-08 20:12:41 -08:00
madvise.c [PATCH] madvise MADV_DONTFORK/MADV_DOFORK 2006-02-14 16:09:34 -08:00
Makefile [PATCH] slob: introduce the SLOB allocator 2006-01-08 20:13:41 -08:00
memory_hotplug.c [PATCH] memory hotadd: pgdat->node_present_pages fix 2006-03-09 19:47:38 -08:00
memory.c [PATCH] x86_64: Add boot option to disable randomized mappings and cleanup 2006-02-17 08:00:40 -08:00
mempolicy.c [PATCH] numa_maps-update fix 2006-03-08 14:14:00 -08:00
mempool.c [PATCH] gfp_t: mm/* (easy parts) 2005-10-28 08:16:47 -07:00
mincore.c
mlock.c [PATCH] move capable() to capability.h 2006-01-11 18:42:13 -08:00
mmap.c [PATCH] move capable() to capability.h 2006-01-11 18:42:13 -08:00
mprotect.c [PATCH] unpaged: private write VM_RESERVED 2005-11-22 09:13:42 -08:00
mremap.c [PATCH] move capable() to capability.h 2006-01-11 18:42:13 -08:00
msync.c [PATCH] mutex subsystem, semaphore to mutex: VFS, ->i_sem 2006-01-09 15:59:24 -08:00
nommu.c [PATCH] nommu: implement vmalloc_node() 2006-02-28 20:53:44 -08:00
oom_kill.c [PATCH] out_of_memory() locking fix 2006-03-02 08:33:07 -08:00
page_alloc.c [PATCH] slab: Node rotor for freeing alien caches and remote per cpu pages. 2006-03-09 19:47:38 -08:00
page_io.c [PATCH] mm: split page table lock 2005-10-29 21:40:42 -07:00
page-writeback.c [PATCH] mm: dirty_exceeded speedup 2006-01-18 19:20:17 -08:00
pdflush.c [PATCH] Swap Migration V5: PF_SWAPWRITE to allow writing to swap 2006-01-08 20:12:41 -08:00
prio_tree.c
readahead.c [PATCH] add AOP_TRUNCATED_PAGE, prepend AOP_ to WRITEPAGE_ACTIVATE 2006-01-03 11:45:42 -08:00
rmap.c [PATCH] page_add_file_rmap(): remove BUG_ON()s 2006-03-09 19:47:36 -08:00
shmem.c [PATCH] tmpfs: fix mount mpol nodelist parsing 2006-02-21 17:10:15 -08:00
slab.c [PATCH] slab: Node rotor for freeing alien caches and remote per cpu pages. 2006-03-09 19:47:38 -08:00
slob.c [PATCH] SLOB=y && SMP=y fix 2006-02-08 07:52:58 -08:00
sparse.c [PATCH] Change maxaligned_in_smp alignemnt macros to internodealigned_in_smp macros 2006-01-08 20:13:38 -08:00
swap_state.c [PATCH] Direct Migration V9: Avoid writeback / page_migrate() method 2006-02-01 08:53:17 -08:00
swap.c [PATCH] percpu_counter_sum() 2006-03-08 14:14:01 -08:00
swapfile.c [PATCH] Direct Migration V9: remove_from_swap() to remove swap ptes 2006-02-01 08:53:16 -08:00
thrash.c [PATCH] temporarily disable swap token on memory pressure 2005-11-28 14:42:25 -08:00
tiny-shmem.c [PATCH] do_truncate() call fix in tiny-shmem.c 2006-01-12 09:08:49 -08:00
truncate.c [PATCH] mutex subsystem, semaphore to mutex: VFS, ->i_sem 2006-01-09 15:59:24 -08:00
util.c [PATCH] slob: introduce mm/util.c for shared functions 2006-01-08 20:13:41 -08:00
vmalloc.c [PATCH] kernel-doc: fix warnings in vmalloc.c 2005-11-07 07:53:56 -08:00
vmscan.c [PATCH] vmscan: no zone_reclaim if PF_MALLOC is set 2006-03-09 19:47:37 -08:00