linux/mm
Vlastimil Babka 3a1086fba9 mm: always steal split buddies in fallback allocations
When allocation falls back to another migratetype, it will steal a page
with highest available order, and (depending on this order and desired
migratetype), it might also steal the rest of free pages from the same
pageblock.

Given the preference of highest available order, it is likely that it will
be higher than the desired order, and result in the stolen buddy page
being split.  The remaining pages after split are currently stolen only
when the rest of the free pages are stolen.  This can however lead to
situations where for MOVABLE allocations we split e.g.  order-4 fallback
UNMOVABLE page, but steal only order-0 page.  Then on the next MOVABLE
allocation (which may be batched to fill the pcplists) we split another
order-3 or higher page, etc.  By stealing all pages that we have split, we
can avoid further stealing.

This patch therefore adjusts the page stealing so that buddy pages created
by split are always stolen.  This has effect only on MOVABLE allocations,
as RECLAIMABLE and UNMOVABLE allocations already always do that in
addition to stealing the rest of free pages from the pageblock.  The
change also allows to simplify try_to_steal_freepages() and factor out CMA
handling.

According to Mel, it has been intended since the beginning that buddy
pages after split would be stolen always, but it doesn't seem like it was
ever the case until commit 47118af076 ("mm: mmzone: MIGRATE_CMA
migration type added").  The commit has unintentionally introduced this
behavior, but was reverted by commit 0cbef29a78 ("mm:
__rmqueue_fallback() should respect pageblock type").  Neither included
evaluation.

My evaluation with stress-highalloc from mmtests shows about 2.5x
reduction of page stealing events for MOVABLE allocations, without
affecting the page stealing events for other allocation migratetypes.

Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Acked-by: Mel Gorman <mgorman@suse.de>
Cc: Zhang Yanfei <zhangyanfei@cn.fujitsu.com>
Acked-by: Minchan Kim <minchan@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-02-11 17:06:06 -08:00
..
backing-dev.c Merge branch 'for-3.18/core' of git://git.kernel.dk/linux-block 2014-10-18 11:53:51 -07:00
balloon_compaction.c mm/balloon_compaction: fix deflation when compaction is disabled 2014-10-29 16:33:15 -07:00
bootmem.c mem-hotplug: reset node managed pages when hot-adding a new pgdat 2014-11-13 16:17:06 -08:00
cleancache.c mm: fix cleancache debugfs directory path 2015-01-20 14:08:31 +01:00
cma.c mm: cma: fix totalcma_pages to include DT defined CMA regions 2015-02-11 17:06:03 -08:00
compaction.c mm/compaction: add tracepoint to observe behaviour of compaction defer 2015-02-11 17:06:04 -08:00
debug-pagealloc.c mm/debug-pagealloc: make debug-pagealloc boottime configurable 2014-12-13 12:42:48 -08:00
debug.c mm: account pmd page tables to the process 2015-02-11 17:06:04 -08:00
dmapool.c mm/dmapool.c: fixed a brace coding style issue 2014-10-09 22:26:00 -04:00
early_ioremap.c
fadvise.c mm: fadvise: document the fadvise(FADV_DONTNEED) behaviour for partial pages 2014-12-13 12:42:49 -08:00
failslab.c
filemap_xip.c mm: drop vm_ops->remap_pages and generic_file_remap_pages() stub 2015-02-10 14:30:30 -08:00
filemap.c mm: drop vm_ops->remap_pages and generic_file_remap_pages() stub 2015-02-10 14:30:30 -08:00
frontswap.c mm/frontswap.c: fix the condition in BUG_ON 2014-12-10 17:41:08 -08:00
gup.c mm: gup: use get_user_pages_unlocked within get_user_pages_fast 2015-02-11 17:06:05 -08:00
highmem.c mm/highmem: make kmap cache coloring aware 2014-08-06 18:01:22 -07:00
huge_memory.c mincore: apply page table walker on do_mincore() 2015-02-11 17:06:06 -08:00
hugetlb_cgroup.c mm: page_counter: pull "-1" handling out of page_counter_memparse() 2015-02-11 17:06:02 -08:00
hugetlb.c mm: account pmd page tables to the process 2015-02-11 17:06:04 -08:00
hwpoison-inject.c mm/hwpoison-inject.c: remove unnecessary null test before debugfs_remove_recursive 2014-08-06 18:01:19 -07:00
init-mm.c
internal.h mm: reduce try_to_compact_pages parameters 2015-02-11 17:06:02 -08:00
interval_tree.c mm: replace vma->sharead.linear with vma->shared 2015-02-10 14:30:31 -08:00
iov_iter.c copy_from_iter_nocache() 2014-12-08 20:25:23 -05:00
Kconfig rcu: Make SRCU optional by using CONFIG_SRCU 2015-01-06 11:04:29 -08:00
Kconfig.debug mm/debug_pagealloc: remove obsolete Kconfig options 2015-01-08 15:10:52 -08:00
kmemcheck.c mm/slab_common: move kmem_cache definition to internal header 2014-10-09 22:25:50 -04:00
kmemleak-test.c
kmemleak.c
ksm.c mm: remove rest usage of VM_NONLINEAR and pte_file() 2015-02-10 14:30:31 -08:00
list_lru.c
maccess.c
madvise.c mm: remove rest usage of VM_NONLINEAR and pte_file() 2015-02-10 14:30:31 -08:00
Makefile mm: replace remap_file_pages() syscall with emulation 2015-02-10 14:30:30 -08:00
memblock.c mm/memblock.c: refactor functions to set/clear MEMBLOCK_HOTPLUG 2014-12-13 12:42:46 -08:00
memcontrol.c memcg: cleanup preparation for page table walk 2015-02-11 17:06:06 -08:00
memory_hotplug.c mm, memory_hotplug/failure: drain single zone pcplists 2014-12-10 17:41:05 -08:00
memory-failure.c mm: vmscan: invoke slab shrinkers from shrink_zone() 2014-12-13 12:42:48 -08:00
memory.c mm: account pmd page tables to the process 2015-02-11 17:06:04 -08:00
mempolicy.c mm: pagewalk: fix misbehavior of walk_page_range for vma(VM_PFNMAP) 2015-02-11 17:06:06 -08:00
mempool.c
migrate.c mm/hugetlb: take page table lock in follow_huge_pmd() 2015-02-11 17:06:01 -08:00
mincore.c mincore: apply page table walker on do_mincore() 2015-02-11 17:06:06 -08:00
mlock.c Merge branch 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2014-10-13 15:44:12 +02:00
mm_init.c
mmap.c mm: fix false-positive warning on exit due mm_nr_pmds(mm) 2015-02-11 17:06:04 -08:00
mmu_context.c
mmu_notifier.c mmu_notifier: add the callback for mmu_notifier_invalidate_range() 2014-11-13 13:46:09 +11:00
mmzone.c mm: microoptimize zonelist operations 2015-02-11 17:06:02 -08:00
mprotect.c mm: remove rest usage of VM_NONLINEAR and pte_file() 2015-02-10 14:30:31 -08:00
mremap.c mm: remove rest usage of VM_NONLINEAR and pte_file() 2015-02-10 14:30:31 -08:00
msync.c mm: remove rest usage of VM_NONLINEAR and pte_file() 2015-02-10 14:30:31 -08:00
nobootmem.c mem-hotplug: reset node managed pages when hot-adding a new pgdat 2014-11-13 16:17:06 -08:00
nommu.c mm: gup: add __get_user_pages_unlocked to customize gup_flags 2015-02-11 17:06:05 -08:00
oom_kill.c mm: account pmd page tables to the process 2015-02-11 17:06:04 -08:00
page_alloc.c mm: always steal split buddies in fallback allocations 2015-02-11 17:06:06 -08:00
page_counter.c mm: page_counter: pull "-1" handling out of page_counter_memparse() 2015-02-11 17:06:02 -08:00
page_ext.c mm/page_owner: keep track of page owners 2014-12-13 12:42:48 -08:00
page_io.c
page_isolation.c mm, page_isolation: drain single zone pcplists 2014-12-10 17:41:05 -08:00
page_owner.c mm/page_owner: correct owner information for early allocated pages 2014-12-13 12:42:48 -08:00
page-writeback.c page_writeback: put account_page_redirty() after set_page_dirty() 2015-02-11 17:06:04 -08:00
pagewalk.c mm: pagewalk: fix misbehavior of walk_page_range for vma(VM_PFNMAP) 2015-02-11 17:06:06 -08:00
percpu-km.c percpu: implmeent pcpu_nr_empty_pop_pages and chunk->nr_populated 2014-09-02 14:46:05 -04:00
percpu-vm.c percpu: move region iterations out of pcpu_[de]populate_chunk() 2014-09-02 14:46:02 -04:00
percpu.c percpu: off by one in BUG_ON() 2014-10-29 10:34:34 -04:00
pgtable-generic.c mm: actually clear pmd_numa before invalidating 2014-08-29 16:28:15 -07:00
process_vm_access.c mm: gup: use get_user_pages_unlocked 2015-02-11 17:06:05 -08:00
quicklist.c
readahead.c mm/readahead.c: remove unused file_ra_state from count_history_pages 2014-08-06 18:01:15 -07:00
rmap.c mm: memcontrol: track move_lock state internally 2015-02-11 17:06:00 -08:00
shmem.c swap: remove unused mem_cgroup_uncharge_swapcache declaration 2015-02-11 17:06:00 -08:00
slab_common.c memcg: zap memcg_slab_caches and memcg_slab_mutex 2015-02-10 14:30:34 -08:00
slab.c slab: fix cpuset check in fallback_alloc 2014-12-13 12:42:53 -08:00
slab.h memcg: zap __memcg_{charge,uncharge}_slab 2015-02-10 14:30:34 -08:00
slob.c mm/sl[ao]b: always track caller in kmalloc_(node_)track_caller() 2014-10-09 22:25:50 -04:00
slub.c mm/slub.c: fix typo in comment 2015-02-10 14:30:30 -08:00
sparse-vmemmap.c
sparse.c
swap_cgroup.c mm: page_cgroup: rename file to mm/swap_cgroup.c 2014-12-10 17:41:09 -08:00
swap_state.c mm: page_cgroup: rename file to mm/swap_cgroup.c 2014-12-10 17:41:09 -08:00
swap.c rmap: drop support of non-linear mappings 2015-02-10 14:30:31 -08:00
swapfile.c mm: page_cgroup: rename file to mm/swap_cgroup.c 2014-12-10 17:41:09 -08:00
truncate.c mm: Fix comment before truncate_setsize() 2014-11-07 08:29:25 +11:00
util.c mm: gup: use get_user_pages_unlocked within get_user_pages_fast 2015-02-11 17:06:05 -08:00
vmacache.c mm,vmacache: count number of system-wide flushes 2014-12-13 12:42:48 -08:00
vmalloc.c mm/vmalloc.c: fix memory ordering bug 2014-12-13 12:42:49 -08:00
vmpressure.c mm/vmpressure.c: fix race in vmpressure_work_fn() 2014-12-02 17:32:07 -08:00
vmscan.c mm: memcontrol: default hierarchy interface for memory 2015-02-11 17:06:02 -08:00
vmstat.c mm/vmstat.c: fix/cleanup ifdefs 2015-02-10 14:30:30 -08:00
workingset.c
zbud.c mm/zbud: init user ops only when it is needed 2014-12-13 12:42:51 -08:00
zpool.c mm/zpool: use prefixed module loading 2014-08-29 16:28:16 -07:00
zsmalloc.c mm/zsmalloc: adjust order of functions 2014-12-18 19:08:11 -08:00
zswap.c mm/zswap: delete unnecessary check before calling free_percpu() 2014-12-13 12:42:50 -08:00