mainlining shenanigans
Go to file
David Rientjes bc3106b26c mm, compaction: drain pcps for zone when kcompactd fails
It's possible for free pages to become stranded on per-cpu pagesets
(pcps) that, if drained, could be merged with buddy pages on the zone's
free area to form large order pages, including up to MAX_ORDER.

Consider a verbose example using the tools/vm/page-types tool at the
beginning of a ZONE_NORMAL ('B' indicates a buddy page and 'S' indicates
a slab page).  Pages on pcps do not have any page flags set.

  109954  1       _______S________________________________________________________
  109955  2       __________B_____________________________________________________
  109957  1       ________________________________________________________________
  109958  1       __________B_____________________________________________________
  109959  7       ________________________________________________________________
  109960  1       __________B_____________________________________________________
  109961  9       ________________________________________________________________
  10996a  1       __________B_____________________________________________________
  10996b  3       ________________________________________________________________
  10996e  1       __________B_____________________________________________________
  10996f  1       ________________________________________________________________
  ...
  109f8c  1       __________B_____________________________________________________
  109f8d  2       ________________________________________________________________
  109f8f  2       __________B_____________________________________________________
  109f91  f       ________________________________________________________________
  109fa0  1       __________B_____________________________________________________
  109fa1  7       ________________________________________________________________
  109fa8  1       __________B_____________________________________________________
  109fa9  1       ________________________________________________________________
  109faa  1       __________B_____________________________________________________
  109fab  1       _______S________________________________________________________

The compaction migration scanner is attempting to defragment this memory
since it is at the beginning of the zone.  It has done so quite well,
all movable pages have been migrated.  From pfn [0x109955, 0x109fab),
there are only buddy pages and pages without flags set.

These pages may be stranded on pcps that could otherwise allow this
memory to be coalesced if freed back to the zone free area.  It is
possible that some of these pages may not be on pcps and that something
has called alloc_pages() and used the memory directly, but we rely on
the absence of __GFP_MOVABLE in these cases to allocate from
MIGATE_UNMOVABLE pageblocks to try to keep these MIGRATE_MOVABLE
pageblocks as free as possible.

These buddy and pcp pages, spanning 1,621 pages, could be coalesced and
allow for three transparent hugepages to be dynamically allocated.
Running the numbers for all such spans on the system, it was found that
there were over 400 such spans of only buddy pages and pages without
flags set at the time this /proc/kpageflags sample was collected.
Without this support, there were _no_ order-9 or order-10 pages free.

When kcompactd fails to defragment memory such that a cc.order page can
be allocated, drain all pcps for the zone back to the buddy allocator so
this stranding cannot occur.  Compaction for that order will
subsequently be deferred, which acts as a ratelimit on this drain.

Link: http://lkml.kernel.org/r/alpine.DEB.2.20.1803010340100.88270@chino.kir.corp.google.com
Signed-off-by: David Rientjes <rientjes@google.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-04-05 21:36:26 -07:00
arch x86/mm/memory_hotplug: determine block size based on the end of boot memory 2018-04-05 21:36:25 -07:00
block Char/Misc patches for 4.17-rc1 2018-04-04 20:07:20 -07:00
certs certs/blacklist_nohashes.c: fix const confusion in certs blacklist 2018-02-21 15:35:43 -08:00
crypto Merge branch 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 2018-04-04 17:11:08 -07:00
Documentation mm, page_alloc: extend kernelcore and movablecore for percent 2018-04-05 21:36:25 -07:00
drivers mm/memory_hotplug: optimize memory hotplug 2018-04-05 21:36:25 -07:00
firmware kbuild: remove all dummy assignments to obj- 2017-11-18 11:46:06 +09:00
fs fs: don't flush pagecache when expanding block device 2018-04-05 21:36:23 -07:00
include mm: make should_failslab always available for fault injection 2018-04-05 21:36:26 -07:00
init Merge branch 'syscalls-next' of git://git.kernel.org/pub/scm/linux/kernel/git/brodo/linux 2018-04-02 21:22:12 -07:00
ipc Merge branch 'userns-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace 2018-04-03 19:15:32 -07:00
kernel Driver core patches for 4.17-rc1 2018-04-04 19:41:45 -07:00
lib lib: fix stall in __bitmap_parselist() 2018-04-05 21:36:21 -07:00
LICENSES LICENSES: Add MPL-1.1 license 2018-01-06 10:59:44 -07:00
mm mm, compaction: drain pcps for zone when kcompactd fails 2018-04-05 21:36:26 -07:00
net net/9p/client.c: fix potential refcnt problem of trans module 2018-04-05 21:36:23 -07:00
samples Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next 2018-04-03 14:04:18 -07:00
scripts scripts/faddr2line: show the code context 2018-04-05 21:36:21 -07:00
security Merge branch 'userns-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace 2018-04-03 19:15:32 -07:00
sound Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next 2018-04-03 14:04:18 -07:00
tools Char/Misc patches for 4.17-rc1 2018-04-04 20:07:20 -07:00
usr kbuild: rename built-in.o to built-in.a 2018-03-26 02:01:19 +09:00
virt kvm/arm fixes for 4.16, take 2 2018-03-15 21:45:37 +01:00
.cocciconfig scripts: add Linux .cocciconfig for coccinelle 2016-07-22 12:13:39 +02:00
.get_maintainer.ignore
.gitattributes .gitattributes: set git diff driver for C source code files 2016-10-07 18:46:30 -07:00
.gitignore kbuild: move include/config/ksym/* to include/ksym/* 2018-03-26 02:01:23 +09:00
.mailmap Merge remote-tracking branch 'spi/topic/samsung' into spi-next 2018-04-02 15:56:32 +01:00
COPYING COPYING: use the new text with points to the license files 2018-03-23 12:41:45 -06:00
CREDITS MAINTAINERS/CREDITS: Drop METAG ARCHITECTURE 2018-03-05 16:34:24 +00:00
Kbuild Kbuild updates for v4.15 2017-11-17 17:45:29 -08:00
Kconfig License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
MAINTAINERS Char/Misc patches for 4.17-rc1 2018-04-04 20:07:20 -07:00
Makefile Kconfig updates for v4.17 2018-04-03 16:28:01 -07:00
README Docs: Added a pointer to the formatted docs to README 2018-03-21 09:02:53 -06:00

Linux kernel
============

There are several guides for kernel developers and users. These guides can
be rendered in a number of formats, like HTML and PDF. Please read
Documentation/admin-guide/README.rst first.

In order to build the documentation, use ``make htmldocs`` or
``make pdfdocs``.  The formatted documentation can also be read online at:

    https://www.kernel.org/doc/html/latest/

There are various text files in the Documentation/ subdirectory,
several of them using the Restructured Text markup notation.
See Documentation/00-INDEX for a list of what is contained in each file.

Please read the Documentation/process/changes.rst file, as it contains the
requirements for building and running the kernel, and information about
the problems which may result by upgrading your kernel.