linux/include
Vladimir Davydov 4949148ad4 mm: charge/uncharge kmemcg from generic page allocator paths
Currently, to charge a non-slab allocation to kmemcg one has to use
alloc_kmem_pages helper with __GFP_ACCOUNT flag.  A page allocated with
this helper should finally be freed using free_kmem_pages, otherwise it
won't be uncharged.

This API suits its current users fine, but it turns out to be impossible
to use along with page reference counting, i.e.  when an allocation is
supposed to be freed with put_page, as it is the case with pipe or unix
socket buffers.

To overcome this limitation, this patch moves charging/uncharging to
generic page allocator paths, i.e.  to __alloc_pages_nodemask and
free_pages_prepare, and zaps alloc/free_kmem_pages helpers.  This way,
one can use any of the available page allocation functions to get the
allocated page charged to kmemcg - it's enough to pass __GFP_ACCOUNT,
just like in case of kmalloc and friends.  A charged page will be
automatically uncharged on free.

To make it possible, we need to mark pages charged to kmemcg somehow.
To avoid introducing a new page flag, we make use of page->_mapcount for
marking such pages.  Since pages charged to kmemcg are not supposed to
be mapped to userspace, it should work just fine.  There are other
(ab)users of page->_mapcount - buddy and balloon pages - but we don't
conflict with them.

In case kmemcg is compiled out or not used at runtime, this patch
introduces no overhead to generic page allocator paths.  If kmemcg is
used, it will be plus one gfp flags check on alloc and plus one
page->_mapcount check on free, which shouldn't hurt performance, because
the data accessed are hot.

Link: http://lkml.kernel.org/r/a9736d856f895bcb465d9f257b54efe32eda6f99.1464079538.git.vdavydov@virtuozzo.com
Signed-off-by: Vladimir Davydov <vdavydov@virtuozzo.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Minchan Kim <minchan@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-07-26 16:19:19 -07:00
..
acpi Revert "ACPI 2.0 / AML: Improve module level execution by moving the If/Else/While execution to per-table basis" 2016-07-11 16:21:08 +02:00
asm-generic mm/mmu_gather: track page size with mmu gather and force flush if page size change 2016-07-26 16:19:19 -07:00
clocksource clocksource/drivers/sp804: Convert init function to return error 2016-06-28 10:19:30 +02:00
crypto
drm Merge branch 'drm-vmwgfx-fixes' of git://people.freedesktop.org/~syeh/repos_linux into drm-fixes 2016-07-15 13:51:55 +10:00
dt-bindings Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/rzhang/linux 2016-05-26 09:23:43 -07:00
keys
kvm arm64: KVM: fix build with CONFIG_ARM_PMU disabled 2016-06-27 12:55:51 +02:00
linux mm: charge/uncharge kmemcg from generic page allocator paths 2016-07-26 16:19:19 -07:00
math-emu
media Update my main e-mails at the Kernel tree 2016-06-15 15:35:37 -10:00
memory
misc
net net: switchdev: change ageing_time type to clock_t 2016-07-19 16:49:20 -07:00
pcmcia
ras
rdma IB/rdmavt: Correct qp_priv_alloc() return value test 2016-06-23 10:16:15 -04:00
rxrpc
scsi
soc Revert "usb: ohci-at91: Forcibly suspend ports while USB suspend" 2016-06-20 07:42:07 -07:00
sound
target Merge branch 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending 2016-05-28 12:04:17 -07:00
trace fs/fs-writeback.c: inode writeback list tracking tracepoints 2016-07-26 16:19:19 -07:00
uapi zsmalloc: page migration support 2016-07-26 16:19:19 -07:00
video imx-drm probing fix 2016-05-25 12:36:20 +10:00
xen
Kbuild