linux/include
KAMEZAWA Hiroyuki 8c7c6e34a1 memcg: mem+swap controller core
This patch implements per cgroup limit for usage of memory+swap.  However
there are SwapCache, double counting of swap-cache and swap-entry is
avoided.

Mem+Swap controller works as following.
  - memory usage is limited by memory.limit_in_bytes.
  - memory + swap usage is limited by memory.memsw_limit_in_bytes.

This has following benefits.
  - A user can limit total resource usage of mem+swap.

    Without this, because memory resource controller doesn't take care of
    usage of swap, a process can exhaust all the swap (by memory leak.)
    We can avoid this case.

    And Swap is shared resource but it cannot be reclaimed (goes back to memory)
    until it's used. This characteristic can be trouble when the memory
    is divided into some parts by cpuset or memcg.
    Assume group A and group B.
    After some application executes, the system can be..

    Group A -- very large free memory space but occupy 99% of swap.
    Group B -- under memory shortage but cannot use swap...it's nearly full.

    Ability to set appropriate swap limit for each group is required.

Maybe someone wonder "why not swap but mem+swap ?"

  - The global LRU(kswapd) can swap out arbitrary pages. Swap-out means
    to move account from memory to swap...there is no change in usage of
    mem+swap.

    In other words, when we want to limit the usage of swap without affecting
    global LRU, mem+swap limit is better than just limiting swap.

Accounting target information is stored in swap_cgroup which is
per swap entry record.

Charge is done as following.
  map
    - charge  page and memsw.

  unmap
    - uncharge page/memsw if not SwapCache.

  swap-out (__delete_from_swap_cache)
    - uncharge page
    - record mem_cgroup information to swap_cgroup.

  swap-in (do_swap_page)
    - charged as page and memsw.
      record in swap_cgroup is cleared.
      memsw accounting is decremented.

  swap-free (swap_free())
    - if swap entry is freed, memsw is uncharged by PAGE_SIZE.

There are people work under never-swap environments and consider swap as
something bad. For such people, this mem+swap controller extension is just an
overhead.  This overhead is avoided by config or boot option.
(see Kconfig. detail is not in this patch.)

TODO:
 - maybe more optimization can be don in swap-in path. (but not very safe.)
   But we just do simple accounting at this stage.

[nishimura@mxp.nes.nec.co.jp: make resize limit hold mutex]
[hugh@veritas.com: memswap controller core swapcache fixes]
Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Li Zefan <lizf@cn.fujitsu.com>
Cc: Balbir Singh <balbir@in.ibm.com>
Cc: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
Signed-off-by: Hugh Dickins <hugh@veritas.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-01-08 08:31:05 -08:00
..
acpi trivial: fix an -> a typos in documentation and comments 2009-01-06 11:28:07 +01:00
asm-arm
asm-frv frv: introduce asm/swab.h 2009-01-06 18:10:28 -08:00
asm-generic remove linux/hardirq.h from asm-generic/local.h 2009-01-06 15:59:13 -08:00
asm-h8300
asm-m32r m32r: introduce asm/swab.h 2009-01-06 18:10:28 -08:00
asm-m68k m68k: introduce asm/swab.h 2009-01-06 18:10:27 -08:00
asm-mn10300 mn10300: introduce asm/swab.h 2009-01-06 18:10:29 -08:00
crypto crypto: aes - Precompute tables 2008-12-25 11:05:13 +11:00
drm drm: Add a debug node for vblank state. 2008-12-29 17:47:27 +10:00
keys KEYS: Disperse linux/key_ui.h 2008-11-14 10:39:13 +11:00
linux memcg: mem+swap controller core 2009-01-08 08:31:05 -08:00
math-emu
media V4L/DVB (10141): v4l2: debugging API changed to match against driver name instead of ID. 2009-01-02 17:11:52 -02:00
mtd trivial: fix then -> than typos in comments and documentation 2009-01-06 11:28:06 +01:00
net wimax: headers for kernel API and user space interaction 2009-01-07 10:00:16 -08:00
pcmcia
rdma
rxrpc
scsi [SCSI] fcoe: Fibre Channel over Ethernet 2008-12-29 11:24:33 -06:00
sound Merge branch 'topic/asoc' into for-linus 2009-01-06 09:48:51 +01:00
trace sched, trace: update trace_sched_wakeup() 2008-12-25 13:10:21 +01:00
video video: sh_mobile_lcdcfb deferred io support 2008-12-22 18:44:48 +09:00
xen xen: add xenfs to allow usermode <-> Xen interaction 2009-01-08 08:30:59 -08:00
Kbuild