linux/fs
Johannes Weiner 6b4f7799c6 mm: vmscan: invoke slab shrinkers from shrink_zone()
The slab shrinkers are currently invoked from the zonelist walkers in
kswapd, direct reclaim, and zone reclaim, all of which roughly gauge the
eligible LRU pages and assemble a nodemask to pass to NUMA-aware
shrinkers, which then again have to walk over the nodemask.  This is
redundant code, extra runtime work, and fairly inaccurate when it comes to
the estimation of actually scannable LRU pages.  The code duplication will
only get worse when making the shrinkers cgroup-aware and requiring them
to have out-of-band cgroup hierarchy walks as well.

Instead, invoke the shrinkers from shrink_zone(), which is where all
reclaimers end up, to avoid this duplication.

Take the count for eligible LRU pages out of get_scan_count(), which
considers many more factors than just the availability of swap space, like
zone_reclaimable_pages() currently does.  Accumulate the number over all
visited lruvecs to get the per-zone value.

Some nodes have multiple zones due to memory addressing restrictions.  To
avoid putting too much pressure on the shrinkers, only invoke them once
for each such node, using the class zone of the allocation as the pivot
zone.

For now, this integrates the slab shrinking better into the reclaim logic
and gets rid of duplicative invocations from kswapd, direct reclaim, and
zone reclaim.  It also prepares for cgroup-awareness, allowing
memcg-capable shrinkers to be added at the lruvec level without much
duplication of both code and runtime work.

This changes kswapd behavior, which used to invoke the shrinkers for each
zone, but with scan ratios gathered from the entire node, resulting in
meaningless pressure quantities on multi-zone nodes.

Zone reclaim behavior also changes.  It used to shrink slabs until the
same amount of pages were shrunk as were reclaimed from the LRUs.  Now it
merely invokes the shrinkers once with the zone's scan ratio, which makes
the shrinkers go easier on caches that implement aging and would prefer
feeding back pressure from recently used slab objects to unused LRU pages.

[vdavydov@parallels.com: assure class zone is populated]
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Cc: Dave Chinner <david@fromorbit.com>
Signed-off-by: Vladimir Davydov <vdavydov@parallels.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-12-13 12:42:48 -08:00
..
9p assorted conversions to %p[dD] 2014-11-19 13:01:20 -05:00
adfs adfs: add __printf verification, fix format/argument mismatches 2014-08-08 15:57:24 -07:00
affs assorted conversions to %p[dD] 2014-11-19 13:01:20 -05:00
afs Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next 2014-12-11 14:27:06 -08:00
autofs4 assorted conversions to %p[dD] 2014-11-19 13:01:20 -05:00
befs assorted conversions to %p[dD] 2014-11-19 13:01:20 -05:00
bfs fs/bfs: use bfs prefix for dump_imap 2014-08-08 15:57:24 -07:00
btrfs Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs 2014-12-12 11:15:23 -08:00
cachefiles assorted conversions to %p[dD] 2014-11-19 13:01:20 -05:00
ceph Merge branch 'iov_iter' into for-next 2014-12-08 20:39:29 -05:00
cifs Merge branch 'akpm' (patchbomb from Andrew) 2014-12-10 18:34:42 -08:00
coda move d_rcu from overlapping d_child to overlapping d_alias 2014-11-03 15:20:29 -05:00
configfs assorted conversions to %p[dD] 2014-11-19 13:01:20 -05:00
cramfs fs/cramfs/inode.c: use linux/uaccess.h 2014-08-08 15:57:25 -07:00
debugfs Merge tag 'trace-seq-file-cleanup' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace into for-next 2014-11-19 13:02:53 -05:00
devpts
dlm Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2014-12-10 16:10:49 -08:00
ecryptfs kill f_dentry uses 2014-11-19 13:01:25 -05:00
efivarfs Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2014-12-10 16:10:49 -08:00
efs fs/efs/namei.c: return is not a function 2014-08-08 15:57:18 -07:00
exofs Boaz Harrosh - Fix broken email address 2014-10-19 20:22:32 +03:00
exportfs move d_rcu from overlapping d_child to overlapping d_alias 2014-11-03 15:20:29 -05:00
ext2 ext2: Convert to private i_dquot field 2014-11-10 10:06:10 +01:00
ext3 ext3: Convert to private i_dquot field 2014-11-10 10:06:10 +01:00
ext4 Lots of bugs fixes, including Zheng and Jan's extent status shrinker 2014-12-12 09:28:03 -08:00
f2fs f2fs: avoid to ra unneeded blocks in recover flow 2014-12-08 14:19:09 -08:00
fat Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2014-12-10 16:10:49 -08:00
freevxfs
fscache fs/fscache/object-list.c: use __seq_open_private() 2014-10-13 17:52:21 +01:00
fuse assorted conversions to %p[dD] 2014-11-19 13:01:20 -05:00
gfs2 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2014-12-10 16:10:49 -08:00
hfs fs/hfs/catalog.c: fix comparison bug in hfs_cat_keycmp 2014-12-10 17:41:16 -08:00
hfsplus
hostfs hostfs: support rename flags 2014-08-07 14:40:09 -04:00
hpfs fs/hpfs/dnode.c: fix suspect code indent 2014-08-08 15:57:22 -07:00
hppfs vfs: make first argument of dir_context.actor typed 2014-10-31 17:48:54 -04:00
hugetlbfs mm: convert i_mmap_mutex to rwsem 2014-12-13 12:42:45 -08:00
isofs isofs: avoid unused function warning 2014-11-19 13:09:37 -05:00
jbd jbd: Deletion of an unnecessary check before the function call "iput" 2014-11-18 10:15:29 +01:00
jbd2 Lots of bugs fixes, including Zheng and Jan's extent status shrinker 2014-12-12 09:28:03 -08:00
jffs2 [jffs2] kill wbuf_queued/wbuf_dwork_lock 2014-10-09 02:39:01 -04:00
jfs Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2014-12-10 16:10:49 -08:00
kernfs switch d_materialise_unique() users to d_splice_alias() 2014-11-19 13:01:20 -05:00
lockd Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2014-12-10 16:10:49 -08:00
logfs fs/logfs/readwrite.c: kernel-doc warning fixes 2014-08-06 18:01:12 -07:00
minix minix zmap block counts calculation fix 2014-08-08 15:57:20 -07:00
ncpfs Merge branch 'akpm' (patchbomb from Andrew) 2014-12-10 18:34:42 -08:00
nfs Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2014-12-10 16:10:49 -08:00
nfs_common lockd: move lockd's grace period handling into its own module 2014-09-17 16:33:11 -04:00
nfsd Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next 2014-12-11 14:27:06 -08:00
nilfs2 nilfs2: fix the nilfs_iget() vs. nilfs_new_inode() races 2014-12-10 17:41:16 -08:00
nls
notify Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2014-12-10 16:10:49 -08:00
ntfs assorted conversions to %p[dD] 2014-11-19 13:01:20 -05:00
ocfs2 Merge branch 'akpm' (patchbomb from Andrew) 2014-12-10 18:34:42 -08:00
omfs FS/OMFS: block number sanity check during fill_super operation 2014-10-14 02:18:22 +02:00
openpromfs
overlayfs Merge branch 'iov_iter' into for-next 2014-12-08 20:39:29 -05:00
proc Merge branch 'akpm' (patchbomb from Andrew) 2014-12-10 18:34:42 -08:00
pstore pstore-ram: Allow optional mapping with pgprot_noncached 2014-12-11 13:38:31 -08:00
qnx4
qnx6 fs/qnx6: update debugging to current functions 2014-08-08 15:57:26 -07:00
quota vfs: Remove i_dquot field from inode 2014-11-10 10:06:18 +01:00
ramfs fs/ramfs/file-nommu.c: replace count*size kzalloc by kcalloc 2014-08-08 15:57:18 -07:00
reiserfs Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial 2014-12-12 10:08:06 -08:00
romfs fs/romfs/super.c: add blank line after declarations 2014-08-08 15:57:25 -07:00
squashfs fs/squashfs/super.c: logging cleanup 2014-08-06 18:01:13 -07:00
sysfs
sysv
ubifs UBIFS: fix a couple bugs in UBIFS xattr length calculation 2014-11-07 12:32:22 +02:00
udf udf: One function call less in udf_fill_super() after error detection 2014-11-19 21:56:06 +01:00
ufs fs/ufs/balloc.c: remove unused variable 2014-10-14 02:18:20 +02:00
xfs xfs: update for 3.19-rc1 2014-12-12 09:48:17 -08:00
aio.c Merge git://git.kvack.org/~bcrl/aio-fixes 2014-11-25 18:55:44 -08:00
anon_inodes.c
attr.c
bad_inode.c bad_inode: add ->rename2() 2014-08-07 14:40:09 -04:00
binfmt_aout.c assorted conversions to %p[dD] 2014-11-19 13:01:20 -05:00
binfmt_elf_fdpic.c handle suicide on late failure exits in execve() in search_binary_handler() 2014-10-09 02:39:00 -04:00
binfmt_elf.c Merge branch 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus 2014-12-11 17:56:37 -08:00
binfmt_em86.c
binfmt_flat.c
binfmt_misc.c fs/binfmt_misc.c: use GFP_KERNEL instead of GFP_USER 2014-12-10 17:41:12 -08:00
binfmt_script.c
binfmt_som.c
block_dev.c fs: add freeze_super/thaw_super fs hooks 2014-11-17 10:35:17 +00:00
buffer.c fs: clarify rate limit suppressed buffer I/O errors 2014-10-21 13:55:11 -06:00
char_dev.c fs/char_dev.c: remove pointless assignment from __register_chrdev_region() 2014-12-10 17:41:04 -08:00
compat_binfmt_elf.c
compat_ioctl.c
compat.c vfs: make first argument of dir_context.actor typed 2014-10-31 17:48:54 -04:00
coredump.c coredump: add %i/%I in core_pattern to report the tid of the crashed thread 2014-10-14 02:18:21 +02:00
dcache.c Merge branch 'iov_iter' into for-next 2014-12-08 20:39:29 -05:00
dcookies.c
direct-io.c fuse: honour max_read and max_write in direct_io mode 2014-09-26 21:16:51 -04:00
drop_caches.c mm: vmscan: invoke slab shrinkers from shrink_zone() 2014-12-13 12:42:48 -08:00
eventfd.c fs: Convert show_fdinfo functions to void 2014-11-05 14:13:23 -05:00
eventpoll.c fs: Convert show_fdinfo functions to void 2014-11-05 14:13:23 -05:00
exec.c fs: Do not include mpx.h in exec.c 2014-11-18 02:01:40 +01:00
fcntl.c security: make security_file_set_fowner, f_setown and __f_setown void return 2014-09-09 16:01:36 -04:00
fhandle.c
file_table.c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2014-10-13 11:28:42 +02:00
file.c fs/file.c: replace get_unused_fd() with get_unused_fd_flags(0) 2014-12-10 17:41:10 -08:00
filesystems.c
fs_pin.c make fs/{namespace,super}.c forget about acct.h 2014-08-07 14:40:09 -04:00
fs_struct.c
fs-writeback.c
inode.c mm: convert i_mmap_mutex to rwsem 2014-12-13 12:42:45 -08:00
internal.h vfs: export __inode_permission() to modules 2014-10-24 00:14:35 +02:00
ioctl.c fs: add freeze_super/thaw_super fs hooks 2014-11-17 10:35:17 +00:00
Kconfig overlay filesystem 2014-10-24 00:14:38 +02:00
Kconfig.binfmt binfmt_elf: allow arch code to examine PT_LOPROC ... PT_HIPROC headers 2014-11-24 07:45:02 +01:00
libfs.c move d_rcu from overlapping d_child to overlapping d_alias 2014-11-03 15:20:29 -05:00
locks.c locks: flock_make_lock should return a struct file_lock (or PTR_ERR) 2014-10-07 14:06:13 -04:00
Makefile ovl: rename filesystem type to "overlay" 2014-11-20 16:39:59 +01:00
mbcache.c
mount.h vfs: Add a function to lazily unmount all mounts from any dentry. 2014-10-09 02:38:55 -04:00
mpage.c vfs: guard end of device for mpage interface 2014-10-09 22:25:53 -04:00
namei.c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2014-11-02 10:28:43 -08:00
namespace.c vfs: introduce clone_private_mount() 2014-10-24 00:14:36 +02:00
no-block.c
open.c new helper: audit_file() 2014-11-19 13:01:26 -05:00
pipe.c
pnode.c get rid of propagate_umount() mistakenly treating slaves as busy. 2014-08-30 18:31:41 -04:00
pnode.h
posix_acl.c
proc_namespace.c
read_write.c cachefiles_write_page(): switch to __kernel_write() 2014-10-09 02:39:05 -04:00
readdir.c vfs: make first argument of dir_context.actor typed 2014-10-31 17:48:54 -04:00
select.c
seq_file.c seq_file: Rename seq_overflow() to seq_has_overflowed() and make public 2014-10-29 20:26:06 -04:00
signalfd.c fs: Convert show_fdinfo functions to void 2014-11-05 14:13:23 -05:00
splice.c vfs: export do_splice_direct() to modules 2014-10-24 00:14:35 +02:00
stack.c fs: fix comment for 'CONFIG_LBADF' 2014-08-26 09:35:56 +02:00
stat.c
statfs.c
super.c vfs: Remove i_dquot field from inode 2014-11-10 10:06:18 +01:00
sync.c kill f_dentry uses 2014-11-19 13:01:25 -05:00
timerfd.c fs: Convert show_fdinfo functions to void 2014-11-05 14:13:23 -05:00
utimes.c
xattr.c new helper: audit_file() 2014-11-19 13:01:26 -05:00