linux/mm
Nhat Pham 0bdf0efa18 zswap: do not shrink if cgroup may not zswap
Before storing a page, zswap first checks if the number of stored pages
exceeds the limit specified by memory.zswap.max, for each cgroup in the
hierarchy.  If this limit is reached or exceeded, then zswap shrinking is
triggered and short-circuits the store attempt.

However, since the zswap's LRU is not memcg-aware, this can create the
following pathological behavior: the cgroup whose zswap limit is 0 will
evict pages from other cgroups continually, without lowering its own zswap
usage.  This means the shrinking will continue until the need for swap
ceases or the pool becomes empty.

As a result of this, we observe a disproportionate amount of zswap
writeback and a perpetually small zswap pool in our experiments, even
though the pool limit is never hit.

More generally, a cgroup might unnecessarily evict pages from other
cgroups before we drive the memcg back below its limit.

This patch fixes the issue by rejecting zswap store attempt without
shrinking the pool when obj_cgroup_may_zswap() returns false.

[akpm@linux-foundation.org: fix return of unintialized value]
[akpm@linux-foundation.org: s/ENOSPC/ENOMEM/]
Link: https://lkml.kernel.org/r/20230530222440.2777700-1-nphamcs@gmail.com
Link: https://lkml.kernel.org/r/20230530232435.3097106-1-nphamcs@gmail.com
Fixes: f4840ccfca ("zswap: memcg accounting")
Signed-off-by: Nhat Pham <nphamcs@gmail.com>
Cc: Dan Streetman <ddstreet@ieee.org>
Cc: Domenico Cerasuolo <cerasuolodomenico@gmail.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Seth Jennings <sjenning@redhat.com>
Cc: Vitaly Wool <vitaly.wool@konsulko.com>
Cc: Yosry Ahmed <yosryahmed@google.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-06-12 11:31:52 -07:00
..
damon mm/damon/core: fix divide error in damon_nr_accesses_to_accesses_bp() 2023-06-12 11:31:52 -07:00
kasan kasan: hw_tags: avoid invalid virt_to_page() 2023-05-02 17:23:27 -07:00
kfence mm: kfence: fix false positives on big endian 2023-05-17 15:24:33 -07:00
kmsan printk: export console trace point for kcsan/kasan/kfence/kmsan 2023-04-18 16:30:11 -07:00
backing-dev.c - Nick Piggin's "shoot lazy tlbs" series, to improve the peformance of 2023-04-27 19:42:02 -07:00
balloon_compaction.c mm: Convert all PageMovable users to movable_operations 2022-08-02 12:34:03 -04:00
bootmem_info.c bootmem: remove the vmemmap pages from kmemleak in put_page_bootmem 2022-08-28 14:02:45 -07:00
cma_debug.c mm/cma_debug: show complete cma name in debugfs directories 2022-09-11 20:25:50 -07:00
cma_sysfs.c mm: cma: make kobj_type structure constant 2023-03-28 16:20:06 -07:00
cma.c mm: move most of core MM initialization to mm/mm_init.c 2023-04-05 19:42:52 -07:00
cma.h
compaction.c - Nick Piggin's "shoot lazy tlbs" series, to improve the peformance of 2023-04-27 19:42:02 -07:00
debug_page_ref.c
debug_vm_pgtable.c mm, treewide: redefine MAX_ORDER sanely 2023-04-05 19:42:46 -07:00
debug.c mm/debug: use %pGt to display page_type in dump_page() 2023-03-28 16:20:09 -07:00
dmapool_test.c dmapool: add alloc/free performance test 2023-04-05 19:42:38 -07:00
dmapool.c dmapool: link blocks across pages 2023-05-06 10:33:38 -07:00
early_ioremap.c
fadvise.c mm: support POSIX_FADV_NOREUSE 2023-01-18 17:12:57 -08:00
failslab.c mm: fix unexpected changes to {failslab|fail_page_alloc}.attr 2022-11-22 18:50:44 -08:00
filemap.c page cache: fix page_cache_next/prev_miss off by one 2023-06-12 11:31:52 -07:00
folio-compat.c - Nick Piggin's "shoot lazy tlbs" series, to improve the peformance of 2023-04-27 19:42:02 -07:00
frontswap.c frontswap: don't call ->init if no ops are registered 2022-09-26 12:14:34 -07:00
gup_test.c mm/gup_test: fix ioctl fail for compat task 2023-06-12 11:31:51 -07:00
gup_test.h mm/gup_test: start/stop/read functionality for PIN LONGTERM test 2022-11-08 17:37:15 -08:00
gup.c x86-64: make access_ok() independent of LAM 2023-05-03 10:37:22 -07:00
highmem.c highmem: fix kmap_to_page() for kmap_local_page() addresses 2022-10-12 18:51:51 -07:00
hmm.c mm/hugetlb: make walk_hugetlb_range() safe to pmd unshare 2023-01-18 17:12:39 -08:00
huge_memory.c mm: don't check VMA write permissions if the PTE/PMD indicates write permissions 2023-04-21 14:52:03 -07:00
hugetlb_cgroup.c mm/hugetlb: increase use of folios in alloc_huge_page() 2023-02-13 15:54:27 -08:00
hugetlb_vmemmap.c mm, page_alloc: use check_pages_enabled static key to check tail pages 2023-04-18 16:29:54 -07:00
hugetlb_vmemmap.h mm: hugetlb_vmemmap: improve hugetlb_vmemmap code readability 2022-08-08 18:06:43 -07:00
hugetlb.c - Nick Piggin's "shoot lazy tlbs" series, to improve the peformance of 2023-04-27 19:42:02 -07:00
hwpoison-inject.c mm/hwpoison: add __init/__exit annotations to module init/exit funcs 2022-10-03 14:03:05 -07:00
init-mm.c IOMMU Updates for Linux 6.4 2023-04-30 13:00:38 -07:00
internal.h mm: correct arg in reclaim_pages()/reclaim_clean_pages_from_list() 2023-04-21 14:52:02 -07:00
interval_tree.c
io-mapping.c
ioremap.c mm: ioremap: Add ioremap/iounmap_allowed() 2022-06-27 12:22:31 +01:00
Kconfig - Nick Piggin's "shoot lazy tlbs" series, to improve the peformance of 2023-04-27 19:42:02 -07:00
Kconfig.debug mm: change per-VMA lock statistics to be disabled by default 2023-05-02 17:23:28 -07:00
khugepaged.c mm/khugepaged: fix conflicting mods to collapse_file() 2023-04-27 13:42:16 -07:00
kmemleak.c lib/stackdepot, mm: rename stack_depot_want_early_init 2023-02-16 20:43:49 -08:00
ksm.c mm/ksm: move disabling KSM from s390/gmap code to KSM code 2023-05-02 17:21:50 -07:00
list_lru.c mm: kmem: make mem_cgroup_from_obj() vmalloc()-safe 2022-06-16 19:48:31 -07:00
maccess.c mm: Fix copy_from_user_nofault(). 2023-04-12 17:36:23 -07:00
madvise.c Add support for new Linear Address Masking CPU feature. This is similar 2023-04-28 09:43:49 -07:00
Makefile - Nick Piggin's "shoot lazy tlbs" series, to improve the peformance of 2023-04-27 19:42:02 -07:00
mapping_dirty_helpers.c mm/mmu_notifier: remove unused mmu_notifier_range_update_to_read_only export 2023-02-02 22:32:54 -08:00
memblock.c mm: avoid passing 0 to __ffs() 2023-04-18 16:29:42 -07:00
memcontrol.c memcg: page_cgroup_ino() get memcg from the page's folio 2023-04-18 16:30:09 -07:00
memfd.c memfd: pass argument of memfd_fcntl as int 2023-04-18 16:30:11 -07:00
memory_hotplug.c mm: avoid passing 0 to __ffs() 2023-04-18 16:29:42 -07:00
memory-failure.c - Nick Piggin's "shoot lazy tlbs" series, to improve the peformance of 2023-04-27 19:42:02 -07:00
memory-tiers.c memory tier: release the new_memtier in find_create_memory_tier() 2023-02-09 16:51:40 -08:00
memory.c mm: do not increment pgfault stats when page fault handler retries 2023-04-21 14:52:04 -07:00
mempolicy.c mm/mempolicy: correctly update prev when policy is equal on mbind 2023-05-02 17:23:27 -07:00
mempool.c mempool: do not use ksize() for poisoning 2022-11-30 15:58:41 -08:00
memremap.c mm/memremap.c: fix outdated comment in devm_memremap_pages 2023-02-09 16:51:46 -08:00
memtest.c mm/memtest: add results of early memtest to /proc/meminfo 2023-04-05 19:42:55 -07:00
migrate_device.c mm: change to return bool for isolate_lru_page() 2023-02-20 12:46:17 -08:00
migrate.c Add support for new Linear Address Masking CPU feature. This is similar 2023-04-28 09:43:49 -07:00
mincore.c mm: return an ERR_PTR from __filemap_get_folio 2023-04-05 19:42:42 -07:00
mlock.c mm: mlock: use folios_put() in mlock_folio_batch() 2023-04-18 16:29:53 -07:00
mm_init.c mm/vmemmap/devdax: fix kernel crash when probing devdax devices 2023-04-18 16:30:09 -07:00
mm_slot.h mm: introduce common struct mm_slot 2022-10-03 14:02:43 -07:00
mmap_lock.c
mmap.c mm/mmap/vma_merge: always check invariants 2023-05-06 10:10:07 -07:00
mmu_gather.c mm: prefer xxx_page() alloc/free functions for order-0 pages 2023-03-28 16:20:16 -07:00
mmu_notifier.c mm/mmu_notifier: remove unused mmu_notifier_range_update_to_read_only export 2023-02-02 22:32:54 -08:00
mmzone.c mm: multi-gen LRU: groundwork 2022-09-26 19:46:09 -07:00
mprotect.c mm/userfaultfd: don't consider uffd-wp bit of writable migration entries 2023-04-18 16:29:53 -07:00
mremap.c mm/mremap: write-lock VMA while remapping it to a new address range 2023-04-05 20:02:58 -07:00
msync.c mm/msync: use vma_find() instead of vma linked list 2022-09-26 19:46:25 -07:00
nommu.c mm: vmalloc: convert vread() to vread_iter() 2023-04-05 19:42:57 -07:00
oom_kill.c mm/mmu_notifier: remove unused mmu_notifier_range_update_to_read_only export 2023-02-02 22:32:54 -08:00
page_alloc.c - Some DAMON cleanups from Kefeng Wang 2023-05-04 13:09:43 -07:00
page_counter.c mm: page_counter: remove unneeded atomic ops for low/min 2022-09-11 20:26:01 -07:00
page_ext.c mm/page_ext: init page_ext early if there are no deferred struct pages 2023-02-02 22:33:22 -08:00
page_idle.c mm: page_idle: convert page idle to use a folio 2023-01-18 17:12:52 -08:00
page_io.c - Daniel Verkamp has contributed a memfd series ("mm/memfd: add 2023-02-23 17:09:35 -08:00
page_isolation.c mm, treewide: redefine MAX_ORDER sanely 2023-04-05 19:42:46 -07:00
page_owner.c mm, treewide: redefine MAX_ORDER sanely 2023-04-05 19:42:46 -07:00
page_poison.c
page_reporting.c mm, treewide: redefine MAX_ORDER sanely 2023-04-05 19:42:46 -07:00
page_reporting.h
page_table_check.c mm/page_ext: do not allocate space for page_ext->flags if not needed 2023-02-02 22:33:11 -08:00
page_vma_mapped.c mm/hugetlb: introduce hugetlb_walk() 2023-01-18 17:12:39 -08:00
page-writeback.c mm,jfs: move write_one_page/folio_write_one to jfs 2023-03-28 16:20:14 -07:00
pagewalk.c mm/hugetlb: introduce hugetlb_walk() 2023-01-18 17:12:39 -08:00
percpu-internal.h mm: percpu: fix incorrect size in pcpu_obj_full_size() 2023-02-16 20:43:55 -08:00
percpu-km.c
percpu-stats.c
percpu-vm.c
percpu.c mm: memcontrol: rename memcg_kmem_enabled() 2023-02-16 20:43:56 -08:00
pgalloc-track.h
pgtable-generic.c mm: add PTE pointer parameter to flush_tlb_fix_spurious_fault() 2023-03-28 16:20:12 -07:00
process_vm_access.c use less confusing names for iov_iter direction initializers 2022-11-25 13:01:55 -05:00
ptdump.c mm: pagewalk: Fix race between unmap and page walker 2022-09-03 10:13:13 -07:00
readahead.c readahead: convert readahead_expand() to use a folio 2023-02-02 22:33:21 -08:00
rmap.c mm,unmap: avoid flushing TLB in batch if PTE is inaccessible 2023-04-27 13:42:16 -07:00
rodata_test.c mm/rodata_test: use PAGE_ALIGNED() helper 2022-10-03 14:03:05 -07:00
secretmem.c - Daniel Verkamp has contributed a memfd series ("mm/memfd: add 2023-02-23 17:09:35 -08:00
shmem.c - Nick Piggin's "shoot lazy tlbs" series, to improve the peformance of 2023-04-27 19:42:02 -07:00
shrinker_debug.c mm: shrinkers: fix race condition on debugfs cleanup 2023-05-17 15:24:33 -07:00
shuffle.c mm/shuffle: convert module_param_call to module_param_cb 2022-10-03 14:03:07 -07:00
shuffle.h mm, treewide: redefine MAX_ORDER sanely 2023-04-05 19:42:46 -07:00
slab_common.c mm/slab: document kfree() as allowed for kmem_cache_alloc() objects 2023-03-29 10:35:41 +02:00
slab.c mm: vmscan: refactor updating current->reclaim_state 2023-04-18 16:30:10 -07:00
slab.h - Nick Piggin's "shoot lazy tlbs" series, to improve the peformance of 2023-04-27 19:42:02 -07:00
slub.c - Nick Piggin's "shoot lazy tlbs" series, to improve the peformance of 2023-04-27 19:42:02 -07:00
sparse-vmemmap.c mm/vmemmap/devdax: fix kernel crash when probing devdax devices 2023-04-18 16:30:09 -07:00
sparse.c sparse: remove unnecessary 0 values from rc 2023-04-21 14:52:05 -07:00
swap_cgroup.c mm: memcontrol: don't allocate cgroup swap arrays when memcg is disabled 2022-10-03 14:03:36 -07:00
swap_slots.c mm/swap: convert put_swap_page() to put_swap_folio() 2022-10-03 14:02:46 -07:00
swap_state.c mm: return an ERR_PTR from __filemap_get_folio 2023-04-05 19:42:42 -07:00
swap.c mm: swap: fix performance regression on sparsetruncate-tiny 2023-04-16 10:41:24 -07:00
swap.h mm: remove the __swap_writepage return value 2023-02-02 22:33:33 -08:00
swapfile.c sync mm-stable with mm-hotfixes-stable to pick up depended-upon upstream changes 2023-04-16 12:31:58 -07:00
truncate.c mm: return an ERR_PTR from __filemap_get_folio 2023-04-05 19:42:42 -07:00
usercopy.c mm: Fix copy_from_user_nofault(). 2023-04-12 17:36:23 -07:00
userfaultfd.c userfaultfd: use helper function range_in_vma() 2023-04-21 14:52:02 -07:00
util.c mm: uninline kstrdup() 2023-04-08 13:45:37 -07:00
vmalloc.c mm: vmalloc: rename addr_to_vb_xarray() function 2023-04-18 16:29:48 -07:00
vmpressure.c
vmscan.c mm: shrinkers: fix race condition on debugfs cleanup 2023-05-17 15:24:33 -07:00
vmstat.c mm: introduce per-VMA lock statistics 2023-04-05 20:03:01 -07:00
workingset.c mm: workingset: update description of the source file 2023-04-18 16:30:11 -07:00
z3fold.c mm: remove PageMovable export 2023-01-18 17:12:57 -08:00
zbud.c zpool: clean out dead code 2022-12-11 18:12:10 -08:00
zpool.c zpool: remove MODULE_LICENSE in non-modules 2023-04-13 13:13:54 -07:00
zsmalloc.c zsmalloc: move LRU update from zs_map_object() to zs_malloc() 2023-05-17 15:24:33 -07:00
zswap.c zswap: do not shrink if cgroup may not zswap 2023-06-12 11:31:52 -07:00