linux/mm
Vlastimil Babka afe0c26d19 mm, slub: move slub_debug static key enabling outside slab_mutex
Paul E.  McKenney reported [1] that commit 1f0723a4c0 ("mm, slub: enable
slub_debug static key when creating cache with explicit debug flags")
results in the lockdep complaint:

 ======================================================
 WARNING: possible circular locking dependency detected
 5.12.0+ #15 Not tainted
 ------------------------------------------------------
 rcu_torture_sta/109 is trying to acquire lock:
 ffffffff96063cd0 (cpu_hotplug_lock){++++}-{0:0}, at: static_key_enable+0x9/0x20

 but task is already holding lock:
 ffffffff96173c28 (slab_mutex){+.+.}-{3:3}, at: kmem_cache_create_usercopy+0x2d/0x250

 which lock already depends on the new lock.

 the existing dependency chain (in reverse order) is:

 -> #1 (slab_mutex){+.+.}-{3:3}:
        lock_acquire+0xb9/0x3a0
        __mutex_lock+0x8d/0x920
        slub_cpu_dead+0x15/0xf0
        cpuhp_invoke_callback+0x17a/0x7c0
        cpuhp_invoke_callback_range+0x3b/0x80
        _cpu_down+0xdf/0x2a0
        cpu_down+0x2c/0x50
        device_offline+0x82/0xb0
        remove_cpu+0x1a/0x30
        torture_offline+0x80/0x140
        torture_onoff+0x147/0x260
        kthread+0x10a/0x140
        ret_from_fork+0x22/0x30

 -> #0 (cpu_hotplug_lock){++++}-{0:0}:
        check_prev_add+0x8f/0xbf0
        __lock_acquire+0x13f0/0x1d80
        lock_acquire+0xb9/0x3a0
        cpus_read_lock+0x21/0xa0
        static_key_enable+0x9/0x20
        __kmem_cache_create+0x38d/0x430
        kmem_cache_create_usercopy+0x146/0x250
        kmem_cache_create+0xd/0x10
        rcu_torture_stats+0x79/0x280
        kthread+0x10a/0x140
        ret_from_fork+0x22/0x30

 other info that might help us debug this:

  Possible unsafe locking scenario:

        CPU0                    CPU1
        ----                    ----
   lock(slab_mutex);
                                lock(cpu_hotplug_lock);
                                lock(slab_mutex);
   lock(cpu_hotplug_lock);

  *** DEADLOCK ***

 1 lock held by rcu_torture_sta/109:
  #0: ffffffff96173c28 (slab_mutex){+.+.}-{3:3}, at: kmem_cache_create_usercopy+0x2d/0x250

 stack backtrace:
 CPU: 3 PID: 109 Comm: rcu_torture_sta Not tainted 5.12.0+ #15
 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.13.0-1ubuntu1.1 04/01/2014
 Call Trace:
  dump_stack+0x6d/0x89
  check_noncircular+0xfe/0x110
  ? lock_is_held_type+0x98/0x110
  check_prev_add+0x8f/0xbf0
  __lock_acquire+0x13f0/0x1d80
  lock_acquire+0xb9/0x3a0
  ? static_key_enable+0x9/0x20
  ? mark_held_locks+0x49/0x70
  cpus_read_lock+0x21/0xa0
  ? static_key_enable+0x9/0x20
  static_key_enable+0x9/0x20
  __kmem_cache_create+0x38d/0x430
  kmem_cache_create_usercopy+0x146/0x250
  ? rcu_torture_stats_print+0xd0/0xd0
  kmem_cache_create+0xd/0x10
  rcu_torture_stats+0x79/0x280
  ? rcu_torture_stats_print+0xd0/0xd0
  kthread+0x10a/0x140
  ? kthread_park+0x80/0x80
  ret_from_fork+0x22/0x30

This is because there's one order of locking from the hotplug callbacks:

lock(cpu_hotplug_lock); // from hotplug machinery itself
lock(slab_mutex); // in e.g. slab_mem_going_offline_callback()

And commit 1f0723a4c0 made the reverse sequence possible:
lock(slab_mutex); // in kmem_cache_create_usercopy()
lock(cpu_hotplug_lock); // kmem_cache_open() -> static_key_enable()

The simplest fix is to move static_key_enable() to a place before slab_mutex is
taken. That means kmem_cache_create_usercopy() in mm/slab_common.c which is not
ideal for SLUB-specific code, but the #ifdef CONFIG_SLUB_DEBUG makes it
at least self-contained and obvious.

[1] https://lore.kernel.org/lkml/20210502171827.GA3670492@paulmck-ThinkPad-P17-Gen-1/

Link: https://lkml.kernel.org/r/20210504120019.26791-1-vbabka@suse.cz
Fixes: 1f0723a4c0 ("mm, slub: enable slub_debug static key when creating cache with explicit debug flags")
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Reported-by: Paul E. McKenney <paulmck@kernel.org>
Tested-by: Paul E. McKenney <paulmck@kernel.org>
Acked-by: David Rientjes <rientjes@google.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-05-14 19:41:32 -07:00
..
kasan mm: fix typos in comments 2021-05-07 00:26:35 -07:00
kfence mm: fix typos in comments 2021-05-07 00:26:35 -07:00
backing-dev.c mm/backing-dev.c: use might_alloc() 2021-02-26 09:41:01 -08:00
balloon_compaction.c mm: fix typos in comments 2021-05-07 00:26:35 -07:00
cleancache.c
cma_debug.c mm/cma: change cma mutex to irq safe spinlock 2021-05-05 11:27:21 -07:00
cma_sysfs.c mm: cma: support sysfs 2021-05-05 11:27:24 -07:00
cma.c mm: use proper type for cma_[alloc|release] 2021-05-05 11:27:24 -07:00
cma.h mm: cma: support sysfs 2021-05-05 11:27:24 -07:00
compaction.c mm: fix typos in comments 2021-05-07 00:26:35 -07:00
debug_page_ref.c
debug_vm_pgtable.c mm: HUGE_VMAP arch support cleanup 2021-04-30 11:20:40 -07:00
debug.c mm/debug: improve memcg debugging 2021-02-24 13:38:27 -08:00
dmapool.c mm/dmapool: switch from strlcpy to strscpy 2021-04-30 11:20:39 -07:00
early_ioremap.c mm/early_ioremap.c: use __func__ instead of function name 2021-02-26 09:41:02 -08:00
fadvise.c mm, fadvise: improve the expensive remote LRU cache draining after FADV_DONTNEED 2020-10-13 18:38:29 -07:00
failslab.c
filemap.c mm: fix typos in comments 2021-05-07 00:26:35 -07:00
frontswap.c mm/mempool: minor coding style tweaks 2021-05-05 11:27:27 -07:00
gup_test.c selftests/vm: gup_test: test faulting in kernel, and verify pinnable pages 2021-05-05 11:27:26 -07:00
gup_test.h selftests/vm: gup_test: fix test flag 2021-05-05 11:27:26 -07:00
gup.c mm: fix typos in comments 2021-05-07 00:26:35 -07:00
highmem.c mm: fix typos in comments 2021-05-07 00:26:35 -07:00
hmm.c mm: do page fault accounting in handle_mm_fault 2020-08-12 10:58:02 -07:00
huge_memory.c mm: fix typos in comments 2021-05-07 00:26:35 -07:00
hugetlb_cgroup.c hugetlb: make free_huge_page irq safe 2021-05-05 11:27:22 -07:00
hugetlb.c mm/hugetlb: fix cow where page writtable in child 2021-05-14 19:41:32 -07:00
hwpoison-inject.c mm,hwpoison-inject: don't pin for hwpoison_filter 2020-10-16 11:11:16 -07:00
init-mm.c mm/gup: prevent gup_fast from racing with COW during fork 2020-12-15 12:13:39 -08:00
internal.h mm: fix typos in comments 2021-05-07 00:26:35 -07:00
interval_tree.c mm/interval_tree: add comments to improve code readability 2021-04-30 11:20:38 -07:00
io-mapping.c mm: add a io_mapping_map_user helper 2021-04-30 11:20:39 -07:00
ioremap.c mm: move vmap_range from mm/ioremap.c to mm/vmalloc.c 2021-04-30 11:20:40 -07:00
Kconfig mm,memory_hotplug: allocate memmap from the added memory range 2021-05-05 11:27:26 -07:00
Kconfig.debug mm, page_poison: remove CONFIG_PAGE_POISONING_ZERO 2020-12-15 12:13:46 -08:00
khugepaged.c mm: fix typos in comments 2021-05-07 00:26:35 -07:00
kmemleak.c mm/kmemleak.c: fix a typo 2021-04-30 11:20:36 -07:00
ksm.c mm: fix typos in comments 2021-05-07 00:26:35 -07:00
list_lru.c mm: vmscan: consolidate shrinker_maps handling code 2021-05-05 11:27:23 -07:00
maccess.c uaccess: add force_uaccess_{begin,end} helpers 2020-08-12 10:57:59 -07:00
madvise.c mm: fix typos in comments 2021-05-07 00:26:35 -07:00
Makefile mm,memory_hotplug: add kernel boot option to enable memmap_on_memory 2021-05-05 11:27:27 -07:00
mapping_dirty_helpers.c mm/mapping_dirty_helpers: guard hugepage pud's usage 2021-04-16 16:10:37 -07:00
memblock.c memblock: remove return value of memblock_free_all() 2021-02-22 13:01:23 -08:00
memcontrol.c mm: fix typos in comments 2021-05-07 00:26:35 -07:00
memfd.c
memory_hotplug.c mm/mempool: minor coding style tweaks 2021-05-05 11:27:27 -07:00
memory-failure.c mm: fix typos in comments 2021-05-07 00:26:35 -07:00
memory.c mm: fix typos in comments 2021-05-07 00:26:35 -07:00
mempolicy.c mm: fix typos in comments 2021-05-07 00:26:35 -07:00
mempool.c mm/mempool: minor coding style tweaks 2021-05-05 11:27:27 -07:00
memremap.c mm/memremap.c: fix improper SPDX comment style 2021-04-30 11:20:37 -07:00
memtest.c
migrate.c mm: fix typos in comments 2021-05-07 00:26:35 -07:00
mincore.c inode: make init and permission helpers idmapped mount aware 2021-01-24 14:27:16 +01:00
mlock.c mm/mempool: minor coding style tweaks 2021-05-05 11:27:27 -07:00
mm_init.c include/linux/page-flags-layout.h: cleanups 2021-04-30 11:20:42 -07:00
mmap_lock.c mm: mmap_lock: add tracepoints around lock acquisition 2020-12-15 12:13:41 -08:00
mmap.c mm: fix typos in comments 2021-05-07 00:26:35 -07:00
mmu_gather.c mm: eliminate "expecting prototype" kernel-doc warnings 2021-04-16 16:10:36 -07:00
mmu_notifier.c mm/mmu_notifiers: ensure range_end() is paired with range_start() 2021-03-25 09:22:55 -07:00
mmzone.c mm/lru: replace pgdat lru_lock with lruvec lock 2020-12-15 14:48:04 -08:00
mprotect.c mm: fix typos in comments 2021-05-07 00:26:35 -07:00
mremap.c mm: fix typos in comments 2021-05-07 00:26:35 -07:00
msync.c mm/msync: exit early when the flags is an MS_ASYNC and start < vm_start 2021-04-30 11:20:37 -07:00
nommu.c mm/vmalloc: remove vwrite() 2021-05-07 00:26:34 -07:00
oom_kill.c mm: fix typos in comments 2021-05-07 00:26:35 -07:00
page_alloc.c mm: fix typos in comments 2021-05-07 00:26:35 -07:00
page_counter.c mm: page_counter: mitigate consequences of a page_counter underflow 2021-04-30 11:20:38 -07:00
page_ext.c mm: fix some spelling mistakes in comments 2020-12-15 22:46:19 -08:00
page_idle.c mm: page_idle_get_page() does not need lru_lock 2020-12-15 14:48:03 -08:00
page_io.c swap: fix swapfile read/write offset 2021-03-02 17:25:46 -07:00
page_isolation.c mm/page_isolation: do not isolate the max order page 2020-12-15 12:13:45 -08:00
page_owner.c mm: fix typos in comments 2021-05-07 00:26:35 -07:00
page_poison.c mm: page_poison: print page info when corruption is caught 2021-04-30 11:20:36 -07:00
page_reporting.c mm/page_reporting: use list_entry_is_head() in page_reporting_cycle() 2021-02-24 13:38:30 -08:00
page_reporting.h mm: introduce include/linux/pgtable.h 2020-06-09 09:39:13 -07:00
page_vma_mapped.c mm: fix typos in comments 2021-05-07 00:26:35 -07:00
page-writeback.c mm: fix typos in comments 2021-05-07 00:26:35 -07:00
pagewalk.c mmap locking API: convert mmap_sem comments 2020-06-09 09:39:14 -07:00
percpu-internal.h mm: fix typos in comments 2021-05-07 00:26:35 -07:00
percpu-km.c mm: memcg/percpu: account percpu memory to memory cgroups 2020-08-12 10:57:55 -07:00
percpu-stats.c percpu: make pcpu_nr_empty_pop_pages per chunk type 2021-04-09 13:58:38 +00:00
percpu-vm.c mm/vmalloc: remove unmap_kernel_range 2021-04-30 11:20:40 -07:00
percpu.c mm: fix typos in comments 2021-05-07 00:26:35 -07:00
pgalloc-track.h mm: fix typos in comments 2021-05-07 00:26:35 -07:00
pgtable-generic.c mm/pgtable-generic.c: optimize the VM_BUG_ON condition in pmdp_huge_clear_flush() 2021-02-24 13:38:30 -08:00
process_vm_access.c mm/process_vm_access.c: remove duplicate include 2021-05-05 11:27:27 -07:00
ptdump.c mm: ptdump: fix build failure 2021-04-16 16:10:37 -07:00
readahead.c mm: Implement readahead_control pageset expansion 2021-04-23 10:14:29 +01:00
rmap.c mm: fix some typos and code style problems 2021-05-07 00:26:33 -07:00
rodata_test.c mm/rodata_test.c: fix missing function declaration 2020-08-21 09:52:53 -07:00
shmem.c mm/hugetlb: fix F_SEAL_FUTURE_WRITE 2021-05-14 19:41:32 -07:00
shuffle.c mm: eliminate "expecting prototype" kernel-doc warnings 2021-04-16 16:10:36 -07:00
shuffle.h mm/shuffle: remove dynamic reconfiguration 2020-08-07 11:33:29 -07:00
slab_common.c mm, slub: move slub_debug static key enabling outside slab_mutex 2021-05-14 19:41:32 -07:00
slab.c mm: fix typos in comments 2021-05-07 00:26:35 -07:00
slab.h kasan, mm: integrate slab init_on_alloc with HW_TAGS 2021-04-30 11:20:41 -07:00
slob.c mm: Don't build mm_dump_obj() on CONFIG_PRINTK=n kernels 2021-03-08 14:18:46 -08:00
slub.c mm, slub: move slub_debug static key enabling outside slab_mutex 2021-05-14 19:41:32 -07:00
sparse-vmemmap.c mm/sparse: only sub-section aligned range would be populated 2020-08-07 11:33:27 -07:00
sparse.c mm/mempool: minor coding style tweaks 2021-05-05 11:27:27 -07:00
swap_cgroup.c mm: memcontrol: make swap tracking an integral part of memory control 2020-06-03 20:09:48 -07:00
swap_slots.c mm: fix typos in comments 2021-05-07 00:26:35 -07:00
swap_state.c mm: fix some typos and code style problems 2021-05-07 00:26:33 -07:00
swap.c mm: fix some typos and code style problems 2021-05-07 00:26:33 -07:00
swapfile.c mm/mempool: minor coding style tweaks 2021-05-05 11:27:27 -07:00
truncate.c mm: stop accounting shadow entries 2021-05-05 11:27:19 -07:00
usercopy.c mm/usercopy.c: delete duplicated word 2020-08-12 10:57:58 -07:00
userfaultfd.c userfaultfd: add UFFDIO_CONTINUE ioctl 2021-05-05 11:27:22 -07:00
util.c mm/util.c: fix typo 2021-05-05 11:27:25 -07:00
vmacache.c kernel: better document the use_mm/unuse_mm API contract 2020-06-10 19:14:18 -07:00
vmalloc.c mm: fix typos in comments 2021-05-07 00:26:35 -07:00
vmpressure.c mm: vmpressure: use mem_cgroup_is_root API 2020-04-02 09:35:31 -07:00
vmscan.c mm/mempool: minor coding style tweaks 2021-05-05 11:27:27 -07:00
vmstat.c mm: fix typos in comments 2021-05-07 00:26:35 -07:00
workingset.c mm: stop accounting shadow entries 2021-05-05 11:27:19 -07:00
z3fold.c mm: fix some typos and code style problems 2021-05-07 00:26:33 -07:00
zbud.c mm: set the sleep_mapped to true for zbud and z3fold 2021-02-26 09:41:01 -08:00
zpool.c mm: fix typos in comments 2021-05-07 00:26:35 -07:00
zsmalloc.c mm: fix typos in comments 2021-05-07 00:26:35 -07:00
zswap.c mm/zswap.c: switch from strlcpy to strscpy 2021-05-05 11:27:27 -07:00