linux

mirror of https://github.com/torvalds/linux.git synced 2024-11-24 05:02:12 +00:00

History

Jann Horn eeaa345e12 mm/slub: add missing TID updates on slab deactivation The fastpath in slab_alloc_node() assumes that c->slab is stable as long as the TID stays the same. However, two places in __slab_alloc() currently don't update the TID when deactivating the CPU slab. If multiple operations race the right way, this could lead to an object getting lost; or, in an even more unlikely situation, it could even lead to an object being freed onto the wrong slab's freelist, messing up the `inuse` counter and eventually causing a page to be freed to the page allocator while it still contains slab objects. (I haven't actually tested these cases though, this is just based on looking at the code. Writing testcases for this stuff seems like it'd be a pain...) The race leading to state inconsistency is (all operations on the same CPU and kmem_cache): - task A: begin do_slab_free(): - read TID - read pcpu freelist (==NULL) - check `slab == c->slab` (true) - [PREEMPT A->B] - task B: begin slab_alloc_node(): - fastpath fails (`c->freelist` is NULL) - enter __slab_alloc() - slub_get_cpu_ptr() (disables preemption) - enter ___slab_alloc() - take local_lock_irqsave() - read c->freelist as NULL - get_freelist() returns NULL - write `c->slab = NULL` - drop local_unlock_irqrestore() - goto new_slab - slub_percpu_partial() is NULL - get_partial() returns NULL - slub_put_cpu_ptr() (enables preemption) - [PREEMPT B->A] - task A: finish do_slab_free(): - this_cpu_cmpxchg_double() succeeds() - [CORRUPT STATE: c->slab==NULL, c->freelist!=NULL] From there, the object on c->freelist will get lost if task B is allowed to continue from here: It will proceed to the retry_load_slab label, set c->slab, then jump to load_freelist, which clobbers c->freelist. But if we instead continue as follows, we get worse corruption: - task A: run __slab_free() on object from other struct slab: - CPU_PARTIAL_FREE case (slab was on no list, is now on pcpu partial) - task A: run slab_alloc_node() with NUMA node constraint: - fastpath fails (c->slab is NULL) - call __slab_alloc() - slub_get_cpu_ptr() (disables preemption) - enter ___slab_alloc() - c->slab is NULL: goto new_slab - slub_percpu_partial() is non-NULL - set c->slab to slub_percpu_partial(c) - [CORRUPT STATE: c->slab points to slab-1, c->freelist has objects from slab-2] - goto redo - node_match() fails - goto deactivate_slab - existing c->freelist is passed into deactivate_slab() - inuse count of slab-1 is decremented to account for object from slab-2 At this point, the inuse count of slab-1 is 1 lower than it should be. This means that if we free all allocated objects in slab-1 except for one, SLUB will think that slab-1 is completely unused, and may free its page, leading to use-after-free. Fixes: `c17dda40a6` ("slub: Separate out kmem_cache_cpu processing from deactivate_slab") Fixes: `03e404af26` ("slub: fast release on full slab") Cc: stable@vger.kernel.org Signed-off-by: Jann Horn <jannh@google.com> Acked-by: Christoph Lameter <cl@linux.com> Acked-by: David Rientjes <rientjes@google.com> Reviewed-by: Muchun Song <songmuchun@bytedance.com> Tested-by: Hyeonggon Yoo <42.hyeyoo@gmail.com> Signed-off-by: Vlastimil Babka <vbabka@suse.cz> Link: https://lore.kernel.org/r/20220608182205.2945720-1-jannh@google.com		2022-06-13 17:41:36 +02:00
..
damon	mm: damon: use HPAGE_PMD_SIZE	2022-05-19 14:08:55 -07:00
kasan	mm: kasan: fix input of vmalloc_to_page()	2022-05-27 09:33:46 -07:00
kfence	Yang Shi has improved the behaviour of khugepaged collapsing of readonly	2022-05-26 12:32:41 -07:00
backing-dev.c	blk-cgroup: remove unneeded includes from <linux/blk-cgroup.h>	2022-05-02 14:06:20 -06:00
balloon_compaction.c	mm/balloon_compaction: make balloon page compaction callbacks static	2022-03-28 16:52:57 -04:00
bootmem_info.c	bootmem: Use page->index instead of page->freelist	2022-01-06 12:27:03 +01:00
cma_debug.c
cma_sysfs.c
cma.c	Revert "mm/cma.c: remove redundant cma_mutex lock"	2022-05-13 15:11:26 -07:00
cma.h	mm/cma: provide option to opt out from exposing pages on activation failure	2022-03-22 15:57:09 -07:00
compaction.c	mm, compaction: fast_find_migrateblock() should return pfn in the target zone	2022-05-13 16:48:57 -07:00
debug_page_ref.c
debug_vm_pgtable.c	mm/debug_vm_pgtable: add tests for __HAVE_ARCH_PTE_SWP_EXCLUSIVE	2022-05-09 18:20:45 -07:00
debug.c	mm: unexport page_init_poison	2022-03-24 19:06:45 -07:00
dmapool.c	mm/dmapool.c: revert "make dma pool to use kmalloc_node"	2022-01-15 16:30:28 +02:00
early_ioremap.c	mm/early_ioremap: declare early_memremap_pgprot_adjust()	2022-03-22 15:57:11 -07:00
fadvise.c	riscv: compat: syscall: Add compat_sys_call_table implementation	2022-04-26 13:36:25 -07:00
failslab.c	mm: fix missing handler for __GFP_NOWARN	2022-05-19 14:08:55 -07:00
filemap.c	filemap: Cache the value of vm_flags	2022-06-09 16:24:25 -04:00
folio-compat.c	fs: Remove aop flags parameter from grab_cache_page_write_begin()	2022-05-08 14:28:19 -04:00
frontswap.c	frontswap: remove support for multiple ops	2022-01-22 08:33:38 +02:00
gup_test.c
gup_test.h
gup.c	Yang Shi has improved the behaviour of khugepaged collapsing of readonly	2022-05-26 12:32:41 -07:00
highmem.c	highmem: fix checks in __kmap_local_sched_{in,out}	2022-04-08 14:20:36 -10:00
hmm.c	mm: teach core mm about pte markers	2022-05-13 07:20:09 -07:00
huge_memory.c	mm/huge_memory: Fix xarray node memory leak	2022-06-09 16:24:25 -04:00
hugetlb_cgroup.c	hugetlb: add hugetlb.*.numa_stat file	2022-01-15 16:30:29 +02:00
hugetlb_vmemmap.c	mm: hugetlb_vmemmap: fix CONFIG_HUGETLB_PAGE_FREE_VMEMMAP_DEFAULT_ON	2022-06-01 15:57:16 -07:00
hugetlb_vmemmap.h	mm: hugetlb_vmemmap: cleanup CONFIG_HUGETLB_PAGE_FREE_VMEMMAP*	2022-04-28 23:16:15 -07:00
hugetlb.c	delayacct: track delays from write-protect copy	2022-06-01 15:55:25 -07:00
hwpoison-inject.c	mm/hwpoison: disable hwpoison filter during removing	2022-05-13 07:20:19 -07:00
init-mm.c	kernel/fork: Initialize mm's PASID	2022-02-14 19:51:47 +01:00
internal.h	mm: split free page with properly free memory accounting and without race	2022-05-27 09:33:43 -07:00
interval_tree.c
io-mapping.c
ioremap.c	mm: move ioremap_page_range to vmalloc.c	2021-09-08 11:50:24 -07:00
Kconfig	mm: Kconfig: reorganize misplaced mm options	2022-05-27 09:33:47 -07:00
Kconfig.debug	Two followon fixes for the post-5.19 series "Use pageblock_order for cma	2022-05-27 11:40:49 -07:00
khugepaged.c	mm: khugepaged: introduce khugepaged_enter_vma() helper	2022-05-19 14:08:50 -07:00
kmemleak.c	mm: kmemleak: take a full lowmem check in kmemleak_*_phys()	2022-04-15 14:49:56 -07:00
ksm.c	ksm: fix typo in comment	2022-05-25 10:47:48 -07:00
list_lru.c	mm/list_lru.c: revert "mm/list_lru: optimize memcg_reparent_list_lru_node()"	2022-04-08 14:20:36 -10:00
maccess.c	asm-generic updates for 5.18	2022-03-23 18:03:08 -07:00
madvise.c	mm: filter out swapin error entry in shmem mapping	2022-05-27 09:33:46 -07:00
Makefile	mm: hugetlb_vmemmap: cleanup CONFIG_HUGETLB_PAGE_FREE_VMEMMAP*	2022-04-28 23:16:15 -07:00
mapping_dirty_helpers.c	mm: move tlb_flush_pending inline helpers to mm_inline.h	2022-01-15 16:30:27 +02:00
memblock.c	memblock: test suite and a small cleanup	2022-03-27 13:36:06 -07:00
memcontrol.c	zswap: memcg accounting	2022-05-19 14:08:53 -07:00
memfd.c	memfd: fix F_SEAL_WRITE after shmem huge page allocated	2022-03-05 11:08:32 -08:00
memory_hotplug.c	mm: hugetlb_vmemmap: add hugetlb_optimize_vmemmap sysctl	2022-05-13 16:48:56 -07:00
memory-failure.c	Yang Shi has improved the behaviour of khugepaged collapsing of readonly	2022-05-26 12:32:41 -07:00
memory.c	delayacct: track delays from write-protect copy	2022-06-01 15:55:25 -07:00
mempolicy.c	mm/mempolicy: fix uninit-value in mpol_rebind_policy()	2022-05-19 14:08:54 -07:00
mempool.c	mm: remove spurious blkdev.h includes	2021-10-18 06:17:01 -06:00
memremap.c	mm/memremap: fix missing call to untrack_pfn() in pagemap_range()	2022-06-01 15:57:16 -07:00
memtest.c
migrate_device.c	mm: remember exclusively mapped anonymous pages with PG_anon_exclusive	2022-05-09 18:20:44 -07:00
migrate.c	Yang Shi has improved the behaviour of khugepaged collapsing of readonly	2022-05-26 12:32:41 -07:00
mincore.c	mm: teach core mm about pte markers	2022-05-13 07:20:09 -07:00
mlock.c	mm/munlock: protect the per-CPU pagevec by a local_lock_t	2022-04-01 11:46:09 -07:00
mm_init.c
mmap_lock.c	mm: mmap_lock: fix disabling preemption directly	2021-07-23 17:43:28 -07:00
mmap.c	powerpc updates for 5.19	2022-05-28 11:27:17 -07:00
mmu_gather.c	mm/mmu_gather: limit free batch count and add schedule point in tlb_batch_pages_flush	2022-04-28 23:16:12 -07:00
mmu_notifier.c	mm/mmu_notifier.c: fix race in mmu_interval_notifier_remove()	2022-04-21 20:01:10 -07:00
mmzone.c	Folio changes for 5.18	2022-03-22 17:03:12 -07:00
mprotect.c	mm/hugetlb: handle UFFDIO_WRITEPROTECT	2022-05-13 07:20:11 -07:00
mremap.c	Yang Shi has improved the behaviour of khugepaged collapsing of readonly	2022-05-26 12:32:41 -07:00
msync.c
nommu.c	no-MMU: expose vmalloc_huge() for alloc_large_system_hash()	2022-04-25 10:11:49 -07:00
oom_kill.c	mm/oom_kill.c: fix vm_oom_kill_table[] ifdeffery	2022-06-01 15:57:16 -07:00
page_alloc.c	Two followon fixes for the post-5.19 series "Use pageblock_order for cma	2022-05-27 11:40:49 -07:00
page_counter.c	mm/page_counter: remove an incorrect call to propagate_protected_usage()	2022-01-15 16:30:27 +02:00
page_ext.c	mm: use for_each_online_node and node_online instead of open coding	2022-04-29 14:36:58 -07:00
page_idle.c	mm: don't be stuck to rmap lock on reclaim path	2022-05-19 14:08:54 -07:00
page_io.c	Yang Shi has improved the behaviour of khugepaged collapsing of readonly	2022-05-26 12:32:41 -07:00
page_isolation.c	mm: page_isolation: use compound_nr() correctly in isolate_single_pageblock()	2022-06-01 15:57:16 -07:00
page_owner.c	Yang Shi has improved the behaviour of khugepaged collapsing of readonly	2022-05-26 12:32:41 -07:00
page_poison.c
page_reporting.c
page_reporting.h
page_table_check.c	Six hotfixes. One from Miaohe Lin is considered a minor thing so it isn't	2022-05-27 11:29:35 -07:00
page_vma_mapped.c	mm: pvmw: add support for walking devmap pages	2022-04-28 23:16:10 -07:00
page-writeback.c	sysctl changes for v5.19-rc1	2022-05-26 16:57:20 -07:00
pagewalk.c
percpu-internal.h	percpu: improve percpu_alloc_percpu event trace	2022-05-13 07:20:18 -07:00
percpu-km.c
percpu-stats.c	mm: use vmalloc_array and vcalloc for array allocations	2022-03-08 09:30:46 -05:00
percpu-vm.c
percpu.c	percpu: improve percpu_alloc_percpu event trace	2022-05-13 07:20:18 -07:00
pgalloc-track.h
pgtable-generic.c	mm: avoid unnecessary flush on change_huge_pmd()	2022-05-13 07:20:05 -07:00
process_vm_access.c
ptdump.c	mm: sparsemem: use page table lock to protect kernel pmd operations	2022-03-22 15:57:08 -07:00
readahead.c	filemap: Don't release a locked folio	2022-06-09 16:24:25 -04:00
rmap.c	mm: don't be stuck to rmap lock on reclaim path	2022-05-19 14:08:54 -07:00
rodata_test.c
secretmem.c	secretmem: Convert to free_folio	2022-05-09 23:12:53 -04:00
shmem.c	Two followon fixes for the post-5.19 series "Use pageblock_order for cma	2022-05-27 11:40:49 -07:00
shuffle.c
shuffle.h
slab_common.c	Yang Shi has improved the behaviour of khugepaged collapsing of readonly	2022-05-26 12:32:41 -07:00
slab.c	Yang Shi has improved the behaviour of khugepaged collapsing of readonly	2022-05-26 12:32:41 -07:00
slab.h	slab changes for 5.19	2022-05-25 10:24:04 -07:00
slob.c	mm: make minimum slab alignment a runtime property	2022-05-13 07:20:07 -07:00
slub.c	mm/slub: add missing TID updates on slab deactivation	2022-06-13 17:41:36 +02:00
sparse-vmemmap.c	mm/sparse-vmemmap: improve memory savings for compound devmaps	2022-04-28 23:16:16 -07:00
sparse.c	mm/memory-failure.c: move clear_hwpoisoned_pages	2022-05-13 07:20:19 -07:00
swap_cgroup.c	mm: use vmalloc_array and vcalloc for array allocations	2022-03-08 09:30:46 -05:00
swap_slots.c	mm/swap: remove buggy cache->nr check in refill_swap_slots_cache	2022-05-19 14:08:51 -07:00
swap_state.c	mm: filter out swapin error entry in shmem mapping	2022-05-27 09:33:46 -07:00
swap.c	mm/swap: fix the comment of get_kernel_pages	2022-05-19 14:08:52 -07:00
swap.h	swap: convert add_to_swap() to take a folio	2022-05-13 07:20:15 -07:00
swapfile.c	Two followon fixes for the post-5.19 series "Use pageblock_order for cma	2022-05-27 11:40:49 -07:00
truncate.c	Filesystem folio changes for 5.18	2022-03-22 18:26:56 -07:00
usercopy.c	mm: usercopy: move the virt_addr_valid() below the is_vmalloc_addr()	2022-05-16 16:02:21 -07:00
userfaultfd.c	mm/uffd: enable write protection for shmem & hugetlbfs	2022-05-13 07:20:11 -07:00
util.c	powerpc updates for 5.19	2022-05-28 11:27:17 -07:00
vmacache.c
vmalloc.c	mm/vmalloc: use raw_cpu_ptr() for vmap_block_queue access	2022-05-13 07:20:18 -07:00
vmpressure.c	mm/vmpressure: fix data-race with memcg->socket_pressure	2021-11-06 13:30:40 -07:00
vmscan.c	Yang Shi has improved the behaviour of khugepaged collapsing of readonly	2022-05-26 12:32:41 -07:00
vmstat.c	Bitmap patches for 5.19-rc1	2022-06-04 14:04:27 -07:00
workingset.c	memcg: sync flush only if periodic flush is delayed	2022-04-21 20:01:09 -07:00
z3fold.c	mm/z3fold: fix z3fold_page_migrate races with z3fold_map	2022-05-27 09:33:44 -07:00
zbud.c
zpool.c	zpool: remove the list of pools_head	2022-01-15 16:30:31 +02:00
zsmalloc.c	zsmalloc: fix races between asynchronous zspage free and page migration	2022-05-13 15:11:26 -07:00
zswap.c	zswap: memcg accounting	2022-05-19 14:08:53 -07:00