linux

mirror of https://github.com/torvalds/linux.git synced 2024-11-24 13:11:40 +00:00

History

Nadav Amit 4a18419f71 mm/mprotect: use mmu_gather Patch series "mm/mprotect: avoid unnecessary TLB flushes", v6. This patchset is intended to remove unnecessary TLB flushes during mprotect() syscalls. Once this patch-set make it through, similar and further optimizations for MADV_COLD and userfaultfd would be possible. Basically, there are 3 optimizations in this patch-set: 1. Use TLB batching infrastructure to batch flushes across VMAs and do better/fewer flushes. This would also be handy for later userfaultfd enhancements. 2. Avoid unnecessary TLB flushes. This optimization is the one that provides most of the performance benefits. Unlike previous versions, we now only avoid flushes that would not result in spurious page-faults. 3. Avoiding TLB flushes on change_huge_pmd() that are only needed to prevent the A/D bits from changing. Andrew asked for some benchmark numbers. I do not have an easy determinate macrobenchmark in which it is easy to show benefit. I therefore ran a microbenchmark: a loop that does the following on anonymous memory, just as a sanity check to see that time is saved by avoiding TLB flushes. The loop goes: mprotect(p, PAGE_SIZE, PROT_READ) mprotect(p, PAGE_SIZE, PROT_READ\|PROT_WRITE) p = 0; // make the page writable The test was run in KVM guest with 1 or 2 threads (the second thread was busy-looping). I measured the time (cycles) of each operation: 1 thread 2 threads mmots +patch mmots +patch PROT_READ 3494 2725 (-22%) 8630 7788 (-10%) PROT_READ\|WRITE 3952 2724 (-31%) 9075 2865 (-68%) [ mmots = v5.17-rc6-mmots-2022-03-06-20-38 ] The exact numbers are really meaningless, but the benefit is clear. There are 2 interesting results though. (1) PROT_READ is cheaper, while one can expect it not to be affected. This is presumably due to TLB miss that is saved (2) Without memory access (p = 0), the speedup of the patch is even greater. In that scenario mprotect(PROT_READ) also avoids the TLB flush. As a result both operations on the patched kernel take roughly ~1500 cycles (with either 1 or 2 threads), whereas on mmotm their cost is as high as presented in the table. This patch (of 3): change_pXX_range() currently does not use mmu_gather, but instead implements its own deferred TLB flushes scheme. This both complicates the code, as developers need to be aware of different invalidation schemes, and prevents opportunities to avoid TLB flushes or perform them in finer granularity. The use of mmu_gather for modified PTEs has benefits in various scenarios even if pages are not released. For instance, if only a single page needs to be flushed out of a range of many pages, only that page would be flushed. If a THP page is flushed, on x86 a single TLB invlpg instruction can be used instead of 512 instructions (or a full TLB flush, which would Linux would actually use by default). mprotect() over multiple VMAs requires a single flush. Use mmu_gather in change_pXX_range(). As the pages are not released, only record the flushed range using tlb_flush_pXX_range(). Handle THP similarly and get rid of flush_cache_range() which becomes redundant since tlb_start_vma() calls it when needed. Link: https://lkml.kernel.org/r/20220401180821.1986781-1-namit@vmware.com Link: https://lkml.kernel.org/r/20220401180821.1986781-2-namit@vmware.com Signed-off-by: Nadav Amit <namit@vmware.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Andrew Cooper <andrew.cooper3@citrix.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Peter Xu <peterx@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Will Deacon <will@kernel.org> Cc: Yu Zhao <yuzhao@google.com> Cc: Nick Piggin <npiggin@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>		2022-05-13 07:20:05 -07:00
..
damon	mm/damon/reclaim: fix the timer always stays active	2022-04-29 14:37:00 -07:00
kasan	kasan: fix sleeping function called from invalid context on RT kernel	2022-04-29 14:36:58 -07:00
kfence	mm, kfence: support kmem_dump_obj() for KFENCE objects	2022-04-15 14:49:55 -07:00
backing-dev.c	remove congestion tracking framework	2022-03-22 15:57:01 -07:00
balloon_compaction.c	mm/balloon_compaction: make balloon page compaction callbacks static	2022-03-28 16:52:57 -04:00
bootmem_info.c	bootmem: Use page->index instead of page->freelist	2022-01-06 12:27:03 +01:00
cma_debug.c	mm/cma: change cma mutex to irq safe spinlock	2021-05-05 11:27:21 -07:00
cma_sysfs.c	mm: cma: support sysfs	2021-05-05 11:27:24 -07:00
cma.c	mm/cma: provide option to opt out from exposing pages on activation failure	2022-03-22 15:57:09 -07:00
cma.h	mm/cma: provide option to opt out from exposing pages on activation failure	2022-03-22 15:57:09 -07:00
compaction.c	mm: compaction: make sure highest is above the min_pfn	2022-04-28 23:16:19 -07:00
debug_page_ref.c
debug_vm_pgtable.c	mm/debug_vm_pgtable: add tests for __HAVE_ARCH_PTE_SWP_EXCLUSIVE	2022-05-09 18:20:45 -07:00
debug.c	mm: unexport page_init_poison	2022-03-24 19:06:45 -07:00
dmapool.c	mm/dmapool.c: revert "make dma pool to use kmalloc_node"	2022-01-15 16:30:28 +02:00
early_ioremap.c	mm/early_ioremap: declare early_memremap_pgprot_adjust()	2022-03-22 15:57:11 -07:00
fadvise.c	remove inode_congested()	2022-03-22 15:57:01 -07:00
failslab.c
filemap.c	tmpfs: fix regressions from wider use of ZERO_PAGE	2022-04-15 14:49:54 -07:00
folio-compat.c	mm/rmap: Convert rmap_walk() to take a folio	2022-03-21 13:01:35 -04:00
frontswap.c	frontswap: remove support for multiple ops	2022-01-22 08:33:38 +02:00
gup_test.c	selftests/vm: gup_test: test faulting in kernel, and verify pinnable pages	2021-05-05 11:27:26 -07:00
gup_test.h	selftests/vm: gup_test: fix test flag	2021-05-05 11:27:26 -07:00
gup.c	mm/gup: fix comments to pin_user_pages_*()	2022-05-09 18:20:47 -07:00
highmem.c	highmem: fix checks in __kmap_local_sched_{in,out}	2022-04-08 14:20:36 -10:00
hmm.c	mm/hmm.c: remove unneeded local variable ret	2022-03-22 15:57:12 -07:00
huge_memory.c	mm/mprotect: use mmu_gather	2022-05-13 07:20:05 -07:00
hugetlb_cgroup.c	hugetlb: add hugetlb.*.numa_stat file	2022-01-15 16:30:29 +02:00
hugetlb_vmemmap.c	mm/hugetlb_vmemmap: move comment block to Documentation/vm	2022-04-28 23:16:15 -07:00
hugetlb_vmemmap.h	mm: hugetlb_vmemmap: cleanup CONFIG_HUGETLB_PAGE_FREE_VMEMMAP*	2022-04-28 23:16:15 -07:00
hugetlb.c	mm/gup: sanity-check with CONFIG_DEBUG_VM that anonymous pages are exclusive when (un)pinning	2022-05-09 18:20:45 -07:00
hwpoison-inject.c	mm/hwpoison: avoid the impact of hwpoison_filter() return value on mce handler	2022-03-22 15:57:07 -07:00
init-mm.c	kernel/fork: Initialize mm's PASID	2022-02-14 19:51:47 +01:00
internal.h	mm: compaction: clean up comment for sched contention	2022-04-28 23:16:17 -07:00
interval_tree.c
io-mapping.c
ioremap.c	mm: move ioremap_page_range to vmalloc.c	2021-09-08 11:50:24 -07:00
Kconfig	mm/mmap: drop arch_filter_pgprot()	2022-04-28 23:16:13 -07:00
Kconfig.debug	mm: page table check	2022-01-15 16:30:28 +02:00
khugepaged.c	mm/rmap: drop "compound" parameter from page_add_new_anon_rmap()	2022-05-09 18:20:43 -07:00
kmemleak.c	mm: kmemleak: take a full lowmem check in kmemleak_*_phys()	2022-04-15 14:49:56 -07:00
ksm.c	mm: remember exclusively mapped anonymous pages with PG_anon_exclusive	2022-05-09 18:20:44 -07:00
list_lru.c	mm/list_lru.c: revert "mm/list_lru: optimize memcg_reparent_list_lru_node()"	2022-04-08 14:20:36 -10:00
maccess.c	asm-generic updates for 5.18	2022-03-23 18:03:08 -07:00
madvise.c	mm: submit multipage reads for SWP_FS_OPS swap-space	2022-05-09 18:20:49 -07:00
Makefile	mm: hugetlb_vmemmap: cleanup CONFIG_HUGETLB_PAGE_FREE_VMEMMAP*	2022-04-28 23:16:15 -07:00
mapping_dirty_helpers.c	mm: move tlb_flush_pending inline helpers to mm_inline.h	2022-01-15 16:30:27 +02:00
memblock.c	memblock: test suite and a small cleanup	2022-03-27 13:36:06 -07:00
memcontrol.c	mm: create new mm/swap.h header file	2022-05-09 18:20:47 -07:00
memfd.c	memfd: fix F_SEAL_WRITE after shmem huge page allocated	2022-03-05 11:08:32 -08:00
memory_hotplug.c	mm/sparse-vmemmap: add a pgmap argument to section activation	2022-04-28 23:16:15 -07:00
memory-failure.c	mm: create new mm/swap.h header file	2022-05-09 18:20:47 -07:00
memory.c	mm: submit multipage reads for SWP_FS_OPS swap-space	2022-05-09 18:20:49 -07:00
mempolicy.c	mm/mprotect: use mmu_gather	2022-05-13 07:20:05 -07:00
mempool.c	mm: remove spurious blkdev.h includes	2021-10-18 06:17:01 -06:00
memremap.c	mm/page-flags: reuse PG_mappedtodisk as PG_anon_exclusive for PageAnon() pages	2022-05-09 18:20:44 -07:00
memtest.c
migrate_device.c	mm: remember exclusively mapped anonymous pages with PG_anon_exclusive	2022-05-09 18:20:44 -07:00
migrate.c	mm: remember exclusively mapped anonymous pages with PG_anon_exclusive	2022-05-09 18:20:44 -07:00
mincore.c	mm: create new mm/swap.h header file	2022-05-09 18:20:47 -07:00
mlock.c	mm/munlock: protect the per-CPU pagevec by a local_lock_t	2022-04-01 11:46:09 -07:00
mm_init.c
mmap_lock.c	mm: mmap_lock: fix disabling preemption directly	2021-07-23 17:43:28 -07:00
mmap.c	mm/mmap: drop arch_vm_get_page_pgprot()	2022-04-28 23:16:14 -07:00
mmu_gather.c	mm/mmu_gather: limit free batch count and add schedule point in tlb_batch_pages_flush	2022-04-28 23:16:12 -07:00
mmu_notifier.c	mm/mmu_notifier.c: fix race in mmu_interval_notifier_remove()	2022-04-21 20:01:10 -07:00
mmzone.c	Folio changes for 5.18	2022-03-22 17:03:12 -07:00
mprotect.c	mm/mprotect: use mmu_gather	2022-05-13 07:20:05 -07:00
mremap.c	mm/mremap: avoid unneeded do_munmap call	2022-04-28 23:16:14 -07:00
msync.c
nommu.c	no-MMU: expose vmalloc_huge() for alloc_large_system_hash()	2022-04-25 10:11:49 -07:00
oom_kill.c	oom_kill.c: futex: delay the OOM reaper to allow time for proper futex cleanup	2022-04-21 20:01:10 -07:00
page_alloc.c	mm: create new mm/swap.h header file	2022-05-09 18:20:47 -07:00
page_counter.c	mm/page_counter: remove an incorrect call to propagate_protected_usage()	2022-01-15 16:30:27 +02:00
page_ext.c	mm: use for_each_online_node and node_online instead of open coding	2022-04-29 14:36:58 -07:00
page_idle.c	mm/rmap: Constify the rmap_walk_control argument	2022-03-21 13:01:35 -04:00
page_io.c	MM: handle THP in swap_*page_fs() - count_vm_events()	2022-05-09 18:20:49 -07:00
page_isolation.c	mm: wrap __find_buddy_pfn() with a necessary buddy page validation	2022-04-28 23:16:01 -07:00
page_owner.c	mm/page_owner.c: record tgid	2022-03-24 19:06:44 -07:00
page_poison.c
page_reporting.c	mm/page_reporting: allow driver to specify reporting order	2021-06-29 10:53:47 -07:00
page_reporting.h	mm/page_reporting: export reporting order as module parameter	2021-06-29 10:53:47 -07:00
page_table_check.c	mm/page_table_check.c: use strtobool for param parsing	2022-03-22 15:57:11 -07:00
page_vma_mapped.c	mm: pvmw: add support for walking devmap pages	2022-04-28 23:16:10 -07:00
page-writeback.c	mm: rework calculation of bdi_min_ratio in bdi_set_min_ratio	2022-04-28 23:15:57 -07:00
pagewalk.c	mm: pagewalk: fix walk for hugepage tables	2021-06-29 10:53:49 -07:00
percpu-internal.h	mm: memcg/percpu: account extra objcg space to memory cgroups	2022-01-15 16:30:31 +02:00
percpu-km.c	percpu: flush tlb in pcpu_reclaim_populated()	2021-07-04 18:30:17 +00:00
percpu-stats.c	mm: use vmalloc_array and vcalloc for array allocations	2022-03-08 09:30:46 -05:00
percpu-vm.c	percpu: flush tlb in pcpu_reclaim_populated()	2021-07-04 18:30:17 +00:00
percpu.c	bitmap patches for 5.17-rc1	2022-01-23 06:20:44 +02:00
pgalloc-track.h	mm: fix typos in comments	2021-05-07 00:26:35 -07:00
pgtable-generic.c	mm: move tlb_flush_pending inline helpers to mm_inline.h	2022-01-15 16:30:27 +02:00
process_vm_access.c	mm/process_vm_access.c: remove duplicate include	2021-05-05 11:27:27 -07:00
ptdump.c	mm: sparsemem: use page table lock to protect kernel pmd operations	2022-03-22 15:57:08 -07:00
readahead.c	readahead: Update comments	2022-04-01 14:40:42 -04:00
rmap.c	mm/swap: remember PG_anon_exclusive via a swp pte bit	2022-05-09 18:20:45 -07:00
rodata_test.c
secretmem.c	mm/secretmem: fix panic when growing a memfd_secret	2022-04-15 14:49:54 -07:00
shmem.c	mm: create new mm/swap.h header file	2022-05-09 18:20:47 -07:00
shuffle.c
shuffle.h	mm/shuffle: fix section mismatch warning	2021-05-22 15:09:07 -10:00
slab_common.c	mm, kfence: support kmem_dump_obj() for KFENCE objects	2022-04-15 14:49:55 -07:00
slab.c	mm, kfence: support kmem_dump_obj() for KFENCE objects	2022-04-15 14:49:55 -07:00
slab.h	mm, kfence: support kmem_dump_obj() for KFENCE objects	2022-04-15 14:49:55 -07:00
slob.c	mm, kfence: support kmem_dump_obj() for KFENCE objects	2022-04-15 14:49:55 -07:00
slub.c	mm, kfence: support kmem_dump_obj() for KFENCE objects	2022-04-15 14:49:55 -07:00
sparse-vmemmap.c	mm/sparse-vmemmap: improve memory savings for compound devmaps	2022-04-28 23:16:16 -07:00
sparse.c	mm/sparse-vmemmap: add a pgmap argument to section activation	2022-04-28 23:16:15 -07:00
swap_cgroup.c	mm: use vmalloc_array and vcalloc for array allocations	2022-03-08 09:30:46 -05:00
swap_slots.c	treewide: Add missing includes masked by cgroup -> bpf dependency	2021-12-03 10:58:13 -08:00
swap_state.c	mm: submit multipage reads for SWP_FS_OPS swap-space	2022-05-09 18:20:49 -07:00
swap.c	mm/munlock: protect the per-CPU pagevec by a local_lock_t	2022-04-01 11:46:09 -07:00
swap.h	mm: submit multipage write for SWP_FS_OPS swap-space	2022-05-09 18:20:49 -07:00
swapfile.c	mm: introduce ->swap_rw and use it for reads from SWP_FS_OPS swap-space	2022-05-09 18:20:48 -07:00
truncate.c	Filesystem folio changes for 5.18	2022-03-22 18:26:56 -07:00
usercopy.c	Merge branch 'akpm' (patches from Andrew)	2022-03-22 16:11:53 -07:00
userfaultfd.c	mm/mprotect: use mmu_gather	2022-05-13 07:20:05 -07:00
util.c	mm: create new mm/swap.h header file	2022-05-09 18:20:47 -07:00
vmacache.c
vmalloc.c	vmap(): don't allow invalid pages	2022-04-28 23:16:00 -07:00
vmpressure.c	mm/vmpressure: fix data-race with memcg->socket_pressure	2021-11-06 13:30:40 -07:00
vmscan.c	mm: submit multipage write for SWP_FS_OPS swap-space	2022-05-09 18:20:49 -07:00
vmstat.c	mm/vmstat: add events for ksm cow	2022-04-28 23:16:16 -07:00
workingset.c	memcg: sync flush only if periodic flush is delayed	2022-04-21 20:01:09 -07:00
z3fold.c	mm/z3fold: remove unneeded PAGE_HEADLESS check in free_handle()	2022-04-28 23:16:06 -07:00
zbud.c	mm/zbud: add kerneldoc fields for zbud_pool	2021-07-01 11:06:03 -07:00
zpool.c	zpool: remove the list of pools_head	2022-01-15 16:30:31 +02:00
zsmalloc.c	zsmalloc: replace get_cpu_var with local_lock	2022-01-22 08:33:37 +02:00
zswap.c	mm: create new mm/swap.h header file	2022-05-09 18:20:47 -07:00