linux/mm
Peter Xu 9c28a205c0 mm/shmem: handle uffd-wp special pte in page fault handler
File-backed memories are prone to unmap/swap so the ptes are always
unstable, because they can be easily faulted back later using the page
cache.  This could lead to uffd-wp getting lost when unmapping or swapping
out such memory.  One example is shmem.  PTE markers are needed to store
those information.

This patch prepares it by handling uffd-wp pte markers first it is applied
elsewhere, so that the page fault handler can recognize uffd-wp pte
markers.

The handling of uffd-wp pte markers is similar to missing fault, it's just
that we'll handle this "missing fault" when we see the pte markers,
meanwhile we need to make sure the marker information is kept during
processing the fault.

This is a slow path of uffd-wp handling, because zapping of wr-protected
shmem ptes should be rare.  So far it should only trigger in two
conditions:

  (1) When trying to punch holes in shmem_fallocate(), there is an
      optimization to zap the pgtables before evicting the page.

  (2) When swapping out shmem pages.

Because of this, the page fault handling is simplifed too by not sending
the wr-protect message in the 1st page fault, instead the page will be
installed read-only, so the uffd-wp message will be generated in the next
fault, which will trigger the do_wp_page() path of general uffd-wp
handling.

Disable fault-around for all uffd-wp registered ranges for extra safety
just like uffd-minor fault, and clean the code up.

Link: https://lkml.kernel.org/r/20220405014844.14239-1-peterx@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
Cc: Alistair Popple <apopple@nvidia.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Axel Rasmussen <axelrasmussen@google.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Jerome Glisse <jglisse@redhat.com>
Cc: "Kirill A . Shutemov" <kirill@shutemov.name>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Mike Rapoport <rppt@linux.vnet.ibm.com>
Cc: Nadav Amit <nadav.amit@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-05-13 07:20:10 -07:00
..
damon mm/damon/reclaim: support online inputs update 2022-05-13 07:20:09 -07:00
kasan kasan: fix sleeping function called from invalid context on RT kernel 2022-04-29 14:36:58 -07:00
kfence kfence: enable check kfence canary on panic via boot param 2022-05-13 07:20:06 -07:00
backing-dev.c remove congestion tracking framework 2022-03-22 15:57:01 -07:00
balloon_compaction.c mm/balloon_compaction: make balloon page compaction callbacks static 2022-03-28 16:52:57 -04:00
bootmem_info.c bootmem: Use page->index instead of page->freelist 2022-01-06 12:27:03 +01:00
cma_debug.c
cma_sysfs.c mm: cma: support sysfs 2021-05-05 11:27:24 -07:00
cma.c mm/cma: provide option to opt out from exposing pages on activation failure 2022-03-22 15:57:09 -07:00
cma.h mm/cma: provide option to opt out from exposing pages on activation failure 2022-03-22 15:57:09 -07:00
compaction.c mm: compaction: make sure highest is above the min_pfn 2022-04-28 23:16:19 -07:00
debug_page_ref.c
debug_vm_pgtable.c mm/debug_vm_pgtable: add tests for __HAVE_ARCH_PTE_SWP_EXCLUSIVE 2022-05-09 18:20:45 -07:00
debug.c mm: unexport page_init_poison 2022-03-24 19:06:45 -07:00
dmapool.c mm/dmapool.c: revert "make dma pool to use kmalloc_node" 2022-01-15 16:30:28 +02:00
early_ioremap.c mm/early_ioremap: declare early_memremap_pgprot_adjust() 2022-03-22 15:57:11 -07:00
fadvise.c remove inode_congested() 2022-03-22 15:57:01 -07:00
failslab.c
filemap.c mm: teach core mm about pte markers 2022-05-13 07:20:09 -07:00
folio-compat.c mm/rmap: Convert rmap_walk() to take a folio 2022-03-21 13:01:35 -04:00
frontswap.c frontswap: remove support for multiple ops 2022-01-22 08:33:38 +02:00
gup_test.c selftests/vm: gup_test: test faulting in kernel, and verify pinnable pages 2021-05-05 11:27:26 -07:00
gup_test.h selftests/vm: gup_test: fix test flag 2021-05-05 11:27:26 -07:00
gup.c mm/gup: fix comments to pin_user_pages_*() 2022-05-09 18:20:47 -07:00
highmem.c highmem: fix checks in __kmap_local_sched_{in,out} 2022-04-08 14:20:36 -10:00
hmm.c mm: teach core mm about pte markers 2022-05-13 07:20:09 -07:00
huge_memory.c mm: avoid unnecessary flush on change_huge_pmd() 2022-05-13 07:20:05 -07:00
hugetlb_cgroup.c hugetlb: add hugetlb.*.numa_stat file 2022-01-15 16:30:29 +02:00
hugetlb_vmemmap.c mm/hugetlb_vmemmap: move comment block to Documentation/vm 2022-04-28 23:16:15 -07:00
hugetlb_vmemmap.h mm: hugetlb_vmemmap: cleanup CONFIG_HUGETLB_PAGE_FREE_VMEMMAP* 2022-04-28 23:16:15 -07:00
hugetlb.c mm: hugetlb: considering PMD sharing when flushing cache/TLBs 2022-05-13 07:20:07 -07:00
hwpoison-inject.c mm/hwpoison: avoid the impact of hwpoison_filter() return value on mce handler 2022-03-22 15:57:07 -07:00
init-mm.c kernel/fork: Initialize mm's PASID 2022-02-14 19:51:47 +01:00
internal.h mm: compaction: clean up comment for sched contention 2022-04-28 23:16:17 -07:00
interval_tree.c
io-mapping.c
ioremap.c mm: move ioremap_page_range to vmalloc.c 2021-09-08 11:50:24 -07:00
Kconfig mm/uffd: PTE_MARKER_UFFD_WP 2022-05-13 07:20:09 -07:00
Kconfig.debug mm: page table check 2022-01-15 16:30:28 +02:00
khugepaged.c mm/rmap: drop "compound" parameter from page_add_new_anon_rmap() 2022-05-09 18:20:43 -07:00
kmemleak.c mm: kmemleak: take a full lowmem check in kmemleak_*_phys() 2022-04-15 14:49:56 -07:00
ksm.c mm: remember exclusively mapped anonymous pages with PG_anon_exclusive 2022-05-09 18:20:44 -07:00
list_lru.c mm/list_lru.c: revert "mm/list_lru: optimize memcg_reparent_list_lru_node()" 2022-04-08 14:20:36 -10:00
maccess.c asm-generic updates for 5.18 2022-03-23 18:03:08 -07:00
madvise.c mm: submit multipage reads for SWP_FS_OPS swap-space 2022-05-09 18:20:49 -07:00
Makefile mm: hugetlb_vmemmap: cleanup CONFIG_HUGETLB_PAGE_FREE_VMEMMAP* 2022-04-28 23:16:15 -07:00
mapping_dirty_helpers.c mm: move tlb_flush_pending inline helpers to mm_inline.h 2022-01-15 16:30:27 +02:00
memblock.c memblock: test suite and a small cleanup 2022-03-27 13:36:06 -07:00
memcontrol.c mm: teach core mm about pte markers 2022-05-13 07:20:09 -07:00
memfd.c memfd: fix F_SEAL_WRITE after shmem huge page allocated 2022-03-05 11:08:32 -08:00
memory_hotplug.c mm/memory_hotplug: use pgprot_val to get value of pgprot 2022-05-13 07:20:07 -07:00
memory-failure.c mm: create new mm/swap.h header file 2022-05-09 18:20:47 -07:00
memory.c mm/shmem: handle uffd-wp special pte in page fault handler 2022-05-13 07:20:10 -07:00
mempolicy.c mm/mprotect: use mmu_gather 2022-05-13 07:20:05 -07:00
mempool.c mm: remove spurious blkdev.h includes 2021-10-18 06:17:01 -06:00
memremap.c mm/page-flags: reuse PG_mappedtodisk as PG_anon_exclusive for PageAnon() pages 2022-05-09 18:20:44 -07:00
memtest.c
migrate_device.c mm: remember exclusively mapped anonymous pages with PG_anon_exclusive 2022-05-09 18:20:44 -07:00
migrate.c mm: remember exclusively mapped anonymous pages with PG_anon_exclusive 2022-05-09 18:20:44 -07:00
mincore.c mm: teach core mm about pte markers 2022-05-13 07:20:09 -07:00
mlock.c mm/munlock: protect the per-CPU pagevec by a local_lock_t 2022-04-01 11:46:09 -07:00
mm_init.c
mmap_lock.c mm: mmap_lock: fix disabling preemption directly 2021-07-23 17:43:28 -07:00
mmap.c mmap locking API: fix missed mmap_sem references in comments 2022-05-13 07:20:07 -07:00
mmu_gather.c mm/mmu_gather: limit free batch count and add schedule point in tlb_batch_pages_flush 2022-04-28 23:16:12 -07:00
mmu_notifier.c mm/mmu_notifier.c: fix race in mmu_interval_notifier_remove() 2022-04-21 20:01:10 -07:00
mmzone.c Folio changes for 5.18 2022-03-22 17:03:12 -07:00
mprotect.c mm: teach core mm about pte markers 2022-05-13 07:20:09 -07:00
mremap.c mm: hugetlb: considering PMD sharing when flushing cache/TLBs 2022-05-13 07:20:07 -07:00
msync.c
nommu.c no-MMU: expose vmalloc_huge() for alloc_large_system_hash() 2022-04-25 10:11:49 -07:00
oom_kill.c oom_kill.c: futex: delay the OOM reaper to allow time for proper futex cleanup 2022-04-21 20:01:10 -07:00
page_alloc.c mm/page_alloc: cache the result of node_dirty_ok() 2022-05-13 07:20:09 -07:00
page_counter.c mm/page_counter: remove an incorrect call to propagate_protected_usage() 2022-01-15 16:30:27 +02:00
page_ext.c mm: use for_each_online_node and node_online instead of open coding 2022-04-29 14:36:58 -07:00
page_idle.c mm/rmap: Constify the rmap_walk_control argument 2022-03-21 13:01:35 -04:00
page_io.c MM: handle THP in swap_*page_fs() - count_vm_events() 2022-05-09 18:20:49 -07:00
page_isolation.c mm: wrap __find_buddy_pfn() with a necessary buddy page validation 2022-04-28 23:16:01 -07:00
page_owner.c mm/page_owner.c: record tgid 2022-03-24 19:06:44 -07:00
page_poison.c
page_reporting.c mm/page_reporting: allow driver to specify reporting order 2021-06-29 10:53:47 -07:00
page_reporting.h mm/page_reporting: export reporting order as module parameter 2021-06-29 10:53:47 -07:00
page_table_check.c mm/page_table_check.c: use strtobool for param parsing 2022-03-22 15:57:11 -07:00
page_vma_mapped.c mm: pvmw: add support for walking devmap pages 2022-04-28 23:16:10 -07:00
page-writeback.c mm: rework calculation of bdi_min_ratio in bdi_set_min_ratio 2022-04-28 23:15:57 -07:00
pagewalk.c mm: pagewalk: fix walk for hugepage tables 2021-06-29 10:53:49 -07:00
percpu-internal.h mm: memcg/percpu: account extra objcg space to memory cgroups 2022-01-15 16:30:31 +02:00
percpu-km.c percpu: flush tlb in pcpu_reclaim_populated() 2021-07-04 18:30:17 +00:00
percpu-stats.c mm: use vmalloc_array and vcalloc for array allocations 2022-03-08 09:30:46 -05:00
percpu-vm.c percpu: flush tlb in pcpu_reclaim_populated() 2021-07-04 18:30:17 +00:00
percpu.c bitmap patches for 5.17-rc1 2022-01-23 06:20:44 +02:00
pgalloc-track.h mm: fix typos in comments 2021-05-07 00:26:35 -07:00
pgtable-generic.c mm: avoid unnecessary flush on change_huge_pmd() 2022-05-13 07:20:05 -07:00
process_vm_access.c mm/process_vm_access.c: remove duplicate include 2021-05-05 11:27:27 -07:00
ptdump.c mm: sparsemem: use page table lock to protect kernel pmd operations 2022-03-22 15:57:08 -07:00
readahead.c readahead: Update comments 2022-04-01 14:40:42 -04:00
rmap.c mm/rmap: Fix typos in comments 2022-05-13 07:20:07 -07:00
rodata_test.c
secretmem.c mm/secretmem: fix panic when growing a memfd_secret 2022-04-15 14:49:54 -07:00
shmem.c mm/shmem: take care of UFFDIO_COPY_MODE_WP 2022-05-13 07:20:10 -07:00
shuffle.c
shuffle.h mm/shuffle: fix section mismatch warning 2021-05-22 15:09:07 -10:00
slab_common.c mm: make minimum slab alignment a runtime property 2022-05-13 07:20:07 -07:00
slab.c mm: make minimum slab alignment a runtime property 2022-05-13 07:20:07 -07:00
slab.h mm, kfence: support kmem_dump_obj() for KFENCE objects 2022-04-15 14:49:55 -07:00
slob.c mm: make minimum slab alignment a runtime property 2022-05-13 07:20:07 -07:00
slub.c mm, kfence: support kmem_dump_obj() for KFENCE objects 2022-04-15 14:49:55 -07:00
sparse-vmemmap.c mm/sparse-vmemmap: improve memory savings for compound devmaps 2022-04-28 23:16:16 -07:00
sparse.c mm/sparse-vmemmap: add a pgmap argument to section activation 2022-04-28 23:16:15 -07:00
swap_cgroup.c mm: use vmalloc_array and vcalloc for array allocations 2022-03-08 09:30:46 -05:00
swap_slots.c treewide: Add missing includes masked by cgroup -> bpf dependency 2021-12-03 10:58:13 -08:00
swap_state.c mm: submit multipage reads for SWP_FS_OPS swap-space 2022-05-09 18:20:49 -07:00
swap.c mm/munlock: protect the per-CPU pagevec by a local_lock_t 2022-04-01 11:46:09 -07:00
swap.h mm: submit multipage write for SWP_FS_OPS swap-space 2022-05-09 18:20:49 -07:00
swapfile.c mm: introduce ->swap_rw and use it for reads from SWP_FS_OPS swap-space 2022-05-09 18:20:48 -07:00
truncate.c Filesystem folio changes for 5.18 2022-03-22 18:26:56 -07:00
usercopy.c Merge branch 'akpm' (patches from Andrew) 2022-03-22 16:11:53 -07:00
userfaultfd.c mm/shmem: take care of UFFDIO_COPY_MODE_WP 2022-05-13 07:20:10 -07:00
util.c mm: create new mm/swap.h header file 2022-05-09 18:20:47 -07:00
vmacache.c
vmalloc.c vmap(): don't allow invalid pages 2022-04-28 23:16:00 -07:00
vmpressure.c mm/vmpressure: fix data-race with memcg->socket_pressure 2021-11-06 13:30:40 -07:00
vmscan.c mm: submit multipage write for SWP_FS_OPS swap-space 2022-05-09 18:20:49 -07:00
vmstat.c mm/vmstat: add events for ksm cow 2022-04-28 23:16:16 -07:00
workingset.c memcg: sync flush only if periodic flush is delayed 2022-04-21 20:01:09 -07:00
z3fold.c mm/z3fold: remove unneeded PAGE_HEADLESS check in free_handle() 2022-04-28 23:16:06 -07:00
zbud.c mm/zbud: add kerneldoc fields for zbud_pool 2021-07-01 11:06:03 -07:00
zpool.c zpool: remove the list of pools_head 2022-01-15 16:30:31 +02:00
zsmalloc.c zsmalloc: replace get_cpu_var with local_lock 2022-01-22 08:33:37 +02:00
zswap.c mm: create new mm/swap.h header file 2022-05-09 18:20:47 -07:00