linux/mm
Peter Xu fe2567eb55 mm/shmem: allow uffd wr-protect none pte for file-backed mem
File-backed memory differs from anonymous memory in that even if the pte
is missing, the data could still resides either in the file or in
page/swap cache.  So when wr-protect a pte, we need to consider none ptes
too.

We do that by installing the uffd-wp pte markers when necessary.  So when
there's a future write to the pte, the fault handler will go the special
path to first fault-in the page as read-only, then report to userfaultfd
server with the wr-protect message.

On the other hand, when unprotecting a page, it's also possible that the
pte got unmapped but replaced by the special uffd-wp marker.  Then we'll
need to be able to recover from a uffd-wp pte marker into a none pte, so
that the next access to the page will fault in correctly as usual when
accessed the next time.

Special care needs to be taken throughout the change_protection_range()
process.  Since now we allow user to wr-protect a none pte, we need to be
able to pre-populate the page table entries if we see (!anonymous &&
MM_CP_UFFD_WP) requests, otherwise change_protection_range() will always
skip when the pgtable entry does not exist.

For example, the pgtable can be missing for a whole chunk of 2M pmd, but
the page cache can exist for the 2M range.  When we want to wr-protect one
4K page within the 2M pmd range, we need to pre-populate the pgtable and
install the pte marker showing that we want to get a message and block the
thread when the page cache of that 4K page is written.  Without
pre-populating the pmd, change_protection() will simply skip that whole
pmd.

Note that this patch only covers the small pages (pte level) but not
covering any of the transparent huge pages yet.  That will be done later,
and this patch will be a preparation for it too.

Link: https://lkml.kernel.org/r/20220405014850.14352-1-peterx@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
Cc: Alistair Popple <apopple@nvidia.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Axel Rasmussen <axelrasmussen@google.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Jerome Glisse <jglisse@redhat.com>
Cc: "Kirill A . Shutemov" <kirill@shutemov.name>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Mike Rapoport <rppt@linux.vnet.ibm.com>
Cc: Nadav Amit <nadav.amit@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-05-13 07:20:10 -07:00
..
damon mm/damon/reclaim: support online inputs update 2022-05-13 07:20:09 -07:00
kasan kasan: fix sleeping function called from invalid context on RT kernel 2022-04-29 14:36:58 -07:00
kfence kfence: enable check kfence canary on panic via boot param 2022-05-13 07:20:06 -07:00
backing-dev.c remove congestion tracking framework 2022-03-22 15:57:01 -07:00
balloon_compaction.c mm/balloon_compaction: make balloon page compaction callbacks static 2022-03-28 16:52:57 -04:00
bootmem_info.c bootmem: Use page->index instead of page->freelist 2022-01-06 12:27:03 +01:00
cma_debug.c mm/cma: change cma mutex to irq safe spinlock 2021-05-05 11:27:21 -07:00
cma_sysfs.c mm: cma: support sysfs 2021-05-05 11:27:24 -07:00
cma.c mm/cma: provide option to opt out from exposing pages on activation failure 2022-03-22 15:57:09 -07:00
cma.h mm/cma: provide option to opt out from exposing pages on activation failure 2022-03-22 15:57:09 -07:00
compaction.c mm: compaction: make sure highest is above the min_pfn 2022-04-28 23:16:19 -07:00
debug_page_ref.c
debug_vm_pgtable.c mm/debug_vm_pgtable: add tests for __HAVE_ARCH_PTE_SWP_EXCLUSIVE 2022-05-09 18:20:45 -07:00
debug.c mm: unexport page_init_poison 2022-03-24 19:06:45 -07:00
dmapool.c mm/dmapool.c: revert "make dma pool to use kmalloc_node" 2022-01-15 16:30:28 +02:00
early_ioremap.c mm/early_ioremap: declare early_memremap_pgprot_adjust() 2022-03-22 15:57:11 -07:00
fadvise.c remove inode_congested() 2022-03-22 15:57:01 -07:00
failslab.c
filemap.c mm: teach core mm about pte markers 2022-05-13 07:20:09 -07:00
folio-compat.c mm/rmap: Convert rmap_walk() to take a folio 2022-03-21 13:01:35 -04:00
frontswap.c frontswap: remove support for multiple ops 2022-01-22 08:33:38 +02:00
gup_test.c selftests/vm: gup_test: test faulting in kernel, and verify pinnable pages 2021-05-05 11:27:26 -07:00
gup_test.h selftests/vm: gup_test: fix test flag 2021-05-05 11:27:26 -07:00
gup.c mm/gup: fix comments to pin_user_pages_*() 2022-05-09 18:20:47 -07:00
highmem.c highmem: fix checks in __kmap_local_sched_{in,out} 2022-04-08 14:20:36 -10:00
hmm.c mm: teach core mm about pte markers 2022-05-13 07:20:09 -07:00
huge_memory.c mm: avoid unnecessary flush on change_huge_pmd() 2022-05-13 07:20:05 -07:00
hugetlb_cgroup.c hugetlb: add hugetlb.*.numa_stat file 2022-01-15 16:30:29 +02:00
hugetlb_vmemmap.c mm/hugetlb_vmemmap: move comment block to Documentation/vm 2022-04-28 23:16:15 -07:00
hugetlb_vmemmap.h mm: hugetlb_vmemmap: cleanup CONFIG_HUGETLB_PAGE_FREE_VMEMMAP* 2022-04-28 23:16:15 -07:00
hugetlb.c mm: hugetlb: considering PMD sharing when flushing cache/TLBs 2022-05-13 07:20:07 -07:00
hwpoison-inject.c mm/hwpoison: avoid the impact of hwpoison_filter() return value on mce handler 2022-03-22 15:57:07 -07:00
init-mm.c kernel/fork: Initialize mm's PASID 2022-02-14 19:51:47 +01:00
internal.h mm: compaction: clean up comment for sched contention 2022-04-28 23:16:17 -07:00
interval_tree.c
io-mapping.c mm: add a io_mapping_map_user helper 2021-04-30 11:20:39 -07:00
ioremap.c mm: move ioremap_page_range to vmalloc.c 2021-09-08 11:50:24 -07:00
Kconfig mm/uffd: PTE_MARKER_UFFD_WP 2022-05-13 07:20:09 -07:00
Kconfig.debug mm: page table check 2022-01-15 16:30:28 +02:00
khugepaged.c mm/rmap: drop "compound" parameter from page_add_new_anon_rmap() 2022-05-09 18:20:43 -07:00
kmemleak.c mm: kmemleak: take a full lowmem check in kmemleak_*_phys() 2022-04-15 14:49:56 -07:00
ksm.c mm: remember exclusively mapped anonymous pages with PG_anon_exclusive 2022-05-09 18:20:44 -07:00
list_lru.c mm/list_lru.c: revert "mm/list_lru: optimize memcg_reparent_list_lru_node()" 2022-04-08 14:20:36 -10:00
maccess.c asm-generic updates for 5.18 2022-03-23 18:03:08 -07:00
madvise.c mm: submit multipage reads for SWP_FS_OPS swap-space 2022-05-09 18:20:49 -07:00
Makefile mm: hugetlb_vmemmap: cleanup CONFIG_HUGETLB_PAGE_FREE_VMEMMAP* 2022-04-28 23:16:15 -07:00
mapping_dirty_helpers.c mm: move tlb_flush_pending inline helpers to mm_inline.h 2022-01-15 16:30:27 +02:00
memblock.c memblock: test suite and a small cleanup 2022-03-27 13:36:06 -07:00
memcontrol.c mm: teach core mm about pte markers 2022-05-13 07:20:09 -07:00
memfd.c memfd: fix F_SEAL_WRITE after shmem huge page allocated 2022-03-05 11:08:32 -08:00
memory_hotplug.c mm/memory_hotplug: use pgprot_val to get value of pgprot 2022-05-13 07:20:07 -07:00
memory-failure.c mm: create new mm/swap.h header file 2022-05-09 18:20:47 -07:00
memory.c mm/shmem: persist uffd-wp bit across zapping for file-backed 2022-05-13 07:20:10 -07:00
mempolicy.c mm/mprotect: use mmu_gather 2022-05-13 07:20:05 -07:00
mempool.c mm: remove spurious blkdev.h includes 2021-10-18 06:17:01 -06:00
memremap.c mm/page-flags: reuse PG_mappedtodisk as PG_anon_exclusive for PageAnon() pages 2022-05-09 18:20:44 -07:00
memtest.c
migrate_device.c mm: remember exclusively mapped anonymous pages with PG_anon_exclusive 2022-05-09 18:20:44 -07:00
migrate.c mm: remember exclusively mapped anonymous pages with PG_anon_exclusive 2022-05-09 18:20:44 -07:00
mincore.c mm: teach core mm about pte markers 2022-05-13 07:20:09 -07:00
mlock.c mm/munlock: protect the per-CPU pagevec by a local_lock_t 2022-04-01 11:46:09 -07:00
mm_init.c include/linux/page-flags-layout.h: cleanups 2021-04-30 11:20:42 -07:00
mmap_lock.c mm: mmap_lock: fix disabling preemption directly 2021-07-23 17:43:28 -07:00
mmap.c mmap locking API: fix missed mmap_sem references in comments 2022-05-13 07:20:07 -07:00
mmu_gather.c mm/mmu_gather: limit free batch count and add schedule point in tlb_batch_pages_flush 2022-04-28 23:16:12 -07:00
mmu_notifier.c mm/mmu_notifier.c: fix race in mmu_interval_notifier_remove() 2022-04-21 20:01:10 -07:00
mmzone.c Folio changes for 5.18 2022-03-22 17:03:12 -07:00
mprotect.c mm/shmem: allow uffd wr-protect none pte for file-backed mem 2022-05-13 07:20:10 -07:00
mremap.c mm: hugetlb: considering PMD sharing when flushing cache/TLBs 2022-05-13 07:20:07 -07:00
msync.c
nommu.c no-MMU: expose vmalloc_huge() for alloc_large_system_hash() 2022-04-25 10:11:49 -07:00
oom_kill.c oom_kill.c: futex: delay the OOM reaper to allow time for proper futex cleanup 2022-04-21 20:01:10 -07:00
page_alloc.c mm/page_alloc: cache the result of node_dirty_ok() 2022-05-13 07:20:09 -07:00
page_counter.c mm/page_counter: remove an incorrect call to propagate_protected_usage() 2022-01-15 16:30:27 +02:00
page_ext.c mm: use for_each_online_node and node_online instead of open coding 2022-04-29 14:36:58 -07:00
page_idle.c mm/rmap: Constify the rmap_walk_control argument 2022-03-21 13:01:35 -04:00
page_io.c MM: handle THP in swap_*page_fs() - count_vm_events() 2022-05-09 18:20:49 -07:00
page_isolation.c mm: wrap __find_buddy_pfn() with a necessary buddy page validation 2022-04-28 23:16:01 -07:00
page_owner.c mm/page_owner.c: record tgid 2022-03-24 19:06:44 -07:00
page_poison.c
page_reporting.c mm/page_reporting: allow driver to specify reporting order 2021-06-29 10:53:47 -07:00
page_reporting.h mm/page_reporting: export reporting order as module parameter 2021-06-29 10:53:47 -07:00
page_table_check.c mm/page_table_check.c: use strtobool for param parsing 2022-03-22 15:57:11 -07:00
page_vma_mapped.c mm: pvmw: add support for walking devmap pages 2022-04-28 23:16:10 -07:00
page-writeback.c mm: rework calculation of bdi_min_ratio in bdi_set_min_ratio 2022-04-28 23:15:57 -07:00
pagewalk.c mm: pagewalk: fix walk for hugepage tables 2021-06-29 10:53:49 -07:00
percpu-internal.h mm: memcg/percpu: account extra objcg space to memory cgroups 2022-01-15 16:30:31 +02:00
percpu-km.c percpu: flush tlb in pcpu_reclaim_populated() 2021-07-04 18:30:17 +00:00
percpu-stats.c mm: use vmalloc_array and vcalloc for array allocations 2022-03-08 09:30:46 -05:00
percpu-vm.c percpu: flush tlb in pcpu_reclaim_populated() 2021-07-04 18:30:17 +00:00
percpu.c bitmap patches for 5.17-rc1 2022-01-23 06:20:44 +02:00
pgalloc-track.h mm: fix typos in comments 2021-05-07 00:26:35 -07:00
pgtable-generic.c mm: avoid unnecessary flush on change_huge_pmd() 2022-05-13 07:20:05 -07:00
process_vm_access.c mm/process_vm_access.c: remove duplicate include 2021-05-05 11:27:27 -07:00
ptdump.c mm: sparsemem: use page table lock to protect kernel pmd operations 2022-03-22 15:57:08 -07:00
readahead.c readahead: Update comments 2022-04-01 14:40:42 -04:00
rmap.c mm/shmem: persist uffd-wp bit across zapping for file-backed 2022-05-13 07:20:10 -07:00
rodata_test.c
secretmem.c mm/secretmem: fix panic when growing a memfd_secret 2022-04-15 14:49:54 -07:00
shmem.c mm/shmem: take care of UFFDIO_COPY_MODE_WP 2022-05-13 07:20:10 -07:00
shuffle.c
shuffle.h mm/shuffle: fix section mismatch warning 2021-05-22 15:09:07 -10:00
slab_common.c mm: make minimum slab alignment a runtime property 2022-05-13 07:20:07 -07:00
slab.c mm: make minimum slab alignment a runtime property 2022-05-13 07:20:07 -07:00
slab.h mm, kfence: support kmem_dump_obj() for KFENCE objects 2022-04-15 14:49:55 -07:00
slob.c mm: make minimum slab alignment a runtime property 2022-05-13 07:20:07 -07:00
slub.c mm, kfence: support kmem_dump_obj() for KFENCE objects 2022-04-15 14:49:55 -07:00
sparse-vmemmap.c mm/sparse-vmemmap: improve memory savings for compound devmaps 2022-04-28 23:16:16 -07:00
sparse.c mm/sparse-vmemmap: add a pgmap argument to section activation 2022-04-28 23:16:15 -07:00
swap_cgroup.c mm: use vmalloc_array and vcalloc for array allocations 2022-03-08 09:30:46 -05:00
swap_slots.c treewide: Add missing includes masked by cgroup -> bpf dependency 2021-12-03 10:58:13 -08:00
swap_state.c mm: submit multipage reads for SWP_FS_OPS swap-space 2022-05-09 18:20:49 -07:00
swap.c mm/munlock: protect the per-CPU pagevec by a local_lock_t 2022-04-01 11:46:09 -07:00
swap.h mm: submit multipage write for SWP_FS_OPS swap-space 2022-05-09 18:20:49 -07:00
swapfile.c mm: introduce ->swap_rw and use it for reads from SWP_FS_OPS swap-space 2022-05-09 18:20:48 -07:00
truncate.c Filesystem folio changes for 5.18 2022-03-22 18:26:56 -07:00
usercopy.c Merge branch 'akpm' (patches from Andrew) 2022-03-22 16:11:53 -07:00
userfaultfd.c mm/shmem: take care of UFFDIO_COPY_MODE_WP 2022-05-13 07:20:10 -07:00
util.c mm: create new mm/swap.h header file 2022-05-09 18:20:47 -07:00
vmacache.c
vmalloc.c vmap(): don't allow invalid pages 2022-04-28 23:16:00 -07:00
vmpressure.c mm/vmpressure: fix data-race with memcg->socket_pressure 2021-11-06 13:30:40 -07:00
vmscan.c mm: submit multipage write for SWP_FS_OPS swap-space 2022-05-09 18:20:49 -07:00
vmstat.c mm/vmstat: add events for ksm cow 2022-04-28 23:16:16 -07:00
workingset.c memcg: sync flush only if periodic flush is delayed 2022-04-21 20:01:09 -07:00
z3fold.c mm/z3fold: remove unneeded PAGE_HEADLESS check in free_handle() 2022-04-28 23:16:06 -07:00
zbud.c mm/zbud: add kerneldoc fields for zbud_pool 2021-07-01 11:06:03 -07:00
zpool.c zpool: remove the list of pools_head 2022-01-15 16:30:31 +02:00
zsmalloc.c zsmalloc: replace get_cpu_var with local_lock 2022-01-22 08:33:37 +02:00
zswap.c mm: create new mm/swap.h header file 2022-05-09 18:20:47 -07:00