linux/mm
Alistair Popple b756a3b5e7 mm: device exclusive memory access
Some devices require exclusive write access to shared virtual memory (SVM)
ranges to perform atomic operations on that memory.  This requires CPU
page tables to be updated to deny access whilst atomic operations are
occurring.

In order to do this introduce a new swap entry type
(SWP_DEVICE_EXCLUSIVE).  When a SVM range needs to be marked for exclusive
access by a device all page table mappings for the particular range are
replaced with device exclusive swap entries.  This causes any CPU access
to the page to result in a fault.

Faults are resovled by replacing the faulting entry with the original
mapping.  This results in MMU notifiers being called which a driver uses
to update access permissions such as revoking atomic access.  After
notifiers have been called the device will no longer have exclusive access
to the region.

Walking of the page tables to find the target pages is handled by
get_user_pages() rather than a direct page table walk.  A direct page
table walk similar to what migrate_vma_collect()/unmap() does could also
have been utilised.  However this resulted in more code similar in
functionality to what get_user_pages() provides as page faulting is
required to make the PTEs present and to break COW.

[dan.carpenter@oracle.com: fix signedness bug in make_device_exclusive_range()]
  Link: https://lkml.kernel.org/r/YNIz5NVnZ5GiZ3u1@mwanda

Link: https://lkml.kernel.org/r/20210616105937.23201-8-apopple@nvidia.com
Signed-off-by: Alistair Popple <apopple@nvidia.com>
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Cc: Ben Skeggs <bskeggs@redhat.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Jason Gunthorpe <jgg@nvidia.com>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: "Matthew Wilcox (Oracle)" <willy@infradead.org>
Cc: Peter Xu <peterx@redhat.com>
Cc: Ralph Campbell <rcampbell@nvidia.com>
Cc: Shakeel Butt <shakeelb@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-07-01 11:06:03 -07:00
..
kasan kasan: add memory corruption identification support for hardware tag-based mode 2021-06-29 10:53:53 -07:00
kfence kfence: unconditionally use unbound work queue 2021-07-01 11:06:03 -07:00
backing-dev.c writeback, cgroup: release dying cgwbs by switching attached inodes 2021-06-29 10:53:48 -07:00
balloon_compaction.c mm: fix typos in comments 2021-05-07 00:26:35 -07:00
bootmem_info.c mm: memory_hotplug: factor out bootmem core functions to bootmem_info.c 2021-06-30 20:47:25 -07:00
cleancache.c
cma_debug.c mm/cma: change cma mutex to irq safe spinlock 2021-05-05 11:27:21 -07:00
cma_sysfs.c mm: cma: support sysfs 2021-05-05 11:27:24 -07:00
cma.c mm: use proper type for cma_[alloc|release] 2021-05-05 11:27:24 -07:00
cma.h mm: cma: support sysfs 2021-05-05 11:27:24 -07:00
compaction.c mm/compaction: fix 'limit' in fast_isolate_freepages 2021-06-30 20:47:29 -07:00
debug_page_ref.c
debug_vm_pgtable.c mm/swapops: rework swap entry manipulation code 2021-07-01 11:06:03 -07:00
debug.c mm/debug: factor PagePoisoned out of __dump_page 2021-06-29 10:53:53 -07:00
dmapool.c mm/dmapool: use DEVICE_ATTR_RO macro 2021-06-29 10:53:52 -07:00
early_ioremap.c mm/early_ioremap.c: use __func__ instead of function name 2021-02-26 09:41:02 -08:00
fadvise.c mm, fadvise: improve the expensive remote LRU cache draining after FADV_DONTNEED 2020-10-13 18:38:29 -07:00
failslab.c
filemap.c mm: charge active memcg when no mm is set 2021-06-29 10:53:50 -07:00
frontswap.c mm/mempool: minor coding style tweaks 2021-05-05 11:27:27 -07:00
gup_test.c selftests/vm: gup_test: test faulting in kernel, and verify pinnable pages 2021-05-05 11:27:26 -07:00
gup_test.h selftests/vm: gup_test: fix test flag 2021-05-05 11:27:26 -07:00
gup.c mm/madvise: introduce MADV_POPULATE_(READ|WRITE) to prefault page tables 2021-06-30 20:47:30 -07:00
highmem.c mm: fix typos in comments 2021-05-07 00:26:35 -07:00
hmm.c mm: device exclusive memory access 2021-07-01 11:06:03 -07:00
huge_memory.c mm/rmap: split migration into its own function 2021-07-01 11:06:03 -07:00
hugetlb_cgroup.c hugetlb: make free_huge_page irq safe 2021-05-05 11:27:22 -07:00
hugetlb_vmemmap.c mm: hugetlb: introduce CONFIG_HUGETLB_PAGE_FREE_VMEMMAP_DEFAULT_ON 2021-06-30 20:47:26 -07:00
hugetlb_vmemmap.h mm: hugetlb: introduce nr_free_vmemmap_pages in the struct hstate 2021-06-30 20:47:25 -07:00
hugetlb.c mm/swapops: rework swap entry manipulation code 2021-07-01 11:06:03 -07:00
hwpoison-inject.c mm,hwpoison-inject: don't pin for hwpoison_filter 2020-10-16 11:11:16 -07:00
init-mm.c mm/gup: prevent gup_fast from racing with COW during fork 2020-12-15 12:13:39 -08:00
internal.h mm/page_alloc: move prototype for find_suitable_fallback 2021-07-01 11:06:03 -07:00
interval_tree.c mm/interval_tree: add comments to improve code readability 2021-04-30 11:20:38 -07:00
io-mapping.c mm: add a io_mapping_map_user helper 2021-04-30 11:20:39 -07:00
ioremap.c mm/ioremap: fix iomap_max_page_shift 2021-05-14 19:41:32 -07:00
Kconfig mm: generalize ZONE_[DMA|DMA32] 2021-06-30 20:47:30 -07:00
Kconfig.debug mm, page_poison: remove CONFIG_PAGE_POISONING_ZERO 2020-12-15 12:13:46 -08:00
khugepaged.c mm, thp: relax the VM_DENYWRITE constraint on file-backed THPs 2021-06-30 20:47:29 -07:00
kmemleak.c mm/kmemleak: fix possible wrong memory scanning period 2021-06-29 10:53:47 -07:00
ksm.c mm/ksm: use vma_lookup() in find_mergeable_vma() 2021-06-29 10:53:52 -07:00
list_lru.c mm: vmscan: consolidate shrinker_maps handling code 2021-05-05 11:27:23 -07:00
maccess.c uaccess: add force_uaccess_{begin,end} helpers 2020-08-12 10:57:59 -07:00
madvise.c mm/madvise: introduce MADV_POPULATE_(READ|WRITE) to prefault page tables 2021-06-30 20:47:30 -07:00
Makefile mm: hugetlb: free the vmemmap pages associated with each HugeTLB page 2021-06-30 20:47:25 -07:00
mapping_dirty_helpers.c mm/mapping_dirty_helpers: remove double Note in kerneldoc 2021-07-01 11:06:02 -07:00
memblock.c memblock: update initialization of reserved pages 2021-06-30 20:47:29 -07:00
memcontrol.c mm: remove special swap entry functions 2021-07-01 11:06:03 -07:00
memfd.c
memory_hotplug.c mm/memory_hotplug: fix kerneldoc comment for __remove_memory 2021-07-01 11:06:02 -07:00
memory-failure.c mm: fix spelling mistakes 2021-07-01 11:06:02 -07:00
memory.c mm: device exclusive memory access 2021-07-01 11:06:03 -07:00
mempolicy.c mm/mempolicy: use unified 'nodes' for bind/interleave/prefer policies 2021-06-30 20:47:29 -07:00
mempool.c mm/mempool: minor coding style tweaks 2021-05-05 11:27:27 -07:00
memremap.c mm/memremap.c: fix improper SPDX comment style 2021-04-30 11:20:37 -07:00
memtest.c
migrate.c mm: rename migrate_pgmap_owner 2021-07-01 11:06:03 -07:00
mincore.c inode: make init and permission helpers idmapped mount aware 2021-01-24 14:27:16 +01:00
mlock.c mm/rmap: split try_to_munlock from try_to_unmap 2021-07-01 11:06:03 -07:00
mm_init.c include/linux/page-flags-layout.h: cleanups 2021-04-30 11:20:42 -07:00
mmap_lock.c mm/mmap_lock: remove dead code for !CONFIG_TRACING configurations 2021-07-01 11:06:03 -07:00
mmap.c mm/mmap: use find_vma_intersection() in do_mmap() for overlap 2021-06-29 10:53:51 -07:00
mmu_gather.c mm: eliminate "expecting prototype" kernel-doc warnings 2021-04-16 16:10:36 -07:00
mmu_notifier.c mm/mmu_notifiers: ensure range_end() is paired with range_start() 2021-03-25 09:22:55 -07:00
mmzone.c mm/lru: replace pgdat lru_lock with lruvec lock 2020-12-15 14:48:04 -08:00
mprotect.c mm: device exclusive memory access 2021-07-01 11:06:03 -07:00
mremap.c mm/mremap: use vma_lookup() in vma_to_resize() 2021-06-29 10:53:52 -07:00
msync.c mm/msync: exit early when the flags is an MS_ASYNC and start < vm_start 2021-04-30 11:20:37 -07:00
nommu.c mm/nommu: unexport do_munmap() 2021-06-30 20:47:30 -07:00
oom_kill.c mm/mempolicy: cleanup nodemask intersection check for oom 2021-06-30 20:47:29 -07:00
page_alloc.c mm/page_alloc: make should_fail_alloc_page() static 2021-07-01 11:06:02 -07:00
page_counter.c mm: page_counter: mitigate consequences of a page_counter underflow 2021-04-30 11:20:38 -07:00
page_ext.c mm: replace CONFIG_FLAT_NODE_MEM_MAP with CONFIG_FLATMEM 2021-06-29 10:53:55 -07:00
page_idle.c mm: page_idle_get_page() does not need lru_lock 2020-12-15 14:48:03 -08:00
page_io.c swap: fix swapfile read/write offset 2021-03-02 17:25:46 -07:00
page_isolation.c mm/page_isolation: do not isolate the max order page 2020-12-15 12:13:45 -08:00
page_owner.c mm/page_owner: constify dump_page_owner 2021-06-29 10:53:53 -07:00
page_poison.c mm: page_poison: print page info when corruption is caught 2021-04-30 11:20:36 -07:00
page_reporting.c mm/page_reporting: allow driver to specify reporting order 2021-06-29 10:53:47 -07:00
page_reporting.h mm/page_reporting: export reporting order as module parameter 2021-06-29 10:53:47 -07:00
page_vma_mapped.c mm: device exclusive memory access 2021-07-01 11:06:03 -07:00
page-writeback.c fs: remove noop_set_page_dirty() 2021-06-29 10:53:48 -07:00
pagewalk.c mm: pagewalk: fix walk for hugepage tables 2021-06-29 10:53:49 -07:00
percpu-internal.h mm: fix typos in comments 2021-05-07 00:26:35 -07:00
percpu-km.c mm: memcg/percpu: account percpu memory to memory cgroups 2020-08-12 10:57:55 -07:00
percpu-stats.c percpu: make pcpu_nr_empty_pop_pages per chunk type 2021-04-09 13:58:38 +00:00
percpu-vm.c mm/vmalloc: remove unmap_kernel_range 2021-04-30 11:20:40 -07:00
percpu.c mm: fix typos in comments 2021-05-07 00:26:35 -07:00
pgalloc-track.h mm: fix typos in comments 2021-05-07 00:26:35 -07:00
pgtable-generic.c mm/thp: fix __split_huge_pmd_locked() on shmem migration entry 2021-06-16 09:24:42 -07:00
process_vm_access.c mm/process_vm_access.c: remove duplicate include 2021-05-05 11:27:27 -07:00
ptdump.c mm: ptdump: fix build failure 2021-04-16 16:10:37 -07:00
readahead.c mm: Implement readahead_control pageset expansion 2021-04-23 10:14:29 +01:00
rmap.c mm: device exclusive memory access 2021-07-01 11:06:03 -07:00
rodata_test.c mm/rodata_test.c: fix missing function declaration 2020-08-21 09:52:53 -07:00
shmem.c userfaultfd/shmem: modify shmem_mfill_atomic_pte to use install_pte() 2021-06-30 20:47:27 -07:00
shuffle.c mm: eliminate "expecting prototype" kernel-doc warnings 2021-04-16 16:10:36 -07:00
shuffle.h mm/shuffle: fix section mismatch warning 2021-05-22 15:09:07 -10:00
slab_common.c mm: memcg/slab: disable cache merging for KMALLOC_NORMAL caches 2021-06-29 10:53:49 -07:00
slab.c mm: fix typos in comments 2021-05-07 00:26:35 -07:00
slab.h mm: memcg/slab: properly set up gfp flags for objcg pointer array 2021-06-29 10:53:49 -07:00
slob.c mm: Don't build mm_dump_obj() on CONFIG_PRINTK=n kernels 2021-03-08 14:18:46 -08:00
slub.c mm/slub: add taint after the errors are printed 2021-06-29 10:53:47 -07:00
sparse-vmemmap.c mm: sparsemem: split the huge PMD mapping of vmemmap pages 2021-06-30 20:47:26 -07:00
sparse.c mm: memory_hotplug: factor out bootmem core functions to bootmem_info.c 2021-06-30 20:47:25 -07:00
swap_cgroup.c mm: memcontrol: make swap tracking an integral part of memory control 2020-06-03 20:09:48 -07:00
swap_slots.c mm/swap_slots.c: delete meaningless forward declarations 2021-06-29 10:53:49 -07:00
swap_state.c swap: check mapping_empty() for swap cache before being freed 2021-06-29 10:53:49 -07:00
swap.c mm: fix typos and grammar error in comments 2021-07-01 11:06:02 -07:00
swapfile.c mm: fix spelling mistakes 2021-07-01 11:06:02 -07:00
truncate.c mm/thp: unmap_mapping_page() to fix THP truncate_cleanup_page() 2021-06-16 09:24:42 -07:00
usercopy.c mm/usercopy.c: delete duplicated word 2020-08-12 10:57:58 -07:00
userfaultfd.c userfaultfd/shmem: modify shmem_mfill_atomic_pte to use install_pte() 2021-06-30 20:47:27 -07:00
util.c mm: introduce page_offline_(begin|end|freeze|thaw) to synchronize setting PageOffline() 2021-06-30 20:47:28 -07:00
vmacache.c kernel: better document the use_mm/unuse_mm API contract 2020-06-10 19:14:18 -07:00
vmalloc.c mm/vmalloc: include header for prototype of set_iounmap_nonlazy 2021-07-01 11:06:02 -07:00
vmpressure.c
vmscan.c mm/vmscan: remove kerneldoc-like comment from isolate_lru_pages 2021-07-01 11:06:02 -07:00
vmstat.c mm/vmstat: inline NUMA event counter updates 2021-06-29 10:53:54 -07:00
workingset.c mm: workingset: define macro WORKINGSET_SHIFT 2021-06-30 20:47:28 -07:00
z3fold.c mm/z3fold: add kerneldoc fields for z3fold_pool 2021-07-01 11:06:03 -07:00
zbud.c mm/zbud: add kerneldoc fields for zbud_pool 2021-07-01 11:06:03 -07:00
zpool.c mm: fix typos in comments 2021-05-07 00:26:35 -07:00
zsmalloc.c mm/zsmalloc.c: improve readability for async_free_zspage() 2021-07-01 11:06:02 -07:00
zswap.c mm/zswap.c: fix two bugs in zswap_writeback_entry() 2021-06-30 20:47:31 -07:00