mainlining shenanigans
Go to file
Mike Kravetz 3a47c54f09 hugetlbfs: revert use i_mmap_rwsem for more pmd sharing synchronization
Commit c0d0381ade ("hugetlbfs: use i_mmap_rwsem for more pmd sharing
synchronization") added code to take i_mmap_rwsem in read mode for the
duration of fault processing.  However, this has been shown to cause
performance/scaling issues.  Revert the code and go back to only taking
the semaphore in huge_pmd_share during the fault path.

Keep the code that takes i_mmap_rwsem in write mode before calling
try_to_unmap as this is required if huge_pmd_unshare is called.

NOTE: Reverting this code does expose the following race condition.

Faulting thread                                 Unsharing thread
...                                                  ...
ptep = huge_pte_offset()
      or
ptep = huge_pte_alloc()
...
                                                i_mmap_lock_write
                                                lock page table
ptep invalid   <------------------------        huge_pmd_unshare()
Could be in a previously                        unlock_page_table
sharing process or worse                        i_mmap_unlock_write
...
ptl = huge_pte_lock(ptep)
get/update pte
set_pte_at(pte, ptep)

It is unknown if the above race was ever experienced by a user.  It was
discovered via code inspection when initially addressed.

In subsequent patches, a new synchronization mechanism will be added to
coordinate pmd sharing and eliminate this race.

Link: https://lkml.kernel.org/r/20220914221810.95771-3-mike.kravetz@oracle.com
Signed-off-by: Mike Kravetz <mike.kravetz@oracle.com>
Reviewed-by: Miaohe Lin <linmiaohe@huawei.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
Cc: Axel Rasmussen <axelrasmussen@google.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: James Houghton <jthoughton@google.com>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mina Almasry <almasrymina@google.com>
Cc: Muchun Song <songmuchun@bytedance.com>
Cc: Naoya Horiguchi <naoya.horiguchi@linux.dev>
Cc: Pasha Tatashin <pasha.tatashin@soleen.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: Prakash Sangappa <prakash.sangappa@oracle.com>
Cc: Sven Schnelle <svens@linux.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-10-03 14:03:16 -07:00
arch riscv: use vma iterator for vdso 2022-09-26 19:46:26 -07:00
block block-6.0-2022-08-26 2022-08-26 11:05:54 -07:00
certs Kbuild updates for v5.20 2022-08-10 10:40:41 -07:00
crypto crypto: blake2b: effectively disable frame size warning 2022-08-10 17:59:11 -07:00
Documentation mm/huge_memory: prevent THP_ZERO_PAGE_ALLOC increased twice 2022-10-03 14:03:08 -07:00
drivers mm: hugetlb: eliminate memory-less nodes handling 2022-10-03 14:03:15 -07:00
fs hugetlbfs: revert use i_mmap_rwsem for more pmd sharing synchronization 2022-10-03 14:03:16 -07:00
include mm/filemap: make folio_put_wait_locked static 2022-10-03 14:03:15 -07:00
init Maple Tree: add new data structure 2022-09-26 19:46:13 -07:00
io_uring io_uring/net: save address for sendzc async execution 2022-08-25 07:52:30 -06:00
ipc ipc/shm: use VMA iterator instead of linked list 2022-09-26 19:46:21 -07:00
kernel uprobes: use new_folio in __replace_page() 2022-10-03 14:02:55 -07:00
lib mm/hmm/test: use char dev with struct device to get device node 2022-10-03 14:03:03 -07:00
LICENSES LICENSES/LGPL-2.1: Add LGPL-2.1-or-later as valid identifiers 2021-12-16 14:33:10 +01:00
mm hugetlbfs: revert use i_mmap_rwsem for more pmd sharing synchronization 2022-10-03 14:03:16 -07:00
net Including fixes from ipsec and netfilter (with one broken Fixes tag). 2022-08-25 14:03:58 -07:00
samples Tracing updates for 5.20 / 6.0 2022-08-05 09:41:12 -07:00
scripts asm goto: eradicate CC_HAS_ASM_GOTO 2022-08-21 10:06:28 -07:00
security hardening fixes for v6.0-rc2 2022-08-19 13:56:14 -07:00
sound sound fixes for 6.0-rc2 2022-08-19 09:46:11 -07:00
tools selftest/damon: add a test for duplicate context dirs creation 2022-10-03 14:03:06 -07:00
usr Not a lot of material this cycle. Many singleton patches against various 2022-05-27 11:22:03 -07:00
virt KVM: Drop unnecessary initialization of "ops" in kvm_ioctl_create_device() 2022-08-19 04:05:43 -04:00
.clang-format PCI/DOE: Add DOE mailbox support functions 2022-07-19 15:38:04 -07:00
.cocciconfig
.get_maintainer.ignore get_maintainer: add Alan to .get_maintainer.ignore 2022-08-20 15:17:44 -07:00
.gitattributes .gitattributes: use 'dts' diff driver for dts files 2019-12-04 19:44:11 -08:00
.gitignore kbuild: split the second line of *.mod into *.usyms 2022-05-08 03:16:59 +09:00
.mailmap .mailmap: update Luca Ceresoli's e-mail address 2022-08-28 14:02:46 -07:00
COPYING COPYING: state that all contributions really are covered by this file 2020-02-10 13:32:20 -08:00
CREDITS drm for 5.20/6.0 2022-08-03 19:52:08 -07:00
Kbuild kbuild: rename hostprogs-y/always to hostprogs/always-y 2020-02-04 01:53:07 +09:00
Kconfig kbuild: ensure full rebuild when the compiler is updated 2020-05-12 13:28:33 +09:00
MAINTAINERS kasan: move tests to mm/kasan/ 2022-10-03 14:03:02 -07:00
Makefile Linux 6.0-rc3 2022-08-28 15:05:29 -07:00
README Drop all 00-INDEX files from Documentation/ 2018-09-09 15:08:58 -06:00

Linux kernel
============

There are several guides for kernel developers and users. These guides can
be rendered in a number of formats, like HTML and PDF. Please read
Documentation/admin-guide/README.rst first.

In order to build the documentation, use ``make htmldocs`` or
``make pdfdocs``.  The formatted documentation can also be read online at:

    https://www.kernel.org/doc/html/latest/

There are various text files in the Documentation/ subdirectory,
several of them using the Restructured Text markup notation.

Please read the Documentation/process/changes.rst file, as it contains the
requirements for building and running the kernel, and information about
the problems which may result by upgrading your kernel.