mainlining shenanigans
Go to file
David Matlack a3fe5dbda0 KVM: x86/mmu: Split huge pages mapped by the TDP MMU when dirty logging is enabled
When dirty logging is enabled without initially-all-set, try to split
all huge pages in the memslot down to 4KB pages so that vCPUs do not
have to take expensive write-protection faults to split huge pages.

Eager page splitting is best-effort only. This commit only adds the
support for the TDP MMU, and even there splitting may fail due to out
of memory conditions. Failures to split a huge page is fine from a
correctness standpoint because KVM will always follow up splitting by
write-protecting any remaining huge pages.

Eager page splitting moves the cost of splitting huge pages off of the
vCPU threads and onto the thread enabling dirty logging on the memslot.
This is useful because:

 1. Splitting on the vCPU thread interrupts vCPUs execution and is
    disruptive to customers whereas splitting on VM ioctl threads can
    run in parallel with vCPU execution.

 2. Splitting all huge pages at once is more efficient because it does
    not require performing VM-exit handling or walking the page table for
    every 4KiB page in the memslot, and greatly reduces the amount of
    contention on the mmu_lock.

For example, when running dirty_log_perf_test with 96 virtual CPUs, 1GiB
per vCPU, and 1GiB HugeTLB memory, the time it takes vCPUs to write to
all of their memory after dirty logging is enabled decreased by 95% from
2.94s to 0.14s.

Eager Page Splitting is over 100x more efficient than the current
implementation of splitting on fault under the read lock. For example,
taking the same workload as above, Eager Page Splitting reduced the CPU
required to split all huge pages from ~270 CPU-seconds ((2.94s - 0.14s)
* 96 vCPU threads) to only 1.55 CPU-seconds.

Eager page splitting does increase the amount of time it takes to enable
dirty logging since it has split all huge pages. For example, the time
it took to enable dirty logging in the 96GiB region of the
aforementioned test increased from 0.001s to 1.55s.

Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: David Matlack <dmatlack@google.com>
Message-Id: <20220119230739.2234394-16-dmatlack@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-02-10 13:50:42 -05:00
arch KVM: x86/mmu: Split huge pages mapped by the TDP MMU when dirty logging is enabled 2022-02-10 13:50:42 -05:00
block block: bio-integrity: Advance seed correctly for larger interval sizes 2022-02-03 21:09:24 -07:00
certs certs: Fix build error when CONFIG_MODULE_SIG_KEY is empty 2022-01-23 00:08:44 +09:00
crypto lib/crypto: blake2s: avoid indirect calls to compression function for Clang CFI 2022-02-04 19:22:32 +01:00
Documentation KVM: x86/mmu: Split huge pages mapped by the TDP MMU when dirty logging is enabled 2022-02-10 13:50:42 -05:00
drivers - Remove a bogus warning introduced by the recent PCI MSI irq affinity 2022-02-06 10:00:40 -08:00
fs Various bug fixes for ext4 fast commit and inline data handling. Also 2022-02-06 10:34:45 -08:00
include KVM: x86: Add checks for reserved-to-zero Hyper-V hypercall fields 2022-02-10 13:50:36 -05:00
init lib/stackdepot: allow optional init and stack_table allocation by kvmalloc() 2022-01-22 08:33:37 +02:00
ipc ipc/sem: do not sleep with a spin lock held 2022-02-04 09:25:05 -08:00
kernel perf/urgent contains 3 fixups: 2022-02-06 10:11:14 -08:00
lib lib/crypto: blake2s: avoid indirect calls to compression function for Clang CFI 2022-02-04 19:22:32 +01:00
LICENSES LICENSES/LGPL-2.1: Add LGPL-2.1-or-later as valid identifiers 2021-12-16 14:33:10 +01:00
mm mm/kmemleak: avoid scanning potential huge holes 2022-02-04 09:25:05 -08:00
net A patch to make it possible to disable zero copy path in the messenger 2022-02-04 09:54:02 -08:00
samples Merge branch 'akpm' (patches from Andrew) 2022-01-20 10:41:01 +02:00
scripts ftrace: Have architectures opt-in for mcount build time sorting 2022-01-27 19:15:44 -05:00
security selinux/stable-5.17 PR 20220203 2022-02-03 16:44:12 -08:00
sound ASoC: Fixes for v5.17 2022-02-01 16:52:54 +01:00
tools perf tools fixes for v5.17: 1st batch 2022-02-06 10:18:23 -08:00
usr kbuild: remove include/linux/cyclades.h from header file check 2022-01-27 08:51:08 +01:00
virt KVM: Remove unused "kvm" of kvm_make_vcpu_request() 2022-02-10 13:47:15 -05:00
.clang-format genirq/msi: Make interrupt allocation less convoluted 2021-12-16 22:22:20 +01:00
.cocciconfig
.get_maintainer.ignore Opt out of scripts/get_maintainer.pl 2019-05-16 10:53:40 -07:00
.gitattributes .gitattributes: use 'dts' diff driver for dts files 2019-12-04 19:44:11 -08:00
.gitignore .gitignore: ignore only top-level modules.builtin 2021-05-02 00:43:35 +09:00
.mailmap mailmap: update Christian Brauner's email address 2022-02-01 11:21:31 -08:00
COPYING COPYING: state that all contributions really are covered by this file 2020-02-10 13:32:20 -08:00
CREDITS MAINTAINERS: Removing Ohad from remoteproc/rpmsg maintenance 2021-12-08 10:09:40 -07:00
Kbuild kbuild: rename hostprogs-y/always to hostprogs/always-y 2020-02-04 01:53:07 +09:00
Kconfig kbuild: ensure full rebuild when the compiler is updated 2020-05-12 13:28:33 +09:00
MAINTAINERS ata fixes for 5.17-rc3 2022-02-04 11:52:37 -08:00
Makefile Linux 5.17-rc3 2022-02-06 12:20:50 -08:00
README Drop all 00-INDEX files from Documentation/ 2018-09-09 15:08:58 -06:00

Linux kernel
============

There are several guides for kernel developers and users. These guides can
be rendered in a number of formats, like HTML and PDF. Please read
Documentation/admin-guide/README.rst first.

In order to build the documentation, use ``make htmldocs`` or
``make pdfdocs``.  The formatted documentation can also be read online at:

    https://www.kernel.org/doc/html/latest/

There are various text files in the Documentation/ subdirectory,
several of them using the Restructured Text markup notation.

Please read the Documentation/process/changes.rst file, as it contains the
requirements for building and running the kernel, and information about
the problems which may result by upgrading your kernel.