linux/drivers
Jason Gunthorpe 57efa1fe59 mm/gup: prevent gup_fast from racing with COW during fork
Since commit 70e806e4e6 ("mm: Do early cow for pinned pages during
fork() for ptes") pages under a FOLL_PIN will not be write protected
during COW for fork.  This means that pages returned from
pin_user_pages(FOLL_WRITE) should not become write protected while the pin
is active.

However, there is a small race where get_user_pages_fast(FOLL_PIN) can
establish a FOLL_PIN at the same time copy_present_page() is write
protecting it:

        CPU 0                             CPU 1
   get_user_pages_fast()
    internal_get_user_pages_fast()
                                       copy_page_range()
                                         pte_alloc_map_lock()
                                           copy_present_page()
                                             atomic_read(has_pinned) == 0
					     page_maybe_dma_pinned() == false
     atomic_set(has_pinned, 1);
     gup_pgd_range()
      gup_pte_range()
       pte_t pte = gup_get_pte(ptep)
       pte_access_permitted(pte)
       try_grab_compound_head()
                                             pte = pte_wrprotect(pte)
	                                     set_pte_at();
                                         pte_unmap_unlock()
      // GUP now returns with a write protected page

The first attempt to resolve this by using the write protect caused
problems (and was missing a barrrier), see commit f3c64eda3e ("mm: avoid
early COW write protect games during fork()")

Instead wrap copy_p4d_range() with the write side of a seqcount and check
the read side around gup_pgd_range().  If there is a collision then
get_user_pages_fast() fails and falls back to slow GUP.

Slow GUP is safe against this race because copy_page_range() is only
called while holding the exclusive side of the mmap_lock on the src
mm_struct.

[akpm@linux-foundation.org: coding style fixes]
  Link: https://lore.kernel.org/r/CAHk-=wi=iCnYCARbPGjkVJu9eyYeZ13N64tZYLdOB8CP5Q_PLw@mail.gmail.com

Link: https://lkml.kernel.org/r/2-v4-908497cf359a+4782-gup_fork_jgg@nvidia.com
Fixes: f3c64eda3e ("mm: avoid early COW write protect games during fork()")
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Reviewed-by: John Hubbard <jhubbard@nvidia.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Reviewed-by: Peter Xu <peterx@redhat.com>
Acked-by: "Ahmed S. Darwish" <a.darwish@linutronix.de>	[seqcount_t parts]
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Hugh Dickins <hughd@google.com>
Cc: Jann Horn <jannh@google.com>
Cc: Kirill Shutemov <kirill@shutemov.name>
Cc: Kirill Tkhai <ktkhai@virtuozzo.com>
Cc: Leon Romanovsky <leonro@nvidia.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-12-15 12:13:39 -08:00
..
accessibility speakup: Reject setting the speakup line discipline outside of speakup 2020-11-30 09:20:32 +01:00
acpi arm64 fixes for -rc6 2020-11-27 10:44:59 -08:00
amba
android task_work: cleanup notification modes 2020-10-17 15:05:30 -06:00
ata libata-5.10-2020-10-30 2020-10-30 14:51:01 -07:00
atm atm: nicstar: Unmap DMA on send error 2020-11-18 16:42:07 -08:00
auxdisplay
base PM: runtime: Resume the device earlier in __device_release_driver() 2020-11-02 18:14:07 +01:00
bcma bcma: use semicolons rather than commas to separate statements 2020-10-01 16:23:50 +03:00
block xen: add helpers for caching grant mapping pages 2020-12-09 10:31:37 +01:00
bluetooth Bluetooth: btintel: Replace zero-length array with flexible-array member 2020-10-30 16:57:41 -05:00
bus bus: ti-sysc: suppress err msg for timers used as clockevent/source 2020-11-19 11:05:48 +02:00
cdrom
char Char/Misc driver fixes for 5.10-rc4 2020-11-15 10:15:17 -08:00
clk clk: renesas: r9a06g032: Drop __packed for portability 2020-12-07 13:58:49 -08:00
clocksource treewide: Convert macro and uses of __section(foo) to __section("foo") 2020-10-25 14:51:49 -07:00
connector
counter counter/ti-eqep: Fix regmap max_register 2020-11-01 17:17:31 +00:00
cpufreq Merge branch 'cpufreq/arm/fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vireshk/pm 2020-11-23 12:55:01 +01:00
cpuidle cpuidle: tegra: Annotate tegra_pm_set_cpu_in_lp2() with RCU_NONIDLE 2020-11-16 13:24:32 +01:00
crypto crypto: sun8x-ce*: update entries to its documentation 2020-10-28 11:41:15 -06:00
dax device-dax/kmem: use struct_size() 2020-12-15 12:13:38 -08:00
dca
devfreq PM / devfreq: tegra30: Improve initial hardware resetting 2020-09-29 17:50:10 +09:00
dio
dma dmaengine fixes for v5.10-rc5 2020-11-20 10:23:49 -08:00
dma-buf dma-buf: use krealloc_array() 2020-12-15 12:13:37 -08:00
edac edac: ghes: use krealloc_array() 2020-12-15 12:13:37 -08:00
eisa
extcon extcon: axp288: Use module_platform_driver to simplify the code 2020-09-30 00:40:06 +09:00
firewire
firmware mm/gup: prevent gup_fast from racing with COW during fork 2020-12-15 12:13:39 -08:00
fpga fpga: Specify HAS_IOMEM dependency for FPGA_DFL 2020-12-01 18:46:24 +01:00
fsi
gnss
gpio gpio: eic-sprd: break loop when getting NULL device resource 2020-12-09 09:41:49 +01:00
gpu drm: atomic: use krealloc_array() 2020-12-15 12:13:37 -08:00
greybus
hid Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid 2020-11-22 14:36:06 -08:00
hsi
hv hyperv-fixes for 5.10-rc5 2020-11-16 15:02:33 -08:00
hwmon hwmon: (amd_energy) modify the visibility of the counters 2020-11-13 06:46:20 -08:00
hwspinlock
hwtracing hwtracing: intel: use krealloc_array() 2020-12-15 12:13:37 -08:00
i2c i2c: mlxbf: Fix the return check of devm_ioremap and ioremap 2020-12-05 14:52:35 +01:00
i3c * Fix DAA for the pre-reserved address case 2020-10-17 11:01:01 -07:00
ide ide: remove BUG_ON(in_interrupt() || irqs_disabled()) from ide_unregister() 2020-12-15 12:13:36 -08:00
idle intel_idle: Build fix 2020-12-03 10:00:23 +01:00
iio iio: accel: kxcjk1013: Add support for KIOX010A ACPI DSM for setting tablet-mode 2020-11-14 17:33:47 +00:00
infiniband RDMA/cm: Fix an attempt to use non-valid pointer when cleaning timewait 2020-12-09 15:51:35 -04:00
input Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input 2020-12-12 09:41:33 -08:00
interconnect interconnect: fix memory trashing in of_count_icc_providers() 2020-11-20 16:01:35 +02:00
iommu iommu fix for 5.10 2020-12-09 09:59:14 -08:00
ipack
irqchip irqchip fixes for Linux 5.10, take #2 2020-11-25 00:56:28 +01:00
isdn
leds leds: pwm: Remove platform_data support 2020-10-07 12:02:58 +02:00
lightnvm lightnvm: fix out-of-bounds write to array devices->info[] 2020-10-16 09:28:45 -06:00
macintosh powerpc updates for 5.10 2020-10-16 12:21:15 -07:00
mailbox ARM: SoC-related driver updates 2020-10-24 10:39:22 -07:00
mcb
md block-5.10-2020-12-12 2020-12-13 10:36:23 -08:00
media media: vidtv: fix some warnings 2020-12-08 08:15:49 +01:00
memory ARM: SoC-related driver updates 2020-10-24 10:39:22 -07:00
memstick Merge branch 'fixes' into next 2020-09-28 12:17:36 +02:00
message scsi: mptfusion: Fix null pointer dereferences in mptscsih_remove() 2020-10-26 16:57:18 -04:00
mfd - New Drivers 2020-10-14 15:56:58 -07:00
misc at24 fixes for v5.10 2020-12-11 23:23:30 +01:00
mmc mmc: mediatek: mark PM functions as __maybe_unused 2020-12-04 15:35:54 +01:00
most
mtd mtd: rawnand: xway: Do not force a particular software ECC engine 2020-12-11 20:10:02 +01:00
mux
net Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf 2020-12-10 14:29:30 -08:00
nfc nfc: s3fwrn5: use signed integer for parsing GPIO numbers 2020-11-24 15:00:53 -08:00
ntb Bug fixes for v5.10 2020-10-25 11:12:31 -07:00
nubus
nvdimm mm/memremap_pages: support multiple ranges per invocation 2020-10-13 18:38:28 -07:00
nvme nvme: fix memory leak freeing command effects 2020-11-14 09:57:55 +01:00
nvmem nvmem: core: fix possibly memleak when use nvmem_cell_info_to_nvmem_cell() 2020-09-27 14:25:48 +02:00
of of/address: Fix of_node memory leak in of_dma_is_coherent 2020-11-11 17:10:16 -06:00
opp opp: Reduce the size of critical section in _opp_table_kref_release() 2020-10-27 13:21:03 +05:30
oprofile
parisc dma-mapping: split <linux/dma-mapping.h> 2020-10-06 07:07:03 +02:00
parport
pci PCI: mvebu: Fix duplicate resource requests 2020-11-04 13:55:30 -06:00
pcmcia
perf perf: arm-cmn: Fix conversion specifiers for node type 2020-10-01 22:30:07 +01:00
phy phy: mediatek: fix spelling mistake in Kconfig "veriosn" -> "version" 2020-11-16 13:21:28 +05:30
pinctrl pinctrl: use krealloc_array() 2020-12-15 12:13:37 -08:00
platform platform/x86: touchscreen_dmi: Add info for the Irbis TW118 tablet 2020-11-26 15:49:16 +01:00
pnp PNP: fix kernel-doc markups 2020-10-27 19:23:04 +01:00
power ARM: SoC platform updates 2020-10-24 10:33:08 -07:00
powercap Merge branch 'turbostat' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux 2020-11-10 10:02:31 -08:00
pps
ps3
ptp ptp: clockmatrix: bug fix for idtcm_strverscmp 2020-11-25 17:24:49 -08:00
pwm pwm: sl28cpld: fix getting driver data in pwm callbacks 2020-12-03 09:57:37 -08:00
rapidio rapidio: fix the missed put_device() for rio_mport_add_riodev 2020-10-16 11:11:22 -07:00
ras RAS/CEC: Convert to DEFINE_SHOW_ATTRIBUTE() 2020-09-25 19:05:31 +02:00
regulator regulator: ti-abb: Fix array out of bound read access on the first transition 2020-11-18 17:59:24 +00:00
remoteproc remoteproc updates for v5.10 2020-10-22 12:56:33 -07:00
reset ARM: SoC-related driver updates 2020-10-24 10:39:22 -07:00
rpmsg rpmsg updates for 5.10 2020-10-22 12:58:21 -07:00
rtc RTC for 5.10 2020-10-21 11:22:08 -07:00
s390 Networking fixes for 5.10-rc6, including fixes from the WiFi driver, 2020-11-27 14:38:02 -08:00
sbus
scsi SCSI fixes on 20201212 2020-12-12 12:57:12 -08:00
sfi
sh
siox
slimbus slimbus: qcom-ngd-ctrl: disable ngd in qmi server down callback 2020-09-25 14:41:51 +02:00
soc NXP/FSL SoC driver fix for 5.10 2020-11-26 22:07:22 +01:00
soundwire soundwire updates for 5.10-rc1 2020-10-01 22:59:55 +02:00
spi spi: dw: Fix spi registration for controllers overriding CS 2020-11-25 12:54:05 +00:00
spmi
ssb
staging media fixes for v5.10-rc6 2020-11-25 10:35:44 -08:00
target SCSI fixes on 20201120 2020-11-20 16:24:28 -08:00
tc
tee ARM: SoC fixes for v5.10, part 3 2020-11-27 14:48:03 -08:00
thermal thermal: ti-soc-thermal: Disable the CPU PM notifier for OMAP4430 2020-11-12 12:30:29 +01:00
thunderbolt thunderbolt: Fix use-after-free in remove_unplugged_switch() 2020-11-19 17:44:10 +03:00
tty tty: Fix ->session locking 2020-12-04 17:39:58 +01:00
uio uio: Fix use-after-free in uio_unregister_device() 2020-11-09 18:54:30 +01:00
usb usb: gadget: f_fs: Use local copy of descriptors for userspace copy 2020-12-04 16:09:10 +01:00
vdpa vdpa: mlx5: fix vdpa/vhost dependencies 2020-12-02 04:09:56 -05:00
vfio vfio/pci: Bypass IGD init in case of -ENODEV 2020-11-03 11:07:40 -07:00
vhost vhost: vringh: use krealloc_array() 2020-12-15 12:13:37 -08:00
video hyperv-fixes for 5.10-rc6 2020-11-23 15:29:03 -08:00
virt nitro_enclaves: Fixup type and simplify logic of the poll mask setup 2020-11-09 18:20:36 +01:00
virtio vhost,vdpa,virtio: cleanups, fixes 2020-10-23 11:00:57 -07:00
visorbus
vlynq
vme
w1 w1: w1_therm: make w1_poll_completion static 2020-10-05 14:49:24 +02:00
watchdog ARM: SoC platform updates 2020-10-24 10:33:08 -07:00
xen xen: don't use page->lru for ZONE_DEVICE memory 2020-12-09 10:31:41 +01:00
zorro
Kconfig
Makefile vdpa: mlx5: fix vdpa/vhost dependencies 2020-12-02 04:09:56 -05:00