linux/include
Dan Williams 1f90a3477d mm: teach pfn_to_online_page() about ZONE_DEVICE section collisions
While pfn_to_online_page() is able to determine pfn_valid() at subsection
granularity it is not able to reliably determine if a given pfn is also
online if the section is mixes ZONE_{NORMAL,MOVABLE} with ZONE_DEVICE.
This means that pfn_to_online_page() may return invalid @page objects.
For example with a memory map like:

100000000-1fbffffff : System RAM
  142000000-143002e16 : Kernel code
  143200000-143713fff : Kernel rodata
  143800000-143b15b7f : Kernel data
  144227000-144ffffff : Kernel bss
1fc000000-2fbffffff : Persistent Memory (legacy)
  1fc000000-2fbffffff : namespace0.0

This command:

echo 0x1fc000000 > /sys/devices/system/memory/soft_offline_page

...succeeds when it should fail.  When it succeeds it touches an
uninitialized page and may crash or cause other damage (see
dissolve_free_huge_page()).

While the memory map above is contrived via the memmap=ss!nn kernel
command line option, the collision happens in practice on shipping
platforms.  The memory controller resources that decode spans of physical
address space are a limited resource.  One technique platform-firmware
uses to conserve those resources is to share a decoder across 2 devices to
keep the address range contiguous.  Unfortunately the unit of operation of
a decoder is 64MiB while the Linux section size is 128MiB.  This results
in situations where, without subsection hotplug memory mappings with
different lifetimes collide into one object that can only express one
lifetime.

Update move_pfn_range_to_zone() to flag (SECTION_TAINT_ZONE_DEVICE) a
section that mixes ZONE_DEVICE pfns with other online pfns.  With
SECTION_TAINT_ZONE_DEVICE to delineate, pfn_to_online_page() can fall back
to a slow-path check for ZONE_DEVICE pfns in an online section.  In the
fast path online_section() for a full ZONE_DEVICE section returns false.

Because the collision case is rare, and for simplicity, the
SECTION_TAINT_ZONE_DEVICE flag is never cleared once set.

[dan.j.williams@intel.com: fix CONFIG_ZONE_DEVICE=n build]
  Link: https://lkml.kernel.org/r/CAPcyv4iX+7LAgAeSqx7Zw-Zd=ZV9gBv8Bo7oTbwCOOqJoZ3+Yg@mail.gmail.com

Link: https://lkml.kernel.org/r/161058500675.1840162.7887862152161279354.stgit@dwillia2-desk3.amr.corp.intel.com
Fixes: ba72b4c8cf ("mm/sparsemem: support sub-section hotplug")
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Reported-by: Michal Hocko <mhocko@suse.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Reported-by: David Hildenbrand <david@redhat.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Oscar Salvador <osalvador@suse.de>
Cc: Naoya Horiguchi <naoya.horiguchi@nec.com>
Cc: Qian Cai <cai@lca.pw>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-02-26 09:41:00 -08:00
..
acpi IOMMU Updates for Linux v5.12 2021-02-22 10:31:29 -08:00
asm-generic Kbuild updates for v5.12 2021-02-25 10:17:31 -08:00
clocksource
crypto Keyrings miscellany 2021-02-23 16:09:23 -08:00
drm
dt-bindings Char/Misc driver patches for 5.12-rc1 2021-02-24 10:25:37 -08:00
keys
kunit
kvm
linux mm: teach pfn_to_online_page() about ZONE_DEVICE section collisions 2021-02-26 09:41:00 -08:00
math-emu
media Fixes around VM_FPNMAP and follow_pfn 2021-02-22 17:45:02 -08:00
memory
misc
net Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-next 2021-02-17 13:19:24 -08:00
pcmcia
ras
rdma RDMA/ipoib: Remove racy Subnet Manager sendonly join checks 2021-02-16 14:42:58 -04:00
scsi
soc IOMMU Updates for Linux v5.12 2021-02-22 10:31:29 -08:00
sound ASoC: Updates for v5.12 2021-02-17 21:16:27 +01:00
target
trace mm/swap.c: don't pass "enum lru_list" to trace_mm_lru_insertion() 2021-02-24 13:38:33 -08:00
uapi Merge branch 'akpm' (patches from Andrew) 2021-02-24 16:20:38 -08:00
vdso
video
xen x86: 2021-02-21 13:31:43 -08:00