linux/mm
Ben Widawsky b7e3debdd0 mm/memory_hotplug.c: fix false softlockup during pfn range removal
When working with very large nodes, poisoning the struct pages (for which
there will be very many) can take a very long time.  If the system is
using voluntary preemptions, the software watchdog will not be able to
detect forward progress.  This patch addresses this issue by offering to
give up time like __remove_pages() does.  This behavior was introduced in
v5.6 with: commit d33695b16a ("mm/memory_hotplug: poison memmap in
remove_pfn_range_from_zone()")

Alternately, init_page_poison could do this cond_resched(), but it seems
to me that the caller of init_page_poison() is what actually knows whether
or not it should relax its own priority.

Based on Dan's notes, I think this is perfectly safe: commit f931ab479d
("mm: fix devm_memremap_pages crash, use mem_hotplug_{begin, done}")

Aside from fixing the lockup, it is also a friendlier thing to do on lower
core systems that might wipe out large chunks of hotplug memory (probably
not a very common case).

Fixes this kind of splat:

  watchdog: BUG: soft lockup - CPU#46 stuck for 22s! [daxctl:9922]
  irq event stamp: 138450
  hardirqs last  enabled at (138449): [<ffffffffa1001f26>] trace_hardirqs_on_thunk+0x1a/0x1c
  hardirqs last disabled at (138450): [<ffffffffa1001f42>] trace_hardirqs_off_thunk+0x1a/0x1c
  softirqs last  enabled at (138448): [<ffffffffa1e00347>] __do_softirq+0x347/0x456
  softirqs last disabled at (138443): [<ffffffffa10c416d>] irq_exit+0x7d/0xb0
  CPU: 46 PID: 9922 Comm: daxctl Not tainted 5.7.0-BEN-14238-g373c6049b336 #30
  Hardware name: Intel Corporation PURLEY/PURLEY, BIOS PLYXCRB1.86B.0578.D07.1902280810 02/28/2019
  RIP: 0010:memset_erms+0x9/0x10
  Code: c1 e9 03 40 0f b6 f6 48 b8 01 01 01 01 01 01 01 01 48 0f af c6 f3 48 ab 89 d1 f3 aa 4c 89 c8 c3 90 49 89 f9 40 88 f0 48 89 d1 <f3> aa 4c 89 c8 c3 90 49 89 fa 40 0f b6 ce 48 b8 01 01 01 01 01 01
  Call Trace:
   remove_pfn_range_from_zone+0x3a/0x380
   memunmap_pages+0x17f/0x280
   release_nodes+0x22a/0x260
   __device_release_driver+0x172/0x220
   device_driver_detach+0x3e/0xa0
   unbind_store+0x113/0x130
   kernfs_fop_write+0xdc/0x1c0
   vfs_write+0xde/0x1d0
   ksys_write+0x58/0xd0
   do_syscall_64+0x5a/0x120
   entry_SYSCALL_64_after_hwframe+0x49/0xb3
  Built 2 zonelists, mobility grouping on.  Total pages: 49050381
  Policy zone: Normal
  Built 3 zonelists, mobility grouping on.  Total pages: 49312525
  Policy zone: Normal

David said: "It really only is an issue for devmem.  Ordinary
hotplugged system memory is not affected (onlined/offlined in memory
block granularity)."

Link: http://lkml.kernel.org/r/20200619231213.1160351-1-ben.widawsky@intel.com
Fixes: commit d33695b16a ("mm/memory_hotplug: poison memmap in remove_pfn_range_from_zone()")
Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
Reported-by: "Scargall, Steve" <steve.scargall@intel.com>
Reported-by: Ben Widawsky <ben.widawsky@intel.com>
Acked-by: David Hildenbrand <david@redhat.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Vishal Verma <vishal.l.verma@intel.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-06-26 00:27:38 -07:00
..
kasan mm: remove __ARCH_HAS_5LEVEL_HACK and include/asm-generic/5level-fixup.h 2020-06-04 19:06:21 -07:00
backing-dev.c bdi: remove the name field in struct backing_dev_info 2020-05-09 16:15:13 -06:00
balloon_compaction.c
cleancache.c
cma_debug.c mm/cma_debug.c: use DEFINE_DEBUGFS_ATTRIBUTE to define debugfs fops 2019-12-01 12:59:09 -08:00
cma.c mm: cma: NUMA node interface 2020-04-10 15:36:21 -07:00
cma.h
compaction.c mm, compaction: make capture control handling safe wrt interrupts 2020-06-26 00:27:36 -07:00
debug_page_ref.c
debug_vm_pgtable.c mm/debug_vm_pgtable: fix build failure with powerpc 8xx 2020-06-26 00:27:37 -07:00
debug.c maccess: rename probe_kernel_{read,write} to copy_{from,to}_kernel_nofault 2020-06-17 10:57:41 -07:00
dmapool.c mm/dmapool.c: micro-optimisation remove unnecessary branch 2020-04-07 10:43:42 -07:00
early_ioremap.c mm/early_ioremap.c: use %pa to print resource_size_t variables 2020-01-31 10:30:38 -08:00
fadvise.c mm: return void from various readahead functions 2020-06-02 10:59:06 -07:00
failslab.c
filemap.c mmap locking API: convert mmap_sem comments 2020-06-09 09:39:14 -07:00
frame_vector.c mmap locking API: convert mmap_sem comments 2020-06-09 09:39:14 -07:00
frontswap.c mm/frontswap: fix some typos in frontswap.c 2020-06-04 19:06:24 -07:00
gup_benchmark.c mm/gup_benchmark: support pin_user_pages() and related calls 2020-04-02 09:35:27 -07:00
gup.c mm: Allow arches to provide ptep_get() 2020-06-20 22:14:53 +10:00
highmem.c mm, x86/mm: Untangle address space layout definitions from basic pgtable type definitions 2019-12-10 10:12:55 +01:00
hmm.c mmap locking API: add mmap_assert_locked() and mmap_assert_write_locked() 2020-06-09 09:39:14 -07:00
huge_memory.c mmap locking API: convert mmap_sem comments 2020-06-09 09:39:14 -07:00
hugetlb_cgroup.c mm: use fallthrough; 2020-04-07 10:43:41 -07:00
hugetlb.c mmap locking API: convert mmap_sem comments 2020-06-09 09:39:14 -07:00
hwpoison-inject.c mm/hwpoison-inject: use DEFINE_DEBUGFS_ATTRIBUTE to define debugfs fops 2019-12-01 12:59:09 -08:00
init-mm.c mmap locking API: add MMAP_LOCK_INITIALIZER 2020-06-09 09:39:14 -07:00
internal.h mmap locking API: convert mmap_sem comments 2020-06-09 09:39:14 -07:00
interval_tree.c
Kconfig Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc-next 2020-06-07 17:25:29 -07:00
Kconfig.debug treewide: replace '---help---' in Kconfig files with 'help' 2020-06-14 01:57:21 +09:00
khugepaged.c mmap locking API: convert mmap_sem comments 2020-06-09 09:39:14 -07:00
kmemleak-test.c
kmemleak.c mm/kmemleak.c: use address-of operator on section symbols 2020-04-02 09:35:26 -07:00
ksm.c mmap locking API: convert mmap_sem comments 2020-06-09 09:39:14 -07:00
list_lru.c mm/list_lru: fix a typo in comment "numbesr"->"numbers" 2020-06-04 19:06:24 -07:00
maccess.c maccess: rename probe_user_{read,write} to copy_{from,to}_user_nofault 2020-06-17 10:57:41 -07:00
madvise.c mmap locking API: convert mmap_sem comments 2020-06-09 09:39:14 -07:00
Makefile The Kernel Concurrency Sanitizer (KCSAN) 2020-06-11 18:55:43 -07:00
mapping_dirty_helpers.c mm/mapping_dirty_helpers: update huge page-table entry callbacks 2020-04-02 09:35:29 -07:00
memblock.c mm/memblock: fix a typo in comment "implict"->"implicit" 2020-06-04 19:06:23 -07:00
memcontrol.c mm/memcontrol.c: prevent missed memory.low load tears 2020-06-26 00:27:37 -07:00
memfd.c
memory_hotplug.c mm/memory_hotplug.c: fix false softlockup during pfn range removal 2020-06-26 00:27:38 -07:00
memory-failure.c mm/memory-failure: send SIGBUS(BUS_MCEERR_AR) only to current thread 2020-06-11 18:17:47 -07:00
memory.c mm/memory: fix IO cost for anonymous page 2020-06-26 00:27:38 -07:00
mempolicy.c mmap locking API: convert mmap_sem comments 2020-06-09 09:39:14 -07:00
mempool.c
memremap.c mm/memremap: set caching mode for PCI P2PDMA memory to WC 2020-04-10 15:36:21 -07:00
memtest.c
migrate.c mmap locking API: convert mmap_sem comments 2020-06-09 09:39:14 -07:00
mincore.c mmap locking API: use coccinelle to convert mmap_sem rwsem call sites 2020-06-09 09:39:14 -07:00
mlock.c mmap locking API: convert mmap_sem comments 2020-06-09 09:39:14 -07:00
mm_init.c mm/mm_init.c: report kasan-tag information stored in page->flags 2020-06-02 10:59:12 -07:00
mmap.c mmap locking API: convert mmap_sem comments 2020-06-09 09:39:14 -07:00
mmu_gather.c mmap locking API: convert mmap_sem comments 2020-06-09 09:39:14 -07:00
mmu_notifier.c mmap locking API: convert mmap_sem comments 2020-06-09 09:39:14 -07:00
mmzone.c
mprotect.c mmap locking API: convert mmap_sem comments 2020-06-09 09:39:14 -07:00
mremap.c mmap locking API: convert mmap_sem comments 2020-06-09 09:39:14 -07:00
msync.c mmap locking API: use coccinelle to convert mmap_sem rwsem call sites 2020-06-09 09:39:14 -07:00
nommu.c mm: remove vmalloc_exec 2020-06-26 00:27:38 -07:00
oom_kill.c kernel: better document the use_mm/unuse_mm API contract 2020-06-10 19:14:18 -07:00
page_alloc.c virtio: features, fixes 2020-06-10 13:42:09 -07:00
page_counter.c mm, memcg: prevent memory.min load/store tearing 2020-04-02 09:35:29 -07:00
page_ext.c mm/page_ext.c: drop pfn_present() check when onlining 2020-04-07 10:43:40 -07:00
page_idle.c mm/page_idle.c: skip offline pages 2020-06-08 11:05:55 -07:00
page_io.c mm: don't include asm/pgtable.h if linux/mm.h is already included 2020-06-09 09:39:13 -07:00
page_isolation.c mm: Allow to offline unmovable PageOffline() pages via MEM_GOING_OFFLINE 2020-06-04 15:36:52 -04:00
page_owner.c mm: rename gfpflags_to_migratetype to gfp_migratetype for same convention 2020-06-03 20:09:45 -07:00
page_poison.c
page_reporting.c mm/page_reporting: add budget limit on how many pages can be reported per pass 2020-04-07 10:43:39 -07:00
page_reporting.h mm: introduce include/linux/pgtable.h 2020-06-09 09:39:13 -07:00
page_vma_mapped.c mm/page_vma_mapped.c: explicitly compare pfn for normal, hugetlbfs and THP page 2020-01-31 10:30:38 -08:00
page-writeback.c mm/page-writeback: fix a typo in comment "effictive"->"effective" 2020-06-04 19:06:24 -07:00
pagewalk.c mmap locking API: convert mmap_sem comments 2020-06-09 09:39:14 -07:00
percpu-internal.h
percpu-km.c
percpu-stats.c percpu: update copyright emails to dennis@kernel.org 2020-04-01 10:09:12 -07:00
percpu-vm.c
percpu.c mm: remove the pgprot argument to __vmalloc 2020-06-02 10:59:11 -07:00
pgtable-generic.c mm: introduce include/linux/pgtable.h 2020-06-09 09:39:13 -07:00
process_vm_access.c mmap locking API: use coccinelle to convert mmap_sem rwsem call sites 2020-06-09 09:39:14 -07:00
ptdump.c mmap locking API: use coccinelle to convert mmap_sem rwsem call sites 2020-06-09 09:39:14 -07:00
readahead.c mm: use memalloc_nofs_save in readahead path 2020-06-02 10:59:07 -07:00
rmap.c mmap locking API: convert mmap_sem comments 2020-06-09 09:39:14 -07:00
rodata_test.c maccess: rename probe_kernel_{read,write} to copy_{from,to}_kernel_nofault 2020-06-17 10:57:41 -07:00
shmem.c mmap locking API: convert mmap_sem comments 2020-06-09 09:39:14 -07:00
shuffle.c mm: adjust shuffle code to allow for future coalescing 2020-04-07 10:43:38 -07:00
shuffle.h mm: adjust shuffle code to allow for future coalescing 2020-04-07 10:43:38 -07:00
slab_common.c mm/slab: use memzero_explicit() in kzfree() 2020-06-26 00:27:37 -07:00
slab.c mm/page_alloc: integrate classzone_idx and high_zoneidx 2020-06-03 20:09:44 -07:00
slab.h mm, slab: fix sign conversion problem in memcg_uncharge_slab() 2020-06-26 00:27:37 -07:00
slob.c mm/sl[uo]b: export __kmalloc_track(_node)_caller 2020-03-26 14:45:51 +01:00
slub.c slub: cure list_slab_objects() from double fix 2020-06-26 00:27:37 -07:00
sparse-vmemmap.c mm: don't include asm/pgtable.h if linux/mm.h is already included 2020-06-09 09:39:13 -07:00
sparse.c mm: don't include asm/pgtable.h if linux/mm.h is already included 2020-06-09 09:39:13 -07:00
swap_cgroup.c mm: memcontrol: make swap tracking an integral part of memory control 2020-06-03 20:09:48 -07:00
swap_slots.c mm/swap_slots.c: assign|reset cache slot by value directly 2020-04-02 09:35:27 -07:00
swap_state.c mm: fix swap cache node allocation mask 2020-06-26 00:27:37 -07:00
swap.c mm/swap: fix for "mm: workingset: age nonresident information alongside anonymous pages" 2020-06-26 00:27:38 -07:00
swapfile.c mmap locking API: use coccinelle to convert mmap_sem rwsem call sites 2020-06-09 09:39:14 -07:00
truncate.c mm/thp: allow dropping THP from page cache 2019-10-19 06:32:33 -04:00
usercopy.c
userfaultfd.c mmap locking API: convert mmap_sem comments 2020-06-09 09:39:14 -07:00
util.c mmap locking API: convert mmap_sem comments 2020-06-09 09:39:14 -07:00
vmacache.c kernel: better document the use_mm/unuse_mm API contract 2020-06-10 19:14:18 -07:00
vmalloc.c mm: remove vmalloc_exec 2020-06-26 00:27:38 -07:00
vmpressure.c mm: vmpressure: use mem_cgroup_is_root API 2020-04-02 09:35:31 -07:00
vmscan.c mm: workingset: age nonresident information alongside anonymous pages 2020-06-26 00:27:37 -07:00
vmstat.c mm/vmstat.c: convert to use DEFINE_SEQ_ATTRIBUTE macro 2020-06-04 19:06:26 -07:00
workingset.c mm: workingset: age nonresident information alongside anonymous pages 2020-06-26 00:27:37 -07:00
z3fold.c mm/z3fold: silence kmemleak false positives of slots 2020-05-28 11:35:40 -07:00
zbud.c mm: use false for bool variable 2020-06-04 19:06:24 -07:00
zpool.c
zsmalloc.c mm: reorder includes after introduction of linux/pgtable.h 2020-06-09 09:39:13 -07:00
zswap.c mm/zswap: allow setting default status, compressor and allocator in Kconfig 2020-04-07 10:43:41 -07:00