Anshuman Khandual
91d4ce985f
sparc/mm: enable ARCH_HAS_VM_GET_PAGE_PROT
...
This defines and exports a platform specific custom vm_get_page_prot() via
subscribing ARCH_HAS_VM_GET_PAGE_PROT. It localizes
arch_vm_get_page_prot() as sparc_vm_get_page_prot() and moves near
vm_get_page_prot().
Link: https://lkml.kernel.org/r/20220414062125.609297-5-anshuman.khandual@arm.com
Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com >
Acked-by: David S. Miller <davem@davemloft.net >
Reviewed-by: Khalid Aziz <khalid.aziz@oracle.com >
Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu >
Cc: Khalid Aziz <khalid.aziz@oracle.com >
Cc: Catalin Marinas <catalin.marinas@arm.com >
Cc: Christoph Hellwig <hch@infradead.org >
Cc: Ingo Molnar <mingo@redhat.com >
Cc: Michael Ellerman <mpe@ellerman.id.au >
Cc: Paul Mackerras <paulus@samba.org >
Cc: Thomas Gleixner <tglx@linutronix.de >
Cc: Will Deacon <will@kernel.org >
Signed-off-by: Andrew Morton <akpm@linux-foundation.org >
2022-04-28 23:16:13 -07:00
Anshuman Khandual
b3aca728fb
arm64/mm: enable ARCH_HAS_VM_GET_PAGE_PROT
...
This defines and exports a platform specific custom vm_get_page_prot() via
subscribing ARCH_HAS_VM_GET_PAGE_PROT. It localizes arch_vm_get_page_prot()
and moves it near vm_get_page_prot().
Link: https://lkml.kernel.org/r/20220414062125.609297-4-anshuman.khandual@arm.com
Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com >
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com >
Cc: Catalin Marinas <catalin.marinas@arm.com >
Cc: Will Deacon <will@kernel.org >
Cc: Christophe Leroy <christophe.leroy@csgroup.eu >
Cc: Christoph Hellwig <hch@infradead.org >
Cc: David S. Miller <davem@davemloft.net >
Cc: Ingo Molnar <mingo@redhat.com >
Cc: Khalid Aziz <khalid.aziz@oracle.com >
Cc: Michael Ellerman <mpe@ellerman.id.au >
Cc: Paul Mackerras <paulus@samba.org >
Cc: Thomas Gleixner <tglx@linutronix.de >
Signed-off-by: Andrew Morton <akpm@linux-foundation.org >
2022-04-28 23:16:13 -07:00
Anshuman Khandual
634093c59a
powerpc/mm: enable ARCH_HAS_VM_GET_PAGE_PROT
...
This defines and exports a platform specific custom vm_get_page_prot() via
subscribing ARCH_HAS_VM_GET_PAGE_PROT. While here, this also localizes
arch_vm_get_page_prot() as __vm_get_page_prot() and moves it near
vm_get_page_prot().
Link: https://lkml.kernel.org/r/20220414062125.609297-3-anshuman.khandual@arm.com
Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com >
Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu >
Cc: Michael Ellerman <mpe@ellerman.id.au >
Cc: Paul Mackerras <paulus@samba.org >
Cc: Catalin Marinas <catalin.marinas@arm.com >
Cc: Christoph Hellwig <hch@infradead.org >
Cc: David S. Miller <davem@davemloft.net >
Cc: Ingo Molnar <mingo@redhat.com >
Cc: Khalid Aziz <khalid.aziz@oracle.com >
Cc: Thomas Gleixner <tglx@linutronix.de >
Cc: Will Deacon <will@kernel.org >
Signed-off-by: Andrew Morton <akpm@linux-foundation.org >
2022-04-28 23:16:13 -07:00
Anshuman Khandual
67436193c2
mm/mmap: add new config ARCH_HAS_VM_GET_PAGE_PROT
...
Patch series "mm/mmap: Drop arch_vm_get_page_prot() and arch_filter_pgprot()", v7.
protection_map[] is an array based construct that translates given
vm_flags combination. This array contains page protection map, which is
populated by the platform via [__S000 .. __S111] and [__P000 .. __P111]
exported macros. Primary usage for protection_map[] is for
vm_get_page_prot(), which is used to determine page protection value for a
given vm_flags. vm_get_page_prot() implementation, could again call
platform overrides arch_vm_get_page_prot() and arch_filter_pgprot(). Some
platforms override protection_map[] that was originally built with
__SXXX/__PXXX with different runtime values.
Currently there are multiple layers of abstraction i.e __SXXX/__PXXX
macros , protection_map[], arch_vm_get_page_prot() and
arch_filter_pgprot() built between the platform and generic MM, finally
defining vm_get_page_prot().
Hence this series proposes to drop later two abstraction levels and
instead just move the responsibility of defining vm_get_page_prot() to the
platform (still utilizing generic protection_map[] array) itself making it
clean and simple.
This first introduces ARCH_HAS_VM_GET_PAGE_PROT which enables the
platforms to define custom vm_get_page_prot(). This starts converting
platforms that define the overrides arch_filter_pgprot() or
arch_vm_get_page_prot() which enables for those constructs to be dropped
off completely.
The series has been inspired from an earlier discuss with Christoph Hellwig
https://lore.kernel.org/all/1632712920-8171-1-git-send-email-anshuman.khandual@arm.com/
This patch (of 7):
Add a new config ARCH_HAS_VM_GET_PAGE_PROT, which when subscribed enables
a given platform to define its own vm_get_page_prot() but still utilizing
the generic protection_map[] array.
Link: https://lkml.kernel.org/r/20220414062125.609297-1-anshuman.khandual@arm.com
Link: https://lkml.kernel.org/r/20220414062125.609297-2-anshuman.khandual@arm.com
Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com >
Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu >
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com >
Suggested-by: Christoph Hellwig <hch@infradead.org >
Cc: David S. Miller <davem@davemloft.net >
Cc: Ingo Molnar <mingo@redhat.com >
Cc: Khalid Aziz <khalid.aziz@oracle.com >
Cc: Michael Ellerman <mpe@ellerman.id.au >
Cc: Paul Mackerras <paulus@samba.org >
Cc: Thomas Gleixner <tglx@linutronix.de >
Cc: Will Deacon <will@kernel.org >
Signed-off-by: Andrew Morton <akpm@linux-foundation.org >
2022-04-28 23:16:12 -07:00
Miaohe Lin
c5d8a3643d
mm/mmap.c: use helper mlock_future_check()
...
Use helper mlock_future_check() to check whether it's safe to enlarge the
locked_vm to simplify the code. Minor readability improvement.
Link: https://lkml.kernel.org/r/20220402032231.64974-1-linmiaohe@huawei.com
Signed-off-by: Miaohe Lin <linmiaohe@huawei.com >
Signed-off-by: Andrew Morton <akpm@linux-foundation.org >
2022-04-28 23:16:12 -07:00
Anshuman Khandual
6c862bd059
mm/mmap: clarify protection_map[] indices
...
protection_map[] maps vm_flags access combinations into page protection
value as defined by the platform via __PXXX and __SXXX macros. The array
indices in protection_map[], represents vm_flags access combinations but
it's not very intuitive to derive. This makes it clear and explicit.
Link: https://lkml.kernel.org/r/20220404031840.588321-3-anshuman.khandual@arm.com
Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com >
Reviewed-by: Christoph Hellwig <hch@lst.de >
Signed-off-by: Andrew Morton <akpm@linux-foundation.org >
2022-04-28 23:16:12 -07:00
Anshuman Khandual
31d17076b0
mm/debug_vm_pgtable: drop protection_map[] usage
...
Patch series "mm: protection_map[] cleanups".
This patch (of 2):
Although protection_map[] contains the platform defined page protection
map for a given vm_flags combination, vm_get_page_prot() is the right
interface to use. This will also reduce dependency on protection_map[]
which is going to be dropped off completely later on.
Link: https://lkml.kernel.org/r/20220404031840.588321-1-anshuman.khandual@arm.com
Link: https://lkml.kernel.org/r/20220404031840.588321-2-anshuman.khandual@arm.com
Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com >
Reviewed-by: Christoph Hellwig <hch@lst.de >
Signed-off-by: Andrew Morton <akpm@linux-foundation.org >
2022-04-28 23:16:12 -07:00
Jianxing Wang
b191c9bc33
mm/mmu_gather: limit free batch count and add schedule point in tlb_batch_pages_flush
...
free a large list of pages maybe cause rcu_sched starved on
non-preemptible kernels. howerver free_unref_page_list maybe can't
cond_resched as it maybe called in interrupt or atomic context, especially
can't detect atomic context in CONFIG_PREEMPTION=n.
The issue is detected in guest with kvm cpu 200% overcommit, however I
didn't see the warning in the host with the same application. I'm sure
that the patch is needed for guest kernel, but no sure for host.
To reproduce, set up two virtual machines in one host machine, per vm has
the same number cpu and half memory of host. the run ltpstress.sh in per
vm, then will see rcu stall warning.kernel is preempt disabled, append
kernel command 'preempt=none' if enable dynamic preempt . It could
detected in loongson machine(32 core, 128G mem) and ProLiant DL380
Gen9(x86 E5-2680, 28 core, 64G mem)
tlb flush batch count depends on PAGE_SIZE, it's too large if PAGE_SIZE >
4K, here limit free batch count with 512. And add schedule point in
tlb_batch_pages_flush.
rcu: rcu_sched kthread starved for 5359 jiffies! g454793 f0x0
RCU_GP_WAIT_FQS(5) ->state=0x0 ->cpu=19
[...]
Call Trace:
free_unref_page_list+0x19c/0x270
release_pages+0x3cc/0x498
tlb_flush_mmu_free+0x44/0x70
zap_pte_range+0x450/0x738
unmap_page_range+0x108/0x240
unmap_vmas+0x74/0xf0
unmap_region+0xb0/0x120
do_munmap+0x264/0x438
vm_munmap+0x58/0xa0
sys_munmap+0x10/0x20
syscall_common+0x24/0x38
Link: https://lkml.kernel.org/r/20220317072857.2635262-1-wangjianxing@loongson.cn
Signed-off-by: Jianxing Wang <wangjianxing@loongson.cn >
Signed-off-by: Peter Zijlstra <peterz@infradead.org >
Cc: Will Deacon <will@kernel.org >
Cc: Nicholas Piggin <npiggin@gmail.com >
Signed-off-by: Andrew Morton <akpm@linux-foundation.org >
2022-04-28 23:16:12 -07:00
Rolf Eike Beer
325bca1fe0
mm/mmap.c: use mmap_assert_write_locked() instead of open coding it
...
In case the lock is actually not held at this point.
Link: https://lkml.kernel.org/r/5827758.TJ1SttVevJ@mobilepool36.emlix.com
Signed-off-by: Rolf Eike Beer <eb@emlix.com >
Reviewed-by: Christoph Hellwig <hch@lst.de >
Signed-off-by: Andrew Morton <akpm@linux-foundation.org >
2022-04-28 23:16:11 -07:00
Axel Rasmussen
241ec63a9a
selftests: vm: fix shellcheck warnings in run_vmtests.sh
...
These might not be issues yet, but they make the script more fragile.
Also by fixing them we give a better example to future readers, who might
copy/paste or otherwise re-use snippets from our script.
- Use "read -r", since we don't ever want read to be interpreting '\'
characters as escape sequences...
- Quote variables, to deal with spaces properly.
- Use $() instead of the older and harder-to-nest ``.
- Get rid of superfluous "$" prefixes inside arithmetic $(()).
Link: https://lkml.kernel.org/r/20220421224928.1848230-2-axelrasmussen@google.com
Signed-off-by: Axel Rasmussen <axelrasmussen@google.com >
Cc: Shuah Khan <shuah@kernel.org >
Signed-off-by: Andrew Morton <akpm@linux-foundation.org >
2022-04-28 23:16:11 -07:00
Axel Rasmussen
b67bd55120
selftests: vm: refactor run_vmtests.sh to reduce boilerplate
...
Previously, each test printed out its own header, dealt with its own
return code, etc. By just putting this standard stuff in a function, we
can delete > 300 lines from the script.
This also makes adding future tests easier. And, it gets rid of various
inconsistencies that already exist:
- Some tests correctly deal with ksft_skip, but others don't.
- Some tests just print the executable name, others print arguments, and
yet others print some comment in the header.
- Most tests print out a header with two separator lines, but not the
HMM smoke test or the memfd_secret test, which only print one.
- We had a redundant "exit" at the end, with all the boilerplate it's an
easy oversight.
Link: https://lkml.kernel.org/r/20220421224928.1848230-1-axelrasmussen@google.com
Signed-off-by: Axel Rasmussen <axelrasmussen@google.com >
Cc: Shuah Khan <shuah@kernel.org >
Signed-off-by: Andrew Morton <akpm@linux-foundation.org >
2022-04-28 23:16:11 -07:00
Gabriel Krisman Bertazi
9f3265db6a
selftests: vm: add test for Soft-Dirty PTE bit
...
This introduces three tests:
1) Sanity check soft dirty basic semantics: allocate area, clean,
dirty, check if the SD bit is flipped.
2) Check VMA reuse: validate the VM_SOFTDIRTY usage
3) Check soft-dirty on huge pages
This was motivated by Will Deacon's fix commit 912efa17e5 ("mm: proc:
Invalidate TLB after clearing soft-dirty page state"). I was tracking the
same issue that he fixed, and this test would have caught it.
Link: https://lkml.kernel.org/r/20220420084036.4101604-2-usama.anjum@collabora.com
Signed-off-by: Gabriel Krisman Bertazi <krisman@collabora.com >
Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com >
Co-developed-by: Muhammad Usama Anjum <usama.anjum@collabora.com >
Cc: Will Deacon <will@kernel.org >
Cc: Shuah Khan <shuah@kernel.org >
Signed-off-by: Andrew Morton <akpm@linux-foundation.org >
2022-04-28 23:16:11 -07:00
Muhammad Usama Anjum
642bc52aed
selftests: vm: bring common functions to a new file
...
Bring common functions to a new file while keeping code as much same as
possible. These functions can be used in the new tests. This helps in
avoiding code duplication.
Link: https://lkml.kernel.org/r/20220420084036.4101604-1-usama.anjum@collabora.com
Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com >
Acked-by: David Hildenbrand <david@redhat.com >
Cc: Gabriel Krisman Bertazi <krisman@collabora.com >
Cc: Shuah Khan <shuah@kernel.org >
Cc: Will Deacon <will@kernel.org >
Signed-off-by: Andrew Morton <akpm@linux-foundation.org >
2022-04-28 23:16:11 -07:00
Sidhartha Kumar
62e80f2b50
tools/testing/selftests/vm/gup_test.c: clarify error statement
...
Print three possible reasons /sys/kernel/debug/gup_test cannot be opened
to help users of this test diagnose failures.
Link: https://lkml.kernel.org/r/20220405214809.3351223-1-sidhartha.kumar@oracle.com
Signed-off-by: Sidhartha Kumar <sidhartha.kumar@oracle.com >
Reviewed-by: Shuah Khan <skhan@linuxfoundation.org >
Signed-off-by: Andrew Morton <akpm@linux-foundation.org >
2022-04-28 23:16:10 -07:00
Muchun Song
0e5e64c0b0
mm: simplify follow_invalidate_pte()
...
The only user (DAX) of range and pmdpp parameters of
follow_invalidate_pte() is gone, it is safe to remove them and make it
static to simlify the code. This is revertant of the following commits:
0979639595 ("mm: add follow_pte_pmd()")
a4d1a88525 ("dax: update to new mmu_notifier semantic")
There is only one caller of the follow_invalidate_pte(). So just fold it
into follow_pte() and remove it.
Link: https://lkml.kernel.org/r/20220403053957.10770-7-songmuchun@bytedance.com
Signed-off-by: Muchun Song <songmuchun@bytedance.com >
Reviewed-by: Christoph Hellwig <hch@lst.de >
Cc: Alistair Popple <apopple@nvidia.com >
Cc: Al Viro <viro@zeniv.linux.org.uk >
Cc: Dan Williams <dan.j.williams@intel.com >
Cc: Hugh Dickins <hughd@google.com >
Cc: Jan Kara <jack@suse.cz >
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com >
Cc: Matthew Wilcox <willy@infradead.org >
Cc: Ralph Campbell <rcampbell@nvidia.com >
Cc: Ross Zwisler <zwisler@kernel.org >
Cc: Xiongchun Duan <duanxiongchun@bytedance.com >
Cc: Xiyu Yang <xiyuyang19@fudan.edu.cn >
Cc: Yang Shi <shy828301@gmail.com >
Signed-off-by: Andrew Morton <akpm@linux-foundation.org >
2022-04-28 23:16:10 -07:00
Muchun Song
06083a0921
dax: fix missing writeprotect the pte entry
...
Currently dax_mapping_entry_mkclean() fails to clean and write protect the
pte entry within a DAX PMD entry during an *sync operation. This can
result in data loss in the following sequence:
1) process A mmap write to DAX PMD, dirtying PMD radix tree entry and
making the pmd entry dirty and writeable.
2) process B mmap with the @offset (e.g. 4K) and @length (e.g. 4K)
write to the same file, dirtying PMD radix tree entry (already
done in 1)) and making the pte entry dirty and writeable.
3) fsync, flushing out PMD data and cleaning the radix tree entry. We
currently fail to mark the pte entry as clean and write protected
since the vma of process B is not covered in dax_entry_mkclean().
4) process B writes to the pte. These don't cause any page faults since
the pte entry is dirty and writeable. The radix tree entry remains
clean.
5) fsync, which fails to flush the dirty PMD data because the radix tree
entry was clean.
6) crash - dirty data that should have been fsync'd as part of 5) could
still have been in the processor cache, and is lost.
Just to use pfn_mkclean_range() to clean the pfns to fix this issue.
Link: https://lkml.kernel.org/r/20220403053957.10770-6-songmuchun@bytedance.com
Fixes: 4b4bb46d00 ("dax: clear dirty entry tags on cache flush")
Signed-off-by: Muchun Song <songmuchun@bytedance.com >
Reviewed-by: Christoph Hellwig <hch@lst.de >
Cc: Alistair Popple <apopple@nvidia.com >
Cc: Al Viro <viro@zeniv.linux.org.uk >
Cc: Dan Williams <dan.j.williams@intel.com >
Cc: Hugh Dickins <hughd@google.com >
Cc: Jan Kara <jack@suse.cz >
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com >
Cc: Matthew Wilcox <willy@infradead.org >
Cc: Ralph Campbell <rcampbell@nvidia.com >
Cc: Ross Zwisler <zwisler@kernel.org >
Cc: Xiongchun Duan <duanxiongchun@bytedance.com >
Cc: Xiyu Yang <xiyuyang19@fudan.edu.cn >
Cc: Yang Shi <shy828301@gmail.com >
Signed-off-by: Andrew Morton <akpm@linux-foundation.org >
2022-04-28 23:16:10 -07:00
Muchun Song
6472f6d2f7
mm: pvmw: add support for walking devmap pages
...
The devmap pages can not use page_vma_mapped_walk() to check if a huge
devmap page is mapped into a vma. Add support for walking huge devmap
pages so that DAX can use it in the next patch.
Link: https://lkml.kernel.org/r/20220403053957.10770-5-songmuchun@bytedance.com
Signed-off-by: Muchun Song <songmuchun@bytedance.com >
Cc: Alistair Popple <apopple@nvidia.com >
Cc: Al Viro <viro@zeniv.linux.org.uk >
Cc: Christoph Hellwig <hch@lst.de >
Cc: Dan Williams <dan.j.williams@intel.com >
Cc: Hugh Dickins <hughd@google.com >
Cc: Jan Kara <jack@suse.cz >
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com >
Cc: Matthew Wilcox <willy@infradead.org >
Cc: Ralph Campbell <rcampbell@nvidia.com >
Cc: Ross Zwisler <zwisler@kernel.org >
Cc: Xiongchun Duan <duanxiongchun@bytedance.com >
Cc: Xiyu Yang <xiyuyang19@fudan.edu.cn >
Cc: Yang Shi <shy828301@gmail.com >
Signed-off-by: Andrew Morton <akpm@linux-foundation.org >
2022-04-28 23:16:10 -07:00
Muchun Song
6a8e0596f0
mm: rmap: introduce pfn_mkclean_range() to cleans PTEs
...
The page_mkclean_one() is supposed to be used with the pfn that has a
associated struct page, but not all the pfns (e.g. DAX) have a struct
page. Introduce a new function pfn_mkclean_range() to cleans the PTEs
(including PMDs) mapped with range of pfns which has no struct page
associated with them. This helper will be used by DAX device in the next
patch to make pfns clean.
Link: https://lkml.kernel.org/r/20220403053957.10770-4-songmuchun@bytedance.com
Signed-off-by: Muchun Song <songmuchun@bytedance.com >
Cc: Alistair Popple <apopple@nvidia.com >
Cc: Al Viro <viro@zeniv.linux.org.uk >
Cc: Christoph Hellwig <hch@lst.de >
Cc: Dan Williams <dan.j.williams@intel.com >
Cc: Hugh Dickins <hughd@google.com >
Cc: Jan Kara <jack@suse.cz >
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com >
Cc: Matthew Wilcox <willy@infradead.org >
Cc: Ralph Campbell <rcampbell@nvidia.com >
Cc: Ross Zwisler <zwisler@kernel.org >
Cc: Xiongchun Duan <duanxiongchun@bytedance.com >
Cc: Xiyu Yang <xiyuyang19@fudan.edu.cn >
Cc: Yang Shi <shy828301@gmail.com >
Signed-off-by: Andrew Morton <akpm@linux-foundation.org >
2022-04-28 23:16:10 -07:00
Muchun Song
e583b5c472
dax: fix cache flush on PMD-mapped pages
...
The flush_cache_page() only remove a PAGE_SIZE sized range from the cache.
However, it does not cover the full pages in a THP except a head page.
Replace it with flush_cache_range() to fix this issue. This is just a
documentation issue with the respect to properly documenting the expected
usage of cache flushing before modifying the pmd. However, in practice
this is not a problem due to the fact that DAX is not available on
architectures with virtually indexed caches per:
commit d92576f116 ("dax: does not work correctly with virtual aliasing caches")
Link: https://lkml.kernel.org/r/20220403053957.10770-3-songmuchun@bytedance.com
Fixes: f729c8c9b2 ("dax: wrprotect pmd_t in dax_mapping_entry_mkclean")
Signed-off-by: Muchun Song <songmuchun@bytedance.com >
Reviewed-by: Dan Williams <dan.j.williams@intel.com >
Reviewed-by: Christoph Hellwig <hch@lst.de >
Cc: Alistair Popple <apopple@nvidia.com >
Cc: Al Viro <viro@zeniv.linux.org.uk >
Cc: Hugh Dickins <hughd@google.com >
Cc: Jan Kara <jack@suse.cz >
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com >
Cc: Matthew Wilcox <willy@infradead.org >
Cc: Ralph Campbell <rcampbell@nvidia.com >
Cc: Ross Zwisler <zwisler@kernel.org >
Cc: Xiongchun Duan <duanxiongchun@bytedance.com >
Cc: Xiyu Yang <xiyuyang19@fudan.edu.cn >
Cc: Yang Shi <shy828301@gmail.com >
Signed-off-by: Andrew Morton <akpm@linux-foundation.org >
2022-04-28 23:16:09 -07:00
Muchun Song
7f9c9b607d
mm: rmap: fix cache flush on THP pages
...
Patch series "Fix some bugs related to ramp and dax", v7.
Patch 1-2 fix a cache flush bug, because subsequent patches depend on
those on those changes, there are placed in this series. Patch 3-4 are
preparation for fixing a dax bug in patch 5. Patch 6 is code cleanup
since the previous patch removes the usage of follow_invalidate_pte().
This patch (of 6):
The flush_cache_page() only remove a PAGE_SIZE sized range from the cache.
However, it does not cover the full pages in a THP except a head page.
Replace it with flush_cache_range() to fix this issue. At least, no
problems were found due to this. Maybe because the architectures that
have virtual indexed caches is less.
Link: https://lkml.kernel.org/r/20220403053957.10770-1-songmuchun@bytedance.com
Link: https://lkml.kernel.org/r/20220403053957.10770-2-songmuchun@bytedance.com
Fixes: f27176cfc3 ("mm: convert page_mkclean_one() to use page_vma_mapped_walk()")
Signed-off-by: Muchun Song <songmuchun@bytedance.com >
Reviewed-by: Yang Shi <shy828301@gmail.com >
Reviewed-by: Dan Williams <dan.j.williams@intel.com >
Reviewed-by: Christoph Hellwig <hch@lst.de >
Cc: Matthew Wilcox <willy@infradead.org >
Cc: Jan Kara <jack@suse.cz >
Cc: Al Viro <viro@zeniv.linux.org.uk >
Cc: Alistair Popple <apopple@nvidia.com >
Cc: Yang Shi <shy828301@gmail.com >
Cc: Ralph Campbell <rcampbell@nvidia.com >
Cc: Hugh Dickins <hughd@google.com >
Cc: Xiyu Yang <xiyuyang19@fudan.edu.cn >
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com >
Cc: Ross Zwisler <zwisler@kernel.org >
Cc: Xiongchun Duan <duanxiongchun@bytedance.com >
Signed-off-by: Andrew Morton <akpm@linux-foundation.org >
2022-04-28 23:16:09 -07:00
Miaohe Lin
f3b9e8cc8b
mm/madvise: fix potential pte_unmap_unlock pte error
...
We can't assume pte_offset_map_lock will return same orig_pte value. So
it's necessary to reacquire the orig_pte or pte_unmap_unlock will unmap
the stale pte.
Link: https://lkml.kernel.org/r/20220416081416.23304-1-linmiaohe@huawei.com
Fixes: 9c276cc65a ("mm: introduce MADV_COLD")
Fixes: 854e9ed09d ("mm: support madvise(MADV_FREE)")
Signed-off-by: Miaohe Lin <linmiaohe@huawei.com >
Cc: Johannes Weiner <hannes@cmpxchg.org >
Cc: Michal Hocko <mhocko@suse.com >
Cc: Hugh Dickins <hughd@google.com >
Signed-off-by: Andrew Morton <akpm@linux-foundation.org >
2022-04-28 23:16:09 -07:00
Oscar Salvador
7d6e2d9638
mm: untangle config dependencies for demote-on-reclaim
...
At the time demote-on-reclaim was introduced, it was tied to
CONFIG_HOTPLUG_CPU + CONFIG_MIGRATE, but that is not really accurate.
The only two things we need to depend on are CONFIG_NUMA + CONFIG_MIGRATE,
so clean this up. Furthermore, we only register the hotplug memory
notifier when the system has CONFIG_MEMORY_HOTPLUG.
Link: https://lkml.kernel.org/r/20220322224016.4574-1-osalvador@suse.de
Signed-off-by: Oscar Salvador <osalvador@suse.de >
Suggested-by: "Huang, Ying" <ying.huang@intel.com >
Reviewed-by: "Huang, Ying" <ying.huang@intel.com >
Cc: Dave Hansen <dave.hansen@linux.intel.com >
Cc: Abhishek Goel <huntbag@linux.vnet.ibm.com >
Cc: Baolin Wang <baolin.wang@linux.alibaba.com >
Signed-off-by: Andrew Morton <akpm@linux-foundation.org >
2022-04-28 23:16:09 -07:00
Baolin Wang
9c42fe4e30
mm: migrate: simplify the refcount validation when migrating hugetlb mapping
...
There is no need to validate the hugetlb page's refcount before trying to
freeze the hugetlb page's expected refcount, instead we can just rely on
the page_ref_freeze() to simplify the validation.
Moreover we are always under the page lock when migrating the hugetlb page
mapping, which means nowhere else can remove it from the page cache, so we
can remove the xas_load() validation under the i_pages lock.
Link: https://lkml.kernel.org/r/eb2fbbeaef2b1714097b9dec457426d682ee0635.1649676424.git.baolin.wang@linux.alibaba.com
Signed-off-by: Baolin Wang <baolin.wang@linux.alibaba.com >
Acked-by: Mike Kravetz <mike.kravetz@oracle.com >
Cc: Matthew Wilcox <willy@infradead.org >
Cc: "Huang, Ying" <ying.huang@intel.com >
Cc: Muchun Song <songmuchun@bytedance.com >
Cc: Alistair Popple <apopple@nvidia.com >
Cc: Zi Yan <ziy@nvidia.com >
Signed-off-by: Andrew Morton <akpm@linux-foundation.org >
2022-04-28 23:16:09 -07:00
Miaohe Lin
4cd614841c
mm/migration: fix possible do_pages_stat_array racing with memory offline
...
When follow_page peeks a page, the page could be migrated and then be
offlined while it's still being used by the do_pages_stat_array(). Use
FOLL_GET to hold the page refcnt to fix this potential race.
Link: https://lkml.kernel.org/r/20220318111709.60311-12-linmiaohe@huawei.com
Signed-off-by: Miaohe Lin <linmiaohe@huawei.com >
Acked-by: "Huang, Ying" <ying.huang@intel.com >
Reviewed-by: Muchun Song <songmuchun@bytedance.com >
Cc: Alistair Popple <apopple@nvidia.com >
Cc: Baolin Wang <baolin.wang@linux.alibaba.com >
Cc: Zi Yan <ziy@nvidia.com >
Signed-off-by: Andrew Morton <akpm@linux-foundation.org >
2022-04-28 23:16:08 -07:00
Miaohe Lin
3f26c88bd6
mm/migration: fix potential invalid node access for reclaim-based migration
...
If we failed to setup hotplug state callbacks for mm/demotion:online in
some corner cases, node_demotion will be left uninitialized. Invalid node
might be returned from the next_demotion_node() when doing reclaim-based
migration. Use kcalloc to allocate node_demotion to fix the issue.
Link: https://lkml.kernel.org/r/20220318111709.60311-11-linmiaohe@huawei.com
Fixes: ac16ec8353 ("mm: migrate: support multiple target nodes demotion")
Signed-off-by: Miaohe Lin <linmiaohe@huawei.com >
Reviewed-by: "Huang, Ying" <ying.huang@intel.com >
Cc: Alistair Popple <apopple@nvidia.com >
Cc: Baolin Wang <baolin.wang@linux.alibaba.com >
Cc: Muchun Song <songmuchun@bytedance.com >
Cc: Zi Yan <ziy@nvidia.com >
Signed-off-by: Andrew Morton <akpm@linux-foundation.org >
2022-04-28 23:16:08 -07:00
Miaohe Lin
69a041ff50
mm/migration: fix potential page refcounts leak in migrate_pages
...
In -ENOMEM case, there might be some subpages of fail-to-migrate THPs left
in thp_split_pages list. We should move them back to migration list so
that they could be put back to the right list by the caller otherwise the
page refcnt will be leaked here. Also adjust nr_failed and nr_thp_failed
accordingly to make vm events account more accurate.
Link: https://lkml.kernel.org/r/20220318111709.60311-10-linmiaohe@huawei.com
Fixes: b5bade978e ("mm: migrate: fix the return value of migrate_pages()")
Signed-off-by: Miaohe Lin <linmiaohe@huawei.com >
Reviewed-by: Zi Yan <ziy@nvidia.com >
Reviewed-by: "Huang, Ying" <ying.huang@intel.com >
Reviewed-by: Baolin Wang <baolin.wang@linux.alibaba.com >
Reviewed-by: Muchun Song <songmuchun@bytedance.com >
Cc: Alistair Popple <apopple@nvidia.com >
Signed-off-by: Andrew Morton <akpm@linux-foundation.org >
2022-04-28 23:16:08 -07:00
Miaohe Lin
f430893b01
mm/migration: remove some duplicated codes in migrate_pages
...
Remove the duplicated codes in migrate_pages to simplify the code. Minor
readability improvement. No functional change intended.
Link: https://lkml.kernel.org/r/20220318111709.60311-9-linmiaohe@huawei.com
Signed-off-by: Miaohe Lin <linmiaohe@huawei.com >
Reviewed-by: Zi Yan <ziy@nvidia.com >
Reviewed-by: Baolin Wang <baolin.wang@linux.alibaba.com >
Cc: Alistair Popple <apopple@nvidia.com >
Cc: "Huang, Ying" <ying.huang@intel.com >
Cc: Muchun Song <songmuchun@bytedance.com >
Signed-off-by: Andrew Morton <akpm@linux-foundation.org >
2022-04-28 23:16:08 -07:00
Miaohe Lin
91925ab8cc
mm/migration: avoid unneeded nodemask_t initialization
...
Avoid unneeded next_pass and this_pass initialization as they're always
set before using to save possible cpu cycles when there are plenty of
nodes in the system.
Link: https://lkml.kernel.org/r/20220318111709.60311-8-linmiaohe@huawei.com
Signed-off-by: Miaohe Lin <linmiaohe@huawei.com >
Reviewed-by: Muchun Song <songmuchun@bytedance.com >
Reviewed-by: Baolin Wang <baolin.wang@linux.alibaba.com >
Cc: Alistair Popple <apopple@nvidia.com >
Cc: "Huang, Ying" <ying.huang@intel.com >
Cc: Zi Yan <ziy@nvidia.com >
Signed-off-by: Andrew Morton <akpm@linux-foundation.org >
2022-04-28 23:16:08 -07:00
Miaohe Lin
3eefb826c5
mm/migration: use helper macro min in do_pages_stat
...
We could use helper macro min to help set the chunk_nr to simplify the
code.
Link: https://lkml.kernel.org/r/20220318111709.60311-7-linmiaohe@huawei.com
Signed-off-by: Miaohe Lin <linmiaohe@huawei.com >
Reviewed-by: Muchun Song <songmuchun@bytedance.com >
Cc: Alistair Popple <apopple@nvidia.com >
Cc: Baolin Wang <baolin.wang@linux.alibaba.com >
Cc: "Huang, Ying" <ying.huang@intel.com >
Cc: Zi Yan <ziy@nvidia.com >
Signed-off-by: Andrew Morton <akpm@linux-foundation.org >
2022-04-28 23:16:07 -07:00
Miaohe Lin
cb1c37b1c6
mm/migration: use helper function vma_lookup() in add_page_for_migration
...
We could use helper function vma_lookup() to lookup the needed vma to
simplify the code.
Link: https://lkml.kernel.org/r/20220318111709.60311-6-linmiaohe@huawei.com
Signed-off-by: Miaohe Lin <linmiaohe@huawei.com >
Cc: Alistair Popple <apopple@nvidia.com >
Cc: Baolin Wang <baolin.wang@linux.alibaba.com >
Cc: "Huang, Ying" <ying.huang@intel.com >
Cc: Muchun Song <songmuchun@bytedance.com >
Cc: Zi Yan <ziy@nvidia.com >
Signed-off-by: Andrew Morton <akpm@linux-foundation.org >
2022-04-28 23:16:07 -07:00
Miaohe Lin
b75454e101
mm/migration: remove unneeded local variable page_lru
...
We can use page_is_file_lru() directly to help account the isolated pages
to simplify the code a bit.
Link: https://lkml.kernel.org/r/20220318111709.60311-4-linmiaohe@huawei.com
Signed-off-by: Miaohe Lin <linmiaohe@huawei.com >
Cc: Alistair Popple <apopple@nvidia.com >
Cc: Baolin Wang <baolin.wang@linux.alibaba.com >
Cc: "Huang, Ying" <ying.huang@intel.com >
Cc: Muchun Song <songmuchun@bytedance.com >
Cc: Zi Yan <ziy@nvidia.com >
Signed-off-by: Andrew Morton <akpm@linux-foundation.org >
2022-04-28 23:16:07 -07:00
Miaohe Lin
5202978b48
mm/migration: remove unneeded local variable mapping_locked
...
Patch series "A few cleanup and fixup patches for migration", v2.
This series contains a few patches to remove unneeded variables, jump
label and use helper to simplify the code. Also we fix some bugs such as
page refcounts leak , invalid node access and so on. More details can be
found in the respective changelogs.
This patch (of 11):
When mapping_locked is true, TTU_RMAP_LOCKED is always set to ttu. We can
check ttu instead so mapping_locked can be removed. And ttu is either 0
or TTU_RMAP_LOCKED now. Change '|=' to '=' to reflect this.
Link: https://lkml.kernel.org/r/20220318111709.60311-1-linmiaohe@huawei.com
Link: https://lkml.kernel.org/r/20220318111709.60311-2-linmiaohe@huawei.com
Signed-off-by: Miaohe Lin <linmiaohe@huawei.com >
Reviewed-by: Muchun Song <songmuchun@bytedance.com >
Cc: Zi Yan <ziy@nvidia.com >
Cc: Baolin Wang <baolin.wang@linux.alibaba.com >
Cc: "Huang, Ying" <ying.huang@intel.com >
Cc: Muchun Song <songmuchun@bytedance.com >
Cc: Alistair Popple <apopple@nvidia.com >
Signed-off-by: Andrew Morton <akpm@linux-foundation.org >
2022-04-28 23:16:07 -07:00
Alistair Popple
0c2d087284
mm: add selftests for migration entries
...
Add some basic migration tests and in particular tests that will
stress both the pte and pmd migration entry wait paths.
Link: https://lkml.kernel.org/r/20220324014349.229253-1-apopple@nvidia.com
Signed-off-by: Alistair Popple <apopple@nvidia.com >
Cc: Hugh Dickins <hughd@google.com >
Cc: Jan Kara <jack@suse.cz >
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com >
Cc: Matthew Wilcox (Oracle) <willy@infradead.org >
Cc: Ralph Campbell <rcampbell@nvidia.com >
Cc: Muchun Song <songmuchun@bytedance.com >
Cc: John Hubbard <jhubbard@nvidia.com >
Cc: Thomas Gleixner <tglx@linutronix.de >
Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de >
Cc: Shuah Khan <shuah@kernel.org >
Signed-off-by: Andrew Morton <akpm@linux-foundation.org >
2022-04-28 23:16:07 -07:00
Miaohe Lin
bc78b5ed9f
mm/mempolicy: clean up the code logic in queue_pages_pte_range
...
Since commit e5947d23ed ("mm: mempolicy: don't have to split pmd for
huge zero page"), THP is never splited in queue_pages_pmd. Thus 2 is
never returned now. We can remove such unnecessary ret != 2 check and
clean up the relevant comment. Minor improvements in readability.
Link: https://lkml.kernel.org/r/20220419122234.45083-1-linmiaohe@huawei.com
Signed-off-by: Miaohe Lin <linmiaohe@huawei.com >
Reviewed-by: Yang Shi <shy828301@gmail.com >
Acked-by: David Rientjes <rientjes@google.com >
Cc: Zi Yan <ziy@nvidia.com >
Cc: Michal Hocko <mhocko@suse.com >
Signed-off-by: Andrew Morton <akpm@linux-foundation.org >
2022-04-28 23:16:06 -07:00
Miaohe Lin
da63dc84be
drivers/base/node.c: fix compaction sysfs file leak
...
Compaction sysfs file is created via compaction_register_node in
register_node. But we forgot to remove it in unregister_node. Thus
compaction sysfs file is leaked. Using compaction_unregister_node to fix
this issue.
Link: https://lkml.kernel.org/r/20220401070905.43679-1-linmiaohe@huawei.com
Fixes: ed4a6d7f06 ("mm: compaction: add /sys trigger for per-node memory compaction")
Signed-off-by: Miaohe Lin <linmiaohe@huawei.com >
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org >
Cc: Rafael J. Wysocki <rafael@kernel.org >
Cc: Mel Gorman <mel@csn.ul.ie >
Cc: Minchan Kim <minchan.kim@gmail.com >
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com >
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com >
Signed-off-by: Andrew Morton <akpm@linux-foundation.org >
2022-04-28 23:16:06 -07:00
Miaohe Lin
4af12d04e7
mm: compaction: use helper isolation_suitable()
...
Use helper isolation_suitable() to check whether page is suitable to
isolate to simplify the code. Minor readability improvement.
Link: https://lkml.kernel.org/r/20220322110750.60311-1-linmiaohe@huawei.com
Signed-off-by: Miaohe Lin <linmiaohe@huawei.com >
Reviewed-by: Andrew Morton <akpm@linux-foundation.org >
Reviewed-by: Wei Yang <richard.weiyang@gmail.com >
Signed-off-by: Andrew Morton <akpm@linux-foundation.org >
2022-04-28 23:16:06 -07:00
Miaohe Lin
daf79bd8ee
mm/z3fold: remove unneeded PAGE_HEADLESS check in free_handle()
...
The only caller z3fold_free() never calls free_handle() in PAGE_HEADLESS
case. Remove this unneeded check.
Link: https://lkml.kernel.org/r/20220308134311.59086-9-linmiaohe@huawei.com
Signed-off-by: Miaohe Lin <linmiaohe@huawei.com >
Reviewed-by: Vitaly Wool <vitaly.wool@konsulko.com >
Signed-off-by: Andrew Morton <akpm@linux-foundation.org >
2022-04-28 23:16:06 -07:00
Miaohe Lin
52fb90cc19
mm/z3fold: remove redundant list_del_init of zhdr->buddy in z3fold_free
...
do_compact_page() will do list_del_init(&zhdr->buddy) for us. Remove this
extra one to save some possible cpu cycles.
Link: https://lkml.kernel.org/r/20220308134311.59086-8-linmiaohe@huawei.com
Signed-off-by: Miaohe Lin <linmiaohe@huawei.com >
Reviewed-by: Vitaly Wool <vitaly.wool@konsulko.com >
Signed-off-by: Andrew Morton <akpm@linux-foundation.org >
2022-04-28 23:16:06 -07:00
Miaohe Lin
5e36c25b2c
mm/z3fold: move decrement of pool->pages_nr into __release_z3fold_page()
...
The z3fold will always do atomic64_dec(&pool->pages_nr) when the
__release_z3fold_page() is called. Thus we can move decrement of
pool->pages_nr into __release_z3fold_page() to simplify the code.
Also we can reduce the size of z3fold.o ~1k.
Without this patch:
text data bss dec hex filename
15444 1376 8 16828 41bc mm/z3fold.o
With this patch:
text data bss dec hex filename
15044 1248 8 16300 3fac mm/z3fold.o
Link: https://lkml.kernel.org/r/20220308134311.59086-7-linmiaohe@huawei.com
Signed-off-by: Miaohe Lin <linmiaohe@huawei.com >
Cc: Vitaly Wool <vitaly.wool@konsulko.com >
Signed-off-by: Andrew Morton <akpm@linux-foundation.org >
2022-04-28 23:16:05 -07:00
Miaohe Lin
a3148b5fea
mm/z3fold: remove confusing local variable l reassignment
...
The local variable l holds the address of unbuddied[i] which won't change
after we take the pool lock. Remove it to avoid confusion.
Link: https://lkml.kernel.org/r/20220308134311.59086-6-linmiaohe@huawei.com
Signed-off-by: Miaohe Lin <linmiaohe@huawei.com >
Reviewed-by: Vitaly Wool <vitaly.wool@konsulko.com >
Signed-off-by: Andrew Morton <akpm@linux-foundation.org >
2022-04-28 23:16:05 -07:00
Miaohe Lin
8ea2f86cea
mm/z3fold: remove unneeded page_mapcount_reset and ClearPagePrivate
...
Page->page_type and PagePrivate are not used in z3fold. We should remove
these confusing unneeded operations. The z3fold do these here is due to
referring to zsmalloc's migration code which does need these operations.
Link: https://lkml.kernel.org/r/20220308134311.59086-5-linmiaohe@huawei.com
Signed-off-by: Miaohe Lin <linmiaohe@huawei.com >
Reviewed-by: Vitaly Wool <vitaly.wool@konsulko.com >
Signed-off-by: Andrew Morton <akpm@linux-foundation.org >
2022-04-28 23:16:05 -07:00
Miaohe Lin
ed0e5dcab3
mm/z3fold: minor clean up for z3fold_free
...
Use put_z3fold_header() to pair with get_z3fold_header. Also fix the
wrong comments. Minor readability improvement.
Link: https://lkml.kernel.org/r/20220308134311.59086-4-linmiaohe@huawei.com
Signed-off-by: Miaohe Lin <linmiaohe@huawei.com >
Reviewed-by: Vitaly Wool <vitaly.wool@konsulko.com >
Signed-off-by: Andrew Morton <akpm@linux-foundation.org >
2022-04-28 23:16:05 -07:00
Miaohe Lin
78da57d401
mm/z3fold: remove obsolete comment in z3fold_alloc
...
The highmem pages are supported since commit f1549cb5ab ("mm/z3fold.c:
allow __GFP_HIGHMEM in z3fold_alloc"). Remove the residual comment.
Link: https://lkml.kernel.org/r/20220308134311.59086-3-linmiaohe@huawei.com
Signed-off-by: Miaohe Lin <linmiaohe@huawei.com >
Reviewed-by: Vitaly Wool <vitaly.wool@konsulko.com >
Signed-off-by: Andrew Morton <akpm@linux-foundation.org >
2022-04-28 23:16:05 -07:00
Miaohe Lin
dc3a1f3024
mm/z3fold: declare z3fold_mount with __init
...
Patch series "A few cleanup patches for z3fold", v2.
This series contains a few patches to simplify the code, remove unneeded
code, fix obsolete comment and so on. More details can be found in the
respective changelogs.
This patch (of 8):
z3fold_mount is only called during init. So we should declare it with
__init.
Link: https://lkml.kernel.org/r/20220308134311.59086-1-linmiaohe@huawei.com
Link: https://lkml.kernel.org/r/20220308134311.59086-2-linmiaohe@huawei.com
Signed-off-by: Miaohe Lin <linmiaohe@huawei.com >
Reviewed-by: Vitaly Wool <vitaly.wool@konsulko.com >
Signed-off-by: Andrew Morton <akpm@linux-foundation.org >
2022-04-28 23:16:05 -07:00
Xianting Tian
c310e06cc4
fs/proc/task_mmu.c: remove redundant page validation of pte_page
...
pte_page() always returns a valid page, so remove the redundant page
validation, as we did in many other places.
Link: https://lkml.kernel.org/r/20220316025947.328276-1-xianting.tian@linux.alibaba.com
Signed-off-by: Xianting Tian <xianting.tian@linux.alibaba.com >
Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com >
Cc: Alexey Dobriyan <adobriyan@gmail.com >
Cc: Yang Shi <shy828301@gmail.com >
Cc: Sasha Levin <sashal@kernel.org >
Cc: Miaohe Lin <linmiaohe@huawei.com >
Signed-off-by: Andrew Morton <akpm@linux-foundation.org >
2022-04-28 23:16:04 -07:00
Miaohe Lin
b2cb6826b6
mm/vmscan: fix comment for isolate_lru_pages
...
Since commit 791b48b642 ("mm: vmscan: scan until it finds eligible
pages"), splicing any skipped pages to the tail of the LRU list won't put
the system at risk of premature OOM but will waste lots of cpu cycles.
Correct the comment accordingly.
Link: https://lkml.kernel.org/r/20220416025231.8082-1-linmiaohe@huawei.com
Signed-off-by: Miaohe Lin <linmiaohe@huawei.com >
Cc: Minchan Kim <minchan@kernel.org >
Signed-off-by: Andrew Morton <akpm@linux-foundation.org >
2022-04-28 23:16:04 -07:00
Miaohe Lin
5829f7dbae
mm/vmscan: fix comment for current_may_throttle
...
Since commit 6d6435811c19 ("remove bdi_congested() and wb_congested() and
related functions"), there is no congested backing device check anymore.
Correct the comment accordingly.
[akpm@linux-foundation.org: tweak grammar]
Link: https://lkml.kernel.org/r/20220414120202.30082-1-linmiaohe@huawei.com
Signed-off-by: Miaohe Lin <linmiaohe@huawei.com >
Signed-off-by: Andrew Morton <akpm@linux-foundation.org >
2022-04-28 23:16:04 -07:00
Miaohe Lin
02e458d8d0
mm/vmscan: remove obsolete comment in get_scan_count
...
Since commit 1431d4d11a ("mm: base LRU balancing on an explicit cost
model"), the relative value of each set of LRU lists is based on cost
model instead of rotated/scanned ratio. Cleanup the relevant comment.
Link: https://lkml.kernel.org/r/20220409030245.61211-1-linmiaohe@huawei.com
Signed-off-by: Miaohe Lin <linmiaohe@huawei.com >
Signed-off-by: Andrew Morton <akpm@linux-foundation.org >
2022-04-28 23:16:04 -07:00
Wei Yang
8b3a899abe
mm/vmscan: sc->reclaim_idx must be a valid zone index
...
lruvec_lru_size() is only used in get_scan_count(), so the only possible
zone_idx is sc->reclaim_idx. Since sc->reclaim_idx is ensured to be a
valid zone idex, we can remove the extra check for zone iteration.
Link: https://lkml.kernel.org/r/20220317234624.23358-1-richard.weiyang@gmail.com
Signed-off-by: Wei Yang <richard.weiyang@gmail.com >
Cc: Johannes Weiner <hannes@cmpxchg.org >
Signed-off-by: Andrew Morton <akpm@linux-foundation.org >
2022-04-28 23:16:04 -07:00
Wei Yang
bc53008eea
mm/vmscan: make sure wakeup_kswapd with managed zone
...
wakeup_kswapd() only wake up kswapd when the zone is managed.
For two callers of wakeup_kswapd(), they are node perspective.
* wake_all_kswapds
* numamigrate_isolate_page
If we picked up a !managed zone, this is not we expected.
This patch makes sure we pick up a managed zone for wakeup_kswapd(). And
it also use managed_zone in migrate_balanced_pgdat() to get the proper
zone.
[richard.weiyang@gmail.com: adjust the usage in migrate_balanced_pgdat()]
Link: https://lkml.kernel.org/r/20220329010901.1654-2-richard.weiyang@gmail.com
Link: https://lkml.kernel.org/r/20220327024101.10378-2-richard.weiyang@gmail.com
Signed-off-by: Wei Yang <richard.weiyang@gmail.com >
Reviewed-by: "Huang, Ying" <ying.huang@intel.com >
Reviewed-by: Miaohe Lin <linmiaohe@huawei.com >
Reviewed-by: David Hildenbrand <david@redhat.com >
Cc: Mel Gorman <mgorman@techsingularity.net >
Cc: Oscar Salvador <osalvador@suse.de >
Signed-off-by: Andrew Morton <akpm@linux-foundation.org >
2022-04-28 23:16:03 -07:00