mirror of
https://github.com/torvalds/linux.git
synced 2024-11-21 11:31:31 +00:00
The usual shower of singleton fixes and minor series all over MM,
documented (hopefully adequately) in the respective changelogs. Notable series include: - Lucas Stach has provided some page-mapping cleanup/consolidation/maintainability work in the series "mm/treewide: Remove pXd_huge() API". - In the series "Allow migrate on protnone reference with MPOL_PREFERRED_MANY policy", Donet Tom has optimized mempolicy's MPOL_PREFERRED_MANY mode, yielding almost doubled performance in one test. - In their series "Memory allocation profiling" Kent Overstreet and Suren Baghdasaryan have contributed a means of determining (via /proc/allocinfo) whereabouts in the kernel memory is being allocated: number of calls and amount of memory. - Matthew Wilcox has provided the series "Various significant MM patches" which does a number of rather unrelated things, but in largely similar code sites. - In his series "mm: page_alloc: freelist migratetype hygiene" Johannes Weiner has fixed the page allocator's handling of migratetype requests, with resulting improvements in compaction efficiency. - In the series "make the hugetlb migration strategy consistent" Baolin Wang has fixed a hugetlb migration issue, which should improve hugetlb allocation reliability. - Liu Shixin has hit an I/O meltdown caused by readahead in a memory-tight memcg. Addressed in the series "Fix I/O high when memory almost met memcg limit". - In the series "mm/filemap: optimize folio adding and splitting" Kairui Song has optimized pagecache insertion, yielding ~10% performance improvement in one test. - Baoquan He has cleaned up and consolidated the early zone initialization code in the series "mm/mm_init.c: refactor free_area_init_core()". - Baoquan has also redone some MM initializatio code in the series "mm/init: minor clean up and improvement". - MM helper cleanups from Christoph Hellwig in his series "remove follow_pfn". - More cleanups from Matthew Wilcox in the series "Various page->flags cleanups". - Vlastimil Babka has contributed maintainability improvements in the series "memcg_kmem hooks refactoring". - More folio conversions and cleanups in Matthew Wilcox's series "Convert huge_zero_page to huge_zero_folio" "khugepaged folio conversions" "Remove page_idle and page_young wrappers" "Use folio APIs in procfs" "Clean up __folio_put()" "Some cleanups for memory-failure" "Remove page_mapping()" "More folio compat code removal" - David Hildenbrand chipped in with "fs/proc/task_mmu: convert hugetlb functions to work on folis". - Code consolidation and cleanup work related to GUP's handling of hugetlbs in Peter Xu's series "mm/gup: Unify hugetlb, part 2". - Rick Edgecombe has developed some fixes to stack guard gaps in the series "Cover a guard gap corner case". - Jinjiang Tu has fixed KSM's behaviour after a fork+exec in the series "mm/ksm: fix ksm exec support for prctl". - Baolin Wang has implemented NUMA balancing for multi-size THPs. This is a simple first-cut implementation for now. The series is "support multi-size THP numa balancing". - Cleanups to vma handling helper functions from Matthew Wilcox in the series "Unify vma_address and vma_pgoff_address". - Some selftests maintenance work from Dev Jain in the series "selftests/mm: mremap_test: Optimizations and style fixes". - Improvements to the swapping of multi-size THPs from Ryan Roberts in the series "Swap-out mTHP without splitting". - Kefeng Wang has significantly optimized the handling of arm64's permission page faults in the series "arch/mm/fault: accelerate pagefault when badaccess" "mm: remove arch's private VM_FAULT_BADMAP/BADACCESS" - GUP cleanups from David Hildenbrand in "mm/gup: consistently call it GUP-fast". - hugetlb fault code cleanups from Vishal Moola in "Hugetlb fault path to use struct vm_fault". - selftests build fixes from John Hubbard in the series "Fix selftests/mm build without requiring "make headers"". - Memory tiering fixes/improvements from Ho-Ren (Jack) Chuang in the series "Improved Memory Tier Creation for CPUless NUMA Nodes". Fixes the initialization code so that migration between different memory types works as intended. - David Hildenbrand has improved follow_pte() and fixed an errant driver in the series "mm: follow_pte() improvements and acrn follow_pte() fixes". - David also did some cleanup work on large folio mapcounts in his series "mm: mapcount for large folios + page_mapcount() cleanups". - Folio conversions in KSM in Alex Shi's series "transfer page to folio in KSM". - Barry Song has added some sysfs stats for monitoring multi-size THP's in the series "mm: add per-order mTHP alloc and swpout counters". - Some zswap cleanups from Yosry Ahmed in the series "zswap same-filled and limit checking cleanups". - Matthew Wilcox has been looking at buffer_head code and found the documentation to be lacking. The series is "Improve buffer head documentation". - Multi-size THPs get more work, this time from Lance Yang. His series "mm/madvise: enhance lazyfreeing with mTHP in madvise_free" optimizes the freeing of these things. - Kemeng Shi has added more userspace-visible writeback instrumentation in the series "Improve visibility of writeback". - Kemeng Shi then sent some maintenance work on top in the series "Fix and cleanups to page-writeback". - Matthew Wilcox reduces mmap_lock traffic in the anon vma code in the series "Improve anon_vma scalability for anon VMAs". Intel's test bot reported an improbable 3x improvement in one test. - SeongJae Park adds some DAMON feature work in the series "mm/damon: add a DAMOS filter type for page granularity access recheck" "selftests/damon: add DAMOS quota goal test" - Also some maintenance work in the series "mm/damon/paddr: simplify page level access re-check for pageout" "mm/damon: misc fixes and improvements" - David Hildenbrand has disabled some known-to-fail selftests ni the series "selftests: mm: cow: flag vmsplice() hugetlb tests as XFAIL". - memcg metadata storage optimizations from Shakeel Butt in "memcg: reduce memory consumption by memcg stats". - DAX fixes and maintenance work from Vishal Verma in the series "dax/bus.c: Fixups for dax-bus locking". -----BEGIN PGP SIGNATURE----- iHUEABYIAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCZkgQYwAKCRDdBJ7gKXxA jrdKAP9WVJdpEcXxpoub/vVE0UWGtffr8foifi9bCwrQrGh5mgEAx7Yf0+d/oBZB nvA4E0DcPrUAFy144FNM0NTCb7u9vAw= =V3R/ -----END PGP SIGNATURE----- Merge tag 'mm-stable-2024-05-17-19-19' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull mm updates from Andrew Morton: "The usual shower of singleton fixes and minor series all over MM, documented (hopefully adequately) in the respective changelogs. Notable series include: - Lucas Stach has provided some page-mapping cleanup/consolidation/ maintainability work in the series "mm/treewide: Remove pXd_huge() API". - In the series "Allow migrate on protnone reference with MPOL_PREFERRED_MANY policy", Donet Tom has optimized mempolicy's MPOL_PREFERRED_MANY mode, yielding almost doubled performance in one test. - In their series "Memory allocation profiling" Kent Overstreet and Suren Baghdasaryan have contributed a means of determining (via /proc/allocinfo) whereabouts in the kernel memory is being allocated: number of calls and amount of memory. - Matthew Wilcox has provided the series "Various significant MM patches" which does a number of rather unrelated things, but in largely similar code sites. - In his series "mm: page_alloc: freelist migratetype hygiene" Johannes Weiner has fixed the page allocator's handling of migratetype requests, with resulting improvements in compaction efficiency. - In the series "make the hugetlb migration strategy consistent" Baolin Wang has fixed a hugetlb migration issue, which should improve hugetlb allocation reliability. - Liu Shixin has hit an I/O meltdown caused by readahead in a memory-tight memcg. Addressed in the series "Fix I/O high when memory almost met memcg limit". - In the series "mm/filemap: optimize folio adding and splitting" Kairui Song has optimized pagecache insertion, yielding ~10% performance improvement in one test. - Baoquan He has cleaned up and consolidated the early zone initialization code in the series "mm/mm_init.c: refactor free_area_init_core()". - Baoquan has also redone some MM initializatio code in the series "mm/init: minor clean up and improvement". - MM helper cleanups from Christoph Hellwig in his series "remove follow_pfn". - More cleanups from Matthew Wilcox in the series "Various page->flags cleanups". - Vlastimil Babka has contributed maintainability improvements in the series "memcg_kmem hooks refactoring". - More folio conversions and cleanups in Matthew Wilcox's series: "Convert huge_zero_page to huge_zero_folio" "khugepaged folio conversions" "Remove page_idle and page_young wrappers" "Use folio APIs in procfs" "Clean up __folio_put()" "Some cleanups for memory-failure" "Remove page_mapping()" "More folio compat code removal" - David Hildenbrand chipped in with "fs/proc/task_mmu: convert hugetlb functions to work on folis". - Code consolidation and cleanup work related to GUP's handling of hugetlbs in Peter Xu's series "mm/gup: Unify hugetlb, part 2". - Rick Edgecombe has developed some fixes to stack guard gaps in the series "Cover a guard gap corner case". - Jinjiang Tu has fixed KSM's behaviour after a fork+exec in the series "mm/ksm: fix ksm exec support for prctl". - Baolin Wang has implemented NUMA balancing for multi-size THPs. This is a simple first-cut implementation for now. The series is "support multi-size THP numa balancing". - Cleanups to vma handling helper functions from Matthew Wilcox in the series "Unify vma_address and vma_pgoff_address". - Some selftests maintenance work from Dev Jain in the series "selftests/mm: mremap_test: Optimizations and style fixes". - Improvements to the swapping of multi-size THPs from Ryan Roberts in the series "Swap-out mTHP without splitting". - Kefeng Wang has significantly optimized the handling of arm64's permission page faults in the series "arch/mm/fault: accelerate pagefault when badaccess" "mm: remove arch's private VM_FAULT_BADMAP/BADACCESS" - GUP cleanups from David Hildenbrand in "mm/gup: consistently call it GUP-fast". - hugetlb fault code cleanups from Vishal Moola in "Hugetlb fault path to use struct vm_fault". - selftests build fixes from John Hubbard in the series "Fix selftests/mm build without requiring "make headers"". - Memory tiering fixes/improvements from Ho-Ren (Jack) Chuang in the series "Improved Memory Tier Creation for CPUless NUMA Nodes". Fixes the initialization code so that migration between different memory types works as intended. - David Hildenbrand has improved follow_pte() and fixed an errant driver in the series "mm: follow_pte() improvements and acrn follow_pte() fixes". - David also did some cleanup work on large folio mapcounts in his series "mm: mapcount for large folios + page_mapcount() cleanups". - Folio conversions in KSM in Alex Shi's series "transfer page to folio in KSM". - Barry Song has added some sysfs stats for monitoring multi-size THP's in the series "mm: add per-order mTHP alloc and swpout counters". - Some zswap cleanups from Yosry Ahmed in the series "zswap same-filled and limit checking cleanups". - Matthew Wilcox has been looking at buffer_head code and found the documentation to be lacking. The series is "Improve buffer head documentation". - Multi-size THPs get more work, this time from Lance Yang. His series "mm/madvise: enhance lazyfreeing with mTHP in madvise_free" optimizes the freeing of these things. - Kemeng Shi has added more userspace-visible writeback instrumentation in the series "Improve visibility of writeback". - Kemeng Shi then sent some maintenance work on top in the series "Fix and cleanups to page-writeback". - Matthew Wilcox reduces mmap_lock traffic in the anon vma code in the series "Improve anon_vma scalability for anon VMAs". Intel's test bot reported an improbable 3x improvement in one test. - SeongJae Park adds some DAMON feature work in the series "mm/damon: add a DAMOS filter type for page granularity access recheck" "selftests/damon: add DAMOS quota goal test" - Also some maintenance work in the series "mm/damon/paddr: simplify page level access re-check for pageout" "mm/damon: misc fixes and improvements" - David Hildenbrand has disabled some known-to-fail selftests ni the series "selftests: mm: cow: flag vmsplice() hugetlb tests as XFAIL". - memcg metadata storage optimizations from Shakeel Butt in "memcg: reduce memory consumption by memcg stats". - DAX fixes and maintenance work from Vishal Verma in the series "dax/bus.c: Fixups for dax-bus locking"" * tag 'mm-stable-2024-05-17-19-19' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (426 commits) memcg, oom: cleanup unused memcg_oom_gfp_mask and memcg_oom_order selftests/mm: hugetlb_madv_vs_map: avoid test skipping by querying hugepage size at runtime mm/hugetlb: add missing VM_FAULT_SET_HINDEX in hugetlb_wp mm/hugetlb: add missing VM_FAULT_SET_HINDEX in hugetlb_fault selftests: cgroup: add tests to verify the zswap writeback path mm: memcg: make alloc_mem_cgroup_per_node_info() return bool mm/damon/core: fix return value from damos_wmark_metric_value mm: do not update memcg stats for NR_{FILE/SHMEM}_PMDMAPPED selftests: cgroup: remove redundant enabling of memory controller Docs/mm/damon/maintainer-profile: allow posting patches based on damon/next tree Docs/mm/damon/maintainer-profile: change the maintainer's timezone from PST to PT Docs/mm/damon/design: use a list for supported filters Docs/admin-guide/mm/damon/usage: fix wrong schemes effective quota update command Docs/admin-guide/mm/damon/usage: fix wrong example of DAMOS filter matching sysfs file selftests/damon: classify tests for functionalities and regressions selftests/damon/_damon_sysfs: use 'is' instead of '==' for 'None' selftests/damon/_damon_sysfs: find sysfs mount point from /proc/mounts selftests/damon/_damon_sysfs: check errors from nr_schemes file reads mm/damon/core: initialize ->esz_bp from damos_quota_init_priv() selftests/damon: add a test for DAMOS quota goal ...
This commit is contained in:
commit
61307b7be4
@ -314,9 +314,9 @@ Date: Dec 2022
|
||||
Contact: SeongJae Park <sj@kernel.org>
|
||||
Description: Writing to and reading from this file sets and gets the type of
|
||||
the memory of the interest. 'anon' for anonymous pages,
|
||||
'memcg' for specific memory cgroup, 'addr' for address range
|
||||
(an open-ended interval), or 'target' for DAMON monitoring
|
||||
target can be written and read.
|
||||
'memcg' for specific memory cgroup, 'young' for young pages,
|
||||
'addr' for address range (an open-ended interval), or 'target'
|
||||
for DAMON monitoring target can be written and read.
|
||||
|
||||
What: /sys/kernel/mm/damon/admin/kdamonds/<K>/contexts/<C>/schemes/<S>/filters/<F>/memcg_path
|
||||
Date: Dec 2022
|
||||
|
@ -0,0 +1,18 @@
|
||||
What: /sys/kernel/mm/transparent_hugepage/
|
||||
Date: April 2024
|
||||
Contact: Linux memory management mailing list <linux-mm@kvack.org>
|
||||
Description:
|
||||
/sys/kernel/mm/transparent_hugepage/ contains a number of files and
|
||||
subdirectories,
|
||||
|
||||
- defrag
|
||||
- enabled
|
||||
- hpage_pmd_size
|
||||
- khugepaged
|
||||
- shmem_enabled
|
||||
- use_zero_page
|
||||
- subdirectories of the form hugepages-<size>kB, where <size>
|
||||
is the page size of the hugepages supported by the kernel/CPU
|
||||
combination.
|
||||
|
||||
See Documentation/admin-guide/mm/transhuge.rst for details.
|
@ -466,6 +466,11 @@ of equal or greater size:::
|
||||
#recompress idle pages larger than 2000 bytes
|
||||
echo "type=idle threshold=2000" > /sys/block/zramX/recompress
|
||||
|
||||
It is also possible to limit the number of pages zram re-compression will
|
||||
attempt to recompress:::
|
||||
|
||||
echo "type=huge_idle max_pages=42" > /sys/block/zramX/recompress
|
||||
|
||||
Recompression of idle pages requires memory tracking.
|
||||
|
||||
During re-compression for every page, that matches re-compression criteria,
|
||||
|
@ -300,14 +300,14 @@ When oom event notifier is registered, event will be delivered.
|
||||
|
||||
Lock order is as follows::
|
||||
|
||||
Page lock (PG_locked bit of page->flags)
|
||||
folio_lock
|
||||
mm->page_table_lock or split pte_lock
|
||||
folio_memcg_lock (memcg->move_lock)
|
||||
mapping->i_pages lock
|
||||
lruvec->lru_lock.
|
||||
|
||||
Per-node-per-memcgroup LRU (cgroup's private LRU) is guarded by
|
||||
lruvec->lru_lock; PG_lru bit of page->flags is cleared before
|
||||
lruvec->lru_lock; the folio LRU flag is cleared before
|
||||
isolating a page from its LRU under lruvec->lru_lock.
|
||||
|
||||
.. _cgroup-v1-memory-kernel-extension:
|
||||
@ -802,8 +802,8 @@ a page or a swap can be moved only when it is charged to the task's current
|
||||
| | anonymous pages, file pages (and swaps) in the range mmapped by the task |
|
||||
| | will be moved even if the task hasn't done page fault, i.e. they might |
|
||||
| | not be the task's "RSS", but other task's "RSS" that maps the same file. |
|
||||
| | And mapcount of the page is ignored (the page can be moved even if |
|
||||
| | page_mapcount(page) > 1). You must enable Swap Extension (see 2.4) to |
|
||||
| | The mapcount of the page is ignored (the page can be moved independent |
|
||||
| | of the mapcount). You must enable Swap Extension (see 2.4) to |
|
||||
| | enable move of swap charges. |
|
||||
+---+--------------------------------------------------------------------------+
|
||||
|
||||
|
@ -2151,6 +2151,12 @@
|
||||
Format: 0 | 1
|
||||
Default set by CONFIG_INIT_ON_FREE_DEFAULT_ON.
|
||||
|
||||
init_mlocked_on_free= [MM] Fill freed userspace memory with zeroes if
|
||||
it was mlock'ed and not explicitly munlock'ed
|
||||
afterwards.
|
||||
Format: 0 | 1
|
||||
Default set by CONFIG_INIT_MLOCKED_ON_FREE_DEFAULT_ON
|
||||
|
||||
init_pkru= [X86] Specify the default memory protection keys rights
|
||||
register contents for all processes. 0x55555554 by
|
||||
default (disallow access to all but pkey 0). Can
|
||||
|
@ -153,7 +153,7 @@ Users can write below commands for the kdamond to the ``state`` file.
|
||||
- ``clear_schemes_tried_regions``: Clear the DAMON-based operating scheme
|
||||
action tried regions directory for each DAMON-based operation scheme of the
|
||||
kdamond.
|
||||
- ``update_schemes_effective_bytes``: Update the contents of
|
||||
- ``update_schemes_effective_quotas``: Update the contents of
|
||||
``effective_bytes`` files for each DAMON-based operation scheme of the
|
||||
kdamond. For more details, refer to :ref:`quotas directory <sysfs_quotas>`.
|
||||
|
||||
@ -342,7 +342,7 @@ Based on the user-specified :ref:`goal <sysfs_schemes_quota_goals>`, the
|
||||
effective size quota is further adjusted. Reading ``effective_bytes`` returns
|
||||
the current effective size quota. The file is not updated in real time, so
|
||||
users should ask DAMON sysfs interface to update the content of the file for
|
||||
the stats by writing a special keyword, ``update_schemes_effective_bytes`` to
|
||||
the stats by writing a special keyword, ``update_schemes_effective_quotas`` to
|
||||
the relevant ``kdamonds/<N>/state`` file.
|
||||
|
||||
Under ``weights`` directory, three files (``sz_permil``,
|
||||
@ -410,19 +410,19 @@ in the numeric order.
|
||||
|
||||
Each filter directory contains six files, namely ``type``, ``matcing``,
|
||||
``memcg_path``, ``addr_start``, ``addr_end``, and ``target_idx``. To ``type``
|
||||
file, you can write one of four special keywords: ``anon`` for anonymous pages,
|
||||
``memcg`` for specific memory cgroup, ``addr`` for specific address range (an
|
||||
open-ended interval), or ``target`` for specific DAMON monitoring target
|
||||
filtering. In case of the memory cgroup filtering, you can specify the memory
|
||||
cgroup of the interest by writing the path of the memory cgroup from the
|
||||
cgroups mount point to ``memcg_path`` file. In case of the address range
|
||||
filtering, you can specify the start and end address of the range to
|
||||
``addr_start`` and ``addr_end`` files, respectively. For the DAMON monitoring
|
||||
target filtering, you can specify the index of the target between the list of
|
||||
the DAMON context's monitoring targets list to ``target_idx`` file. You can
|
||||
write ``Y`` or ``N`` to ``matching`` file to filter out pages that does or does
|
||||
not match to the type, respectively. Then, the scheme's action will not be
|
||||
applied to the pages that specified to be filtered out.
|
||||
file, you can write one of five special keywords: ``anon`` for anonymous pages,
|
||||
``memcg`` for specific memory cgroup, ``young`` for young pages, ``addr`` for
|
||||
specific address range (an open-ended interval), or ``target`` for specific
|
||||
DAMON monitoring target filtering. In case of the memory cgroup filtering, you
|
||||
can specify the memory cgroup of the interest by writing the path of the memory
|
||||
cgroup from the cgroups mount point to ``memcg_path`` file. In case of the
|
||||
address range filtering, you can specify the start and end address of the range
|
||||
to ``addr_start`` and ``addr_end`` files, respectively. For the DAMON
|
||||
monitoring target filtering, you can specify the index of the target between
|
||||
the list of the DAMON context's monitoring targets list to ``target_idx`` file.
|
||||
You can write ``Y`` or ``N`` to ``matching`` file to filter out pages that does
|
||||
or does not match to the type, respectively. Then, the scheme's action will
|
||||
not be applied to the pages that specified to be filtered out.
|
||||
|
||||
For example, below restricts a DAMOS action to be applied to only non-anonymous
|
||||
pages of all memory cgroups except ``/having_care_already``.::
|
||||
@ -434,7 +434,7 @@ pages of all memory cgroups except ``/having_care_already``.::
|
||||
# # further filter out all cgroups except one at '/having_care_already'
|
||||
echo memcg > 1/type
|
||||
echo /having_care_already > 1/memcg_path
|
||||
echo N > 1/matching
|
||||
echo Y > 1/matching
|
||||
|
||||
Note that ``anon`` and ``memcg`` filters are currently supported only when
|
||||
``paddr`` :ref:`implementation <sysfs_context>` is being used.
|
||||
|
@ -376,6 +376,13 @@ Note that the number of overcommit and reserve pages remain global quantities,
|
||||
as we don't know until fault time, when the faulting task's mempolicy is
|
||||
applied, from which node the huge page allocation will be attempted.
|
||||
|
||||
The hugetlb may be migrated between the per-node hugepages pool in the following
|
||||
scenarios: memory offline, memory failure, longterm pinning, syscalls(mbind,
|
||||
migrate_pages and move_pages), alloc_contig_range() and alloc_contig_pages().
|
||||
Now only memory offline, memory failure and syscalls allow fallbacking to allocate
|
||||
a new hugetlb on a different node if the current node is unable to allocate during
|
||||
hugetlb migration, that means these 3 cases can break the per-node hugepages pool.
|
||||
|
||||
.. _using_huge_pages:
|
||||
|
||||
Using Huge Pages
|
||||
|
@ -278,7 +278,8 @@ collapsed, resulting fewer pages being collapsed into
|
||||
THPs, and lower memory access performance.
|
||||
|
||||
``max_ptes_shared`` specifies how many pages can be shared across multiple
|
||||
processes. Exceeding the number would block the collapse::
|
||||
processes. khugepaged might treat pages of THPs as shared if any page of
|
||||
that THP is shared. Exceeding the number would block the collapse::
|
||||
|
||||
/sys/kernel/mm/transparent_hugepage/khugepaged/max_ptes_shared
|
||||
|
||||
@ -369,7 +370,7 @@ monitor how successfully the system is providing huge pages for use.
|
||||
|
||||
thp_fault_alloc
|
||||
is incremented every time a huge page is successfully
|
||||
allocated to handle a page fault.
|
||||
allocated and charged to handle a page fault.
|
||||
|
||||
thp_collapse_alloc
|
||||
is incremented by khugepaged when it has found
|
||||
@ -377,7 +378,7 @@ thp_collapse_alloc
|
||||
successfully allocated a new huge page to store the data.
|
||||
|
||||
thp_fault_fallback
|
||||
is incremented if a page fault fails to allocate
|
||||
is incremented if a page fault fails to allocate or charge
|
||||
a huge page and instead falls back to using small pages.
|
||||
|
||||
thp_fault_fallback_charge
|
||||
@ -447,6 +448,34 @@ thp_swpout_fallback
|
||||
Usually because failed to allocate some continuous swap space
|
||||
for the huge page.
|
||||
|
||||
In /sys/kernel/mm/transparent_hugepage/hugepages-<size>kB/stats, There are
|
||||
also individual counters for each huge page size, which can be utilized to
|
||||
monitor the system's effectiveness in providing huge pages for usage. Each
|
||||
counter has its own corresponding file.
|
||||
|
||||
anon_fault_alloc
|
||||
is incremented every time a huge page is successfully
|
||||
allocated and charged to handle a page fault.
|
||||
|
||||
anon_fault_fallback
|
||||
is incremented if a page fault fails to allocate or charge
|
||||
a huge page and instead falls back to using huge pages with
|
||||
lower orders or small pages.
|
||||
|
||||
anon_fault_fallback_charge
|
||||
is incremented if a page fault fails to charge a huge page and
|
||||
instead falls back to using huge pages with lower orders or
|
||||
small pages even though the allocation was successful.
|
||||
|
||||
anon_swpout
|
||||
is incremented every time a huge page is swapped out in one
|
||||
piece without splitting.
|
||||
|
||||
anon_swpout_fallback
|
||||
is incremented if a huge page has to be split before swapout.
|
||||
Usually because failed to allocate some continuous swap space
|
||||
for the huge page.
|
||||
|
||||
As the system ages, allocating huge pages may be expensive as the
|
||||
system uses memory compaction to copy data around memory to free a
|
||||
huge page for use. There are some counters in ``/proc/vmstat`` to help
|
||||
|
@ -111,35 +111,6 @@ checked if it is a same-value filled page before compressing it. If true, the
|
||||
compressed length of the page is set to zero and the pattern or same-filled
|
||||
value is stored.
|
||||
|
||||
Same-value filled pages identification feature is enabled by default and can be
|
||||
disabled at boot time by setting the ``same_filled_pages_enabled`` attribute
|
||||
to 0, e.g. ``zswap.same_filled_pages_enabled=0``. It can also be enabled and
|
||||
disabled at runtime using the sysfs ``same_filled_pages_enabled``
|
||||
attribute, e.g.::
|
||||
|
||||
echo 1 > /sys/module/zswap/parameters/same_filled_pages_enabled
|
||||
|
||||
When zswap same-filled page identification is disabled at runtime, it will stop
|
||||
checking for the same-value filled pages during store operation.
|
||||
In other words, every page will be then considered non-same-value filled.
|
||||
However, the existing pages which are marked as same-value filled pages remain
|
||||
stored unchanged in zswap until they are either loaded or invalidated.
|
||||
|
||||
In some circumstances it might be advantageous to make use of just the zswap
|
||||
ability to efficiently store same-filled pages without enabling the whole
|
||||
compressed page storage.
|
||||
In this case the handling of non-same-value pages by zswap (enabled by default)
|
||||
can be disabled by setting the ``non_same_filled_pages_enabled`` attribute
|
||||
to 0, e.g. ``zswap.non_same_filled_pages_enabled=0``.
|
||||
It can also be enabled and disabled at runtime using the sysfs
|
||||
``non_same_filled_pages_enabled`` attribute, e.g.::
|
||||
|
||||
echo 1 > /sys/module/zswap/parameters/non_same_filled_pages_enabled
|
||||
|
||||
Disabling both ``zswap.same_filled_pages_enabled`` and
|
||||
``zswap.non_same_filled_pages_enabled`` effectively disables accepting any new
|
||||
pages by zswap.
|
||||
|
||||
To prevent zswap from shrinking pool when zswap is full and there's a high
|
||||
pressure on swap (this will result in flipping pages in and out zswap pool
|
||||
without any real benefit but with a performance drop for the system), a
|
||||
|
@ -43,6 +43,7 @@ Currently, these files are in /proc/sys/vm:
|
||||
- legacy_va_layout
|
||||
- lowmem_reserve_ratio
|
||||
- max_map_count
|
||||
- mem_profiling (only if CONFIG_MEM_ALLOC_PROFILING=y)
|
||||
- memory_failure_early_kill
|
||||
- memory_failure_recovery
|
||||
- min_free_kbytes
|
||||
@ -425,6 +426,21 @@ e.g., up to one or two maps per allocation.
|
||||
The default value is 65530.
|
||||
|
||||
|
||||
mem_profiling
|
||||
==============
|
||||
|
||||
Enable memory profiling (when CONFIG_MEM_ALLOC_PROFILING=y)
|
||||
|
||||
1: Enable memory profiling.
|
||||
|
||||
0: Disable memory profiling.
|
||||
|
||||
Enabling memory profiling introduces a small performance overhead for all
|
||||
memory allocations.
|
||||
|
||||
The default value depends on CONFIG_MEM_ALLOC_PROFILING_ENABLED_BY_DEFAULT.
|
||||
|
||||
|
||||
memory_failure_early_kill:
|
||||
==========================
|
||||
|
||||
|
@ -471,7 +471,6 @@ Use the following commands to enable zswap::
|
||||
# echo deflate-iaa > /sys/module/zswap/parameters/compressor
|
||||
# echo zsmalloc > /sys/module/zswap/parameters/zpool
|
||||
# echo 1 > /sys/module/zswap/parameters/enabled
|
||||
# echo 0 > /sys/module/zswap/parameters/same_filled_pages_enabled
|
||||
# echo 100 > /proc/sys/vm/swappiness
|
||||
# echo never > /sys/kernel/mm/transparent_hugepage/enabled
|
||||
# echo 1 > /proc/sys/vm/overcommit_memory
|
||||
@ -621,7 +620,6 @@ the 'fixed' compression mode::
|
||||
echo deflate-iaa > /sys/module/zswap/parameters/compressor
|
||||
echo zsmalloc > /sys/module/zswap/parameters/zpool
|
||||
echo 1 > /sys/module/zswap/parameters/enabled
|
||||
echo 0 > /sys/module/zswap/parameters/same_filled_pages_enabled
|
||||
|
||||
echo 100 > /proc/sys/vm/swappiness
|
||||
echo never > /sys/kernel/mm/transparent_hugepage/enabled
|
||||
|
@ -56,9 +56,6 @@ Other Functions
|
||||
.. kernel-doc:: fs/namei.c
|
||||
:export:
|
||||
|
||||
.. kernel-doc:: fs/buffer.c
|
||||
:export:
|
||||
|
||||
.. kernel-doc:: block/bio.c
|
||||
:export:
|
||||
|
||||
|
12
Documentation/filesystems/buffer.rst
Normal file
12
Documentation/filesystems/buffer.rst
Normal file
@ -0,0 +1,12 @@
|
||||
Buffer Heads
|
||||
============
|
||||
|
||||
Linux uses buffer heads to maintain state about individual filesystem blocks.
|
||||
Buffer heads are deprecated and new filesystems should use iomap instead.
|
||||
|
||||
Functions
|
||||
---------
|
||||
|
||||
.. kernel-doc:: include/linux/buffer_head.h
|
||||
.. kernel-doc:: fs/buffer.c
|
||||
:export:
|
@ -50,6 +50,7 @@ filesystem implementations.
|
||||
.. toctree::
|
||||
:maxdepth: 2
|
||||
|
||||
buffer
|
||||
journalling
|
||||
fscrypt
|
||||
fsverity
|
||||
|
@ -688,6 +688,7 @@ files are there, and which are missing.
|
||||
============ ===============================================================
|
||||
File Content
|
||||
============ ===============================================================
|
||||
allocinfo Memory allocations profiling information
|
||||
apm Advanced power management info
|
||||
bootconfig Kernel command line obtained from boot config,
|
||||
and, if there were kernel parameters from the
|
||||
@ -953,6 +954,34 @@ also be allocatable although a lot of filesystem metadata may have to be
|
||||
reclaimed to achieve this.
|
||||
|
||||
|
||||
allocinfo
|
||||
~~~~~~~~~
|
||||
|
||||
Provides information about memory allocations at all locations in the code
|
||||
base. Each allocation in the code is identified by its source file, line
|
||||
number, module (if originates from a loadable module) and the function calling
|
||||
the allocation. The number of bytes allocated and number of calls at each
|
||||
location are reported.
|
||||
|
||||
Example output.
|
||||
|
||||
::
|
||||
|
||||
> sort -rn /proc/allocinfo
|
||||
127664128 31168 mm/page_ext.c:270 func:alloc_page_ext
|
||||
56373248 4737 mm/slub.c:2259 func:alloc_slab_page
|
||||
14880768 3633 mm/readahead.c:247 func:page_cache_ra_unbounded
|
||||
14417920 3520 mm/mm_init.c:2530 func:alloc_large_system_hash
|
||||
13377536 234 block/blk-mq.c:3421 func:blk_mq_alloc_rqs
|
||||
11718656 2861 mm/filemap.c:1919 func:__filemap_get_folio
|
||||
9192960 2800 kernel/fork.c:307 func:alloc_thread_stack_node
|
||||
4206592 4 net/netfilter/nf_conntrack_core.c:2567 func:nf_ct_alloc_hashtable
|
||||
4136960 1010 drivers/staging/ctagmod/ctagmod.c:20 [ctagmod] func:ctagmod_start
|
||||
3940352 962 mm/memory.c:4214 func:alloc_anon_folio
|
||||
2894464 22613 fs/kernfs/dir.c:615 func:__kernfs_new_node
|
||||
...
|
||||
|
||||
|
||||
meminfo
|
||||
~~~~~~~
|
||||
|
||||
|
100
Documentation/mm/allocation-profiling.rst
Normal file
100
Documentation/mm/allocation-profiling.rst
Normal file
@ -0,0 +1,100 @@
|
||||
.. SPDX-License-Identifier: GPL-2.0
|
||||
|
||||
===========================
|
||||
MEMORY ALLOCATION PROFILING
|
||||
===========================
|
||||
|
||||
Low overhead (suitable for production) accounting of all memory allocations,
|
||||
tracked by file and line number.
|
||||
|
||||
Usage:
|
||||
kconfig options:
|
||||
- CONFIG_MEM_ALLOC_PROFILING
|
||||
|
||||
- CONFIG_MEM_ALLOC_PROFILING_ENABLED_BY_DEFAULT
|
||||
|
||||
- CONFIG_MEM_ALLOC_PROFILING_DEBUG
|
||||
adds warnings for allocations that weren't accounted because of a
|
||||
missing annotation
|
||||
|
||||
Boot parameter:
|
||||
sysctl.vm.mem_profiling=0|1|never
|
||||
|
||||
When set to "never", memory allocation profiling overhead is minimized and it
|
||||
cannot be enabled at runtime (sysctl becomes read-only).
|
||||
When CONFIG_MEM_ALLOC_PROFILING_ENABLED_BY_DEFAULT=y, default value is "1".
|
||||
When CONFIG_MEM_ALLOC_PROFILING_ENABLED_BY_DEFAULT=n, default value is "never".
|
||||
|
||||
sysctl:
|
||||
/proc/sys/vm/mem_profiling
|
||||
|
||||
Runtime info:
|
||||
/proc/allocinfo
|
||||
|
||||
Example output::
|
||||
|
||||
root@moria-kvm:~# sort -g /proc/allocinfo|tail|numfmt --to=iec
|
||||
2.8M 22648 fs/kernfs/dir.c:615 func:__kernfs_new_node
|
||||
3.8M 953 mm/memory.c:4214 func:alloc_anon_folio
|
||||
4.0M 1010 drivers/staging/ctagmod/ctagmod.c:20 [ctagmod] func:ctagmod_start
|
||||
4.1M 4 net/netfilter/nf_conntrack_core.c:2567 func:nf_ct_alloc_hashtable
|
||||
6.0M 1532 mm/filemap.c:1919 func:__filemap_get_folio
|
||||
8.8M 2785 kernel/fork.c:307 func:alloc_thread_stack_node
|
||||
13M 234 block/blk-mq.c:3421 func:blk_mq_alloc_rqs
|
||||
14M 3520 mm/mm_init.c:2530 func:alloc_large_system_hash
|
||||
15M 3656 mm/readahead.c:247 func:page_cache_ra_unbounded
|
||||
55M 4887 mm/slub.c:2259 func:alloc_slab_page
|
||||
122M 31168 mm/page_ext.c:270 func:alloc_page_ext
|
||||
|
||||
===================
|
||||
Theory of operation
|
||||
===================
|
||||
|
||||
Memory allocation profiling builds off of code tagging, which is a library for
|
||||
declaring static structs (that typically describe a file and line number in
|
||||
some way, hence code tagging) and then finding and operating on them at runtime,
|
||||
- i.e. iterating over them to print them in debugfs/procfs.
|
||||
|
||||
To add accounting for an allocation call, we replace it with a macro
|
||||
invocation, alloc_hooks(), that
|
||||
- declares a code tag
|
||||
- stashes a pointer to it in task_struct
|
||||
- calls the real allocation function
|
||||
- and finally, restores the task_struct alloc tag pointer to its previous value.
|
||||
|
||||
This allows for alloc_hooks() calls to be nested, with the most recent one
|
||||
taking effect. This is important for allocations internal to the mm/ code that
|
||||
do not properly belong to the outer allocation context and should be counted
|
||||
separately: for example, slab object extension vectors, or when the slab
|
||||
allocates pages from the page allocator.
|
||||
|
||||
Thus, proper usage requires determining which function in an allocation call
|
||||
stack should be tagged. There are many helper functions that essentially wrap
|
||||
e.g. kmalloc() and do a little more work, then are called in multiple places;
|
||||
we'll generally want the accounting to happen in the callers of these helpers,
|
||||
not in the helpers themselves.
|
||||
|
||||
To fix up a given helper, for example foo(), do the following:
|
||||
- switch its allocation call to the _noprof() version, e.g. kmalloc_noprof()
|
||||
|
||||
- rename it to foo_noprof()
|
||||
|
||||
- define a macro version of foo() like so:
|
||||
|
||||
#define foo(...) alloc_hooks(foo_noprof(__VA_ARGS__))
|
||||
|
||||
It's also possible to stash a pointer to an alloc tag in your own data structures.
|
||||
|
||||
Do this when you're implementing a generic data structure that does allocations
|
||||
"on behalf of" some other code - for example, the rhashtable code. This way,
|
||||
instead of seeing a large line in /proc/allocinfo for rhashtable.c, we can
|
||||
break it out by rhashtable type.
|
||||
|
||||
To do so:
|
||||
- Hook your data structure's init function, like any other allocation function.
|
||||
|
||||
- Within your init function, use the convenience macro alloc_tag_record() to
|
||||
record alloc tag in your data structure.
|
||||
|
||||
- Then, use the following form for your allocations:
|
||||
alloc_hooks_tag(ht->your_saved_tag, kmalloc_noprof(...))
|
@ -140,7 +140,8 @@ PMD Page Table Helpers
|
||||
+---------------------------+--------------------------------------------------+
|
||||
| pmd_swp_clear_soft_dirty | Clears a soft dirty swapped PMD |
|
||||
+---------------------------+--------------------------------------------------+
|
||||
| pmd_mkinvalid | Invalidates a mapped PMD [1] |
|
||||
| pmd_mkinvalid | Invalidates a present PMD; do not call for |
|
||||
| | non-present PMD [1] |
|
||||
+---------------------------+--------------------------------------------------+
|
||||
| pmd_set_huge | Creates a PMD huge mapping |
|
||||
+---------------------------+--------------------------------------------------+
|
||||
@ -196,7 +197,8 @@ PUD Page Table Helpers
|
||||
+---------------------------+--------------------------------------------------+
|
||||
| pud_mkdevmap | Creates a ZONE_DEVICE mapped PUD |
|
||||
+---------------------------+--------------------------------------------------+
|
||||
| pud_mkinvalid | Invalidates a mapped PUD [1] |
|
||||
| pud_mkinvalid | Invalidates a present PUD; do not call for |
|
||||
| | non-present PUD [1] |
|
||||
+---------------------------+--------------------------------------------------+
|
||||
| pud_set_huge | Creates a PUD huge mapping |
|
||||
+---------------------------+--------------------------------------------------+
|
||||
|
@ -461,24 +461,32 @@ number of filters for each scheme. Each filter specifies the type of target
|
||||
memory, and whether it should exclude the memory of the type (filter-out), or
|
||||
all except the memory of the type (filter-in).
|
||||
|
||||
Currently, anonymous page, memory cgroup, address range, and DAMON monitoring
|
||||
target type filters are supported by the feature. Some filter target types
|
||||
require additional arguments. The memory cgroup filter type asks users to
|
||||
specify the file path of the memory cgroup for the filter. The address range
|
||||
type asks the start and end addresses of the range. The DAMON monitoring
|
||||
target type asks the index of the target from the context's monitoring targets
|
||||
list. Hence, users can apply specific schemes to only anonymous pages,
|
||||
non-anonymous pages, pages of specific cgroups, all pages excluding those of
|
||||
specific cgroups, pages in specific address range, pages in specific DAMON
|
||||
monitoring targets, and any combination of those.
|
||||
For efficient handling of filters, some types of filters are handled by the
|
||||
core layer, while others are handled by operations set. In the latter case,
|
||||
hence, support of the filter types depends on the DAMON operations set. In
|
||||
case of the core layer-handled filters, the memory regions that excluded by the
|
||||
filter are not counted as the scheme has tried to the region. In contrast, if
|
||||
a memory regions is filtered by an operations set layer-handled filter, it is
|
||||
counted as the scheme has tried. This difference affects the statistics.
|
||||
|
||||
To handle filters efficiently, the address range and DAMON monitoring target
|
||||
type filters are handled by the core layer, while others are handled by
|
||||
operations set. If a memory region is filtered by a core layer-handled filter,
|
||||
it is not counted as the scheme has tried to the region. In contrast, if a
|
||||
memory regions is filtered by an operations set layer-handled filter, it is
|
||||
counted as the scheme has tried. The difference in accounting leads to changes
|
||||
in the statistics.
|
||||
Below types of filters are currently supported.
|
||||
|
||||
- anonymous page
|
||||
- Applied to pages that containing data that not stored in files.
|
||||
- Handled by operations set layer. Supported by only ``paddr`` set.
|
||||
- memory cgroup
|
||||
- Applied to pages that belonging to a given cgroup.
|
||||
- Handled by operations set layer. Supported by only ``paddr`` set.
|
||||
- young page
|
||||
- Applied to pages that are accessed after the last access check from the
|
||||
scheme.
|
||||
- Handled by operations set layer. Supported by only ``paddr`` set.
|
||||
- address range
|
||||
- Applied to pages that belonging to a given address range.
|
||||
- Handled by the core logic.
|
||||
- DAMON monitoring target
|
||||
- Applied to pages that belonging to a given DAMON monitoring target.
|
||||
- Handled by the core logic.
|
||||
|
||||
|
||||
Application Programming Interface
|
||||
|
@ -20,9 +20,10 @@ management subsystem maintainer. After more sufficient tests, the patches will
|
||||
be queued in mm-stable [3]_ , and finally pull-requested to the mainline by the
|
||||
memory management subsystem maintainer.
|
||||
|
||||
Note again the patches for review should be made against the mm-unstable
|
||||
tree [1]_ whenever possible. damon/next is only for preview of others' works
|
||||
in progress.
|
||||
Note again the patches for mm-unstable tree [1]_ are queued by the memory
|
||||
management subsystem maintainer. If the patches requires some patches in
|
||||
damon/next tree [2]_ which not yet merged in mm-unstable, please make sure the
|
||||
requirement is clearly specified.
|
||||
|
||||
Submit checklist addendum
|
||||
-------------------------
|
||||
@ -48,9 +49,9 @@ Review cadence
|
||||
--------------
|
||||
|
||||
The DAMON maintainer does the work on the usual work hour (09:00 to 17:00,
|
||||
Mon-Fri) in PST. The response to patches will occasionally be slow. Do not
|
||||
hesitate to send a ping if you have not heard back within a week of sending a
|
||||
patch.
|
||||
Mon-Fri) in PT (Pacific Time). The response to patches will occasionally be
|
||||
slow. Do not hesitate to send a ping if you have not heard back within a week
|
||||
of sending a patch.
|
||||
|
||||
|
||||
.. [1] https://git.kernel.org/akpm/mm/h/mm-unstable
|
||||
|
@ -26,6 +26,7 @@ see the :doc:`admin guide <../admin-guide/mm/index>`.
|
||||
page_cache
|
||||
shmfs
|
||||
oom
|
||||
allocation-profiling
|
||||
|
||||
Legacy Documentation
|
||||
====================
|
||||
|
@ -14,7 +14,7 @@ Page table check performs extra verifications at the time when new pages become
|
||||
accessible from the userspace by getting their page table entries (PTEs PMDs
|
||||
etc.) added into the table.
|
||||
|
||||
In case of detected corruption, the kernel is crashed. There is a small
|
||||
In case of most detected corruption, the kernel is crashed. There is a small
|
||||
performance and memory overhead associated with the page table check. Therefore,
|
||||
it is disabled by default, but can be optionally enabled on systems where the
|
||||
extra hardening outweighs the performance costs. Also, because page table check
|
||||
@ -22,6 +22,13 @@ is synchronous, it can help with debugging double map memory corruption issues,
|
||||
by crashing kernel at the time wrong mapping occurs instead of later which is
|
||||
often the case with memory corruptions bugs.
|
||||
|
||||
It can also be used to do page table entry checks over various flags, dump
|
||||
warnings when illegal combinations of entry flags are detected. Currently,
|
||||
userfaultfd is the only user of such to sanity check wr-protect bit against
|
||||
any writable flags. Illegal flag combinations will not directly cause data
|
||||
corruption in this case immediately, but that will cause read-only data to
|
||||
be writable, leading to corrupt when the page content is later modified.
|
||||
|
||||
Double mapping detection logic
|
||||
==============================
|
||||
|
||||
|
@ -116,14 +116,14 @@ pages:
|
||||
succeeds on tail pages.
|
||||
|
||||
- map/unmap of a PMD entry for the whole THP increment/decrement
|
||||
folio->_entire_mapcount and also increment/decrement
|
||||
folio->_nr_pages_mapped by ENTIRELY_MAPPED when _entire_mapcount
|
||||
goes from -1 to 0 or 0 to -1.
|
||||
folio->_entire_mapcount, increment/decrement folio->_large_mapcount
|
||||
and also increment/decrement folio->_nr_pages_mapped by ENTIRELY_MAPPED
|
||||
when _entire_mapcount goes from -1 to 0 or 0 to -1.
|
||||
|
||||
- map/unmap of individual pages with PTE entry increment/decrement
|
||||
page->_mapcount and also increment/decrement folio->_nr_pages_mapped
|
||||
when page->_mapcount goes from -1 to 0 or 0 to -1 as this counts
|
||||
the number of pages mapped by PTE.
|
||||
page->_mapcount, increment/decrement folio->_large_mapcount and also
|
||||
increment/decrement folio->_nr_pages_mapped when page->_mapcount goes
|
||||
from -1 to 0 or 0 to -1 as this counts the number of pages mapped by PTE.
|
||||
|
||||
split_huge_page internally has to distribute the refcounts in the head
|
||||
page to the tail pages before clearing all PG_head/tail bits from the page
|
||||
|
@ -180,27 +180,7 @@ this correctly. There is only **one** head ``struct page``, the tail
|
||||
``struct page`` with ``PG_head`` are fake head ``struct page``. We need an
|
||||
approach to distinguish between those two different types of ``struct page`` so
|
||||
that ``compound_head()`` can return the real head ``struct page`` when the
|
||||
parameter is the tail ``struct page`` but with ``PG_head``. The following code
|
||||
snippet describes how to distinguish between real and fake head ``struct page``.
|
||||
|
||||
.. code-block:: c
|
||||
|
||||
if (test_bit(PG_head, &page->flags)) {
|
||||
unsigned long head = READ_ONCE(page[1].compound_head);
|
||||
|
||||
if (head & 1) {
|
||||
if (head == (unsigned long)page + 1)
|
||||
/* head struct page */
|
||||
else
|
||||
/* tail struct page */
|
||||
} else {
|
||||
/* head struct page */
|
||||
}
|
||||
}
|
||||
|
||||
We can safely access the field of the **page[1]** with ``PG_head`` because the
|
||||
page is a compound page composed with at least two contiguous pages.
|
||||
The implementation refers to ``page_fixed_fake_head()``.
|
||||
parameter is the tail ``struct page`` but with ``PG_head``.
|
||||
|
||||
Device DAX
|
||||
==========
|
||||
|
@ -260,7 +260,7 @@ HyperSparc cpu就是这样一个具有这种属性的cpu。
|
||||
如果D-cache别名不是一个问题,这个程序可以简单地定义为该架构上
|
||||
的nop。
|
||||
|
||||
在page->flags (PG_arch_1)中有一个位是“架构私有”。内核保证,
|
||||
在folio->flags (PG_arch_1)中有一个位是“架构私有”。内核保证,
|
||||
对于分页缓存的页面,当这样的页面第一次进入分页缓存时,它将清除
|
||||
这个位。
|
||||
|
||||
|
18
MAINTAINERS
18
MAINTAINERS
@ -5360,6 +5360,14 @@ S: Supported
|
||||
F: Documentation/process/code-of-conduct-interpretation.rst
|
||||
F: Documentation/process/code-of-conduct.rst
|
||||
|
||||
CODE TAGGING
|
||||
M: Suren Baghdasaryan <surenb@google.com>
|
||||
M: Kent Overstreet <kent.overstreet@linux.dev>
|
||||
S: Maintained
|
||||
F: include/asm-generic/codetag.lds.h
|
||||
F: include/linux/codetag.h
|
||||
F: lib/codetag.c
|
||||
|
||||
COMEDI DRIVERS
|
||||
M: Ian Abbott <abbotti@mev.co.uk>
|
||||
M: H Hartley Sweeten <hsweeten@visionengravers.com>
|
||||
@ -14335,6 +14343,16 @@ F: mm/memblock.c
|
||||
F: mm/mm_init.c
|
||||
F: tools/testing/memblock/
|
||||
|
||||
MEMORY ALLOCATION PROFILING
|
||||
M: Suren Baghdasaryan <surenb@google.com>
|
||||
M: Kent Overstreet <kent.overstreet@linux.dev>
|
||||
L: linux-mm@kvack.org
|
||||
S: Maintained
|
||||
F: Documentation/mm/allocation-profiling.rst
|
||||
F: include/linux/alloc_tag.h
|
||||
F: include/linux/pgalloc_tag.h
|
||||
F: lib/alloc_tag.c
|
||||
|
||||
MEMORY CONTROLLER DRIVERS
|
||||
M: Krzysztof Kozlowski <krzk@kernel.org>
|
||||
L: linux-kernel@vger.kernel.org
|
||||
|
@ -1218,14 +1218,11 @@ static unsigned long
|
||||
arch_get_unmapped_area_1(unsigned long addr, unsigned long len,
|
||||
unsigned long limit)
|
||||
{
|
||||
struct vm_unmapped_area_info info;
|
||||
struct vm_unmapped_area_info info = {};
|
||||
|
||||
info.flags = 0;
|
||||
info.length = len;
|
||||
info.low_limit = addr;
|
||||
info.high_limit = limit;
|
||||
info.align_mask = 0;
|
||||
info.align_offset = 0;
|
||||
return vm_unmapped_area(&info);
|
||||
}
|
||||
|
||||
|
@ -929,7 +929,7 @@ const struct dma_map_ops alpha_pci_ops = {
|
||||
.dma_supported = alpha_pci_supported,
|
||||
.mmap = dma_common_mmap,
|
||||
.get_sgtable = dma_common_get_sgtable,
|
||||
.alloc_pages = dma_common_alloc_pages,
|
||||
.alloc_pages_op = dma_common_alloc_pages,
|
||||
.free_pages = dma_common_free_pages,
|
||||
};
|
||||
EXPORT_SYMBOL(alpha_pci_ops);
|
||||
|
@ -15,6 +15,7 @@
|
||||
#include <net/checksum.h>
|
||||
|
||||
#include <asm/byteorder.h>
|
||||
#include <asm/checksum.h>
|
||||
|
||||
static inline unsigned short from64to16(unsigned long x)
|
||||
{
|
||||
|
@ -8,6 +8,7 @@
|
||||
#include <linux/compiler.h>
|
||||
#include <linux/export.h>
|
||||
#include <linux/preempt.h>
|
||||
#include <asm/fpu.h>
|
||||
#include <asm/thread_info.h>
|
||||
#include <asm/fpu.h>
|
||||
|
||||
|
@ -9,6 +9,8 @@
|
||||
#ifndef _ASM_ARC_MMU_ARCV2_H
|
||||
#define _ASM_ARC_MMU_ARCV2_H
|
||||
|
||||
#include <soc/arc/aux.h>
|
||||
|
||||
/*
|
||||
* TLB Management regs
|
||||
*/
|
||||
|
@ -27,7 +27,7 @@ arch_get_unmapped_area(struct file *filp, unsigned long addr,
|
||||
{
|
||||
struct mm_struct *mm = current->mm;
|
||||
struct vm_area_struct *vma;
|
||||
struct vm_unmapped_area_info info;
|
||||
struct vm_unmapped_area_info info = {};
|
||||
|
||||
/*
|
||||
* We enforce the MAP_FIXED case.
|
||||
@ -51,11 +51,9 @@ arch_get_unmapped_area(struct file *filp, unsigned long addr,
|
||||
return addr;
|
||||
}
|
||||
|
||||
info.flags = 0;
|
||||
info.length = len;
|
||||
info.low_limit = mm->mmap_base;
|
||||
info.high_limit = TASK_SIZE;
|
||||
info.align_mask = 0;
|
||||
info.align_offset = pgoff << PAGE_SHIFT;
|
||||
return vm_unmapped_area(&info);
|
||||
}
|
||||
|
@ -100,7 +100,7 @@ config ARM
|
||||
select HAVE_DYNAMIC_FTRACE_WITH_REGS if HAVE_DYNAMIC_FTRACE
|
||||
select HAVE_EFFICIENT_UNALIGNED_ACCESS if (CPU_V6 || CPU_V6K || CPU_V7) && MMU
|
||||
select HAVE_EXIT_THREAD
|
||||
select HAVE_FAST_GUP if ARM_LPAE
|
||||
select HAVE_GUP_FAST if ARM_LPAE
|
||||
select HAVE_FTRACE_MCOUNT_RECORD if !XIP_KERNEL
|
||||
select HAVE_FUNCTION_ERROR_INJECTION
|
||||
select HAVE_FUNCTION_GRAPH_TRACER
|
||||
|
@ -15,10 +15,10 @@
|
||||
#include <asm/hugetlb-3level.h>
|
||||
#include <asm-generic/hugetlb.h>
|
||||
|
||||
static inline void arch_clear_hugepage_flags(struct page *page)
|
||||
static inline void arch_clear_hugetlb_flags(struct folio *folio)
|
||||
{
|
||||
clear_bit(PG_dcache_clean, &page->flags);
|
||||
clear_bit(PG_dcache_clean, &folio->flags);
|
||||
}
|
||||
#define arch_clear_hugepage_flags arch_clear_hugepage_flags
|
||||
#define arch_clear_hugetlb_flags arch_clear_hugetlb_flags
|
||||
|
||||
#endif /* _ASM_ARM_HUGETLB_H */
|
||||
|
@ -213,8 +213,8 @@ static inline pmd_t *pmd_offset(pud_t *pud, unsigned long addr)
|
||||
|
||||
#define pmd_pfn(pmd) (__phys_to_pfn(pmd_val(pmd) & PHYS_MASK))
|
||||
|
||||
#define pmd_leaf(pmd) (pmd_val(pmd) & 2)
|
||||
#define pmd_bad(pmd) (pmd_val(pmd) & 2)
|
||||
#define pmd_leaf(pmd) (pmd_val(pmd) & PMD_TYPE_SECT)
|
||||
#define pmd_bad(pmd) pmd_leaf(pmd)
|
||||
#define pmd_present(pmd) (pmd_val(pmd))
|
||||
|
||||
#define copy_pmd(pmdpd,pmdps) \
|
||||
@ -241,7 +241,6 @@ static inline pmd_t *pmd_offset(pud_t *pud, unsigned long addr)
|
||||
* define empty stubs for use by pin_page_for_write.
|
||||
*/
|
||||
#define pmd_hugewillfault(pmd) (0)
|
||||
#define pmd_thp_or_huge(pmd) (0)
|
||||
|
||||
#endif /* __ASSEMBLY__ */
|
||||
|
||||
|
@ -14,6 +14,7 @@
|
||||
* + Level 1/2 descriptor
|
||||
* - common
|
||||
*/
|
||||
#define PUD_TABLE_BIT (_AT(pmdval_t, 1) << 1)
|
||||
#define PMD_TYPE_MASK (_AT(pmdval_t, 3) << 0)
|
||||
#define PMD_TYPE_FAULT (_AT(pmdval_t, 0) << 0)
|
||||
#define PMD_TYPE_TABLE (_AT(pmdval_t, 3) << 0)
|
||||
|
@ -112,7 +112,7 @@
|
||||
#ifndef __ASSEMBLY__
|
||||
|
||||
#define pud_none(pud) (!pud_val(pud))
|
||||
#define pud_bad(pud) (!(pud_val(pud) & 2))
|
||||
#define pud_bad(pud) (!(pud_val(pud) & PUD_TABLE_BIT))
|
||||
#define pud_present(pud) (pud_val(pud))
|
||||
#define pmd_table(pmd) ((pmd_val(pmd) & PMD_TYPE_MASK) == \
|
||||
PMD_TYPE_TABLE)
|
||||
@ -137,7 +137,7 @@ static inline pmd_t *pud_pgtable(pud_t pud)
|
||||
return __va(pud_val(pud) & PHYS_MASK & (s32)PAGE_MASK);
|
||||
}
|
||||
|
||||
#define pmd_bad(pmd) (!(pmd_val(pmd) & 2))
|
||||
#define pmd_bad(pmd) (!(pmd_val(pmd) & PMD_TABLE_BIT))
|
||||
|
||||
#define copy_pmd(pmdpd,pmdps) \
|
||||
do { \
|
||||
@ -190,7 +190,6 @@ static inline pte_t pte_mkspecial(pte_t pte)
|
||||
#define pmd_dirty(pmd) (pmd_isset((pmd), L_PMD_SECT_DIRTY))
|
||||
|
||||
#define pmd_hugewillfault(pmd) (!pmd_young(pmd) || !pmd_write(pmd))
|
||||
#define pmd_thp_or_huge(pmd) (pmd_huge(pmd) || pmd_trans_huge(pmd))
|
||||
|
||||
#ifdef CONFIG_TRANSPARENT_HUGEPAGE
|
||||
#define pmd_trans_huge(pmd) (pmd_val(pmd) && !pmd_table(pmd))
|
||||
|
@ -32,6 +32,7 @@
|
||||
#include <linux/kallsyms.h>
|
||||
#include <linux/proc_fs.h>
|
||||
#include <linux/export.h>
|
||||
#include <linux/vmalloc.h>
|
||||
|
||||
#include <asm/hardware/cache-l2x0.h>
|
||||
#include <asm/hardware/cache-uniphier.h>
|
||||
|
@ -26,6 +26,7 @@
|
||||
#include <linux/sched/debug.h>
|
||||
#include <linux/sched/task_stack.h>
|
||||
#include <linux/irq.h>
|
||||
#include <linux/vmalloc.h>
|
||||
|
||||
#include <linux/atomic.h>
|
||||
#include <asm/cacheflush.h>
|
||||
|
@ -56,10 +56,10 @@ pin_page_for_write(const void __user *_addr, pte_t **ptep, spinlock_t **ptlp)
|
||||
* to see that it's still huge and whether or not we will
|
||||
* need to fault on write.
|
||||
*/
|
||||
if (unlikely(pmd_thp_or_huge(*pmd))) {
|
||||
if (unlikely(pmd_leaf(*pmd))) {
|
||||
ptl = ¤t->mm->page_table_lock;
|
||||
spin_lock(ptl);
|
||||
if (unlikely(!pmd_thp_or_huge(*pmd)
|
||||
if (unlikely(!pmd_leaf(*pmd)
|
||||
|| pmd_hugewillfault(*pmd))) {
|
||||
spin_unlock(ptl);
|
||||
return 0;
|
||||
|
@ -21,7 +21,6 @@ KASAN_SANITIZE_physaddr.o := n
|
||||
obj-$(CONFIG_DEBUG_VIRTUAL) += physaddr.o
|
||||
|
||||
obj-$(CONFIG_ALIGNMENT_TRAP) += alignment.o
|
||||
obj-$(CONFIG_HUGETLB_PAGE) += hugetlbpage.o
|
||||
obj-$(CONFIG_ARM_PV_FIXUP) += pv-fixup-asm.o
|
||||
|
||||
obj-$(CONFIG_CPU_ABRT_NOMMU) += abort-nommu.o
|
||||
|
@ -226,9 +226,6 @@ void do_bad_area(unsigned long addr, unsigned int fsr, struct pt_regs *regs)
|
||||
}
|
||||
|
||||
#ifdef CONFIG_MMU
|
||||
#define VM_FAULT_BADMAP ((__force vm_fault_t)0x010000)
|
||||
#define VM_FAULT_BADACCESS ((__force vm_fault_t)0x020000)
|
||||
|
||||
static inline bool is_permission_fault(unsigned int fsr)
|
||||
{
|
||||
int fs = fsr_fs(fsr);
|
||||
@ -323,7 +320,10 @@ do_page_fault(unsigned long addr, unsigned int fsr, struct pt_regs *regs)
|
||||
|
||||
if (!(vma->vm_flags & vm_flags)) {
|
||||
vma_end_read(vma);
|
||||
goto lock_mmap;
|
||||
count_vm_vma_lock_event(VMA_LOCK_SUCCESS);
|
||||
fault = 0;
|
||||
code = SEGV_ACCERR;
|
||||
goto bad_area;
|
||||
}
|
||||
fault = handle_mm_fault(vma, addr, flags | FAULT_FLAG_VMA_LOCK, regs);
|
||||
if (!(fault & (VM_FAULT_RETRY | VM_FAULT_COMPLETED)))
|
||||
@ -348,7 +348,8 @@ lock_mmap:
|
||||
retry:
|
||||
vma = lock_mm_and_find_vma(mm, addr, regs);
|
||||
if (unlikely(!vma)) {
|
||||
fault = VM_FAULT_BADMAP;
|
||||
fault = 0;
|
||||
code = SEGV_MAPERR;
|
||||
goto bad_area;
|
||||
}
|
||||
|
||||
@ -356,10 +357,14 @@ retry:
|
||||
* ok, we have a good vm_area for this memory access, check the
|
||||
* permissions on the VMA allow for the fault which occurred.
|
||||
*/
|
||||
if (!(vma->vm_flags & vm_flags))
|
||||
fault = VM_FAULT_BADACCESS;
|
||||
else
|
||||
fault = handle_mm_fault(vma, addr & PAGE_MASK, flags, regs);
|
||||
if (!(vma->vm_flags & vm_flags)) {
|
||||
mmap_read_unlock(mm);
|
||||
fault = 0;
|
||||
code = SEGV_ACCERR;
|
||||
goto bad_area;
|
||||
}
|
||||
|
||||
fault = handle_mm_fault(vma, addr & PAGE_MASK, flags, regs);
|
||||
|
||||
/* If we need to retry but a fatal signal is pending, handle the
|
||||
* signal first. We do not need to release the mmap_lock because
|
||||
@ -385,12 +390,11 @@ retry:
|
||||
mmap_read_unlock(mm);
|
||||
done:
|
||||
|
||||
/*
|
||||
* Handle the "normal" case first - VM_FAULT_MAJOR
|
||||
*/
|
||||
if (likely(!(fault & (VM_FAULT_ERROR | VM_FAULT_BADMAP | VM_FAULT_BADACCESS))))
|
||||
/* Handle the "normal" case first */
|
||||
if (likely(!(fault & VM_FAULT_ERROR)))
|
||||
return 0;
|
||||
|
||||
code = SEGV_MAPERR;
|
||||
bad_area:
|
||||
/*
|
||||
* If we are in kernel mode at this point, we
|
||||
@ -422,8 +426,6 @@ bad_area:
|
||||
* isn't in our memory map..
|
||||
*/
|
||||
sig = SIGSEGV;
|
||||
code = fault == VM_FAULT_BADACCESS ?
|
||||
SEGV_ACCERR : SEGV_MAPERR;
|
||||
}
|
||||
|
||||
__do_user_fault(addr, fsr, sig, code, regs);
|
||||
|
@ -1,34 +0,0 @@
|
||||
// SPDX-License-Identifier: GPL-2.0-only
|
||||
/*
|
||||
* arch/arm/mm/hugetlbpage.c
|
||||
*
|
||||
* Copyright (C) 2012 ARM Ltd.
|
||||
*
|
||||
* Based on arch/x86/include/asm/hugetlb.h and Bill Carson's patches
|
||||
*/
|
||||
|
||||
#include <linux/init.h>
|
||||
#include <linux/fs.h>
|
||||
#include <linux/mm.h>
|
||||
#include <linux/hugetlb.h>
|
||||
#include <linux/pagemap.h>
|
||||
#include <linux/err.h>
|
||||
#include <linux/sysctl.h>
|
||||
#include <asm/mman.h>
|
||||
#include <asm/tlb.h>
|
||||
#include <asm/tlbflush.h>
|
||||
|
||||
/*
|
||||
* On ARM, huge pages are backed by pmd's rather than pte's, so we do a lot
|
||||
* of type casting from pmd_t * to pte_t *.
|
||||
*/
|
||||
|
||||
int pud_huge(pud_t pud)
|
||||
{
|
||||
return 0;
|
||||
}
|
||||
|
||||
int pmd_huge(pmd_t pmd)
|
||||
{
|
||||
return pmd_val(pmd) && !(pmd_val(pmd) & PMD_TABLE_BIT);
|
||||
}
|
@ -34,7 +34,7 @@ arch_get_unmapped_area(struct file *filp, unsigned long addr,
|
||||
struct vm_area_struct *vma;
|
||||
int do_align = 0;
|
||||
int aliasing = cache_is_vipt_aliasing();
|
||||
struct vm_unmapped_area_info info;
|
||||
struct vm_unmapped_area_info info = {};
|
||||
|
||||
/*
|
||||
* We only need to do colour alignment if either the I or D
|
||||
@ -68,7 +68,6 @@ arch_get_unmapped_area(struct file *filp, unsigned long addr,
|
||||
return addr;
|
||||
}
|
||||
|
||||
info.flags = 0;
|
||||
info.length = len;
|
||||
info.low_limit = mm->mmap_base;
|
||||
info.high_limit = TASK_SIZE;
|
||||
@ -87,7 +86,7 @@ arch_get_unmapped_area_topdown(struct file *filp, const unsigned long addr0,
|
||||
unsigned long addr = addr0;
|
||||
int do_align = 0;
|
||||
int aliasing = cache_is_vipt_aliasing();
|
||||
struct vm_unmapped_area_info info;
|
||||
struct vm_unmapped_area_info info = {};
|
||||
|
||||
/*
|
||||
* We only need to do colour alignment if either the I or D
|
||||
|
@ -205,7 +205,7 @@ config ARM64
|
||||
select HAVE_SAMPLE_FTRACE_DIRECT
|
||||
select HAVE_SAMPLE_FTRACE_DIRECT_MULTI
|
||||
select HAVE_EFFICIENT_UNALIGNED_ACCESS
|
||||
select HAVE_FAST_GUP
|
||||
select HAVE_GUP_FAST
|
||||
select HAVE_FTRACE_MCOUNT_RECORD
|
||||
select HAVE_FUNCTION_TRACER
|
||||
select HAVE_FUNCTION_ERROR_INJECTION
|
||||
|
@ -18,11 +18,11 @@
|
||||
extern bool arch_hugetlb_migration_supported(struct hstate *h);
|
||||
#endif
|
||||
|
||||
static inline void arch_clear_hugepage_flags(struct page *page)
|
||||
static inline void arch_clear_hugetlb_flags(struct folio *folio)
|
||||
{
|
||||
clear_bit(PG_dcache_clean, &page->flags);
|
||||
clear_bit(PG_dcache_clean, &folio->flags);
|
||||
}
|
||||
#define arch_clear_hugepage_flags arch_clear_hugepage_flags
|
||||
#define arch_clear_hugetlb_flags arch_clear_hugetlb_flags
|
||||
|
||||
pte_t arch_make_huge_pte(pte_t entry, unsigned int shift, vm_flags_t flags);
|
||||
#define arch_make_huge_pte arch_make_huge_pte
|
||||
|
@ -49,12 +49,6 @@
|
||||
__flush_tlb_range(vma, addr, end, PUD_SIZE, false, 1)
|
||||
#endif /* CONFIG_TRANSPARENT_HUGEPAGE */
|
||||
|
||||
static inline bool arch_thp_swp_supported(void)
|
||||
{
|
||||
return !system_supports_mte();
|
||||
}
|
||||
#define arch_thp_swp_supported arch_thp_swp_supported
|
||||
|
||||
/*
|
||||
* Outside of a few very special situations (e.g. hibernation), we always
|
||||
* use broadcast TLB invalidation instructions, therefore a spurious page
|
||||
@ -571,8 +565,6 @@ static inline int pmd_trans_huge(pmd_t pmd)
|
||||
pte_pmd(pte_swp_clear_uffd_wp(pmd_pte(pmd)))
|
||||
#endif /* CONFIG_HAVE_ARCH_USERFAULTFD_WP */
|
||||
|
||||
#define pmd_thp_or_huge(pmd) (pmd_huge(pmd) || pmd_trans_huge(pmd))
|
||||
|
||||
#define pmd_write(pmd) pte_write(pmd_pte(pmd))
|
||||
|
||||
#define pmd_mkhuge(pmd) (__pmd(pmd_val(pmd) & ~PMD_TABLE_BIT))
|
||||
@ -763,7 +755,11 @@ static inline unsigned long pmd_page_vaddr(pmd_t pmd)
|
||||
#define pud_none(pud) (!pud_val(pud))
|
||||
#define pud_bad(pud) (!pud_table(pud))
|
||||
#define pud_present(pud) pte_present(pud_pte(pud))
|
||||
#ifndef __PAGETABLE_PMD_FOLDED
|
||||
#define pud_leaf(pud) (pud_present(pud) && !pud_table(pud))
|
||||
#else
|
||||
#define pud_leaf(pud) false
|
||||
#endif
|
||||
#define pud_valid(pud) pte_valid(pud_pte(pud))
|
||||
#define pud_user(pud) pte_user(pud_pte(pud))
|
||||
#define pud_user_exec(pud) pte_user_exec(pud_pte(pud))
|
||||
@ -1284,6 +1280,46 @@ static inline void __wrprotect_ptes(struct mm_struct *mm, unsigned long address,
|
||||
__ptep_set_wrprotect(mm, address, ptep);
|
||||
}
|
||||
|
||||
static inline void __clear_young_dirty_pte(struct vm_area_struct *vma,
|
||||
unsigned long addr, pte_t *ptep,
|
||||
pte_t pte, cydp_t flags)
|
||||
{
|
||||
pte_t old_pte;
|
||||
|
||||
do {
|
||||
old_pte = pte;
|
||||
|
||||
if (flags & CYDP_CLEAR_YOUNG)
|
||||
pte = pte_mkold(pte);
|
||||
if (flags & CYDP_CLEAR_DIRTY)
|
||||
pte = pte_mkclean(pte);
|
||||
|
||||
pte_val(pte) = cmpxchg_relaxed(&pte_val(*ptep),
|
||||
pte_val(old_pte), pte_val(pte));
|
||||
} while (pte_val(pte) != pte_val(old_pte));
|
||||
}
|
||||
|
||||
static inline void __clear_young_dirty_ptes(struct vm_area_struct *vma,
|
||||
unsigned long addr, pte_t *ptep,
|
||||
unsigned int nr, cydp_t flags)
|
||||
{
|
||||
pte_t pte;
|
||||
|
||||
for (;;) {
|
||||
pte = __ptep_get(ptep);
|
||||
|
||||
if (flags == (CYDP_CLEAR_YOUNG | CYDP_CLEAR_DIRTY))
|
||||
__set_pte(ptep, pte_mkclean(pte_mkold(pte)));
|
||||
else
|
||||
__clear_young_dirty_pte(vma, addr, ptep, pte, flags);
|
||||
|
||||
if (--nr == 0)
|
||||
break;
|
||||
ptep++;
|
||||
addr += PAGE_SIZE;
|
||||
}
|
||||
}
|
||||
|
||||
#ifdef CONFIG_TRANSPARENT_HUGEPAGE
|
||||
#define __HAVE_ARCH_PMDP_SET_WRPROTECT
|
||||
static inline void pmdp_set_wrprotect(struct mm_struct *mm,
|
||||
@ -1338,12 +1374,7 @@ static inline pmd_t pmdp_establish(struct vm_area_struct *vma,
|
||||
#ifdef CONFIG_ARM64_MTE
|
||||
|
||||
#define __HAVE_ARCH_PREPARE_TO_SWAP
|
||||
static inline int arch_prepare_to_swap(struct page *page)
|
||||
{
|
||||
if (system_supports_mte())
|
||||
return mte_save_tags(page);
|
||||
return 0;
|
||||
}
|
||||
extern int arch_prepare_to_swap(struct folio *folio);
|
||||
|
||||
#define __HAVE_ARCH_SWAP_INVALIDATE
|
||||
static inline void arch_swap_invalidate_page(int type, pgoff_t offset)
|
||||
@ -1359,11 +1390,7 @@ static inline void arch_swap_invalidate_area(int type)
|
||||
}
|
||||
|
||||
#define __HAVE_ARCH_SWAP_RESTORE
|
||||
static inline void arch_swap_restore(swp_entry_t entry, struct folio *folio)
|
||||
{
|
||||
if (system_supports_mte())
|
||||
mte_restore_tags(entry, &folio->page);
|
||||
}
|
||||
extern void arch_swap_restore(swp_entry_t entry, struct folio *folio);
|
||||
|
||||
#endif /* CONFIG_ARM64_MTE */
|
||||
|
||||
@ -1450,6 +1477,9 @@ extern void contpte_wrprotect_ptes(struct mm_struct *mm, unsigned long addr,
|
||||
extern int contpte_ptep_set_access_flags(struct vm_area_struct *vma,
|
||||
unsigned long addr, pte_t *ptep,
|
||||
pte_t entry, int dirty);
|
||||
extern void contpte_clear_young_dirty_ptes(struct vm_area_struct *vma,
|
||||
unsigned long addr, pte_t *ptep,
|
||||
unsigned int nr, cydp_t flags);
|
||||
|
||||
static __always_inline void contpte_try_fold(struct mm_struct *mm,
|
||||
unsigned long addr, pte_t *ptep, pte_t pte)
|
||||
@ -1674,6 +1704,17 @@ static inline int ptep_set_access_flags(struct vm_area_struct *vma,
|
||||
return contpte_ptep_set_access_flags(vma, addr, ptep, entry, dirty);
|
||||
}
|
||||
|
||||
#define clear_young_dirty_ptes clear_young_dirty_ptes
|
||||
static inline void clear_young_dirty_ptes(struct vm_area_struct *vma,
|
||||
unsigned long addr, pte_t *ptep,
|
||||
unsigned int nr, cydp_t flags)
|
||||
{
|
||||
if (likely(nr == 1 && !pte_cont(__ptep_get(ptep))))
|
||||
__clear_young_dirty_ptes(vma, addr, ptep, nr, flags);
|
||||
else
|
||||
contpte_clear_young_dirty_ptes(vma, addr, ptep, nr, flags);
|
||||
}
|
||||
|
||||
#else /* CONFIG_ARM64_CONTPTE */
|
||||
|
||||
#define ptep_get __ptep_get
|
||||
@ -1693,6 +1734,7 @@ static inline int ptep_set_access_flags(struct vm_area_struct *vma,
|
||||
#define wrprotect_ptes __wrprotect_ptes
|
||||
#define __HAVE_ARCH_PTEP_SET_ACCESS_FLAGS
|
||||
#define ptep_set_access_flags __ptep_set_access_flags
|
||||
#define clear_young_dirty_ptes __clear_young_dirty_ptes
|
||||
|
||||
#endif /* CONFIG_ARM64_CONTPTE */
|
||||
|
||||
|
@ -10,6 +10,7 @@
|
||||
#include <linux/efi.h>
|
||||
#include <linux/init.h>
|
||||
#include <linux/screen_info.h>
|
||||
#include <linux/vmalloc.h>
|
||||
|
||||
#include <asm/efi.h>
|
||||
#include <asm/stacktrace.h>
|
||||
|
@ -361,6 +361,35 @@ void contpte_wrprotect_ptes(struct mm_struct *mm, unsigned long addr,
|
||||
}
|
||||
EXPORT_SYMBOL_GPL(contpte_wrprotect_ptes);
|
||||
|
||||
void contpte_clear_young_dirty_ptes(struct vm_area_struct *vma,
|
||||
unsigned long addr, pte_t *ptep,
|
||||
unsigned int nr, cydp_t flags)
|
||||
{
|
||||
/*
|
||||
* We can safely clear access/dirty without needing to unfold from
|
||||
* the architectures perspective, even when contpte is set. If the
|
||||
* range starts or ends midway through a contpte block, we can just
|
||||
* expand to include the full contpte block. While this is not
|
||||
* exactly what the core-mm asked for, it tracks access/dirty per
|
||||
* folio, not per page. And since we only create a contpte block
|
||||
* when it is covered by a single folio, we can get away with
|
||||
* clearing access/dirty for the whole block.
|
||||
*/
|
||||
unsigned long start = addr;
|
||||
unsigned long end = start + nr;
|
||||
|
||||
if (pte_cont(__ptep_get(ptep + nr - 1)))
|
||||
end = ALIGN(end, CONT_PTE_SIZE);
|
||||
|
||||
if (pte_cont(__ptep_get(ptep))) {
|
||||
start = ALIGN_DOWN(start, CONT_PTE_SIZE);
|
||||
ptep = contpte_align_down(ptep);
|
||||
}
|
||||
|
||||
__clear_young_dirty_ptes(vma, start, ptep, end - start, flags);
|
||||
}
|
||||
EXPORT_SYMBOL_GPL(contpte_clear_young_dirty_ptes);
|
||||
|
||||
int contpte_ptep_set_access_flags(struct vm_area_struct *vma,
|
||||
unsigned long addr, pte_t *ptep,
|
||||
pte_t entry, int dirty)
|
||||
|
@ -486,25 +486,6 @@ static void do_bad_area(unsigned long far, unsigned long esr,
|
||||
}
|
||||
}
|
||||
|
||||
#define VM_FAULT_BADMAP ((__force vm_fault_t)0x010000)
|
||||
#define VM_FAULT_BADACCESS ((__force vm_fault_t)0x020000)
|
||||
|
||||
static vm_fault_t __do_page_fault(struct mm_struct *mm,
|
||||
struct vm_area_struct *vma, unsigned long addr,
|
||||
unsigned int mm_flags, unsigned long vm_flags,
|
||||
struct pt_regs *regs)
|
||||
{
|
||||
/*
|
||||
* Ok, we have a good vm_area for this memory access, so we can handle
|
||||
* it.
|
||||
* Check that the permissions on the VMA allow for the fault which
|
||||
* occurred.
|
||||
*/
|
||||
if (!(vma->vm_flags & vm_flags))
|
||||
return VM_FAULT_BADACCESS;
|
||||
return handle_mm_fault(vma, addr, mm_flags, regs);
|
||||
}
|
||||
|
||||
static bool is_el0_instruction_abort(unsigned long esr)
|
||||
{
|
||||
return ESR_ELx_EC(esr) == ESR_ELx_EC_IABT_LOW;
|
||||
@ -529,6 +510,7 @@ static int __kprobes do_page_fault(unsigned long far, unsigned long esr,
|
||||
unsigned int mm_flags = FAULT_FLAG_DEFAULT;
|
||||
unsigned long addr = untagged_addr(far);
|
||||
struct vm_area_struct *vma;
|
||||
int si_code;
|
||||
|
||||
if (kprobe_page_fault(regs, esr))
|
||||
return 0;
|
||||
@ -588,7 +570,10 @@ static int __kprobes do_page_fault(unsigned long far, unsigned long esr,
|
||||
|
||||
if (!(vma->vm_flags & vm_flags)) {
|
||||
vma_end_read(vma);
|
||||
goto lock_mmap;
|
||||
fault = 0;
|
||||
si_code = SEGV_ACCERR;
|
||||
count_vm_vma_lock_event(VMA_LOCK_SUCCESS);
|
||||
goto bad_area;
|
||||
}
|
||||
fault = handle_mm_fault(vma, addr, mm_flags | FAULT_FLAG_VMA_LOCK, regs);
|
||||
if (!(fault & (VM_FAULT_RETRY | VM_FAULT_COMPLETED)))
|
||||
@ -613,12 +598,19 @@ lock_mmap:
|
||||
retry:
|
||||
vma = lock_mm_and_find_vma(mm, addr, regs);
|
||||
if (unlikely(!vma)) {
|
||||
fault = VM_FAULT_BADMAP;
|
||||
goto done;
|
||||
fault = 0;
|
||||
si_code = SEGV_MAPERR;
|
||||
goto bad_area;
|
||||
}
|
||||
|
||||
fault = __do_page_fault(mm, vma, addr, mm_flags, vm_flags, regs);
|
||||
if (!(vma->vm_flags & vm_flags)) {
|
||||
mmap_read_unlock(mm);
|
||||
fault = 0;
|
||||
si_code = SEGV_ACCERR;
|
||||
goto bad_area;
|
||||
}
|
||||
|
||||
fault = handle_mm_fault(vma, addr, mm_flags, regs);
|
||||
/* Quick path to respond to signals */
|
||||
if (fault_signal_pending(fault, regs)) {
|
||||
if (!user_mode(regs))
|
||||
@ -637,13 +629,12 @@ retry:
|
||||
mmap_read_unlock(mm);
|
||||
|
||||
done:
|
||||
/*
|
||||
* Handle the "normal" (no error) case first.
|
||||
*/
|
||||
if (likely(!(fault & (VM_FAULT_ERROR | VM_FAULT_BADMAP |
|
||||
VM_FAULT_BADACCESS))))
|
||||
/* Handle the "normal" (no error) case first. */
|
||||
if (likely(!(fault & VM_FAULT_ERROR)))
|
||||
return 0;
|
||||
|
||||
si_code = SEGV_MAPERR;
|
||||
bad_area:
|
||||
/*
|
||||
* If we are in kernel mode at this point, we have no context to
|
||||
* handle this fault with.
|
||||
@ -678,13 +669,8 @@ done:
|
||||
|
||||
arm64_force_sig_mceerr(BUS_MCEERR_AR, far, lsb, inf->name);
|
||||
} else {
|
||||
/*
|
||||
* Something tried to access memory that isn't in our memory
|
||||
* map.
|
||||
*/
|
||||
arm64_force_sig_fault(SIGSEGV,
|
||||
fault == VM_FAULT_BADACCESS ? SEGV_ACCERR : SEGV_MAPERR,
|
||||
far, inf->name);
|
||||
/* Something tried to access memory that out of memory map */
|
||||
arm64_force_sig_fault(SIGSEGV, si_code, far, inf->name);
|
||||
}
|
||||
|
||||
return 0;
|
||||
|
@ -79,20 +79,6 @@ bool arch_hugetlb_migration_supported(struct hstate *h)
|
||||
}
|
||||
#endif
|
||||
|
||||
int pmd_huge(pmd_t pmd)
|
||||
{
|
||||
return pmd_val(pmd) && !(pmd_val(pmd) & PMD_TABLE_BIT);
|
||||
}
|
||||
|
||||
int pud_huge(pud_t pud)
|
||||
{
|
||||
#ifndef __PAGETABLE_PMD_FOLDED
|
||||
return pud_val(pud) && !(pud_val(pud) & PUD_TABLE_BIT);
|
||||
#else
|
||||
return 0;
|
||||
#endif
|
||||
}
|
||||
|
||||
static int find_num_contig(struct mm_struct *mm, unsigned long addr,
|
||||
pte_t *ptep, size_t *pgsize)
|
||||
{
|
||||
@ -328,7 +314,7 @@ pte_t *huge_pte_offset(struct mm_struct *mm,
|
||||
if (sz != PUD_SIZE && pud_none(pud))
|
||||
return NULL;
|
||||
/* hugepage or swap? */
|
||||
if (pud_huge(pud) || !pud_present(pud))
|
||||
if (pud_leaf(pud) || !pud_present(pud))
|
||||
return (pte_t *)pudp;
|
||||
/* table; check the next level */
|
||||
|
||||
@ -340,7 +326,7 @@ pte_t *huge_pte_offset(struct mm_struct *mm,
|
||||
if (!(sz == PMD_SIZE || sz == CONT_PMD_SIZE) &&
|
||||
pmd_none(pmd))
|
||||
return NULL;
|
||||
if (pmd_huge(pmd) || !pmd_present(pmd))
|
||||
if (pmd_leaf(pmd) || !pmd_present(pmd))
|
||||
return (pte_t *)pmdp;
|
||||
|
||||
if (sz == CONT_PTE_SIZE)
|
||||
|
@ -68,6 +68,13 @@ void mte_invalidate_tags(int type, pgoff_t offset)
|
||||
mte_free_tag_storage(tags);
|
||||
}
|
||||
|
||||
static inline void __mte_invalidate_tags(struct page *page)
|
||||
{
|
||||
swp_entry_t entry = page_swap_entry(page);
|
||||
|
||||
mte_invalidate_tags(swp_type(entry), swp_offset(entry));
|
||||
}
|
||||
|
||||
void mte_invalidate_tags_area(int type)
|
||||
{
|
||||
swp_entry_t entry = swp_entry(type, 0);
|
||||
@ -83,3 +90,41 @@ void mte_invalidate_tags_area(int type)
|
||||
}
|
||||
xa_unlock(&mte_pages);
|
||||
}
|
||||
|
||||
int arch_prepare_to_swap(struct folio *folio)
|
||||
{
|
||||
long i, nr;
|
||||
int err;
|
||||
|
||||
if (!system_supports_mte())
|
||||
return 0;
|
||||
|
||||
nr = folio_nr_pages(folio);
|
||||
|
||||
for (i = 0; i < nr; i++) {
|
||||
err = mte_save_tags(folio_page(folio, i));
|
||||
if (err)
|
||||
goto out;
|
||||
}
|
||||
return 0;
|
||||
|
||||
out:
|
||||
while (i--)
|
||||
__mte_invalidate_tags(folio_page(folio, i));
|
||||
return err;
|
||||
}
|
||||
|
||||
void arch_swap_restore(swp_entry_t entry, struct folio *folio)
|
||||
{
|
||||
long i, nr;
|
||||
|
||||
if (!system_supports_mte())
|
||||
return;
|
||||
|
||||
nr = folio_nr_pages(folio);
|
||||
|
||||
for (i = 0; i < nr; i++) {
|
||||
mte_restore_tags(entry, folio_page(folio, i));
|
||||
entry.val++;
|
||||
}
|
||||
}
|
||||
|
@ -28,7 +28,12 @@ arch_get_unmapped_area(struct file *filp, unsigned long addr,
|
||||
struct mm_struct *mm = current->mm;
|
||||
struct vm_area_struct *vma;
|
||||
int do_align = 0;
|
||||
struct vm_unmapped_area_info info;
|
||||
struct vm_unmapped_area_info info = {
|
||||
.length = len,
|
||||
.low_limit = mm->mmap_base,
|
||||
.high_limit = TASK_SIZE,
|
||||
.align_offset = pgoff << PAGE_SHIFT
|
||||
};
|
||||
|
||||
/*
|
||||
* We only need to do colour alignment if either the I or D
|
||||
@ -61,11 +66,6 @@ arch_get_unmapped_area(struct file *filp, unsigned long addr,
|
||||
return addr;
|
||||
}
|
||||
|
||||
info.flags = 0;
|
||||
info.length = len;
|
||||
info.low_limit = mm->mmap_base;
|
||||
info.high_limit = TASK_SIZE;
|
||||
info.align_mask = do_align ? (PAGE_MASK & (SHMLBA - 1)) : 0;
|
||||
info.align_offset = pgoff << PAGE_SHIFT;
|
||||
return vm_unmapped_area(&info);
|
||||
}
|
||||
|
@ -119,7 +119,7 @@ config LOONGARCH
|
||||
select HAVE_EBPF_JIT
|
||||
select HAVE_EFFICIENT_UNALIGNED_ACCESS if !ARCH_STRICT_ALIGN
|
||||
select HAVE_EXIT_THREAD
|
||||
select HAVE_FAST_GUP
|
||||
select HAVE_GUP_FAST
|
||||
select HAVE_FTRACE_MCOUNT_RECORD
|
||||
select HAVE_FUNCTION_ARG_ACCESS_API
|
||||
select HAVE_FUNCTION_ERROR_INJECTION
|
||||
|
@ -10,6 +10,7 @@
|
||||
#define _ASM_LOONGARCH_KFENCE_H
|
||||
|
||||
#include <linux/kfence.h>
|
||||
#include <linux/vmalloc.h>
|
||||
#include <asm/pgtable.h>
|
||||
#include <asm/tlb.h>
|
||||
|
||||
|
@ -50,21 +50,11 @@ pte_t *huge_pte_offset(struct mm_struct *mm, unsigned long addr,
|
||||
return (pte_t *) pmd;
|
||||
}
|
||||
|
||||
int pmd_huge(pmd_t pmd)
|
||||
{
|
||||
return (pmd_val(pmd) & _PAGE_HUGE) != 0;
|
||||
}
|
||||
|
||||
int pud_huge(pud_t pud)
|
||||
{
|
||||
return (pud_val(pud) & _PAGE_HUGE) != 0;
|
||||
}
|
||||
|
||||
uint64_t pmd_to_entrylo(unsigned long pmd_val)
|
||||
{
|
||||
uint64_t val;
|
||||
/* PMD as PTE. Must be huge page */
|
||||
if (!pmd_huge(__pmd(pmd_val)))
|
||||
if (!pmd_leaf(__pmd(pmd_val)))
|
||||
panic("%s", __func__);
|
||||
|
||||
val = pmd_val ^ _PAGE_HUGE;
|
||||
|
@ -25,7 +25,7 @@ static unsigned long arch_get_unmapped_area_common(struct file *filp,
|
||||
struct vm_area_struct *vma;
|
||||
unsigned long addr = addr0;
|
||||
int do_color_align;
|
||||
struct vm_unmapped_area_info info;
|
||||
struct vm_unmapped_area_info info = {};
|
||||
|
||||
if (unlikely(len > TASK_SIZE))
|
||||
return -ENOMEM;
|
||||
@ -83,7 +83,6 @@ static unsigned long arch_get_unmapped_area_common(struct file *filp,
|
||||
*/
|
||||
}
|
||||
|
||||
info.flags = 0;
|
||||
info.low_limit = mm->mmap_base;
|
||||
info.high_limit = TASK_SIZE;
|
||||
return vm_unmapped_area(&info);
|
||||
|
@ -68,7 +68,7 @@ config MIPS
|
||||
select HAVE_DYNAMIC_FTRACE
|
||||
select HAVE_EBPF_JIT if !CPU_MICROMIPS
|
||||
select HAVE_EXIT_THREAD
|
||||
select HAVE_FAST_GUP
|
||||
select HAVE_GUP_FAST
|
||||
select HAVE_FTRACE_MCOUNT_RECORD
|
||||
select HAVE_FUNCTION_GRAPH_TRACER
|
||||
select HAVE_FUNCTION_TRACER
|
||||
|
@ -129,7 +129,7 @@ static inline int pmd_none(pmd_t pmd)
|
||||
static inline int pmd_bad(pmd_t pmd)
|
||||
{
|
||||
#ifdef CONFIG_MIPS_HUGE_TLB_SUPPORT
|
||||
/* pmd_huge(pmd) but inline */
|
||||
/* pmd_leaf(pmd) but inline */
|
||||
if (unlikely(pmd_val(pmd) & _PAGE_HUGE))
|
||||
return 0;
|
||||
#endif
|
||||
|
@ -245,7 +245,7 @@ static inline int pmd_none(pmd_t pmd)
|
||||
static inline int pmd_bad(pmd_t pmd)
|
||||
{
|
||||
#ifdef CONFIG_MIPS_HUGE_TLB_SUPPORT
|
||||
/* pmd_huge(pmd) but inline */
|
||||
/* pmd_leaf(pmd) but inline */
|
||||
if (unlikely(pmd_val(pmd) & _PAGE_HUGE))
|
||||
return 0;
|
||||
#endif
|
||||
|
@ -617,7 +617,7 @@ const struct dma_map_ops jazz_dma_ops = {
|
||||
.sync_sg_for_device = jazz_dma_sync_sg_for_device,
|
||||
.mmap = dma_common_mmap,
|
||||
.get_sgtable = dma_common_get_sgtable,
|
||||
.alloc_pages = dma_common_alloc_pages,
|
||||
.alloc_pages_op = dma_common_alloc_pages,
|
||||
.free_pages = dma_common_free_pages,
|
||||
};
|
||||
EXPORT_SYMBOL(jazz_dma_ops);
|
||||
|
@ -57,13 +57,3 @@ pte_t *huge_pte_offset(struct mm_struct *mm, unsigned long addr,
|
||||
}
|
||||
return (pte_t *) pmd;
|
||||
}
|
||||
|
||||
int pmd_huge(pmd_t pmd)
|
||||
{
|
||||
return (pmd_val(pmd) & _PAGE_HUGE) != 0;
|
||||
}
|
||||
|
||||
int pud_huge(pud_t pud)
|
||||
{
|
||||
return (pud_val(pud) & _PAGE_HUGE) != 0;
|
||||
}
|
||||
|
@ -34,7 +34,7 @@ static unsigned long arch_get_unmapped_area_common(struct file *filp,
|
||||
struct vm_area_struct *vma;
|
||||
unsigned long addr = addr0;
|
||||
int do_color_align;
|
||||
struct vm_unmapped_area_info info;
|
||||
struct vm_unmapped_area_info info = {};
|
||||
|
||||
if (unlikely(len > TASK_SIZE))
|
||||
return -ENOMEM;
|
||||
@ -92,7 +92,6 @@ static unsigned long arch_get_unmapped_area_common(struct file *filp,
|
||||
*/
|
||||
}
|
||||
|
||||
info.flags = 0;
|
||||
info.low_limit = mm->mmap_base;
|
||||
info.high_limit = TASK_SIZE;
|
||||
return vm_unmapped_area(&info);
|
||||
|
@ -326,7 +326,7 @@ void __update_tlb(struct vm_area_struct * vma, unsigned long address, pte_t pte)
|
||||
idx = read_c0_index();
|
||||
#ifdef CONFIG_MIPS_HUGE_TLB_SUPPORT
|
||||
/* this could be a huge page */
|
||||
if (pmd_huge(*pmdp)) {
|
||||
if (pmd_leaf(*pmdp)) {
|
||||
unsigned long lo;
|
||||
write_c0_pagemask(PM_HUGE_MASK);
|
||||
ptep = (pte_t *)pmdp;
|
||||
|
@ -104,7 +104,9 @@ static unsigned long arch_get_unmapped_area_common(struct file *filp,
|
||||
struct vm_area_struct *vma, *prev;
|
||||
unsigned long filp_pgoff;
|
||||
int do_color_align;
|
||||
struct vm_unmapped_area_info info;
|
||||
struct vm_unmapped_area_info info = {
|
||||
.length = len
|
||||
};
|
||||
|
||||
if (unlikely(len > TASK_SIZE))
|
||||
return -ENOMEM;
|
||||
@ -139,7 +141,6 @@ static unsigned long arch_get_unmapped_area_common(struct file *filp,
|
||||
return addr;
|
||||
}
|
||||
|
||||
info.length = len;
|
||||
info.align_mask = do_color_align ? (PAGE_MASK & (SHM_COLOUR - 1)) : 0;
|
||||
info.align_offset = shared_align_offset(filp_pgoff, pgoff);
|
||||
|
||||
@ -160,7 +161,6 @@ static unsigned long arch_get_unmapped_area_common(struct file *filp,
|
||||
*/
|
||||
}
|
||||
|
||||
info.flags = 0;
|
||||
info.low_limit = mm->mmap_base;
|
||||
info.high_limit = mmap_upper_limit(NULL);
|
||||
return vm_unmapped_area(&info);
|
||||
|
@ -180,14 +180,3 @@ int huge_ptep_set_access_flags(struct vm_area_struct *vma,
|
||||
}
|
||||
return changed;
|
||||
}
|
||||
|
||||
|
||||
int pmd_huge(pmd_t pmd)
|
||||
{
|
||||
return 0;
|
||||
}
|
||||
|
||||
int pud_huge(pud_t pud)
|
||||
{
|
||||
return 0;
|
||||
}
|
||||
|
@ -237,7 +237,7 @@ config PPC
|
||||
select HAVE_DYNAMIC_FTRACE_WITH_REGS if ARCH_USING_PATCHABLE_FUNCTION_ENTRY || MPROFILE_KERNEL || PPC32
|
||||
select HAVE_EBPF_JIT
|
||||
select HAVE_EFFICIENT_UNALIGNED_ACCESS
|
||||
select HAVE_FAST_GUP
|
||||
select HAVE_GUP_FAST
|
||||
select HAVE_FTRACE_MCOUNT_RECORD
|
||||
select HAVE_FUNCTION_ARG_ACCESS_API
|
||||
select HAVE_FUNCTION_DESCRIPTORS if PPC64_ELF_ABI_V1
|
||||
|
@ -6,26 +6,6 @@
|
||||
*/
|
||||
#ifndef __ASSEMBLY__
|
||||
#ifdef CONFIG_HUGETLB_PAGE
|
||||
static inline int pmd_huge(pmd_t pmd)
|
||||
{
|
||||
/*
|
||||
* leaf pte for huge page
|
||||
*/
|
||||
if (radix_enabled())
|
||||
return !!(pmd_raw(pmd) & cpu_to_be64(_PAGE_PTE));
|
||||
return 0;
|
||||
}
|
||||
|
||||
static inline int pud_huge(pud_t pud)
|
||||
{
|
||||
/*
|
||||
* leaf pte for huge page
|
||||
*/
|
||||
if (radix_enabled())
|
||||
return !!(pud_raw(pud) & cpu_to_be64(_PAGE_PTE));
|
||||
return 0;
|
||||
}
|
||||
|
||||
/*
|
||||
* With radix , we have hugepage ptes in the pud and pmd entries. We don't
|
||||
* need to setup hugepage directory for them. Our pte and page directory format
|
||||
|
@ -4,31 +4,6 @@
|
||||
|
||||
#ifndef __ASSEMBLY__
|
||||
#ifdef CONFIG_HUGETLB_PAGE
|
||||
/*
|
||||
* We have PGD_INDEX_SIZ = 12 and PTE_INDEX_SIZE = 8, so that we can have
|
||||
* 16GB hugepage pte in PGD and 16MB hugepage pte at PMD;
|
||||
*
|
||||
* Defined in such a way that we can optimize away code block at build time
|
||||
* if CONFIG_HUGETLB_PAGE=n.
|
||||
*
|
||||
* returns true for pmd migration entries, THP, devmap, hugetlb
|
||||
* But compile time dependent on CONFIG_HUGETLB_PAGE
|
||||
*/
|
||||
static inline int pmd_huge(pmd_t pmd)
|
||||
{
|
||||
/*
|
||||
* leaf pte for huge page
|
||||
*/
|
||||
return !!(pmd_raw(pmd) & cpu_to_be64(_PAGE_PTE));
|
||||
}
|
||||
|
||||
static inline int pud_huge(pud_t pud)
|
||||
{
|
||||
/*
|
||||
* leaf pte for huge page
|
||||
*/
|
||||
return !!(pud_raw(pud) & cpu_to_be64(_PAGE_PTE));
|
||||
}
|
||||
|
||||
/*
|
||||
* With 64k page size, we have hugepage ptes in the pgd and pmd entries. We don't
|
||||
|
@ -262,6 +262,18 @@ extern unsigned long __kernel_io_end;
|
||||
|
||||
extern struct page *vmemmap;
|
||||
extern unsigned long pci_io_base;
|
||||
|
||||
#define pmd_leaf pmd_leaf
|
||||
static inline bool pmd_leaf(pmd_t pmd)
|
||||
{
|
||||
return !!(pmd_raw(pmd) & cpu_to_be64(_PAGE_PTE));
|
||||
}
|
||||
|
||||
#define pud_leaf pud_leaf
|
||||
static inline bool pud_leaf(pud_t pud)
|
||||
{
|
||||
return !!(pud_raw(pud) & cpu_to_be64(_PAGE_PTE));
|
||||
}
|
||||
#endif /* __ASSEMBLY__ */
|
||||
|
||||
#include <asm/book3s/64/hash.h>
|
||||
@ -1426,20 +1438,5 @@ static inline bool is_pte_rw_upgrade(unsigned long old_val, unsigned long new_va
|
||||
return false;
|
||||
}
|
||||
|
||||
/*
|
||||
* Like pmd_huge(), but works regardless of config options
|
||||
*/
|
||||
#define pmd_leaf pmd_leaf
|
||||
static inline bool pmd_leaf(pmd_t pmd)
|
||||
{
|
||||
return !!(pmd_raw(pmd) & cpu_to_be64(_PAGE_PTE));
|
||||
}
|
||||
|
||||
#define pud_leaf pud_leaf
|
||||
static inline bool pud_leaf(pud_t pud)
|
||||
{
|
||||
return !!(pud_raw(pud) & cpu_to_be64(_PAGE_PTE));
|
||||
}
|
||||
|
||||
#endif /* __ASSEMBLY__ */
|
||||
#endif /* _ASM_POWERPC_BOOK3S_64_PGTABLE_H_ */
|
||||
|
@ -406,9 +406,5 @@ extern void *abatron_pteptrs[2];
|
||||
#include <asm/nohash/mmu.h>
|
||||
#endif
|
||||
|
||||
#if defined(CONFIG_FA_DUMP) || defined(CONFIG_PRESERVE_FA_DUMP)
|
||||
#define __HAVE_ARCH_RESERVED_KERNEL_PAGES
|
||||
#endif
|
||||
|
||||
#endif /* __KERNEL__ */
|
||||
#endif /* _ASM_POWERPC_MMU_H_ */
|
||||
|
@ -351,16 +351,6 @@ static inline int hugepd_ok(hugepd_t hpd)
|
||||
#endif
|
||||
}
|
||||
|
||||
static inline int pmd_huge(pmd_t pmd)
|
||||
{
|
||||
return 0;
|
||||
}
|
||||
|
||||
static inline int pud_huge(pud_t pud)
|
||||
{
|
||||
return 0;
|
||||
}
|
||||
|
||||
#define is_hugepd(hpd) (hugepd_ok(hpd))
|
||||
#endif
|
||||
|
||||
|
@ -216,6 +216,6 @@ const struct dma_map_ops dma_iommu_ops = {
|
||||
.get_required_mask = dma_iommu_get_required_mask,
|
||||
.mmap = dma_common_mmap,
|
||||
.get_sgtable = dma_common_get_sgtable,
|
||||
.alloc_pages = dma_common_alloc_pages,
|
||||
.alloc_pages_op = dma_common_alloc_pages,
|
||||
.free_pages = dma_common_free_pages,
|
||||
};
|
||||
|
@ -1883,8 +1883,3 @@ static void __init fadump_reserve_crash_area(u64 base)
|
||||
memblock_reserve(mstart, msize);
|
||||
}
|
||||
}
|
||||
|
||||
unsigned long __init arch_reserved_kernel_pages(void)
|
||||
{
|
||||
return memblock_reserved_size() / PAGE_SIZE;
|
||||
}
|
||||
|
@ -26,6 +26,7 @@
|
||||
#include <linux/iommu.h>
|
||||
#include <linux/sched.h>
|
||||
#include <linux/debugfs.h>
|
||||
#include <linux/vmalloc.h>
|
||||
#include <asm/io.h>
|
||||
#include <asm/iommu.h>
|
||||
#include <asm/pci-bridge.h>
|
||||
|
@ -170,6 +170,7 @@ pmd_t pmdp_invalidate(struct vm_area_struct *vma, unsigned long address,
|
||||
{
|
||||
unsigned long old_pmd;
|
||||
|
||||
VM_WARN_ON_ONCE(!pmd_present(*pmdp));
|
||||
old_pmd = pmd_hugepage_update(vma->vm_mm, address, pmdp, _PAGE_PRESENT, _PAGE_INVALID);
|
||||
flush_pmd_tlb_range(vma, address, address + HPAGE_PMD_SIZE);
|
||||
return __pmd(old_pmd);
|
||||
|
@ -282,12 +282,10 @@ static unsigned long slice_find_area_bottomup(struct mm_struct *mm,
|
||||
{
|
||||
int pshift = max_t(int, mmu_psize_defs[psize].shift, PAGE_SHIFT);
|
||||
unsigned long found, next_end;
|
||||
struct vm_unmapped_area_info info;
|
||||
|
||||
info.flags = 0;
|
||||
info.length = len;
|
||||
info.align_mask = PAGE_MASK & ((1ul << pshift) - 1);
|
||||
info.align_offset = 0;
|
||||
struct vm_unmapped_area_info info = {
|
||||
.length = len,
|
||||
.align_mask = PAGE_MASK & ((1ul << pshift) - 1),
|
||||
};
|
||||
/*
|
||||
* Check till the allow max value for this mmap request
|
||||
*/
|
||||
@ -326,13 +324,13 @@ static unsigned long slice_find_area_topdown(struct mm_struct *mm,
|
||||
{
|
||||
int pshift = max_t(int, mmu_psize_defs[psize].shift, PAGE_SHIFT);
|
||||
unsigned long found, prev;
|
||||
struct vm_unmapped_area_info info;
|
||||
struct vm_unmapped_area_info info = {
|
||||
.flags = VM_UNMAPPED_AREA_TOPDOWN,
|
||||
.length = len,
|
||||
.align_mask = PAGE_MASK & ((1ul << pshift) - 1),
|
||||
};
|
||||
unsigned long min_addr = max(PAGE_SIZE, mmap_min_addr);
|
||||
|
||||
info.flags = VM_UNMAPPED_AREA_TOPDOWN;
|
||||
info.length = len;
|
||||
info.align_mask = PAGE_MASK & ((1ul << pshift) - 1);
|
||||
info.align_offset = 0;
|
||||
/*
|
||||
* If we are trying to allocate above DEFAULT_MAP_WINDOW
|
||||
* Add the different to the mmap_base.
|
||||
|
@ -71,23 +71,26 @@ static noinline int bad_area_nosemaphore(struct pt_regs *regs, unsigned long add
|
||||
return __bad_area_nosemaphore(regs, address, SEGV_MAPERR);
|
||||
}
|
||||
|
||||
static int __bad_area(struct pt_regs *regs, unsigned long address, int si_code)
|
||||
static int __bad_area(struct pt_regs *regs, unsigned long address, int si_code,
|
||||
struct mm_struct *mm, struct vm_area_struct *vma)
|
||||
{
|
||||
struct mm_struct *mm = current->mm;
|
||||
|
||||
/*
|
||||
* Something tried to access memory that isn't in our memory map..
|
||||
* Fix it, but check if it's kernel or user first..
|
||||
*/
|
||||
mmap_read_unlock(mm);
|
||||
if (mm)
|
||||
mmap_read_unlock(mm);
|
||||
else
|
||||
vma_end_read(vma);
|
||||
|
||||
return __bad_area_nosemaphore(regs, address, si_code);
|
||||
}
|
||||
|
||||
static noinline int bad_access_pkey(struct pt_regs *regs, unsigned long address,
|
||||
struct mm_struct *mm,
|
||||
struct vm_area_struct *vma)
|
||||
{
|
||||
struct mm_struct *mm = current->mm;
|
||||
int pkey;
|
||||
|
||||
/*
|
||||
@ -109,7 +112,10 @@ static noinline int bad_access_pkey(struct pt_regs *regs, unsigned long address,
|
||||
*/
|
||||
pkey = vma_pkey(vma);
|
||||
|
||||
mmap_read_unlock(mm);
|
||||
if (mm)
|
||||
mmap_read_unlock(mm);
|
||||
else
|
||||
vma_end_read(vma);
|
||||
|
||||
/*
|
||||
* If we are in kernel mode, bail out with a SEGV, this will
|
||||
@ -124,9 +130,10 @@ static noinline int bad_access_pkey(struct pt_regs *regs, unsigned long address,
|
||||
return 0;
|
||||
}
|
||||
|
||||
static noinline int bad_access(struct pt_regs *regs, unsigned long address)
|
||||
static noinline int bad_access(struct pt_regs *regs, unsigned long address,
|
||||
struct mm_struct *mm, struct vm_area_struct *vma)
|
||||
{
|
||||
return __bad_area(regs, address, SEGV_ACCERR);
|
||||
return __bad_area(regs, address, SEGV_ACCERR, mm, vma);
|
||||
}
|
||||
|
||||
static int do_sigbus(struct pt_regs *regs, unsigned long address,
|
||||
@ -479,13 +486,13 @@ static int ___do_page_fault(struct pt_regs *regs, unsigned long address,
|
||||
|
||||
if (unlikely(access_pkey_error(is_write, is_exec,
|
||||
(error_code & DSISR_KEYFAULT), vma))) {
|
||||
vma_end_read(vma);
|
||||
goto lock_mmap;
|
||||
count_vm_vma_lock_event(VMA_LOCK_SUCCESS);
|
||||
return bad_access_pkey(regs, address, NULL, vma);
|
||||
}
|
||||
|
||||
if (unlikely(access_error(is_write, is_exec, vma))) {
|
||||
vma_end_read(vma);
|
||||
goto lock_mmap;
|
||||
count_vm_vma_lock_event(VMA_LOCK_SUCCESS);
|
||||
return bad_access(regs, address, NULL, vma);
|
||||
}
|
||||
|
||||
fault = handle_mm_fault(vma, address, flags | FAULT_FLAG_VMA_LOCK, regs);
|
||||
@ -521,10 +528,10 @@ retry:
|
||||
|
||||
if (unlikely(access_pkey_error(is_write, is_exec,
|
||||
(error_code & DSISR_KEYFAULT), vma)))
|
||||
return bad_access_pkey(regs, address, vma);
|
||||
return bad_access_pkey(regs, address, mm, vma);
|
||||
|
||||
if (unlikely(access_error(is_write, is_exec, vma)))
|
||||
return bad_access(regs, address);
|
||||
return bad_access(regs, address, mm, vma);
|
||||
|
||||
/*
|
||||
* If for any reason at all we couldn't handle the fault,
|
||||
|
@ -17,6 +17,7 @@
|
||||
#include <linux/suspend.h>
|
||||
#include <linux/dma-direct.h>
|
||||
#include <linux/execmem.h>
|
||||
#include <linux/vmalloc.h>
|
||||
|
||||
#include <asm/swiotlb.h>
|
||||
#include <asm/machdep.h>
|
||||
|
@ -102,7 +102,7 @@ struct page *p4d_page(p4d_t p4d)
|
||||
{
|
||||
if (p4d_leaf(p4d)) {
|
||||
if (!IS_ENABLED(CONFIG_HAVE_ARCH_HUGE_VMAP))
|
||||
VM_WARN_ON(!p4d_huge(p4d));
|
||||
VM_WARN_ON(!p4d_leaf(p4d));
|
||||
return pte_page(p4d_pte(p4d));
|
||||
}
|
||||
return virt_to_page(p4d_pgtable(p4d));
|
||||
@ -113,7 +113,7 @@ struct page *pud_page(pud_t pud)
|
||||
{
|
||||
if (pud_leaf(pud)) {
|
||||
if (!IS_ENABLED(CONFIG_HAVE_ARCH_HUGE_VMAP))
|
||||
VM_WARN_ON(!pud_huge(pud));
|
||||
VM_WARN_ON(!pud_leaf(pud));
|
||||
return pte_page(pud_pte(pud));
|
||||
}
|
||||
return virt_to_page(pud_pgtable(pud));
|
||||
@ -132,7 +132,7 @@ struct page *pmd_page(pmd_t pmd)
|
||||
* enabled so these checks can't be used.
|
||||
*/
|
||||
if (!IS_ENABLED(CONFIG_HAVE_ARCH_HUGE_VMAP))
|
||||
VM_WARN_ON(!(pmd_leaf(pmd) || pmd_huge(pmd)));
|
||||
VM_WARN_ON(!pmd_leaf(pmd));
|
||||
return pte_page(pmd_pte(pmd));
|
||||
}
|
||||
return virt_to_page(pmd_page_vaddr(pmd));
|
||||
|
@ -695,7 +695,7 @@ static const struct dma_map_ops ps3_sb_dma_ops = {
|
||||
.unmap_page = ps3_unmap_page,
|
||||
.mmap = dma_common_mmap,
|
||||
.get_sgtable = dma_common_get_sgtable,
|
||||
.alloc_pages = dma_common_alloc_pages,
|
||||
.alloc_pages_op = dma_common_alloc_pages,
|
||||
.free_pages = dma_common_free_pages,
|
||||
};
|
||||
|
||||
@ -709,7 +709,7 @@ static const struct dma_map_ops ps3_ioc0_dma_ops = {
|
||||
.unmap_page = ps3_unmap_page,
|
||||
.mmap = dma_common_mmap,
|
||||
.get_sgtable = dma_common_get_sgtable,
|
||||
.alloc_pages = dma_common_alloc_pages,
|
||||
.alloc_pages_op = dma_common_alloc_pages,
|
||||
.free_pages = dma_common_free_pages,
|
||||
};
|
||||
|
||||
|
@ -611,7 +611,7 @@ static const struct dma_map_ops vio_dma_mapping_ops = {
|
||||
.get_required_mask = dma_iommu_get_required_mask,
|
||||
.mmap = dma_common_mmap,
|
||||
.get_sgtable = dma_common_get_sgtable,
|
||||
.alloc_pages = dma_common_alloc_pages,
|
||||
.alloc_pages_op = dma_common_alloc_pages,
|
||||
.free_pages = dma_common_free_pages,
|
||||
};
|
||||
|
||||
|
@ -132,7 +132,7 @@ config RISCV
|
||||
select HAVE_FUNCTION_GRAPH_RETVAL if HAVE_FUNCTION_GRAPH_TRACER
|
||||
select HAVE_FUNCTION_TRACER if !XIP_KERNEL && !PREEMPTION
|
||||
select HAVE_EBPF_JIT if MMU
|
||||
select HAVE_FAST_GUP if MMU
|
||||
select HAVE_GUP_FAST if MMU
|
||||
select HAVE_FUNCTION_ARG_ACCESS_API
|
||||
select HAVE_FUNCTION_ERROR_INJECTION
|
||||
select HAVE_GCC_PLUGINS
|
||||
|
@ -5,11 +5,11 @@
|
||||
#include <asm/cacheflush.h>
|
||||
#include <asm/page.h>
|
||||
|
||||
static inline void arch_clear_hugepage_flags(struct page *page)
|
||||
static inline void arch_clear_hugetlb_flags(struct folio *folio)
|
||||
{
|
||||
clear_bit(PG_dcache_clean, &page->flags);
|
||||
clear_bit(PG_dcache_clean, &folio->flags);
|
||||
}
|
||||
#define arch_clear_hugepage_flags arch_clear_hugepage_flags
|
||||
#define arch_clear_hugetlb_flags arch_clear_hugetlb_flags
|
||||
|
||||
#ifdef CONFIG_ARCH_ENABLE_HUGEPAGE_MIGRATION
|
||||
bool arch_hugetlb_migration_supported(struct hstate *h);
|
||||
|
@ -651,6 +651,7 @@ static inline unsigned long pmd_pfn(pmd_t pmd)
|
||||
|
||||
#define __pud_to_phys(pud) (__page_val_to_pfn(pud_val(pud)) << PAGE_SHIFT)
|
||||
|
||||
#define pud_pfn pud_pfn
|
||||
static inline unsigned long pud_pfn(pud_t pud)
|
||||
{
|
||||
return ((__pud_to_phys(pud) & PUD_MASK) >> PAGE_SHIFT);
|
||||
|
@ -19,6 +19,7 @@
|
||||
#include <linux/libfdt.h>
|
||||
#include <linux/types.h>
|
||||
#include <linux/memblock.h>
|
||||
#include <linux/vmalloc.h>
|
||||
#include <asm/setup.h>
|
||||
|
||||
int arch_kimage_file_post_load_cleanup(struct kimage *image)
|
||||
|
@ -6,6 +6,7 @@
|
||||
#include <linux/extable.h>
|
||||
#include <linux/slab.h>
|
||||
#include <linux/stop_machine.h>
|
||||
#include <linux/vmalloc.h>
|
||||
#include <asm/ptrace.h>
|
||||
#include <linux/uaccess.h>
|
||||
#include <asm/sections.h>
|
||||
|
@ -292,7 +292,10 @@ void handle_page_fault(struct pt_regs *regs)
|
||||
|
||||
if (unlikely(access_error(cause, vma))) {
|
||||
vma_end_read(vma);
|
||||
goto lock_mmap;
|
||||
count_vm_vma_lock_event(VMA_LOCK_SUCCESS);
|
||||
tsk->thread.bad_cause = cause;
|
||||
bad_area_nosemaphore(regs, SEGV_ACCERR, addr);
|
||||
return;
|
||||
}
|
||||
|
||||
fault = handle_mm_fault(vma, addr, flags | FAULT_FLAG_VMA_LOCK, regs);
|
||||
|
@ -399,16 +399,6 @@ static bool is_napot_size(unsigned long size)
|
||||
|
||||
#endif /*CONFIG_RISCV_ISA_SVNAPOT*/
|
||||
|
||||
int pud_huge(pud_t pud)
|
||||
{
|
||||
return pud_leaf(pud);
|
||||
}
|
||||
|
||||
int pmd_huge(pmd_t pmd)
|
||||
{
|
||||
return pmd_leaf(pmd);
|
||||
}
|
||||
|
||||
static bool __hugetlb_valid_size(unsigned long size)
|
||||
{
|
||||
if (size == HPAGE_SIZE)
|
||||
|
@ -177,7 +177,7 @@ config S390
|
||||
select HAVE_DYNAMIC_FTRACE_WITH_REGS
|
||||
select HAVE_EBPF_JIT if HAVE_MARCH_Z196_FEATURES
|
||||
select HAVE_EFFICIENT_UNALIGNED_ACCESS
|
||||
select HAVE_FAST_GUP
|
||||
select HAVE_GUP_FAST
|
||||
select HAVE_FENTRY
|
||||
select HAVE_FTRACE_MCOUNT_RECORD
|
||||
select HAVE_FUNCTION_ARG_ACCESS_API
|
||||
|
@ -39,11 +39,11 @@ static inline int prepare_hugepage_range(struct file *file,
|
||||
return 0;
|
||||
}
|
||||
|
||||
static inline void arch_clear_hugepage_flags(struct page *page)
|
||||
static inline void arch_clear_hugetlb_flags(struct folio *folio)
|
||||
{
|
||||
clear_bit(PG_arch_1, &page->flags);
|
||||
clear_bit(PG_arch_1, &folio->flags);
|
||||
}
|
||||
#define arch_clear_hugepage_flags arch_clear_hugepage_flags
|
||||
#define arch_clear_hugetlb_flags arch_clear_hugetlb_flags
|
||||
|
||||
static inline void huge_pte_clear(struct mm_struct *mm, unsigned long addr,
|
||||
pte_t *ptep, unsigned long sz)
|
||||
|
@ -1421,6 +1421,7 @@ static inline unsigned long pud_deref(pud_t pud)
|
||||
return (unsigned long)__va(pud_val(pud) & origin_mask);
|
||||
}
|
||||
|
||||
#define pud_pfn pud_pfn
|
||||
static inline unsigned long pud_pfn(pud_t pud)
|
||||
{
|
||||
return __pa(pud_deref(pud)) >> PAGE_SHIFT;
|
||||
@ -1784,8 +1785,10 @@ static inline pmd_t pmdp_huge_clear_flush(struct vm_area_struct *vma,
|
||||
static inline pmd_t pmdp_invalidate(struct vm_area_struct *vma,
|
||||
unsigned long addr, pmd_t *pmdp)
|
||||
{
|
||||
pmd_t pmd = __pmd(pmd_val(*pmdp) | _SEGMENT_ENTRY_INVALID);
|
||||
pmd_t pmd;
|
||||
|
||||
VM_WARN_ON_ONCE(!pmd_present(*pmdp));
|
||||
pmd = __pmd(pmd_val(*pmdp) | _SEGMENT_ENTRY_INVALID);
|
||||
return pmdp_xchg_direct(vma->vm_mm, addr, pmdp, pmd);
|
||||
}
|
||||
|
||||
|
@ -21,6 +21,7 @@
|
||||
#include <linux/seq_file.h>
|
||||
#include <linux/slab.h>
|
||||
#include <linux/sysfs.h>
|
||||
#include <linux/vmalloc.h>
|
||||
#include <crypto/sha2.h>
|
||||
#include <keys/user-type.h>
|
||||
#include <asm/debug.h>
|
||||
|
@ -20,6 +20,7 @@
|
||||
#include <linux/gfp.h>
|
||||
#include <linux/crash_dump.h>
|
||||
#include <linux/debug_locks.h>
|
||||
#include <linux/vmalloc.h>
|
||||
#include <asm/asm-extable.h>
|
||||
#include <asm/diag.h>
|
||||
#include <asm/ipl.h>
|
||||
|
@ -325,7 +325,8 @@ static void do_exception(struct pt_regs *regs, int access)
|
||||
goto lock_mmap;
|
||||
if (!(vma->vm_flags & access)) {
|
||||
vma_end_read(vma);
|
||||
goto lock_mmap;
|
||||
count_vm_vma_lock_event(VMA_LOCK_SUCCESS);
|
||||
return handle_fault_error_nolock(regs, SEGV_ACCERR);
|
||||
}
|
||||
fault = handle_mm_fault(vma, address, flags | FAULT_FLAG_VMA_LOCK, regs);
|
||||
if (!(fault & (VM_FAULT_RETRY | VM_FAULT_COMPLETED)))
|
||||
|
@ -233,16 +233,6 @@ pte_t *huge_pte_offset(struct mm_struct *mm,
|
||||
return (pte_t *) pmdp;
|
||||
}
|
||||
|
||||
int pmd_huge(pmd_t pmd)
|
||||
{
|
||||
return pmd_leaf(pmd);
|
||||
}
|
||||
|
||||
int pud_huge(pud_t pud)
|
||||
{
|
||||
return pud_leaf(pud);
|
||||
}
|
||||
|
||||
bool __init arch_hugetlb_valid_size(unsigned long size)
|
||||
{
|
||||
if (MACHINE_HAS_EDAT1 && size == PMD_SIZE)
|
||||
@ -258,14 +248,12 @@ static unsigned long hugetlb_get_unmapped_area_bottomup(struct file *file,
|
||||
unsigned long pgoff, unsigned long flags)
|
||||
{
|
||||
struct hstate *h = hstate_file(file);
|
||||
struct vm_unmapped_area_info info;
|
||||
struct vm_unmapped_area_info info = {};
|
||||
|
||||
info.flags = 0;
|
||||
info.length = len;
|
||||
info.low_limit = current->mm->mmap_base;
|
||||
info.high_limit = TASK_SIZE;
|
||||
info.align_mask = PAGE_MASK & ~huge_page_mask(h);
|
||||
info.align_offset = 0;
|
||||
return vm_unmapped_area(&info);
|
||||
}
|
||||
|
||||
@ -274,7 +262,7 @@ static unsigned long hugetlb_get_unmapped_area_topdown(struct file *file,
|
||||
unsigned long pgoff, unsigned long flags)
|
||||
{
|
||||
struct hstate *h = hstate_file(file);
|
||||
struct vm_unmapped_area_info info;
|
||||
struct vm_unmapped_area_info info = {};
|
||||
unsigned long addr;
|
||||
|
||||
info.flags = VM_UNMAPPED_AREA_TOPDOWN;
|
||||
@ -282,7 +270,6 @@ static unsigned long hugetlb_get_unmapped_area_topdown(struct file *file,
|
||||
info.low_limit = PAGE_SIZE;
|
||||
info.high_limit = current->mm->mmap_base;
|
||||
info.align_mask = PAGE_MASK & ~huge_page_mask(h);
|
||||
info.align_offset = 0;
|
||||
addr = vm_unmapped_area(&info);
|
||||
|
||||
/*
|
||||
@ -328,7 +315,7 @@ unsigned long hugetlb_get_unmapped_area(struct file *file, unsigned long addr,
|
||||
goto check_asce_limit;
|
||||
}
|
||||
|
||||
if (mm->get_unmapped_area == arch_get_unmapped_area)
|
||||
if (!test_bit(MMF_TOPDOWN, &mm->flags))
|
||||
addr = hugetlb_get_unmapped_area_bottomup(file, addr, len,
|
||||
pgoff, flags);
|
||||
else
|
||||
|
@ -86,7 +86,7 @@ unsigned long arch_get_unmapped_area(struct file *filp, unsigned long addr,
|
||||
{
|
||||
struct mm_struct *mm = current->mm;
|
||||
struct vm_area_struct *vma;
|
||||
struct vm_unmapped_area_info info;
|
||||
struct vm_unmapped_area_info info = {};
|
||||
|
||||
if (len > TASK_SIZE - mmap_min_addr)
|
||||
return -ENOMEM;
|
||||
@ -102,7 +102,6 @@ unsigned long arch_get_unmapped_area(struct file *filp, unsigned long addr,
|
||||
goto check_asce_limit;
|
||||
}
|
||||
|
||||
info.flags = 0;
|
||||
info.length = len;
|
||||
info.low_limit = mm->mmap_base;
|
||||
info.high_limit = TASK_SIZE;
|
||||
@ -122,7 +121,7 @@ unsigned long arch_get_unmapped_area_topdown(struct file *filp, unsigned long ad
|
||||
{
|
||||
struct vm_area_struct *vma;
|
||||
struct mm_struct *mm = current->mm;
|
||||
struct vm_unmapped_area_info info;
|
||||
struct vm_unmapped_area_info info = {};
|
||||
|
||||
/* requested length too big for entire address space */
|
||||
if (len > TASK_SIZE - mmap_min_addr)
|
||||
@ -185,10 +184,10 @@ void arch_pick_mmap_layout(struct mm_struct *mm, struct rlimit *rlim_stack)
|
||||
*/
|
||||
if (mmap_is_legacy(rlim_stack)) {
|
||||
mm->mmap_base = mmap_base_legacy(random_factor);
|
||||
mm->get_unmapped_area = arch_get_unmapped_area;
|
||||
clear_bit(MMF_TOPDOWN, &mm->flags);
|
||||
} else {
|
||||
mm->mmap_base = mmap_base(random_factor, rlim_stack);
|
||||
mm->get_unmapped_area = arch_get_unmapped_area_topdown;
|
||||
set_bit(MMF_TOPDOWN, &mm->flags);
|
||||
}
|
||||
}
|
||||
|
||||
|
@ -169,7 +169,7 @@ SYSCALL_DEFINE3(s390_pci_mmio_write, unsigned long, mmio_addr,
|
||||
if (!(vma->vm_flags & VM_WRITE))
|
||||
goto out_unlock_mmap;
|
||||
|
||||
ret = follow_pte(vma->vm_mm, mmio_addr, &ptep, &ptl);
|
||||
ret = follow_pte(vma, mmio_addr, &ptep, &ptl);
|
||||
if (ret)
|
||||
goto out_unlock_mmap;
|
||||
|
||||
@ -308,7 +308,7 @@ SYSCALL_DEFINE3(s390_pci_mmio_read, unsigned long, mmio_addr,
|
||||
if (!(vma->vm_flags & VM_WRITE))
|
||||
goto out_unlock_mmap;
|
||||
|
||||
ret = follow_pte(vma->vm_mm, mmio_addr, &ptep, &ptl);
|
||||
ret = follow_pte(vma, mmio_addr, &ptep, &ptl);
|
||||
if (ret)
|
||||
goto out_unlock_mmap;
|
||||
|
||||
|
@ -38,7 +38,7 @@ config SUPERH
|
||||
select HAVE_DEBUG_BUGVERBOSE
|
||||
select HAVE_DEBUG_KMEMLEAK
|
||||
select HAVE_DYNAMIC_FTRACE
|
||||
select HAVE_FAST_GUP if MMU
|
||||
select HAVE_GUP_FAST if MMU
|
||||
select HAVE_FUNCTION_GRAPH_TRACER
|
||||
select HAVE_FUNCTION_TRACER
|
||||
select HAVE_FTRACE_MCOUNT_RECORD
|
||||
|
@ -27,11 +27,11 @@ static inline pte_t huge_ptep_clear_flush(struct vm_area_struct *vma,
|
||||
return *ptep;
|
||||
}
|
||||
|
||||
static inline void arch_clear_hugepage_flags(struct page *page)
|
||||
static inline void arch_clear_hugetlb_flags(struct folio *folio)
|
||||
{
|
||||
clear_bit(PG_dcache_clean, &page->flags);
|
||||
clear_bit(PG_dcache_clean, &folio->flags);
|
||||
}
|
||||
#define arch_clear_hugepage_flags arch_clear_hugepage_flags
|
||||
#define arch_clear_hugetlb_flags arch_clear_hugetlb_flags
|
||||
|
||||
#include <asm-generic/hugetlb.h>
|
||||
|
||||
|
@ -241,13 +241,14 @@ static void sh4_flush_cache_page(void *args)
|
||||
if ((vma->vm_mm == current->active_mm))
|
||||
vaddr = NULL;
|
||||
else {
|
||||
struct folio *folio = page_folio(page);
|
||||
/*
|
||||
* Use kmap_coherent or kmap_atomic to do flushes for
|
||||
* another ASID than the current one.
|
||||
*/
|
||||
map_coherent = (current_cpu_data.dcache.n_aliases &&
|
||||
test_bit(PG_dcache_clean, &page->flags) &&
|
||||
page_mapcount(page));
|
||||
test_bit(PG_dcache_clean, folio_flags(folio, 0)) &&
|
||||
page_mapped(page));
|
||||
if (map_coherent)
|
||||
vaddr = kmap_coherent(page, address);
|
||||
else
|
||||
|
Some files were not shown because too many files have changed in this diff Show More
Loading…
Reference in New Issue
Block a user