The usual shower of singleton fixes and minor series all over MM,

documented (hopefully adequately) in the respective changelogs.  Notable
 series include:
 
 - Lucas Stach has provided some page-mapping
   cleanup/consolidation/maintainability work in the series "mm/treewide:
   Remove pXd_huge() API".
 
 - In the series "Allow migrate on protnone reference with
   MPOL_PREFERRED_MANY policy", Donet Tom has optimized mempolicy's
   MPOL_PREFERRED_MANY mode, yielding almost doubled performance in one
   test.
 
 - In their series "Memory allocation profiling" Kent Overstreet and
   Suren Baghdasaryan have contributed a means of determining (via
   /proc/allocinfo) whereabouts in the kernel memory is being allocated:
   number of calls and amount of memory.
 
 - Matthew Wilcox has provided the series "Various significant MM
   patches" which does a number of rather unrelated things, but in largely
   similar code sites.
 
 - In his series "mm: page_alloc: freelist migratetype hygiene" Johannes
   Weiner has fixed the page allocator's handling of migratetype requests,
   with resulting improvements in compaction efficiency.
 
 - In the series "make the hugetlb migration strategy consistent" Baolin
   Wang has fixed a hugetlb migration issue, which should improve hugetlb
   allocation reliability.
 
 - Liu Shixin has hit an I/O meltdown caused by readahead in a
   memory-tight memcg.  Addressed in the series "Fix I/O high when memory
   almost met memcg limit".
 
 - In the series "mm/filemap: optimize folio adding and splitting" Kairui
   Song has optimized pagecache insertion, yielding ~10% performance
   improvement in one test.
 
 - Baoquan He has cleaned up and consolidated the early zone
   initialization code in the series "mm/mm_init.c: refactor
   free_area_init_core()".
 
 - Baoquan has also redone some MM initializatio code in the series
   "mm/init: minor clean up and improvement".
 
 - MM helper cleanups from Christoph Hellwig in his series "remove
   follow_pfn".
 
 - More cleanups from Matthew Wilcox in the series "Various page->flags
   cleanups".
 
 - Vlastimil Babka has contributed maintainability improvements in the
   series "memcg_kmem hooks refactoring".
 
 - More folio conversions and cleanups in Matthew Wilcox's series
 
 	"Convert huge_zero_page to huge_zero_folio"
 	"khugepaged folio conversions"
 	"Remove page_idle and page_young wrappers"
 	"Use folio APIs in procfs"
 	"Clean up __folio_put()"
 	"Some cleanups for memory-failure"
 	"Remove page_mapping()"
 	"More folio compat code removal"
 
 - David Hildenbrand chipped in with "fs/proc/task_mmu: convert hugetlb
   functions to work on folis".
 
 - Code consolidation and cleanup work related to GUP's handling of
   hugetlbs in Peter Xu's series "mm/gup: Unify hugetlb, part 2".
 
 - Rick Edgecombe has developed some fixes to stack guard gaps in the
   series "Cover a guard gap corner case".
 
 - Jinjiang Tu has fixed KSM's behaviour after a fork+exec in the series
   "mm/ksm: fix ksm exec support for prctl".
 
 - Baolin Wang has implemented NUMA balancing for multi-size THPs.  This
   is a simple first-cut implementation for now.  The series is "support
   multi-size THP numa balancing".
 
 - Cleanups to vma handling helper functions from Matthew Wilcox in the
   series "Unify vma_address and vma_pgoff_address".
 
 - Some selftests maintenance work from Dev Jain in the series
   "selftests/mm: mremap_test: Optimizations and style fixes".
 
 - Improvements to the swapping of multi-size THPs from Ryan Roberts in
   the series "Swap-out mTHP without splitting".
 
 - Kefeng Wang has significantly optimized the handling of arm64's
   permission page faults in the series
 
 	"arch/mm/fault: accelerate pagefault when badaccess"
 	"mm: remove arch's private VM_FAULT_BADMAP/BADACCESS"
 
 - GUP cleanups from David Hildenbrand in "mm/gup: consistently call it
   GUP-fast".
 
 - hugetlb fault code cleanups from Vishal Moola in "Hugetlb fault path to
   use struct vm_fault".
 
 - selftests build fixes from John Hubbard in the series "Fix
   selftests/mm build without requiring "make headers"".
 
 - Memory tiering fixes/improvements from Ho-Ren (Jack) Chuang in the
   series "Improved Memory Tier Creation for CPUless NUMA Nodes".  Fixes
   the initialization code so that migration between different memory types
   works as intended.
 
 - David Hildenbrand has improved follow_pte() and fixed an errant driver
   in the series "mm: follow_pte() improvements and acrn follow_pte()
   fixes".
 
 - David also did some cleanup work on large folio mapcounts in his
   series "mm: mapcount for large folios + page_mapcount() cleanups".
 
 - Folio conversions in KSM in Alex Shi's series "transfer page to folio
   in KSM".
 
 - Barry Song has added some sysfs stats for monitoring multi-size THP's
   in the series "mm: add per-order mTHP alloc and swpout counters".
 
 - Some zswap cleanups from Yosry Ahmed in the series "zswap same-filled
   and limit checking cleanups".
 
 - Matthew Wilcox has been looking at buffer_head code and found the
   documentation to be lacking.  The series is "Improve buffer head
   documentation".
 
 - Multi-size THPs get more work, this time from Lance Yang.  His series
   "mm/madvise: enhance lazyfreeing with mTHP in madvise_free" optimizes
   the freeing of these things.
 
 - Kemeng Shi has added more userspace-visible writeback instrumentation
   in the series "Improve visibility of writeback".
 
 - Kemeng Shi then sent some maintenance work on top in the series "Fix
   and cleanups to page-writeback".
 
 - Matthew Wilcox reduces mmap_lock traffic in the anon vma code in the
   series "Improve anon_vma scalability for anon VMAs".  Intel's test bot
   reported an improbable 3x improvement in one test.
 
 - SeongJae Park adds some DAMON feature work in the series
 
 	"mm/damon: add a DAMOS filter type for page granularity access recheck"
 	"selftests/damon: add DAMOS quota goal test"
 
 - Also some maintenance work in the series
 
 	"mm/damon/paddr: simplify page level access re-check for pageout"
 	"mm/damon: misc fixes and improvements"
 
 - David Hildenbrand has disabled some known-to-fail selftests ni the
   series "selftests: mm: cow: flag vmsplice() hugetlb tests as XFAIL".
 
 - memcg metadata storage optimizations from Shakeel Butt in "memcg:
   reduce memory consumption by memcg stats".
 
 - DAX fixes and maintenance work from Vishal Verma in the series
   "dax/bus.c: Fixups for dax-bus locking".
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCZkgQYwAKCRDdBJ7gKXxA
 jrdKAP9WVJdpEcXxpoub/vVE0UWGtffr8foifi9bCwrQrGh5mgEAx7Yf0+d/oBZB
 nvA4E0DcPrUAFy144FNM0NTCb7u9vAw=
 =V3R/
 -----END PGP SIGNATURE-----

Merge tag 'mm-stable-2024-05-17-19-19' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm

Pull mm updates from Andrew Morton:
 "The usual shower of singleton fixes and minor series all over MM,
  documented (hopefully adequately) in the respective changelogs.
  Notable series include:

   - Lucas Stach has provided some page-mapping cleanup/consolidation/
     maintainability work in the series "mm/treewide: Remove pXd_huge()
     API".

   - In the series "Allow migrate on protnone reference with
     MPOL_PREFERRED_MANY policy", Donet Tom has optimized mempolicy's
     MPOL_PREFERRED_MANY mode, yielding almost doubled performance in
     one test.

   - In their series "Memory allocation profiling" Kent Overstreet and
     Suren Baghdasaryan have contributed a means of determining (via
     /proc/allocinfo) whereabouts in the kernel memory is being
     allocated: number of calls and amount of memory.

   - Matthew Wilcox has provided the series "Various significant MM
     patches" which does a number of rather unrelated things, but in
     largely similar code sites.

   - In his series "mm: page_alloc: freelist migratetype hygiene"
     Johannes Weiner has fixed the page allocator's handling of
     migratetype requests, with resulting improvements in compaction
     efficiency.

   - In the series "make the hugetlb migration strategy consistent"
     Baolin Wang has fixed a hugetlb migration issue, which should
     improve hugetlb allocation reliability.

   - Liu Shixin has hit an I/O meltdown caused by readahead in a
     memory-tight memcg. Addressed in the series "Fix I/O high when
     memory almost met memcg limit".

   - In the series "mm/filemap: optimize folio adding and splitting"
     Kairui Song has optimized pagecache insertion, yielding ~10%
     performance improvement in one test.

   - Baoquan He has cleaned up and consolidated the early zone
     initialization code in the series "mm/mm_init.c: refactor
     free_area_init_core()".

   - Baoquan has also redone some MM initializatio code in the series
     "mm/init: minor clean up and improvement".

   - MM helper cleanups from Christoph Hellwig in his series "remove
     follow_pfn".

   - More cleanups from Matthew Wilcox in the series "Various
     page->flags cleanups".

   - Vlastimil Babka has contributed maintainability improvements in the
     series "memcg_kmem hooks refactoring".

   - More folio conversions and cleanups in Matthew Wilcox's series:
	"Convert huge_zero_page to huge_zero_folio"
	"khugepaged folio conversions"
	"Remove page_idle and page_young wrappers"
	"Use folio APIs in procfs"
	"Clean up __folio_put()"
	"Some cleanups for memory-failure"
	"Remove page_mapping()"
	"More folio compat code removal"

   - David Hildenbrand chipped in with "fs/proc/task_mmu: convert
     hugetlb functions to work on folis".

   - Code consolidation and cleanup work related to GUP's handling of
     hugetlbs in Peter Xu's series "mm/gup: Unify hugetlb, part 2".

   - Rick Edgecombe has developed some fixes to stack guard gaps in the
     series "Cover a guard gap corner case".

   - Jinjiang Tu has fixed KSM's behaviour after a fork+exec in the
     series "mm/ksm: fix ksm exec support for prctl".

   - Baolin Wang has implemented NUMA balancing for multi-size THPs.
     This is a simple first-cut implementation for now. The series is
     "support multi-size THP numa balancing".

   - Cleanups to vma handling helper functions from Matthew Wilcox in
     the series "Unify vma_address and vma_pgoff_address".

   - Some selftests maintenance work from Dev Jain in the series
     "selftests/mm: mremap_test: Optimizations and style fixes".

   - Improvements to the swapping of multi-size THPs from Ryan Roberts
     in the series "Swap-out mTHP without splitting".

   - Kefeng Wang has significantly optimized the handling of arm64's
     permission page faults in the series
	"arch/mm/fault: accelerate pagefault when badaccess"
	"mm: remove arch's private VM_FAULT_BADMAP/BADACCESS"

   - GUP cleanups from David Hildenbrand in "mm/gup: consistently call
     it GUP-fast".

   - hugetlb fault code cleanups from Vishal Moola in "Hugetlb fault
     path to use struct vm_fault".

   - selftests build fixes from John Hubbard in the series "Fix
     selftests/mm build without requiring "make headers"".

   - Memory tiering fixes/improvements from Ho-Ren (Jack) Chuang in the
     series "Improved Memory Tier Creation for CPUless NUMA Nodes".
     Fixes the initialization code so that migration between different
     memory types works as intended.

   - David Hildenbrand has improved follow_pte() and fixed an errant
     driver in the series "mm: follow_pte() improvements and acrn
     follow_pte() fixes".

   - David also did some cleanup work on large folio mapcounts in his
     series "mm: mapcount for large folios + page_mapcount() cleanups".

   - Folio conversions in KSM in Alex Shi's series "transfer page to
     folio in KSM".

   - Barry Song has added some sysfs stats for monitoring multi-size
     THP's in the series "mm: add per-order mTHP alloc and swpout
     counters".

   - Some zswap cleanups from Yosry Ahmed in the series "zswap
     same-filled and limit checking cleanups".

   - Matthew Wilcox has been looking at buffer_head code and found the
     documentation to be lacking. The series is "Improve buffer head
     documentation".

   - Multi-size THPs get more work, this time from Lance Yang. His
     series "mm/madvise: enhance lazyfreeing with mTHP in madvise_free"
     optimizes the freeing of these things.

   - Kemeng Shi has added more userspace-visible writeback
     instrumentation in the series "Improve visibility of writeback".

   - Kemeng Shi then sent some maintenance work on top in the series
     "Fix and cleanups to page-writeback".

   - Matthew Wilcox reduces mmap_lock traffic in the anon vma code in
     the series "Improve anon_vma scalability for anon VMAs". Intel's
     test bot reported an improbable 3x improvement in one test.

   - SeongJae Park adds some DAMON feature work in the series
	"mm/damon: add a DAMOS filter type for page granularity access recheck"
	"selftests/damon: add DAMOS quota goal test"

   - Also some maintenance work in the series
	"mm/damon/paddr: simplify page level access re-check for pageout"
	"mm/damon: misc fixes and improvements"

   - David Hildenbrand has disabled some known-to-fail selftests ni the
     series "selftests: mm: cow: flag vmsplice() hugetlb tests as
     XFAIL".

   - memcg metadata storage optimizations from Shakeel Butt in "memcg:
     reduce memory consumption by memcg stats".

   - DAX fixes and maintenance work from Vishal Verma in the series
     "dax/bus.c: Fixups for dax-bus locking""

* tag 'mm-stable-2024-05-17-19-19' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (426 commits)
  memcg, oom: cleanup unused memcg_oom_gfp_mask and memcg_oom_order
  selftests/mm: hugetlb_madv_vs_map: avoid test skipping by querying hugepage size at runtime
  mm/hugetlb: add missing VM_FAULT_SET_HINDEX in hugetlb_wp
  mm/hugetlb: add missing VM_FAULT_SET_HINDEX in hugetlb_fault
  selftests: cgroup: add tests to verify the zswap writeback path
  mm: memcg: make alloc_mem_cgroup_per_node_info() return bool
  mm/damon/core: fix return value from damos_wmark_metric_value
  mm: do not update memcg stats for NR_{FILE/SHMEM}_PMDMAPPED
  selftests: cgroup: remove redundant enabling of memory controller
  Docs/mm/damon/maintainer-profile: allow posting patches based on damon/next tree
  Docs/mm/damon/maintainer-profile: change the maintainer's timezone from PST to PT
  Docs/mm/damon/design: use a list for supported filters
  Docs/admin-guide/mm/damon/usage: fix wrong schemes effective quota update command
  Docs/admin-guide/mm/damon/usage: fix wrong example of DAMOS filter matching sysfs file
  selftests/damon: classify tests for functionalities and regressions
  selftests/damon/_damon_sysfs: use 'is' instead of '==' for 'None'
  selftests/damon/_damon_sysfs: find sysfs mount point from /proc/mounts
  selftests/damon/_damon_sysfs: check errors from nr_schemes file reads
  mm/damon/core: initialize ->esz_bp from damos_quota_init_priv()
  selftests/damon: add a test for DAMOS quota goal
  ...
This commit is contained in:
Linus Torvalds 2024-05-19 09:21:03 -07:00
commit 61307b7be4
387 changed files with 9728 additions and 5760 deletions

View File

@ -314,9 +314,9 @@ Date: Dec 2022
Contact: SeongJae Park <sj@kernel.org> Contact: SeongJae Park <sj@kernel.org>
Description: Writing to and reading from this file sets and gets the type of Description: Writing to and reading from this file sets and gets the type of
the memory of the interest. 'anon' for anonymous pages, the memory of the interest. 'anon' for anonymous pages,
'memcg' for specific memory cgroup, 'addr' for address range 'memcg' for specific memory cgroup, 'young' for young pages,
(an open-ended interval), or 'target' for DAMON monitoring 'addr' for address range (an open-ended interval), or 'target'
target can be written and read. for DAMON monitoring target can be written and read.
What: /sys/kernel/mm/damon/admin/kdamonds/<K>/contexts/<C>/schemes/<S>/filters/<F>/memcg_path What: /sys/kernel/mm/damon/admin/kdamonds/<K>/contexts/<C>/schemes/<S>/filters/<F>/memcg_path
Date: Dec 2022 Date: Dec 2022

View File

@ -0,0 +1,18 @@
What: /sys/kernel/mm/transparent_hugepage/
Date: April 2024
Contact: Linux memory management mailing list <linux-mm@kvack.org>
Description:
/sys/kernel/mm/transparent_hugepage/ contains a number of files and
subdirectories,
- defrag
- enabled
- hpage_pmd_size
- khugepaged
- shmem_enabled
- use_zero_page
- subdirectories of the form hugepages-<size>kB, where <size>
is the page size of the hugepages supported by the kernel/CPU
combination.
See Documentation/admin-guide/mm/transhuge.rst for details.

View File

@ -466,6 +466,11 @@ of equal or greater size:::
#recompress idle pages larger than 2000 bytes #recompress idle pages larger than 2000 bytes
echo "type=idle threshold=2000" > /sys/block/zramX/recompress echo "type=idle threshold=2000" > /sys/block/zramX/recompress
It is also possible to limit the number of pages zram re-compression will
attempt to recompress:::
echo "type=huge_idle max_pages=42" > /sys/block/zramX/recompress
Recompression of idle pages requires memory tracking. Recompression of idle pages requires memory tracking.
During re-compression for every page, that matches re-compression criteria, During re-compression for every page, that matches re-compression criteria,

View File

@ -300,14 +300,14 @@ When oom event notifier is registered, event will be delivered.
Lock order is as follows:: Lock order is as follows::
Page lock (PG_locked bit of page->flags) folio_lock
mm->page_table_lock or split pte_lock mm->page_table_lock or split pte_lock
folio_memcg_lock (memcg->move_lock) folio_memcg_lock (memcg->move_lock)
mapping->i_pages lock mapping->i_pages lock
lruvec->lru_lock. lruvec->lru_lock.
Per-node-per-memcgroup LRU (cgroup's private LRU) is guarded by Per-node-per-memcgroup LRU (cgroup's private LRU) is guarded by
lruvec->lru_lock; PG_lru bit of page->flags is cleared before lruvec->lru_lock; the folio LRU flag is cleared before
isolating a page from its LRU under lruvec->lru_lock. isolating a page from its LRU under lruvec->lru_lock.
.. _cgroup-v1-memory-kernel-extension: .. _cgroup-v1-memory-kernel-extension:
@ -802,8 +802,8 @@ a page or a swap can be moved only when it is charged to the task's current
| | anonymous pages, file pages (and swaps) in the range mmapped by the task | | | anonymous pages, file pages (and swaps) in the range mmapped by the task |
| | will be moved even if the task hasn't done page fault, i.e. they might | | | will be moved even if the task hasn't done page fault, i.e. they might |
| | not be the task's "RSS", but other task's "RSS" that maps the same file. | | | not be the task's "RSS", but other task's "RSS" that maps the same file. |
| | And mapcount of the page is ignored (the page can be moved even if | | | The mapcount of the page is ignored (the page can be moved independent |
| | page_mapcount(page) > 1). You must enable Swap Extension (see 2.4) to | | | of the mapcount). You must enable Swap Extension (see 2.4) to |
| | enable move of swap charges. | | | enable move of swap charges. |
+---+--------------------------------------------------------------------------+ +---+--------------------------------------------------------------------------+

View File

@ -2151,6 +2151,12 @@
Format: 0 | 1 Format: 0 | 1
Default set by CONFIG_INIT_ON_FREE_DEFAULT_ON. Default set by CONFIG_INIT_ON_FREE_DEFAULT_ON.
init_mlocked_on_free= [MM] Fill freed userspace memory with zeroes if
it was mlock'ed and not explicitly munlock'ed
afterwards.
Format: 0 | 1
Default set by CONFIG_INIT_MLOCKED_ON_FREE_DEFAULT_ON
init_pkru= [X86] Specify the default memory protection keys rights init_pkru= [X86] Specify the default memory protection keys rights
register contents for all processes. 0x55555554 by register contents for all processes. 0x55555554 by
default (disallow access to all but pkey 0). Can default (disallow access to all but pkey 0). Can

View File

@ -153,7 +153,7 @@ Users can write below commands for the kdamond to the ``state`` file.
- ``clear_schemes_tried_regions``: Clear the DAMON-based operating scheme - ``clear_schemes_tried_regions``: Clear the DAMON-based operating scheme
action tried regions directory for each DAMON-based operation scheme of the action tried regions directory for each DAMON-based operation scheme of the
kdamond. kdamond.
- ``update_schemes_effective_bytes``: Update the contents of - ``update_schemes_effective_quotas``: Update the contents of
``effective_bytes`` files for each DAMON-based operation scheme of the ``effective_bytes`` files for each DAMON-based operation scheme of the
kdamond. For more details, refer to :ref:`quotas directory <sysfs_quotas>`. kdamond. For more details, refer to :ref:`quotas directory <sysfs_quotas>`.
@ -342,7 +342,7 @@ Based on the user-specified :ref:`goal <sysfs_schemes_quota_goals>`, the
effective size quota is further adjusted. Reading ``effective_bytes`` returns effective size quota is further adjusted. Reading ``effective_bytes`` returns
the current effective size quota. The file is not updated in real time, so the current effective size quota. The file is not updated in real time, so
users should ask DAMON sysfs interface to update the content of the file for users should ask DAMON sysfs interface to update the content of the file for
the stats by writing a special keyword, ``update_schemes_effective_bytes`` to the stats by writing a special keyword, ``update_schemes_effective_quotas`` to
the relevant ``kdamonds/<N>/state`` file. the relevant ``kdamonds/<N>/state`` file.
Under ``weights`` directory, three files (``sz_permil``, Under ``weights`` directory, three files (``sz_permil``,
@ -410,19 +410,19 @@ in the numeric order.
Each filter directory contains six files, namely ``type``, ``matcing``, Each filter directory contains six files, namely ``type``, ``matcing``,
``memcg_path``, ``addr_start``, ``addr_end``, and ``target_idx``. To ``type`` ``memcg_path``, ``addr_start``, ``addr_end``, and ``target_idx``. To ``type``
file, you can write one of four special keywords: ``anon`` for anonymous pages, file, you can write one of five special keywords: ``anon`` for anonymous pages,
``memcg`` for specific memory cgroup, ``addr`` for specific address range (an ``memcg`` for specific memory cgroup, ``young`` for young pages, ``addr`` for
open-ended interval), or ``target`` for specific DAMON monitoring target specific address range (an open-ended interval), or ``target`` for specific
filtering. In case of the memory cgroup filtering, you can specify the memory DAMON monitoring target filtering. In case of the memory cgroup filtering, you
cgroup of the interest by writing the path of the memory cgroup from the can specify the memory cgroup of the interest by writing the path of the memory
cgroups mount point to ``memcg_path`` file. In case of the address range cgroup from the cgroups mount point to ``memcg_path`` file. In case of the
filtering, you can specify the start and end address of the range to address range filtering, you can specify the start and end address of the range
``addr_start`` and ``addr_end`` files, respectively. For the DAMON monitoring to ``addr_start`` and ``addr_end`` files, respectively. For the DAMON
target filtering, you can specify the index of the target between the list of monitoring target filtering, you can specify the index of the target between
the DAMON context's monitoring targets list to ``target_idx`` file. You can the list of the DAMON context's monitoring targets list to ``target_idx`` file.
write ``Y`` or ``N`` to ``matching`` file to filter out pages that does or does You can write ``Y`` or ``N`` to ``matching`` file to filter out pages that does
not match to the type, respectively. Then, the scheme's action will not be or does not match to the type, respectively. Then, the scheme's action will
applied to the pages that specified to be filtered out. not be applied to the pages that specified to be filtered out.
For example, below restricts a DAMOS action to be applied to only non-anonymous For example, below restricts a DAMOS action to be applied to only non-anonymous
pages of all memory cgroups except ``/having_care_already``.:: pages of all memory cgroups except ``/having_care_already``.::
@ -434,7 +434,7 @@ pages of all memory cgroups except ``/having_care_already``.::
# # further filter out all cgroups except one at '/having_care_already' # # further filter out all cgroups except one at '/having_care_already'
echo memcg > 1/type echo memcg > 1/type
echo /having_care_already > 1/memcg_path echo /having_care_already > 1/memcg_path
echo N > 1/matching echo Y > 1/matching
Note that ``anon`` and ``memcg`` filters are currently supported only when Note that ``anon`` and ``memcg`` filters are currently supported only when
``paddr`` :ref:`implementation <sysfs_context>` is being used. ``paddr`` :ref:`implementation <sysfs_context>` is being used.

View File

@ -376,6 +376,13 @@ Note that the number of overcommit and reserve pages remain global quantities,
as we don't know until fault time, when the faulting task's mempolicy is as we don't know until fault time, when the faulting task's mempolicy is
applied, from which node the huge page allocation will be attempted. applied, from which node the huge page allocation will be attempted.
The hugetlb may be migrated between the per-node hugepages pool in the following
scenarios: memory offline, memory failure, longterm pinning, syscalls(mbind,
migrate_pages and move_pages), alloc_contig_range() and alloc_contig_pages().
Now only memory offline, memory failure and syscalls allow fallbacking to allocate
a new hugetlb on a different node if the current node is unable to allocate during
hugetlb migration, that means these 3 cases can break the per-node hugepages pool.
.. _using_huge_pages: .. _using_huge_pages:
Using Huge Pages Using Huge Pages

View File

@ -278,7 +278,8 @@ collapsed, resulting fewer pages being collapsed into
THPs, and lower memory access performance. THPs, and lower memory access performance.
``max_ptes_shared`` specifies how many pages can be shared across multiple ``max_ptes_shared`` specifies how many pages can be shared across multiple
processes. Exceeding the number would block the collapse:: processes. khugepaged might treat pages of THPs as shared if any page of
that THP is shared. Exceeding the number would block the collapse::
/sys/kernel/mm/transparent_hugepage/khugepaged/max_ptes_shared /sys/kernel/mm/transparent_hugepage/khugepaged/max_ptes_shared
@ -369,7 +370,7 @@ monitor how successfully the system is providing huge pages for use.
thp_fault_alloc thp_fault_alloc
is incremented every time a huge page is successfully is incremented every time a huge page is successfully
allocated to handle a page fault. allocated and charged to handle a page fault.
thp_collapse_alloc thp_collapse_alloc
is incremented by khugepaged when it has found is incremented by khugepaged when it has found
@ -377,7 +378,7 @@ thp_collapse_alloc
successfully allocated a new huge page to store the data. successfully allocated a new huge page to store the data.
thp_fault_fallback thp_fault_fallback
is incremented if a page fault fails to allocate is incremented if a page fault fails to allocate or charge
a huge page and instead falls back to using small pages. a huge page and instead falls back to using small pages.
thp_fault_fallback_charge thp_fault_fallback_charge
@ -447,6 +448,34 @@ thp_swpout_fallback
Usually because failed to allocate some continuous swap space Usually because failed to allocate some continuous swap space
for the huge page. for the huge page.
In /sys/kernel/mm/transparent_hugepage/hugepages-<size>kB/stats, There are
also individual counters for each huge page size, which can be utilized to
monitor the system's effectiveness in providing huge pages for usage. Each
counter has its own corresponding file.
anon_fault_alloc
is incremented every time a huge page is successfully
allocated and charged to handle a page fault.
anon_fault_fallback
is incremented if a page fault fails to allocate or charge
a huge page and instead falls back to using huge pages with
lower orders or small pages.
anon_fault_fallback_charge
is incremented if a page fault fails to charge a huge page and
instead falls back to using huge pages with lower orders or
small pages even though the allocation was successful.
anon_swpout
is incremented every time a huge page is swapped out in one
piece without splitting.
anon_swpout_fallback
is incremented if a huge page has to be split before swapout.
Usually because failed to allocate some continuous swap space
for the huge page.
As the system ages, allocating huge pages may be expensive as the As the system ages, allocating huge pages may be expensive as the
system uses memory compaction to copy data around memory to free a system uses memory compaction to copy data around memory to free a
huge page for use. There are some counters in ``/proc/vmstat`` to help huge page for use. There are some counters in ``/proc/vmstat`` to help

View File

@ -111,35 +111,6 @@ checked if it is a same-value filled page before compressing it. If true, the
compressed length of the page is set to zero and the pattern or same-filled compressed length of the page is set to zero and the pattern or same-filled
value is stored. value is stored.
Same-value filled pages identification feature is enabled by default and can be
disabled at boot time by setting the ``same_filled_pages_enabled`` attribute
to 0, e.g. ``zswap.same_filled_pages_enabled=0``. It can also be enabled and
disabled at runtime using the sysfs ``same_filled_pages_enabled``
attribute, e.g.::
echo 1 > /sys/module/zswap/parameters/same_filled_pages_enabled
When zswap same-filled page identification is disabled at runtime, it will stop
checking for the same-value filled pages during store operation.
In other words, every page will be then considered non-same-value filled.
However, the existing pages which are marked as same-value filled pages remain
stored unchanged in zswap until they are either loaded or invalidated.
In some circumstances it might be advantageous to make use of just the zswap
ability to efficiently store same-filled pages without enabling the whole
compressed page storage.
In this case the handling of non-same-value pages by zswap (enabled by default)
can be disabled by setting the ``non_same_filled_pages_enabled`` attribute
to 0, e.g. ``zswap.non_same_filled_pages_enabled=0``.
It can also be enabled and disabled at runtime using the sysfs
``non_same_filled_pages_enabled`` attribute, e.g.::
echo 1 > /sys/module/zswap/parameters/non_same_filled_pages_enabled
Disabling both ``zswap.same_filled_pages_enabled`` and
``zswap.non_same_filled_pages_enabled`` effectively disables accepting any new
pages by zswap.
To prevent zswap from shrinking pool when zswap is full and there's a high To prevent zswap from shrinking pool when zswap is full and there's a high
pressure on swap (this will result in flipping pages in and out zswap pool pressure on swap (this will result in flipping pages in and out zswap pool
without any real benefit but with a performance drop for the system), a without any real benefit but with a performance drop for the system), a

View File

@ -43,6 +43,7 @@ Currently, these files are in /proc/sys/vm:
- legacy_va_layout - legacy_va_layout
- lowmem_reserve_ratio - lowmem_reserve_ratio
- max_map_count - max_map_count
- mem_profiling (only if CONFIG_MEM_ALLOC_PROFILING=y)
- memory_failure_early_kill - memory_failure_early_kill
- memory_failure_recovery - memory_failure_recovery
- min_free_kbytes - min_free_kbytes
@ -425,6 +426,21 @@ e.g., up to one or two maps per allocation.
The default value is 65530. The default value is 65530.
mem_profiling
==============
Enable memory profiling (when CONFIG_MEM_ALLOC_PROFILING=y)
1: Enable memory profiling.
0: Disable memory profiling.
Enabling memory profiling introduces a small performance overhead for all
memory allocations.
The default value depends on CONFIG_MEM_ALLOC_PROFILING_ENABLED_BY_DEFAULT.
memory_failure_early_kill: memory_failure_early_kill:
========================== ==========================

View File

@ -471,7 +471,6 @@ Use the following commands to enable zswap::
# echo deflate-iaa > /sys/module/zswap/parameters/compressor # echo deflate-iaa > /sys/module/zswap/parameters/compressor
# echo zsmalloc > /sys/module/zswap/parameters/zpool # echo zsmalloc > /sys/module/zswap/parameters/zpool
# echo 1 > /sys/module/zswap/parameters/enabled # echo 1 > /sys/module/zswap/parameters/enabled
# echo 0 > /sys/module/zswap/parameters/same_filled_pages_enabled
# echo 100 > /proc/sys/vm/swappiness # echo 100 > /proc/sys/vm/swappiness
# echo never > /sys/kernel/mm/transparent_hugepage/enabled # echo never > /sys/kernel/mm/transparent_hugepage/enabled
# echo 1 > /proc/sys/vm/overcommit_memory # echo 1 > /proc/sys/vm/overcommit_memory
@ -621,7 +620,6 @@ the 'fixed' compression mode::
echo deflate-iaa > /sys/module/zswap/parameters/compressor echo deflate-iaa > /sys/module/zswap/parameters/compressor
echo zsmalloc > /sys/module/zswap/parameters/zpool echo zsmalloc > /sys/module/zswap/parameters/zpool
echo 1 > /sys/module/zswap/parameters/enabled echo 1 > /sys/module/zswap/parameters/enabled
echo 0 > /sys/module/zswap/parameters/same_filled_pages_enabled
echo 100 > /proc/sys/vm/swappiness echo 100 > /proc/sys/vm/swappiness
echo never > /sys/kernel/mm/transparent_hugepage/enabled echo never > /sys/kernel/mm/transparent_hugepage/enabled

View File

@ -56,9 +56,6 @@ Other Functions
.. kernel-doc:: fs/namei.c .. kernel-doc:: fs/namei.c
:export: :export:
.. kernel-doc:: fs/buffer.c
:export:
.. kernel-doc:: block/bio.c .. kernel-doc:: block/bio.c
:export: :export:

View File

@ -0,0 +1,12 @@
Buffer Heads
============
Linux uses buffer heads to maintain state about individual filesystem blocks.
Buffer heads are deprecated and new filesystems should use iomap instead.
Functions
---------
.. kernel-doc:: include/linux/buffer_head.h
.. kernel-doc:: fs/buffer.c
:export:

View File

@ -50,6 +50,7 @@ filesystem implementations.
.. toctree:: .. toctree::
:maxdepth: 2 :maxdepth: 2
buffer
journalling journalling
fscrypt fscrypt
fsverity fsverity

View File

@ -688,6 +688,7 @@ files are there, and which are missing.
============ =============================================================== ============ ===============================================================
File Content File Content
============ =============================================================== ============ ===============================================================
allocinfo Memory allocations profiling information
apm Advanced power management info apm Advanced power management info
bootconfig Kernel command line obtained from boot config, bootconfig Kernel command line obtained from boot config,
and, if there were kernel parameters from the and, if there were kernel parameters from the
@ -953,6 +954,34 @@ also be allocatable although a lot of filesystem metadata may have to be
reclaimed to achieve this. reclaimed to achieve this.
allocinfo
~~~~~~~~~
Provides information about memory allocations at all locations in the code
base. Each allocation in the code is identified by its source file, line
number, module (if originates from a loadable module) and the function calling
the allocation. The number of bytes allocated and number of calls at each
location are reported.
Example output.
::
> sort -rn /proc/allocinfo
127664128 31168 mm/page_ext.c:270 func:alloc_page_ext
56373248 4737 mm/slub.c:2259 func:alloc_slab_page
14880768 3633 mm/readahead.c:247 func:page_cache_ra_unbounded
14417920 3520 mm/mm_init.c:2530 func:alloc_large_system_hash
13377536 234 block/blk-mq.c:3421 func:blk_mq_alloc_rqs
11718656 2861 mm/filemap.c:1919 func:__filemap_get_folio
9192960 2800 kernel/fork.c:307 func:alloc_thread_stack_node
4206592 4 net/netfilter/nf_conntrack_core.c:2567 func:nf_ct_alloc_hashtable
4136960 1010 drivers/staging/ctagmod/ctagmod.c:20 [ctagmod] func:ctagmod_start
3940352 962 mm/memory.c:4214 func:alloc_anon_folio
2894464 22613 fs/kernfs/dir.c:615 func:__kernfs_new_node
...
meminfo meminfo
~~~~~~~ ~~~~~~~

View File

@ -0,0 +1,100 @@
.. SPDX-License-Identifier: GPL-2.0
===========================
MEMORY ALLOCATION PROFILING
===========================
Low overhead (suitable for production) accounting of all memory allocations,
tracked by file and line number.
Usage:
kconfig options:
- CONFIG_MEM_ALLOC_PROFILING
- CONFIG_MEM_ALLOC_PROFILING_ENABLED_BY_DEFAULT
- CONFIG_MEM_ALLOC_PROFILING_DEBUG
adds warnings for allocations that weren't accounted because of a
missing annotation
Boot parameter:
sysctl.vm.mem_profiling=0|1|never
When set to "never", memory allocation profiling overhead is minimized and it
cannot be enabled at runtime (sysctl becomes read-only).
When CONFIG_MEM_ALLOC_PROFILING_ENABLED_BY_DEFAULT=y, default value is "1".
When CONFIG_MEM_ALLOC_PROFILING_ENABLED_BY_DEFAULT=n, default value is "never".
sysctl:
/proc/sys/vm/mem_profiling
Runtime info:
/proc/allocinfo
Example output::
root@moria-kvm:~# sort -g /proc/allocinfo|tail|numfmt --to=iec
2.8M 22648 fs/kernfs/dir.c:615 func:__kernfs_new_node
3.8M 953 mm/memory.c:4214 func:alloc_anon_folio
4.0M 1010 drivers/staging/ctagmod/ctagmod.c:20 [ctagmod] func:ctagmod_start
4.1M 4 net/netfilter/nf_conntrack_core.c:2567 func:nf_ct_alloc_hashtable
6.0M 1532 mm/filemap.c:1919 func:__filemap_get_folio
8.8M 2785 kernel/fork.c:307 func:alloc_thread_stack_node
13M 234 block/blk-mq.c:3421 func:blk_mq_alloc_rqs
14M 3520 mm/mm_init.c:2530 func:alloc_large_system_hash
15M 3656 mm/readahead.c:247 func:page_cache_ra_unbounded
55M 4887 mm/slub.c:2259 func:alloc_slab_page
122M 31168 mm/page_ext.c:270 func:alloc_page_ext
===================
Theory of operation
===================
Memory allocation profiling builds off of code tagging, which is a library for
declaring static structs (that typically describe a file and line number in
some way, hence code tagging) and then finding and operating on them at runtime,
- i.e. iterating over them to print them in debugfs/procfs.
To add accounting for an allocation call, we replace it with a macro
invocation, alloc_hooks(), that
- declares a code tag
- stashes a pointer to it in task_struct
- calls the real allocation function
- and finally, restores the task_struct alloc tag pointer to its previous value.
This allows for alloc_hooks() calls to be nested, with the most recent one
taking effect. This is important for allocations internal to the mm/ code that
do not properly belong to the outer allocation context and should be counted
separately: for example, slab object extension vectors, or when the slab
allocates pages from the page allocator.
Thus, proper usage requires determining which function in an allocation call
stack should be tagged. There are many helper functions that essentially wrap
e.g. kmalloc() and do a little more work, then are called in multiple places;
we'll generally want the accounting to happen in the callers of these helpers,
not in the helpers themselves.
To fix up a given helper, for example foo(), do the following:
- switch its allocation call to the _noprof() version, e.g. kmalloc_noprof()
- rename it to foo_noprof()
- define a macro version of foo() like so:
#define foo(...) alloc_hooks(foo_noprof(__VA_ARGS__))
It's also possible to stash a pointer to an alloc tag in your own data structures.
Do this when you're implementing a generic data structure that does allocations
"on behalf of" some other code - for example, the rhashtable code. This way,
instead of seeing a large line in /proc/allocinfo for rhashtable.c, we can
break it out by rhashtable type.
To do so:
- Hook your data structure's init function, like any other allocation function.
- Within your init function, use the convenience macro alloc_tag_record() to
record alloc tag in your data structure.
- Then, use the following form for your allocations:
alloc_hooks_tag(ht->your_saved_tag, kmalloc_noprof(...))

View File

@ -140,7 +140,8 @@ PMD Page Table Helpers
+---------------------------+--------------------------------------------------+ +---------------------------+--------------------------------------------------+
| pmd_swp_clear_soft_dirty | Clears a soft dirty swapped PMD | | pmd_swp_clear_soft_dirty | Clears a soft dirty swapped PMD |
+---------------------------+--------------------------------------------------+ +---------------------------+--------------------------------------------------+
| pmd_mkinvalid | Invalidates a mapped PMD [1] | | pmd_mkinvalid | Invalidates a present PMD; do not call for |
| | non-present PMD [1] |
+---------------------------+--------------------------------------------------+ +---------------------------+--------------------------------------------------+
| pmd_set_huge | Creates a PMD huge mapping | | pmd_set_huge | Creates a PMD huge mapping |
+---------------------------+--------------------------------------------------+ +---------------------------+--------------------------------------------------+
@ -196,7 +197,8 @@ PUD Page Table Helpers
+---------------------------+--------------------------------------------------+ +---------------------------+--------------------------------------------------+
| pud_mkdevmap | Creates a ZONE_DEVICE mapped PUD | | pud_mkdevmap | Creates a ZONE_DEVICE mapped PUD |
+---------------------------+--------------------------------------------------+ +---------------------------+--------------------------------------------------+
| pud_mkinvalid | Invalidates a mapped PUD [1] | | pud_mkinvalid | Invalidates a present PUD; do not call for |
| | non-present PUD [1] |
+---------------------------+--------------------------------------------------+ +---------------------------+--------------------------------------------------+
| pud_set_huge | Creates a PUD huge mapping | | pud_set_huge | Creates a PUD huge mapping |
+---------------------------+--------------------------------------------------+ +---------------------------+--------------------------------------------------+

View File

@ -461,24 +461,32 @@ number of filters for each scheme. Each filter specifies the type of target
memory, and whether it should exclude the memory of the type (filter-out), or memory, and whether it should exclude the memory of the type (filter-out), or
all except the memory of the type (filter-in). all except the memory of the type (filter-in).
Currently, anonymous page, memory cgroup, address range, and DAMON monitoring For efficient handling of filters, some types of filters are handled by the
target type filters are supported by the feature. Some filter target types core layer, while others are handled by operations set. In the latter case,
require additional arguments. The memory cgroup filter type asks users to hence, support of the filter types depends on the DAMON operations set. In
specify the file path of the memory cgroup for the filter. The address range case of the core layer-handled filters, the memory regions that excluded by the
type asks the start and end addresses of the range. The DAMON monitoring filter are not counted as the scheme has tried to the region. In contrast, if
target type asks the index of the target from the context's monitoring targets a memory regions is filtered by an operations set layer-handled filter, it is
list. Hence, users can apply specific schemes to only anonymous pages, counted as the scheme has tried. This difference affects the statistics.
non-anonymous pages, pages of specific cgroups, all pages excluding those of
specific cgroups, pages in specific address range, pages in specific DAMON
monitoring targets, and any combination of those.
To handle filters efficiently, the address range and DAMON monitoring target Below types of filters are currently supported.
type filters are handled by the core layer, while others are handled by
operations set. If a memory region is filtered by a core layer-handled filter, - anonymous page
it is not counted as the scheme has tried to the region. In contrast, if a - Applied to pages that containing data that not stored in files.
memory regions is filtered by an operations set layer-handled filter, it is - Handled by operations set layer. Supported by only ``paddr`` set.
counted as the scheme has tried. The difference in accounting leads to changes - memory cgroup
in the statistics. - Applied to pages that belonging to a given cgroup.
- Handled by operations set layer. Supported by only ``paddr`` set.
- young page
- Applied to pages that are accessed after the last access check from the
scheme.
- Handled by operations set layer. Supported by only ``paddr`` set.
- address range
- Applied to pages that belonging to a given address range.
- Handled by the core logic.
- DAMON monitoring target
- Applied to pages that belonging to a given DAMON monitoring target.
- Handled by the core logic.
Application Programming Interface Application Programming Interface

View File

@ -20,9 +20,10 @@ management subsystem maintainer. After more sufficient tests, the patches will
be queued in mm-stable [3]_ , and finally pull-requested to the mainline by the be queued in mm-stable [3]_ , and finally pull-requested to the mainline by the
memory management subsystem maintainer. memory management subsystem maintainer.
Note again the patches for review should be made against the mm-unstable Note again the patches for mm-unstable tree [1]_ are queued by the memory
tree [1]_ whenever possible. damon/next is only for preview of others' works management subsystem maintainer. If the patches requires some patches in
in progress. damon/next tree [2]_ which not yet merged in mm-unstable, please make sure the
requirement is clearly specified.
Submit checklist addendum Submit checklist addendum
------------------------- -------------------------
@ -48,9 +49,9 @@ Review cadence
-------------- --------------
The DAMON maintainer does the work on the usual work hour (09:00 to 17:00, The DAMON maintainer does the work on the usual work hour (09:00 to 17:00,
Mon-Fri) in PST. The response to patches will occasionally be slow. Do not Mon-Fri) in PT (Pacific Time). The response to patches will occasionally be
hesitate to send a ping if you have not heard back within a week of sending a slow. Do not hesitate to send a ping if you have not heard back within a week
patch. of sending a patch.
.. [1] https://git.kernel.org/akpm/mm/h/mm-unstable .. [1] https://git.kernel.org/akpm/mm/h/mm-unstable

View File

@ -26,6 +26,7 @@ see the :doc:`admin guide <../admin-guide/mm/index>`.
page_cache page_cache
shmfs shmfs
oom oom
allocation-profiling
Legacy Documentation Legacy Documentation
==================== ====================

View File

@ -14,7 +14,7 @@ Page table check performs extra verifications at the time when new pages become
accessible from the userspace by getting their page table entries (PTEs PMDs accessible from the userspace by getting their page table entries (PTEs PMDs
etc.) added into the table. etc.) added into the table.
In case of detected corruption, the kernel is crashed. There is a small In case of most detected corruption, the kernel is crashed. There is a small
performance and memory overhead associated with the page table check. Therefore, performance and memory overhead associated with the page table check. Therefore,
it is disabled by default, but can be optionally enabled on systems where the it is disabled by default, but can be optionally enabled on systems where the
extra hardening outweighs the performance costs. Also, because page table check extra hardening outweighs the performance costs. Also, because page table check
@ -22,6 +22,13 @@ is synchronous, it can help with debugging double map memory corruption issues,
by crashing kernel at the time wrong mapping occurs instead of later which is by crashing kernel at the time wrong mapping occurs instead of later which is
often the case with memory corruptions bugs. often the case with memory corruptions bugs.
It can also be used to do page table entry checks over various flags, dump
warnings when illegal combinations of entry flags are detected. Currently,
userfaultfd is the only user of such to sanity check wr-protect bit against
any writable flags. Illegal flag combinations will not directly cause data
corruption in this case immediately, but that will cause read-only data to
be writable, leading to corrupt when the page content is later modified.
Double mapping detection logic Double mapping detection logic
============================== ==============================

View File

@ -116,14 +116,14 @@ pages:
succeeds on tail pages. succeeds on tail pages.
- map/unmap of a PMD entry for the whole THP increment/decrement - map/unmap of a PMD entry for the whole THP increment/decrement
folio->_entire_mapcount and also increment/decrement folio->_entire_mapcount, increment/decrement folio->_large_mapcount
folio->_nr_pages_mapped by ENTIRELY_MAPPED when _entire_mapcount and also increment/decrement folio->_nr_pages_mapped by ENTIRELY_MAPPED
goes from -1 to 0 or 0 to -1. when _entire_mapcount goes from -1 to 0 or 0 to -1.
- map/unmap of individual pages with PTE entry increment/decrement - map/unmap of individual pages with PTE entry increment/decrement
page->_mapcount and also increment/decrement folio->_nr_pages_mapped page->_mapcount, increment/decrement folio->_large_mapcount and also
when page->_mapcount goes from -1 to 0 or 0 to -1 as this counts increment/decrement folio->_nr_pages_mapped when page->_mapcount goes
the number of pages mapped by PTE. from -1 to 0 or 0 to -1 as this counts the number of pages mapped by PTE.
split_huge_page internally has to distribute the refcounts in the head split_huge_page internally has to distribute the refcounts in the head
page to the tail pages before clearing all PG_head/tail bits from the page page to the tail pages before clearing all PG_head/tail bits from the page

View File

@ -180,27 +180,7 @@ this correctly. There is only **one** head ``struct page``, the tail
``struct page`` with ``PG_head`` are fake head ``struct page``. We need an ``struct page`` with ``PG_head`` are fake head ``struct page``. We need an
approach to distinguish between those two different types of ``struct page`` so approach to distinguish between those two different types of ``struct page`` so
that ``compound_head()`` can return the real head ``struct page`` when the that ``compound_head()`` can return the real head ``struct page`` when the
parameter is the tail ``struct page`` but with ``PG_head``. The following code parameter is the tail ``struct page`` but with ``PG_head``.
snippet describes how to distinguish between real and fake head ``struct page``.
.. code-block:: c
if (test_bit(PG_head, &page->flags)) {
unsigned long head = READ_ONCE(page[1].compound_head);
if (head & 1) {
if (head == (unsigned long)page + 1)
/* head struct page */
else
/* tail struct page */
} else {
/* head struct page */
}
}
We can safely access the field of the **page[1]** with ``PG_head`` because the
page is a compound page composed with at least two contiguous pages.
The implementation refers to ``page_fixed_fake_head()``.
Device DAX Device DAX
========== ==========

View File

@ -260,7 +260,7 @@ HyperSparc cpu就是这样一个具有这种属性的cpu。
如果D-cache别名不是一个问题这个程序可以简单地定义为该架构上 如果D-cache别名不是一个问题这个程序可以简单地定义为该架构上
的nop。 的nop。
page->flags (PG_arch_1)中有一个位是“架构私有”。内核保证, folio->flags (PG_arch_1)中有一个位是“架构私有”。内核保证,
对于分页缓存的页面,当这样的页面第一次进入分页缓存时,它将清除 对于分页缓存的页面,当这样的页面第一次进入分页缓存时,它将清除
这个位。 这个位。

View File

@ -5360,6 +5360,14 @@ S: Supported
F: Documentation/process/code-of-conduct-interpretation.rst F: Documentation/process/code-of-conduct-interpretation.rst
F: Documentation/process/code-of-conduct.rst F: Documentation/process/code-of-conduct.rst
CODE TAGGING
M: Suren Baghdasaryan <surenb@google.com>
M: Kent Overstreet <kent.overstreet@linux.dev>
S: Maintained
F: include/asm-generic/codetag.lds.h
F: include/linux/codetag.h
F: lib/codetag.c
COMEDI DRIVERS COMEDI DRIVERS
M: Ian Abbott <abbotti@mev.co.uk> M: Ian Abbott <abbotti@mev.co.uk>
M: H Hartley Sweeten <hsweeten@visionengravers.com> M: H Hartley Sweeten <hsweeten@visionengravers.com>
@ -14335,6 +14343,16 @@ F: mm/memblock.c
F: mm/mm_init.c F: mm/mm_init.c
F: tools/testing/memblock/ F: tools/testing/memblock/
MEMORY ALLOCATION PROFILING
M: Suren Baghdasaryan <surenb@google.com>
M: Kent Overstreet <kent.overstreet@linux.dev>
L: linux-mm@kvack.org
S: Maintained
F: Documentation/mm/allocation-profiling.rst
F: include/linux/alloc_tag.h
F: include/linux/pgalloc_tag.h
F: lib/alloc_tag.c
MEMORY CONTROLLER DRIVERS MEMORY CONTROLLER DRIVERS
M: Krzysztof Kozlowski <krzk@kernel.org> M: Krzysztof Kozlowski <krzk@kernel.org>
L: linux-kernel@vger.kernel.org L: linux-kernel@vger.kernel.org

View File

@ -1218,14 +1218,11 @@ static unsigned long
arch_get_unmapped_area_1(unsigned long addr, unsigned long len, arch_get_unmapped_area_1(unsigned long addr, unsigned long len,
unsigned long limit) unsigned long limit)
{ {
struct vm_unmapped_area_info info; struct vm_unmapped_area_info info = {};
info.flags = 0;
info.length = len; info.length = len;
info.low_limit = addr; info.low_limit = addr;
info.high_limit = limit; info.high_limit = limit;
info.align_mask = 0;
info.align_offset = 0;
return vm_unmapped_area(&info); return vm_unmapped_area(&info);
} }

View File

@ -929,7 +929,7 @@ const struct dma_map_ops alpha_pci_ops = {
.dma_supported = alpha_pci_supported, .dma_supported = alpha_pci_supported,
.mmap = dma_common_mmap, .mmap = dma_common_mmap,
.get_sgtable = dma_common_get_sgtable, .get_sgtable = dma_common_get_sgtable,
.alloc_pages = dma_common_alloc_pages, .alloc_pages_op = dma_common_alloc_pages,
.free_pages = dma_common_free_pages, .free_pages = dma_common_free_pages,
}; };
EXPORT_SYMBOL(alpha_pci_ops); EXPORT_SYMBOL(alpha_pci_ops);

View File

@ -15,6 +15,7 @@
#include <net/checksum.h> #include <net/checksum.h>
#include <asm/byteorder.h> #include <asm/byteorder.h>
#include <asm/checksum.h>
static inline unsigned short from64to16(unsigned long x) static inline unsigned short from64to16(unsigned long x)
{ {

View File

@ -8,6 +8,7 @@
#include <linux/compiler.h> #include <linux/compiler.h>
#include <linux/export.h> #include <linux/export.h>
#include <linux/preempt.h> #include <linux/preempt.h>
#include <asm/fpu.h>
#include <asm/thread_info.h> #include <asm/thread_info.h>
#include <asm/fpu.h> #include <asm/fpu.h>

View File

@ -9,6 +9,8 @@
#ifndef _ASM_ARC_MMU_ARCV2_H #ifndef _ASM_ARC_MMU_ARCV2_H
#define _ASM_ARC_MMU_ARCV2_H #define _ASM_ARC_MMU_ARCV2_H
#include <soc/arc/aux.h>
/* /*
* TLB Management regs * TLB Management regs
*/ */

View File

@ -27,7 +27,7 @@ arch_get_unmapped_area(struct file *filp, unsigned long addr,
{ {
struct mm_struct *mm = current->mm; struct mm_struct *mm = current->mm;
struct vm_area_struct *vma; struct vm_area_struct *vma;
struct vm_unmapped_area_info info; struct vm_unmapped_area_info info = {};
/* /*
* We enforce the MAP_FIXED case. * We enforce the MAP_FIXED case.
@ -51,11 +51,9 @@ arch_get_unmapped_area(struct file *filp, unsigned long addr,
return addr; return addr;
} }
info.flags = 0;
info.length = len; info.length = len;
info.low_limit = mm->mmap_base; info.low_limit = mm->mmap_base;
info.high_limit = TASK_SIZE; info.high_limit = TASK_SIZE;
info.align_mask = 0;
info.align_offset = pgoff << PAGE_SHIFT; info.align_offset = pgoff << PAGE_SHIFT;
return vm_unmapped_area(&info); return vm_unmapped_area(&info);
} }

View File

@ -100,7 +100,7 @@ config ARM
select HAVE_DYNAMIC_FTRACE_WITH_REGS if HAVE_DYNAMIC_FTRACE select HAVE_DYNAMIC_FTRACE_WITH_REGS if HAVE_DYNAMIC_FTRACE
select HAVE_EFFICIENT_UNALIGNED_ACCESS if (CPU_V6 || CPU_V6K || CPU_V7) && MMU select HAVE_EFFICIENT_UNALIGNED_ACCESS if (CPU_V6 || CPU_V6K || CPU_V7) && MMU
select HAVE_EXIT_THREAD select HAVE_EXIT_THREAD
select HAVE_FAST_GUP if ARM_LPAE select HAVE_GUP_FAST if ARM_LPAE
select HAVE_FTRACE_MCOUNT_RECORD if !XIP_KERNEL select HAVE_FTRACE_MCOUNT_RECORD if !XIP_KERNEL
select HAVE_FUNCTION_ERROR_INJECTION select HAVE_FUNCTION_ERROR_INJECTION
select HAVE_FUNCTION_GRAPH_TRACER select HAVE_FUNCTION_GRAPH_TRACER

View File

@ -15,10 +15,10 @@
#include <asm/hugetlb-3level.h> #include <asm/hugetlb-3level.h>
#include <asm-generic/hugetlb.h> #include <asm-generic/hugetlb.h>
static inline void arch_clear_hugepage_flags(struct page *page) static inline void arch_clear_hugetlb_flags(struct folio *folio)
{ {
clear_bit(PG_dcache_clean, &page->flags); clear_bit(PG_dcache_clean, &folio->flags);
} }
#define arch_clear_hugepage_flags arch_clear_hugepage_flags #define arch_clear_hugetlb_flags arch_clear_hugetlb_flags
#endif /* _ASM_ARM_HUGETLB_H */ #endif /* _ASM_ARM_HUGETLB_H */

View File

@ -213,8 +213,8 @@ static inline pmd_t *pmd_offset(pud_t *pud, unsigned long addr)
#define pmd_pfn(pmd) (__phys_to_pfn(pmd_val(pmd) & PHYS_MASK)) #define pmd_pfn(pmd) (__phys_to_pfn(pmd_val(pmd) & PHYS_MASK))
#define pmd_leaf(pmd) (pmd_val(pmd) & 2) #define pmd_leaf(pmd) (pmd_val(pmd) & PMD_TYPE_SECT)
#define pmd_bad(pmd) (pmd_val(pmd) & 2) #define pmd_bad(pmd) pmd_leaf(pmd)
#define pmd_present(pmd) (pmd_val(pmd)) #define pmd_present(pmd) (pmd_val(pmd))
#define copy_pmd(pmdpd,pmdps) \ #define copy_pmd(pmdpd,pmdps) \
@ -241,7 +241,6 @@ static inline pmd_t *pmd_offset(pud_t *pud, unsigned long addr)
* define empty stubs for use by pin_page_for_write. * define empty stubs for use by pin_page_for_write.
*/ */
#define pmd_hugewillfault(pmd) (0) #define pmd_hugewillfault(pmd) (0)
#define pmd_thp_or_huge(pmd) (0)
#endif /* __ASSEMBLY__ */ #endif /* __ASSEMBLY__ */

View File

@ -14,6 +14,7 @@
* + Level 1/2 descriptor * + Level 1/2 descriptor
* - common * - common
*/ */
#define PUD_TABLE_BIT (_AT(pmdval_t, 1) << 1)
#define PMD_TYPE_MASK (_AT(pmdval_t, 3) << 0) #define PMD_TYPE_MASK (_AT(pmdval_t, 3) << 0)
#define PMD_TYPE_FAULT (_AT(pmdval_t, 0) << 0) #define PMD_TYPE_FAULT (_AT(pmdval_t, 0) << 0)
#define PMD_TYPE_TABLE (_AT(pmdval_t, 3) << 0) #define PMD_TYPE_TABLE (_AT(pmdval_t, 3) << 0)

View File

@ -112,7 +112,7 @@
#ifndef __ASSEMBLY__ #ifndef __ASSEMBLY__
#define pud_none(pud) (!pud_val(pud)) #define pud_none(pud) (!pud_val(pud))
#define pud_bad(pud) (!(pud_val(pud) & 2)) #define pud_bad(pud) (!(pud_val(pud) & PUD_TABLE_BIT))
#define pud_present(pud) (pud_val(pud)) #define pud_present(pud) (pud_val(pud))
#define pmd_table(pmd) ((pmd_val(pmd) & PMD_TYPE_MASK) == \ #define pmd_table(pmd) ((pmd_val(pmd) & PMD_TYPE_MASK) == \
PMD_TYPE_TABLE) PMD_TYPE_TABLE)
@ -137,7 +137,7 @@ static inline pmd_t *pud_pgtable(pud_t pud)
return __va(pud_val(pud) & PHYS_MASK & (s32)PAGE_MASK); return __va(pud_val(pud) & PHYS_MASK & (s32)PAGE_MASK);
} }
#define pmd_bad(pmd) (!(pmd_val(pmd) & 2)) #define pmd_bad(pmd) (!(pmd_val(pmd) & PMD_TABLE_BIT))
#define copy_pmd(pmdpd,pmdps) \ #define copy_pmd(pmdpd,pmdps) \
do { \ do { \
@ -190,7 +190,6 @@ static inline pte_t pte_mkspecial(pte_t pte)
#define pmd_dirty(pmd) (pmd_isset((pmd), L_PMD_SECT_DIRTY)) #define pmd_dirty(pmd) (pmd_isset((pmd), L_PMD_SECT_DIRTY))
#define pmd_hugewillfault(pmd) (!pmd_young(pmd) || !pmd_write(pmd)) #define pmd_hugewillfault(pmd) (!pmd_young(pmd) || !pmd_write(pmd))
#define pmd_thp_or_huge(pmd) (pmd_huge(pmd) || pmd_trans_huge(pmd))
#ifdef CONFIG_TRANSPARENT_HUGEPAGE #ifdef CONFIG_TRANSPARENT_HUGEPAGE
#define pmd_trans_huge(pmd) (pmd_val(pmd) && !pmd_table(pmd)) #define pmd_trans_huge(pmd) (pmd_val(pmd) && !pmd_table(pmd))

View File

@ -32,6 +32,7 @@
#include <linux/kallsyms.h> #include <linux/kallsyms.h>
#include <linux/proc_fs.h> #include <linux/proc_fs.h>
#include <linux/export.h> #include <linux/export.h>
#include <linux/vmalloc.h>
#include <asm/hardware/cache-l2x0.h> #include <asm/hardware/cache-l2x0.h>
#include <asm/hardware/cache-uniphier.h> #include <asm/hardware/cache-uniphier.h>

View File

@ -26,6 +26,7 @@
#include <linux/sched/debug.h> #include <linux/sched/debug.h>
#include <linux/sched/task_stack.h> #include <linux/sched/task_stack.h>
#include <linux/irq.h> #include <linux/irq.h>
#include <linux/vmalloc.h>
#include <linux/atomic.h> #include <linux/atomic.h>
#include <asm/cacheflush.h> #include <asm/cacheflush.h>

View File

@ -56,10 +56,10 @@ pin_page_for_write(const void __user *_addr, pte_t **ptep, spinlock_t **ptlp)
* to see that it's still huge and whether or not we will * to see that it's still huge and whether or not we will
* need to fault on write. * need to fault on write.
*/ */
if (unlikely(pmd_thp_or_huge(*pmd))) { if (unlikely(pmd_leaf(*pmd))) {
ptl = &current->mm->page_table_lock; ptl = &current->mm->page_table_lock;
spin_lock(ptl); spin_lock(ptl);
if (unlikely(!pmd_thp_or_huge(*pmd) if (unlikely(!pmd_leaf(*pmd)
|| pmd_hugewillfault(*pmd))) { || pmd_hugewillfault(*pmd))) {
spin_unlock(ptl); spin_unlock(ptl);
return 0; return 0;

View File

@ -21,7 +21,6 @@ KASAN_SANITIZE_physaddr.o := n
obj-$(CONFIG_DEBUG_VIRTUAL) += physaddr.o obj-$(CONFIG_DEBUG_VIRTUAL) += physaddr.o
obj-$(CONFIG_ALIGNMENT_TRAP) += alignment.o obj-$(CONFIG_ALIGNMENT_TRAP) += alignment.o
obj-$(CONFIG_HUGETLB_PAGE) += hugetlbpage.o
obj-$(CONFIG_ARM_PV_FIXUP) += pv-fixup-asm.o obj-$(CONFIG_ARM_PV_FIXUP) += pv-fixup-asm.o
obj-$(CONFIG_CPU_ABRT_NOMMU) += abort-nommu.o obj-$(CONFIG_CPU_ABRT_NOMMU) += abort-nommu.o

View File

@ -226,9 +226,6 @@ void do_bad_area(unsigned long addr, unsigned int fsr, struct pt_regs *regs)
} }
#ifdef CONFIG_MMU #ifdef CONFIG_MMU
#define VM_FAULT_BADMAP ((__force vm_fault_t)0x010000)
#define VM_FAULT_BADACCESS ((__force vm_fault_t)0x020000)
static inline bool is_permission_fault(unsigned int fsr) static inline bool is_permission_fault(unsigned int fsr)
{ {
int fs = fsr_fs(fsr); int fs = fsr_fs(fsr);
@ -323,7 +320,10 @@ do_page_fault(unsigned long addr, unsigned int fsr, struct pt_regs *regs)
if (!(vma->vm_flags & vm_flags)) { if (!(vma->vm_flags & vm_flags)) {
vma_end_read(vma); vma_end_read(vma);
goto lock_mmap; count_vm_vma_lock_event(VMA_LOCK_SUCCESS);
fault = 0;
code = SEGV_ACCERR;
goto bad_area;
} }
fault = handle_mm_fault(vma, addr, flags | FAULT_FLAG_VMA_LOCK, regs); fault = handle_mm_fault(vma, addr, flags | FAULT_FLAG_VMA_LOCK, regs);
if (!(fault & (VM_FAULT_RETRY | VM_FAULT_COMPLETED))) if (!(fault & (VM_FAULT_RETRY | VM_FAULT_COMPLETED)))
@ -348,7 +348,8 @@ lock_mmap:
retry: retry:
vma = lock_mm_and_find_vma(mm, addr, regs); vma = lock_mm_and_find_vma(mm, addr, regs);
if (unlikely(!vma)) { if (unlikely(!vma)) {
fault = VM_FAULT_BADMAP; fault = 0;
code = SEGV_MAPERR;
goto bad_area; goto bad_area;
} }
@ -356,9 +357,13 @@ retry:
* ok, we have a good vm_area for this memory access, check the * ok, we have a good vm_area for this memory access, check the
* permissions on the VMA allow for the fault which occurred. * permissions on the VMA allow for the fault which occurred.
*/ */
if (!(vma->vm_flags & vm_flags)) if (!(vma->vm_flags & vm_flags)) {
fault = VM_FAULT_BADACCESS; mmap_read_unlock(mm);
else fault = 0;
code = SEGV_ACCERR;
goto bad_area;
}
fault = handle_mm_fault(vma, addr & PAGE_MASK, flags, regs); fault = handle_mm_fault(vma, addr & PAGE_MASK, flags, regs);
/* If we need to retry but a fatal signal is pending, handle the /* If we need to retry but a fatal signal is pending, handle the
@ -385,12 +390,11 @@ retry:
mmap_read_unlock(mm); mmap_read_unlock(mm);
done: done:
/* /* Handle the "normal" case first */
* Handle the "normal" case first - VM_FAULT_MAJOR if (likely(!(fault & VM_FAULT_ERROR)))
*/
if (likely(!(fault & (VM_FAULT_ERROR | VM_FAULT_BADMAP | VM_FAULT_BADACCESS))))
return 0; return 0;
code = SEGV_MAPERR;
bad_area: bad_area:
/* /*
* If we are in kernel mode at this point, we * If we are in kernel mode at this point, we
@ -422,8 +426,6 @@ bad_area:
* isn't in our memory map.. * isn't in our memory map..
*/ */
sig = SIGSEGV; sig = SIGSEGV;
code = fault == VM_FAULT_BADACCESS ?
SEGV_ACCERR : SEGV_MAPERR;
} }
__do_user_fault(addr, fsr, sig, code, regs); __do_user_fault(addr, fsr, sig, code, regs);

View File

@ -1,34 +0,0 @@
// SPDX-License-Identifier: GPL-2.0-only
/*
* arch/arm/mm/hugetlbpage.c
*
* Copyright (C) 2012 ARM Ltd.
*
* Based on arch/x86/include/asm/hugetlb.h and Bill Carson's patches
*/
#include <linux/init.h>
#include <linux/fs.h>
#include <linux/mm.h>
#include <linux/hugetlb.h>
#include <linux/pagemap.h>
#include <linux/err.h>
#include <linux/sysctl.h>
#include <asm/mman.h>
#include <asm/tlb.h>
#include <asm/tlbflush.h>
/*
* On ARM, huge pages are backed by pmd's rather than pte's, so we do a lot
* of type casting from pmd_t * to pte_t *.
*/
int pud_huge(pud_t pud)
{
return 0;
}
int pmd_huge(pmd_t pmd)
{
return pmd_val(pmd) && !(pmd_val(pmd) & PMD_TABLE_BIT);
}

View File

@ -34,7 +34,7 @@ arch_get_unmapped_area(struct file *filp, unsigned long addr,
struct vm_area_struct *vma; struct vm_area_struct *vma;
int do_align = 0; int do_align = 0;
int aliasing = cache_is_vipt_aliasing(); int aliasing = cache_is_vipt_aliasing();
struct vm_unmapped_area_info info; struct vm_unmapped_area_info info = {};
/* /*
* We only need to do colour alignment if either the I or D * We only need to do colour alignment if either the I or D
@ -68,7 +68,6 @@ arch_get_unmapped_area(struct file *filp, unsigned long addr,
return addr; return addr;
} }
info.flags = 0;
info.length = len; info.length = len;
info.low_limit = mm->mmap_base; info.low_limit = mm->mmap_base;
info.high_limit = TASK_SIZE; info.high_limit = TASK_SIZE;
@ -87,7 +86,7 @@ arch_get_unmapped_area_topdown(struct file *filp, const unsigned long addr0,
unsigned long addr = addr0; unsigned long addr = addr0;
int do_align = 0; int do_align = 0;
int aliasing = cache_is_vipt_aliasing(); int aliasing = cache_is_vipt_aliasing();
struct vm_unmapped_area_info info; struct vm_unmapped_area_info info = {};
/* /*
* We only need to do colour alignment if either the I or D * We only need to do colour alignment if either the I or D

View File

@ -205,7 +205,7 @@ config ARM64
select HAVE_SAMPLE_FTRACE_DIRECT select HAVE_SAMPLE_FTRACE_DIRECT
select HAVE_SAMPLE_FTRACE_DIRECT_MULTI select HAVE_SAMPLE_FTRACE_DIRECT_MULTI
select HAVE_EFFICIENT_UNALIGNED_ACCESS select HAVE_EFFICIENT_UNALIGNED_ACCESS
select HAVE_FAST_GUP select HAVE_GUP_FAST
select HAVE_FTRACE_MCOUNT_RECORD select HAVE_FTRACE_MCOUNT_RECORD
select HAVE_FUNCTION_TRACER select HAVE_FUNCTION_TRACER
select HAVE_FUNCTION_ERROR_INJECTION select HAVE_FUNCTION_ERROR_INJECTION

View File

@ -18,11 +18,11 @@
extern bool arch_hugetlb_migration_supported(struct hstate *h); extern bool arch_hugetlb_migration_supported(struct hstate *h);
#endif #endif
static inline void arch_clear_hugepage_flags(struct page *page) static inline void arch_clear_hugetlb_flags(struct folio *folio)
{ {
clear_bit(PG_dcache_clean, &page->flags); clear_bit(PG_dcache_clean, &folio->flags);
} }
#define arch_clear_hugepage_flags arch_clear_hugepage_flags #define arch_clear_hugetlb_flags arch_clear_hugetlb_flags
pte_t arch_make_huge_pte(pte_t entry, unsigned int shift, vm_flags_t flags); pte_t arch_make_huge_pte(pte_t entry, unsigned int shift, vm_flags_t flags);
#define arch_make_huge_pte arch_make_huge_pte #define arch_make_huge_pte arch_make_huge_pte

View File

@ -49,12 +49,6 @@
__flush_tlb_range(vma, addr, end, PUD_SIZE, false, 1) __flush_tlb_range(vma, addr, end, PUD_SIZE, false, 1)
#endif /* CONFIG_TRANSPARENT_HUGEPAGE */ #endif /* CONFIG_TRANSPARENT_HUGEPAGE */
static inline bool arch_thp_swp_supported(void)
{
return !system_supports_mte();
}
#define arch_thp_swp_supported arch_thp_swp_supported
/* /*
* Outside of a few very special situations (e.g. hibernation), we always * Outside of a few very special situations (e.g. hibernation), we always
* use broadcast TLB invalidation instructions, therefore a spurious page * use broadcast TLB invalidation instructions, therefore a spurious page
@ -571,8 +565,6 @@ static inline int pmd_trans_huge(pmd_t pmd)
pte_pmd(pte_swp_clear_uffd_wp(pmd_pte(pmd))) pte_pmd(pte_swp_clear_uffd_wp(pmd_pte(pmd)))
#endif /* CONFIG_HAVE_ARCH_USERFAULTFD_WP */ #endif /* CONFIG_HAVE_ARCH_USERFAULTFD_WP */
#define pmd_thp_or_huge(pmd) (pmd_huge(pmd) || pmd_trans_huge(pmd))
#define pmd_write(pmd) pte_write(pmd_pte(pmd)) #define pmd_write(pmd) pte_write(pmd_pte(pmd))
#define pmd_mkhuge(pmd) (__pmd(pmd_val(pmd) & ~PMD_TABLE_BIT)) #define pmd_mkhuge(pmd) (__pmd(pmd_val(pmd) & ~PMD_TABLE_BIT))
@ -763,7 +755,11 @@ static inline unsigned long pmd_page_vaddr(pmd_t pmd)
#define pud_none(pud) (!pud_val(pud)) #define pud_none(pud) (!pud_val(pud))
#define pud_bad(pud) (!pud_table(pud)) #define pud_bad(pud) (!pud_table(pud))
#define pud_present(pud) pte_present(pud_pte(pud)) #define pud_present(pud) pte_present(pud_pte(pud))
#ifndef __PAGETABLE_PMD_FOLDED
#define pud_leaf(pud) (pud_present(pud) && !pud_table(pud)) #define pud_leaf(pud) (pud_present(pud) && !pud_table(pud))
#else
#define pud_leaf(pud) false
#endif
#define pud_valid(pud) pte_valid(pud_pte(pud)) #define pud_valid(pud) pte_valid(pud_pte(pud))
#define pud_user(pud) pte_user(pud_pte(pud)) #define pud_user(pud) pte_user(pud_pte(pud))
#define pud_user_exec(pud) pte_user_exec(pud_pte(pud)) #define pud_user_exec(pud) pte_user_exec(pud_pte(pud))
@ -1284,6 +1280,46 @@ static inline void __wrprotect_ptes(struct mm_struct *mm, unsigned long address,
__ptep_set_wrprotect(mm, address, ptep); __ptep_set_wrprotect(mm, address, ptep);
} }
static inline void __clear_young_dirty_pte(struct vm_area_struct *vma,
unsigned long addr, pte_t *ptep,
pte_t pte, cydp_t flags)
{
pte_t old_pte;
do {
old_pte = pte;
if (flags & CYDP_CLEAR_YOUNG)
pte = pte_mkold(pte);
if (flags & CYDP_CLEAR_DIRTY)
pte = pte_mkclean(pte);
pte_val(pte) = cmpxchg_relaxed(&pte_val(*ptep),
pte_val(old_pte), pte_val(pte));
} while (pte_val(pte) != pte_val(old_pte));
}
static inline void __clear_young_dirty_ptes(struct vm_area_struct *vma,
unsigned long addr, pte_t *ptep,
unsigned int nr, cydp_t flags)
{
pte_t pte;
for (;;) {
pte = __ptep_get(ptep);
if (flags == (CYDP_CLEAR_YOUNG | CYDP_CLEAR_DIRTY))
__set_pte(ptep, pte_mkclean(pte_mkold(pte)));
else
__clear_young_dirty_pte(vma, addr, ptep, pte, flags);
if (--nr == 0)
break;
ptep++;
addr += PAGE_SIZE;
}
}
#ifdef CONFIG_TRANSPARENT_HUGEPAGE #ifdef CONFIG_TRANSPARENT_HUGEPAGE
#define __HAVE_ARCH_PMDP_SET_WRPROTECT #define __HAVE_ARCH_PMDP_SET_WRPROTECT
static inline void pmdp_set_wrprotect(struct mm_struct *mm, static inline void pmdp_set_wrprotect(struct mm_struct *mm,
@ -1338,12 +1374,7 @@ static inline pmd_t pmdp_establish(struct vm_area_struct *vma,
#ifdef CONFIG_ARM64_MTE #ifdef CONFIG_ARM64_MTE
#define __HAVE_ARCH_PREPARE_TO_SWAP #define __HAVE_ARCH_PREPARE_TO_SWAP
static inline int arch_prepare_to_swap(struct page *page) extern int arch_prepare_to_swap(struct folio *folio);
{
if (system_supports_mte())
return mte_save_tags(page);
return 0;
}
#define __HAVE_ARCH_SWAP_INVALIDATE #define __HAVE_ARCH_SWAP_INVALIDATE
static inline void arch_swap_invalidate_page(int type, pgoff_t offset) static inline void arch_swap_invalidate_page(int type, pgoff_t offset)
@ -1359,11 +1390,7 @@ static inline void arch_swap_invalidate_area(int type)
} }
#define __HAVE_ARCH_SWAP_RESTORE #define __HAVE_ARCH_SWAP_RESTORE
static inline void arch_swap_restore(swp_entry_t entry, struct folio *folio) extern void arch_swap_restore(swp_entry_t entry, struct folio *folio);
{
if (system_supports_mte())
mte_restore_tags(entry, &folio->page);
}
#endif /* CONFIG_ARM64_MTE */ #endif /* CONFIG_ARM64_MTE */
@ -1450,6 +1477,9 @@ extern void contpte_wrprotect_ptes(struct mm_struct *mm, unsigned long addr,
extern int contpte_ptep_set_access_flags(struct vm_area_struct *vma, extern int contpte_ptep_set_access_flags(struct vm_area_struct *vma,
unsigned long addr, pte_t *ptep, unsigned long addr, pte_t *ptep,
pte_t entry, int dirty); pte_t entry, int dirty);
extern void contpte_clear_young_dirty_ptes(struct vm_area_struct *vma,
unsigned long addr, pte_t *ptep,
unsigned int nr, cydp_t flags);
static __always_inline void contpte_try_fold(struct mm_struct *mm, static __always_inline void contpte_try_fold(struct mm_struct *mm,
unsigned long addr, pte_t *ptep, pte_t pte) unsigned long addr, pte_t *ptep, pte_t pte)
@ -1674,6 +1704,17 @@ static inline int ptep_set_access_flags(struct vm_area_struct *vma,
return contpte_ptep_set_access_flags(vma, addr, ptep, entry, dirty); return contpte_ptep_set_access_flags(vma, addr, ptep, entry, dirty);
} }
#define clear_young_dirty_ptes clear_young_dirty_ptes
static inline void clear_young_dirty_ptes(struct vm_area_struct *vma,
unsigned long addr, pte_t *ptep,
unsigned int nr, cydp_t flags)
{
if (likely(nr == 1 && !pte_cont(__ptep_get(ptep))))
__clear_young_dirty_ptes(vma, addr, ptep, nr, flags);
else
contpte_clear_young_dirty_ptes(vma, addr, ptep, nr, flags);
}
#else /* CONFIG_ARM64_CONTPTE */ #else /* CONFIG_ARM64_CONTPTE */
#define ptep_get __ptep_get #define ptep_get __ptep_get
@ -1693,6 +1734,7 @@ static inline int ptep_set_access_flags(struct vm_area_struct *vma,
#define wrprotect_ptes __wrprotect_ptes #define wrprotect_ptes __wrprotect_ptes
#define __HAVE_ARCH_PTEP_SET_ACCESS_FLAGS #define __HAVE_ARCH_PTEP_SET_ACCESS_FLAGS
#define ptep_set_access_flags __ptep_set_access_flags #define ptep_set_access_flags __ptep_set_access_flags
#define clear_young_dirty_ptes __clear_young_dirty_ptes
#endif /* CONFIG_ARM64_CONTPTE */ #endif /* CONFIG_ARM64_CONTPTE */

View File

@ -10,6 +10,7 @@
#include <linux/efi.h> #include <linux/efi.h>
#include <linux/init.h> #include <linux/init.h>
#include <linux/screen_info.h> #include <linux/screen_info.h>
#include <linux/vmalloc.h>
#include <asm/efi.h> #include <asm/efi.h>
#include <asm/stacktrace.h> #include <asm/stacktrace.h>

View File

@ -361,6 +361,35 @@ void contpte_wrprotect_ptes(struct mm_struct *mm, unsigned long addr,
} }
EXPORT_SYMBOL_GPL(contpte_wrprotect_ptes); EXPORT_SYMBOL_GPL(contpte_wrprotect_ptes);
void contpte_clear_young_dirty_ptes(struct vm_area_struct *vma,
unsigned long addr, pte_t *ptep,
unsigned int nr, cydp_t flags)
{
/*
* We can safely clear access/dirty without needing to unfold from
* the architectures perspective, even when contpte is set. If the
* range starts or ends midway through a contpte block, we can just
* expand to include the full contpte block. While this is not
* exactly what the core-mm asked for, it tracks access/dirty per
* folio, not per page. And since we only create a contpte block
* when it is covered by a single folio, we can get away with
* clearing access/dirty for the whole block.
*/
unsigned long start = addr;
unsigned long end = start + nr;
if (pte_cont(__ptep_get(ptep + nr - 1)))
end = ALIGN(end, CONT_PTE_SIZE);
if (pte_cont(__ptep_get(ptep))) {
start = ALIGN_DOWN(start, CONT_PTE_SIZE);
ptep = contpte_align_down(ptep);
}
__clear_young_dirty_ptes(vma, start, ptep, end - start, flags);
}
EXPORT_SYMBOL_GPL(contpte_clear_young_dirty_ptes);
int contpte_ptep_set_access_flags(struct vm_area_struct *vma, int contpte_ptep_set_access_flags(struct vm_area_struct *vma,
unsigned long addr, pte_t *ptep, unsigned long addr, pte_t *ptep,
pte_t entry, int dirty) pte_t entry, int dirty)

View File

@ -486,25 +486,6 @@ static void do_bad_area(unsigned long far, unsigned long esr,
} }
} }
#define VM_FAULT_BADMAP ((__force vm_fault_t)0x010000)
#define VM_FAULT_BADACCESS ((__force vm_fault_t)0x020000)
static vm_fault_t __do_page_fault(struct mm_struct *mm,
struct vm_area_struct *vma, unsigned long addr,
unsigned int mm_flags, unsigned long vm_flags,
struct pt_regs *regs)
{
/*
* Ok, we have a good vm_area for this memory access, so we can handle
* it.
* Check that the permissions on the VMA allow for the fault which
* occurred.
*/
if (!(vma->vm_flags & vm_flags))
return VM_FAULT_BADACCESS;
return handle_mm_fault(vma, addr, mm_flags, regs);
}
static bool is_el0_instruction_abort(unsigned long esr) static bool is_el0_instruction_abort(unsigned long esr)
{ {
return ESR_ELx_EC(esr) == ESR_ELx_EC_IABT_LOW; return ESR_ELx_EC(esr) == ESR_ELx_EC_IABT_LOW;
@ -529,6 +510,7 @@ static int __kprobes do_page_fault(unsigned long far, unsigned long esr,
unsigned int mm_flags = FAULT_FLAG_DEFAULT; unsigned int mm_flags = FAULT_FLAG_DEFAULT;
unsigned long addr = untagged_addr(far); unsigned long addr = untagged_addr(far);
struct vm_area_struct *vma; struct vm_area_struct *vma;
int si_code;
if (kprobe_page_fault(regs, esr)) if (kprobe_page_fault(regs, esr))
return 0; return 0;
@ -588,7 +570,10 @@ static int __kprobes do_page_fault(unsigned long far, unsigned long esr,
if (!(vma->vm_flags & vm_flags)) { if (!(vma->vm_flags & vm_flags)) {
vma_end_read(vma); vma_end_read(vma);
goto lock_mmap; fault = 0;
si_code = SEGV_ACCERR;
count_vm_vma_lock_event(VMA_LOCK_SUCCESS);
goto bad_area;
} }
fault = handle_mm_fault(vma, addr, mm_flags | FAULT_FLAG_VMA_LOCK, regs); fault = handle_mm_fault(vma, addr, mm_flags | FAULT_FLAG_VMA_LOCK, regs);
if (!(fault & (VM_FAULT_RETRY | VM_FAULT_COMPLETED))) if (!(fault & (VM_FAULT_RETRY | VM_FAULT_COMPLETED)))
@ -613,12 +598,19 @@ lock_mmap:
retry: retry:
vma = lock_mm_and_find_vma(mm, addr, regs); vma = lock_mm_and_find_vma(mm, addr, regs);
if (unlikely(!vma)) { if (unlikely(!vma)) {
fault = VM_FAULT_BADMAP; fault = 0;
goto done; si_code = SEGV_MAPERR;
goto bad_area;
} }
fault = __do_page_fault(mm, vma, addr, mm_flags, vm_flags, regs); if (!(vma->vm_flags & vm_flags)) {
mmap_read_unlock(mm);
fault = 0;
si_code = SEGV_ACCERR;
goto bad_area;
}
fault = handle_mm_fault(vma, addr, mm_flags, regs);
/* Quick path to respond to signals */ /* Quick path to respond to signals */
if (fault_signal_pending(fault, regs)) { if (fault_signal_pending(fault, regs)) {
if (!user_mode(regs)) if (!user_mode(regs))
@ -637,13 +629,12 @@ retry:
mmap_read_unlock(mm); mmap_read_unlock(mm);
done: done:
/* /* Handle the "normal" (no error) case first. */
* Handle the "normal" (no error) case first. if (likely(!(fault & VM_FAULT_ERROR)))
*/
if (likely(!(fault & (VM_FAULT_ERROR | VM_FAULT_BADMAP |
VM_FAULT_BADACCESS))))
return 0; return 0;
si_code = SEGV_MAPERR;
bad_area:
/* /*
* If we are in kernel mode at this point, we have no context to * If we are in kernel mode at this point, we have no context to
* handle this fault with. * handle this fault with.
@ -678,13 +669,8 @@ done:
arm64_force_sig_mceerr(BUS_MCEERR_AR, far, lsb, inf->name); arm64_force_sig_mceerr(BUS_MCEERR_AR, far, lsb, inf->name);
} else { } else {
/* /* Something tried to access memory that out of memory map */
* Something tried to access memory that isn't in our memory arm64_force_sig_fault(SIGSEGV, si_code, far, inf->name);
* map.
*/
arm64_force_sig_fault(SIGSEGV,
fault == VM_FAULT_BADACCESS ? SEGV_ACCERR : SEGV_MAPERR,
far, inf->name);
} }
return 0; return 0;

View File

@ -79,20 +79,6 @@ bool arch_hugetlb_migration_supported(struct hstate *h)
} }
#endif #endif
int pmd_huge(pmd_t pmd)
{
return pmd_val(pmd) && !(pmd_val(pmd) & PMD_TABLE_BIT);
}
int pud_huge(pud_t pud)
{
#ifndef __PAGETABLE_PMD_FOLDED
return pud_val(pud) && !(pud_val(pud) & PUD_TABLE_BIT);
#else
return 0;
#endif
}
static int find_num_contig(struct mm_struct *mm, unsigned long addr, static int find_num_contig(struct mm_struct *mm, unsigned long addr,
pte_t *ptep, size_t *pgsize) pte_t *ptep, size_t *pgsize)
{ {
@ -328,7 +314,7 @@ pte_t *huge_pte_offset(struct mm_struct *mm,
if (sz != PUD_SIZE && pud_none(pud)) if (sz != PUD_SIZE && pud_none(pud))
return NULL; return NULL;
/* hugepage or swap? */ /* hugepage or swap? */
if (pud_huge(pud) || !pud_present(pud)) if (pud_leaf(pud) || !pud_present(pud))
return (pte_t *)pudp; return (pte_t *)pudp;
/* table; check the next level */ /* table; check the next level */
@ -340,7 +326,7 @@ pte_t *huge_pte_offset(struct mm_struct *mm,
if (!(sz == PMD_SIZE || sz == CONT_PMD_SIZE) && if (!(sz == PMD_SIZE || sz == CONT_PMD_SIZE) &&
pmd_none(pmd)) pmd_none(pmd))
return NULL; return NULL;
if (pmd_huge(pmd) || !pmd_present(pmd)) if (pmd_leaf(pmd) || !pmd_present(pmd))
return (pte_t *)pmdp; return (pte_t *)pmdp;
if (sz == CONT_PTE_SIZE) if (sz == CONT_PTE_SIZE)

View File

@ -68,6 +68,13 @@ void mte_invalidate_tags(int type, pgoff_t offset)
mte_free_tag_storage(tags); mte_free_tag_storage(tags);
} }
static inline void __mte_invalidate_tags(struct page *page)
{
swp_entry_t entry = page_swap_entry(page);
mte_invalidate_tags(swp_type(entry), swp_offset(entry));
}
void mte_invalidate_tags_area(int type) void mte_invalidate_tags_area(int type)
{ {
swp_entry_t entry = swp_entry(type, 0); swp_entry_t entry = swp_entry(type, 0);
@ -83,3 +90,41 @@ void mte_invalidate_tags_area(int type)
} }
xa_unlock(&mte_pages); xa_unlock(&mte_pages);
} }
int arch_prepare_to_swap(struct folio *folio)
{
long i, nr;
int err;
if (!system_supports_mte())
return 0;
nr = folio_nr_pages(folio);
for (i = 0; i < nr; i++) {
err = mte_save_tags(folio_page(folio, i));
if (err)
goto out;
}
return 0;
out:
while (i--)
__mte_invalidate_tags(folio_page(folio, i));
return err;
}
void arch_swap_restore(swp_entry_t entry, struct folio *folio)
{
long i, nr;
if (!system_supports_mte())
return;
nr = folio_nr_pages(folio);
for (i = 0; i < nr; i++) {
mte_restore_tags(entry, folio_page(folio, i));
entry.val++;
}
}

View File

@ -28,7 +28,12 @@ arch_get_unmapped_area(struct file *filp, unsigned long addr,
struct mm_struct *mm = current->mm; struct mm_struct *mm = current->mm;
struct vm_area_struct *vma; struct vm_area_struct *vma;
int do_align = 0; int do_align = 0;
struct vm_unmapped_area_info info; struct vm_unmapped_area_info info = {
.length = len,
.low_limit = mm->mmap_base,
.high_limit = TASK_SIZE,
.align_offset = pgoff << PAGE_SHIFT
};
/* /*
* We only need to do colour alignment if either the I or D * We only need to do colour alignment if either the I or D
@ -61,11 +66,6 @@ arch_get_unmapped_area(struct file *filp, unsigned long addr,
return addr; return addr;
} }
info.flags = 0;
info.length = len;
info.low_limit = mm->mmap_base;
info.high_limit = TASK_SIZE;
info.align_mask = do_align ? (PAGE_MASK & (SHMLBA - 1)) : 0; info.align_mask = do_align ? (PAGE_MASK & (SHMLBA - 1)) : 0;
info.align_offset = pgoff << PAGE_SHIFT;
return vm_unmapped_area(&info); return vm_unmapped_area(&info);
} }

View File

@ -119,7 +119,7 @@ config LOONGARCH
select HAVE_EBPF_JIT select HAVE_EBPF_JIT
select HAVE_EFFICIENT_UNALIGNED_ACCESS if !ARCH_STRICT_ALIGN select HAVE_EFFICIENT_UNALIGNED_ACCESS if !ARCH_STRICT_ALIGN
select HAVE_EXIT_THREAD select HAVE_EXIT_THREAD
select HAVE_FAST_GUP select HAVE_GUP_FAST
select HAVE_FTRACE_MCOUNT_RECORD select HAVE_FTRACE_MCOUNT_RECORD
select HAVE_FUNCTION_ARG_ACCESS_API select HAVE_FUNCTION_ARG_ACCESS_API
select HAVE_FUNCTION_ERROR_INJECTION select HAVE_FUNCTION_ERROR_INJECTION

View File

@ -10,6 +10,7 @@
#define _ASM_LOONGARCH_KFENCE_H #define _ASM_LOONGARCH_KFENCE_H
#include <linux/kfence.h> #include <linux/kfence.h>
#include <linux/vmalloc.h>
#include <asm/pgtable.h> #include <asm/pgtable.h>
#include <asm/tlb.h> #include <asm/tlb.h>

View File

@ -50,21 +50,11 @@ pte_t *huge_pte_offset(struct mm_struct *mm, unsigned long addr,
return (pte_t *) pmd; return (pte_t *) pmd;
} }
int pmd_huge(pmd_t pmd)
{
return (pmd_val(pmd) & _PAGE_HUGE) != 0;
}
int pud_huge(pud_t pud)
{
return (pud_val(pud) & _PAGE_HUGE) != 0;
}
uint64_t pmd_to_entrylo(unsigned long pmd_val) uint64_t pmd_to_entrylo(unsigned long pmd_val)
{ {
uint64_t val; uint64_t val;
/* PMD as PTE. Must be huge page */ /* PMD as PTE. Must be huge page */
if (!pmd_huge(__pmd(pmd_val))) if (!pmd_leaf(__pmd(pmd_val)))
panic("%s", __func__); panic("%s", __func__);
val = pmd_val ^ _PAGE_HUGE; val = pmd_val ^ _PAGE_HUGE;

View File

@ -25,7 +25,7 @@ static unsigned long arch_get_unmapped_area_common(struct file *filp,
struct vm_area_struct *vma; struct vm_area_struct *vma;
unsigned long addr = addr0; unsigned long addr = addr0;
int do_color_align; int do_color_align;
struct vm_unmapped_area_info info; struct vm_unmapped_area_info info = {};
if (unlikely(len > TASK_SIZE)) if (unlikely(len > TASK_SIZE))
return -ENOMEM; return -ENOMEM;
@ -83,7 +83,6 @@ static unsigned long arch_get_unmapped_area_common(struct file *filp,
*/ */
} }
info.flags = 0;
info.low_limit = mm->mmap_base; info.low_limit = mm->mmap_base;
info.high_limit = TASK_SIZE; info.high_limit = TASK_SIZE;
return vm_unmapped_area(&info); return vm_unmapped_area(&info);

View File

@ -68,7 +68,7 @@ config MIPS
select HAVE_DYNAMIC_FTRACE select HAVE_DYNAMIC_FTRACE
select HAVE_EBPF_JIT if !CPU_MICROMIPS select HAVE_EBPF_JIT if !CPU_MICROMIPS
select HAVE_EXIT_THREAD select HAVE_EXIT_THREAD
select HAVE_FAST_GUP select HAVE_GUP_FAST
select HAVE_FTRACE_MCOUNT_RECORD select HAVE_FTRACE_MCOUNT_RECORD
select HAVE_FUNCTION_GRAPH_TRACER select HAVE_FUNCTION_GRAPH_TRACER
select HAVE_FUNCTION_TRACER select HAVE_FUNCTION_TRACER

View File

@ -129,7 +129,7 @@ static inline int pmd_none(pmd_t pmd)
static inline int pmd_bad(pmd_t pmd) static inline int pmd_bad(pmd_t pmd)
{ {
#ifdef CONFIG_MIPS_HUGE_TLB_SUPPORT #ifdef CONFIG_MIPS_HUGE_TLB_SUPPORT
/* pmd_huge(pmd) but inline */ /* pmd_leaf(pmd) but inline */
if (unlikely(pmd_val(pmd) & _PAGE_HUGE)) if (unlikely(pmd_val(pmd) & _PAGE_HUGE))
return 0; return 0;
#endif #endif

View File

@ -245,7 +245,7 @@ static inline int pmd_none(pmd_t pmd)
static inline int pmd_bad(pmd_t pmd) static inline int pmd_bad(pmd_t pmd)
{ {
#ifdef CONFIG_MIPS_HUGE_TLB_SUPPORT #ifdef CONFIG_MIPS_HUGE_TLB_SUPPORT
/* pmd_huge(pmd) but inline */ /* pmd_leaf(pmd) but inline */
if (unlikely(pmd_val(pmd) & _PAGE_HUGE)) if (unlikely(pmd_val(pmd) & _PAGE_HUGE))
return 0; return 0;
#endif #endif

View File

@ -617,7 +617,7 @@ const struct dma_map_ops jazz_dma_ops = {
.sync_sg_for_device = jazz_dma_sync_sg_for_device, .sync_sg_for_device = jazz_dma_sync_sg_for_device,
.mmap = dma_common_mmap, .mmap = dma_common_mmap,
.get_sgtable = dma_common_get_sgtable, .get_sgtable = dma_common_get_sgtable,
.alloc_pages = dma_common_alloc_pages, .alloc_pages_op = dma_common_alloc_pages,
.free_pages = dma_common_free_pages, .free_pages = dma_common_free_pages,
}; };
EXPORT_SYMBOL(jazz_dma_ops); EXPORT_SYMBOL(jazz_dma_ops);

View File

@ -57,13 +57,3 @@ pte_t *huge_pte_offset(struct mm_struct *mm, unsigned long addr,
} }
return (pte_t *) pmd; return (pte_t *) pmd;
} }
int pmd_huge(pmd_t pmd)
{
return (pmd_val(pmd) & _PAGE_HUGE) != 0;
}
int pud_huge(pud_t pud)
{
return (pud_val(pud) & _PAGE_HUGE) != 0;
}

View File

@ -34,7 +34,7 @@ static unsigned long arch_get_unmapped_area_common(struct file *filp,
struct vm_area_struct *vma; struct vm_area_struct *vma;
unsigned long addr = addr0; unsigned long addr = addr0;
int do_color_align; int do_color_align;
struct vm_unmapped_area_info info; struct vm_unmapped_area_info info = {};
if (unlikely(len > TASK_SIZE)) if (unlikely(len > TASK_SIZE))
return -ENOMEM; return -ENOMEM;
@ -92,7 +92,6 @@ static unsigned long arch_get_unmapped_area_common(struct file *filp,
*/ */
} }
info.flags = 0;
info.low_limit = mm->mmap_base; info.low_limit = mm->mmap_base;
info.high_limit = TASK_SIZE; info.high_limit = TASK_SIZE;
return vm_unmapped_area(&info); return vm_unmapped_area(&info);

View File

@ -326,7 +326,7 @@ void __update_tlb(struct vm_area_struct * vma, unsigned long address, pte_t pte)
idx = read_c0_index(); idx = read_c0_index();
#ifdef CONFIG_MIPS_HUGE_TLB_SUPPORT #ifdef CONFIG_MIPS_HUGE_TLB_SUPPORT
/* this could be a huge page */ /* this could be a huge page */
if (pmd_huge(*pmdp)) { if (pmd_leaf(*pmdp)) {
unsigned long lo; unsigned long lo;
write_c0_pagemask(PM_HUGE_MASK); write_c0_pagemask(PM_HUGE_MASK);
ptep = (pte_t *)pmdp; ptep = (pte_t *)pmdp;

View File

@ -104,7 +104,9 @@ static unsigned long arch_get_unmapped_area_common(struct file *filp,
struct vm_area_struct *vma, *prev; struct vm_area_struct *vma, *prev;
unsigned long filp_pgoff; unsigned long filp_pgoff;
int do_color_align; int do_color_align;
struct vm_unmapped_area_info info; struct vm_unmapped_area_info info = {
.length = len
};
if (unlikely(len > TASK_SIZE)) if (unlikely(len > TASK_SIZE))
return -ENOMEM; return -ENOMEM;
@ -139,7 +141,6 @@ static unsigned long arch_get_unmapped_area_common(struct file *filp,
return addr; return addr;
} }
info.length = len;
info.align_mask = do_color_align ? (PAGE_MASK & (SHM_COLOUR - 1)) : 0; info.align_mask = do_color_align ? (PAGE_MASK & (SHM_COLOUR - 1)) : 0;
info.align_offset = shared_align_offset(filp_pgoff, pgoff); info.align_offset = shared_align_offset(filp_pgoff, pgoff);
@ -160,7 +161,6 @@ static unsigned long arch_get_unmapped_area_common(struct file *filp,
*/ */
} }
info.flags = 0;
info.low_limit = mm->mmap_base; info.low_limit = mm->mmap_base;
info.high_limit = mmap_upper_limit(NULL); info.high_limit = mmap_upper_limit(NULL);
return vm_unmapped_area(&info); return vm_unmapped_area(&info);

View File

@ -180,14 +180,3 @@ int huge_ptep_set_access_flags(struct vm_area_struct *vma,
} }
return changed; return changed;
} }
int pmd_huge(pmd_t pmd)
{
return 0;
}
int pud_huge(pud_t pud)
{
return 0;
}

View File

@ -237,7 +237,7 @@ config PPC
select HAVE_DYNAMIC_FTRACE_WITH_REGS if ARCH_USING_PATCHABLE_FUNCTION_ENTRY || MPROFILE_KERNEL || PPC32 select HAVE_DYNAMIC_FTRACE_WITH_REGS if ARCH_USING_PATCHABLE_FUNCTION_ENTRY || MPROFILE_KERNEL || PPC32
select HAVE_EBPF_JIT select HAVE_EBPF_JIT
select HAVE_EFFICIENT_UNALIGNED_ACCESS select HAVE_EFFICIENT_UNALIGNED_ACCESS
select HAVE_FAST_GUP select HAVE_GUP_FAST
select HAVE_FTRACE_MCOUNT_RECORD select HAVE_FTRACE_MCOUNT_RECORD
select HAVE_FUNCTION_ARG_ACCESS_API select HAVE_FUNCTION_ARG_ACCESS_API
select HAVE_FUNCTION_DESCRIPTORS if PPC64_ELF_ABI_V1 select HAVE_FUNCTION_DESCRIPTORS if PPC64_ELF_ABI_V1

View File

@ -6,26 +6,6 @@
*/ */
#ifndef __ASSEMBLY__ #ifndef __ASSEMBLY__
#ifdef CONFIG_HUGETLB_PAGE #ifdef CONFIG_HUGETLB_PAGE
static inline int pmd_huge(pmd_t pmd)
{
/*
* leaf pte for huge page
*/
if (radix_enabled())
return !!(pmd_raw(pmd) & cpu_to_be64(_PAGE_PTE));
return 0;
}
static inline int pud_huge(pud_t pud)
{
/*
* leaf pte for huge page
*/
if (radix_enabled())
return !!(pud_raw(pud) & cpu_to_be64(_PAGE_PTE));
return 0;
}
/* /*
* With radix , we have hugepage ptes in the pud and pmd entries. We don't * With radix , we have hugepage ptes in the pud and pmd entries. We don't
* need to setup hugepage directory for them. Our pte and page directory format * need to setup hugepage directory for them. Our pte and page directory format

View File

@ -4,31 +4,6 @@
#ifndef __ASSEMBLY__ #ifndef __ASSEMBLY__
#ifdef CONFIG_HUGETLB_PAGE #ifdef CONFIG_HUGETLB_PAGE
/*
* We have PGD_INDEX_SIZ = 12 and PTE_INDEX_SIZE = 8, so that we can have
* 16GB hugepage pte in PGD and 16MB hugepage pte at PMD;
*
* Defined in such a way that we can optimize away code block at build time
* if CONFIG_HUGETLB_PAGE=n.
*
* returns true for pmd migration entries, THP, devmap, hugetlb
* But compile time dependent on CONFIG_HUGETLB_PAGE
*/
static inline int pmd_huge(pmd_t pmd)
{
/*
* leaf pte for huge page
*/
return !!(pmd_raw(pmd) & cpu_to_be64(_PAGE_PTE));
}
static inline int pud_huge(pud_t pud)
{
/*
* leaf pte for huge page
*/
return !!(pud_raw(pud) & cpu_to_be64(_PAGE_PTE));
}
/* /*
* With 64k page size, we have hugepage ptes in the pgd and pmd entries. We don't * With 64k page size, we have hugepage ptes in the pgd and pmd entries. We don't

View File

@ -262,6 +262,18 @@ extern unsigned long __kernel_io_end;
extern struct page *vmemmap; extern struct page *vmemmap;
extern unsigned long pci_io_base; extern unsigned long pci_io_base;
#define pmd_leaf pmd_leaf
static inline bool pmd_leaf(pmd_t pmd)
{
return !!(pmd_raw(pmd) & cpu_to_be64(_PAGE_PTE));
}
#define pud_leaf pud_leaf
static inline bool pud_leaf(pud_t pud)
{
return !!(pud_raw(pud) & cpu_to_be64(_PAGE_PTE));
}
#endif /* __ASSEMBLY__ */ #endif /* __ASSEMBLY__ */
#include <asm/book3s/64/hash.h> #include <asm/book3s/64/hash.h>
@ -1426,20 +1438,5 @@ static inline bool is_pte_rw_upgrade(unsigned long old_val, unsigned long new_va
return false; return false;
} }
/*
* Like pmd_huge(), but works regardless of config options
*/
#define pmd_leaf pmd_leaf
static inline bool pmd_leaf(pmd_t pmd)
{
return !!(pmd_raw(pmd) & cpu_to_be64(_PAGE_PTE));
}
#define pud_leaf pud_leaf
static inline bool pud_leaf(pud_t pud)
{
return !!(pud_raw(pud) & cpu_to_be64(_PAGE_PTE));
}
#endif /* __ASSEMBLY__ */ #endif /* __ASSEMBLY__ */
#endif /* _ASM_POWERPC_BOOK3S_64_PGTABLE_H_ */ #endif /* _ASM_POWERPC_BOOK3S_64_PGTABLE_H_ */

View File

@ -406,9 +406,5 @@ extern void *abatron_pteptrs[2];
#include <asm/nohash/mmu.h> #include <asm/nohash/mmu.h>
#endif #endif
#if defined(CONFIG_FA_DUMP) || defined(CONFIG_PRESERVE_FA_DUMP)
#define __HAVE_ARCH_RESERVED_KERNEL_PAGES
#endif
#endif /* __KERNEL__ */ #endif /* __KERNEL__ */
#endif /* _ASM_POWERPC_MMU_H_ */ #endif /* _ASM_POWERPC_MMU_H_ */

View File

@ -351,16 +351,6 @@ static inline int hugepd_ok(hugepd_t hpd)
#endif #endif
} }
static inline int pmd_huge(pmd_t pmd)
{
return 0;
}
static inline int pud_huge(pud_t pud)
{
return 0;
}
#define is_hugepd(hpd) (hugepd_ok(hpd)) #define is_hugepd(hpd) (hugepd_ok(hpd))
#endif #endif

View File

@ -216,6 +216,6 @@ const struct dma_map_ops dma_iommu_ops = {
.get_required_mask = dma_iommu_get_required_mask, .get_required_mask = dma_iommu_get_required_mask,
.mmap = dma_common_mmap, .mmap = dma_common_mmap,
.get_sgtable = dma_common_get_sgtable, .get_sgtable = dma_common_get_sgtable,
.alloc_pages = dma_common_alloc_pages, .alloc_pages_op = dma_common_alloc_pages,
.free_pages = dma_common_free_pages, .free_pages = dma_common_free_pages,
}; };

View File

@ -1883,8 +1883,3 @@ static void __init fadump_reserve_crash_area(u64 base)
memblock_reserve(mstart, msize); memblock_reserve(mstart, msize);
} }
} }
unsigned long __init arch_reserved_kernel_pages(void)
{
return memblock_reserved_size() / PAGE_SIZE;
}

View File

@ -26,6 +26,7 @@
#include <linux/iommu.h> #include <linux/iommu.h>
#include <linux/sched.h> #include <linux/sched.h>
#include <linux/debugfs.h> #include <linux/debugfs.h>
#include <linux/vmalloc.h>
#include <asm/io.h> #include <asm/io.h>
#include <asm/iommu.h> #include <asm/iommu.h>
#include <asm/pci-bridge.h> #include <asm/pci-bridge.h>

View File

@ -170,6 +170,7 @@ pmd_t pmdp_invalidate(struct vm_area_struct *vma, unsigned long address,
{ {
unsigned long old_pmd; unsigned long old_pmd;
VM_WARN_ON_ONCE(!pmd_present(*pmdp));
old_pmd = pmd_hugepage_update(vma->vm_mm, address, pmdp, _PAGE_PRESENT, _PAGE_INVALID); old_pmd = pmd_hugepage_update(vma->vm_mm, address, pmdp, _PAGE_PRESENT, _PAGE_INVALID);
flush_pmd_tlb_range(vma, address, address + HPAGE_PMD_SIZE); flush_pmd_tlb_range(vma, address, address + HPAGE_PMD_SIZE);
return __pmd(old_pmd); return __pmd(old_pmd);

View File

@ -282,12 +282,10 @@ static unsigned long slice_find_area_bottomup(struct mm_struct *mm,
{ {
int pshift = max_t(int, mmu_psize_defs[psize].shift, PAGE_SHIFT); int pshift = max_t(int, mmu_psize_defs[psize].shift, PAGE_SHIFT);
unsigned long found, next_end; unsigned long found, next_end;
struct vm_unmapped_area_info info; struct vm_unmapped_area_info info = {
.length = len,
info.flags = 0; .align_mask = PAGE_MASK & ((1ul << pshift) - 1),
info.length = len; };
info.align_mask = PAGE_MASK & ((1ul << pshift) - 1);
info.align_offset = 0;
/* /*
* Check till the allow max value for this mmap request * Check till the allow max value for this mmap request
*/ */
@ -326,13 +324,13 @@ static unsigned long slice_find_area_topdown(struct mm_struct *mm,
{ {
int pshift = max_t(int, mmu_psize_defs[psize].shift, PAGE_SHIFT); int pshift = max_t(int, mmu_psize_defs[psize].shift, PAGE_SHIFT);
unsigned long found, prev; unsigned long found, prev;
struct vm_unmapped_area_info info; struct vm_unmapped_area_info info = {
.flags = VM_UNMAPPED_AREA_TOPDOWN,
.length = len,
.align_mask = PAGE_MASK & ((1ul << pshift) - 1),
};
unsigned long min_addr = max(PAGE_SIZE, mmap_min_addr); unsigned long min_addr = max(PAGE_SIZE, mmap_min_addr);
info.flags = VM_UNMAPPED_AREA_TOPDOWN;
info.length = len;
info.align_mask = PAGE_MASK & ((1ul << pshift) - 1);
info.align_offset = 0;
/* /*
* If we are trying to allocate above DEFAULT_MAP_WINDOW * If we are trying to allocate above DEFAULT_MAP_WINDOW
* Add the different to the mmap_base. * Add the different to the mmap_base.

View File

@ -71,23 +71,26 @@ static noinline int bad_area_nosemaphore(struct pt_regs *regs, unsigned long add
return __bad_area_nosemaphore(regs, address, SEGV_MAPERR); return __bad_area_nosemaphore(regs, address, SEGV_MAPERR);
} }
static int __bad_area(struct pt_regs *regs, unsigned long address, int si_code) static int __bad_area(struct pt_regs *regs, unsigned long address, int si_code,
struct mm_struct *mm, struct vm_area_struct *vma)
{ {
struct mm_struct *mm = current->mm;
/* /*
* Something tried to access memory that isn't in our memory map.. * Something tried to access memory that isn't in our memory map..
* Fix it, but check if it's kernel or user first.. * Fix it, but check if it's kernel or user first..
*/ */
if (mm)
mmap_read_unlock(mm); mmap_read_unlock(mm);
else
vma_end_read(vma);
return __bad_area_nosemaphore(regs, address, si_code); return __bad_area_nosemaphore(regs, address, si_code);
} }
static noinline int bad_access_pkey(struct pt_regs *regs, unsigned long address, static noinline int bad_access_pkey(struct pt_regs *regs, unsigned long address,
struct mm_struct *mm,
struct vm_area_struct *vma) struct vm_area_struct *vma)
{ {
struct mm_struct *mm = current->mm;
int pkey; int pkey;
/* /*
@ -109,7 +112,10 @@ static noinline int bad_access_pkey(struct pt_regs *regs, unsigned long address,
*/ */
pkey = vma_pkey(vma); pkey = vma_pkey(vma);
if (mm)
mmap_read_unlock(mm); mmap_read_unlock(mm);
else
vma_end_read(vma);
/* /*
* If we are in kernel mode, bail out with a SEGV, this will * If we are in kernel mode, bail out with a SEGV, this will
@ -124,9 +130,10 @@ static noinline int bad_access_pkey(struct pt_regs *regs, unsigned long address,
return 0; return 0;
} }
static noinline int bad_access(struct pt_regs *regs, unsigned long address) static noinline int bad_access(struct pt_regs *regs, unsigned long address,
struct mm_struct *mm, struct vm_area_struct *vma)
{ {
return __bad_area(regs, address, SEGV_ACCERR); return __bad_area(regs, address, SEGV_ACCERR, mm, vma);
} }
static int do_sigbus(struct pt_regs *regs, unsigned long address, static int do_sigbus(struct pt_regs *regs, unsigned long address,
@ -479,13 +486,13 @@ static int ___do_page_fault(struct pt_regs *regs, unsigned long address,
if (unlikely(access_pkey_error(is_write, is_exec, if (unlikely(access_pkey_error(is_write, is_exec,
(error_code & DSISR_KEYFAULT), vma))) { (error_code & DSISR_KEYFAULT), vma))) {
vma_end_read(vma); count_vm_vma_lock_event(VMA_LOCK_SUCCESS);
goto lock_mmap; return bad_access_pkey(regs, address, NULL, vma);
} }
if (unlikely(access_error(is_write, is_exec, vma))) { if (unlikely(access_error(is_write, is_exec, vma))) {
vma_end_read(vma); count_vm_vma_lock_event(VMA_LOCK_SUCCESS);
goto lock_mmap; return bad_access(regs, address, NULL, vma);
} }
fault = handle_mm_fault(vma, address, flags | FAULT_FLAG_VMA_LOCK, regs); fault = handle_mm_fault(vma, address, flags | FAULT_FLAG_VMA_LOCK, regs);
@ -521,10 +528,10 @@ retry:
if (unlikely(access_pkey_error(is_write, is_exec, if (unlikely(access_pkey_error(is_write, is_exec,
(error_code & DSISR_KEYFAULT), vma))) (error_code & DSISR_KEYFAULT), vma)))
return bad_access_pkey(regs, address, vma); return bad_access_pkey(regs, address, mm, vma);
if (unlikely(access_error(is_write, is_exec, vma))) if (unlikely(access_error(is_write, is_exec, vma)))
return bad_access(regs, address); return bad_access(regs, address, mm, vma);
/* /*
* If for any reason at all we couldn't handle the fault, * If for any reason at all we couldn't handle the fault,

View File

@ -17,6 +17,7 @@
#include <linux/suspend.h> #include <linux/suspend.h>
#include <linux/dma-direct.h> #include <linux/dma-direct.h>
#include <linux/execmem.h> #include <linux/execmem.h>
#include <linux/vmalloc.h>
#include <asm/swiotlb.h> #include <asm/swiotlb.h>
#include <asm/machdep.h> #include <asm/machdep.h>

View File

@ -102,7 +102,7 @@ struct page *p4d_page(p4d_t p4d)
{ {
if (p4d_leaf(p4d)) { if (p4d_leaf(p4d)) {
if (!IS_ENABLED(CONFIG_HAVE_ARCH_HUGE_VMAP)) if (!IS_ENABLED(CONFIG_HAVE_ARCH_HUGE_VMAP))
VM_WARN_ON(!p4d_huge(p4d)); VM_WARN_ON(!p4d_leaf(p4d));
return pte_page(p4d_pte(p4d)); return pte_page(p4d_pte(p4d));
} }
return virt_to_page(p4d_pgtable(p4d)); return virt_to_page(p4d_pgtable(p4d));
@ -113,7 +113,7 @@ struct page *pud_page(pud_t pud)
{ {
if (pud_leaf(pud)) { if (pud_leaf(pud)) {
if (!IS_ENABLED(CONFIG_HAVE_ARCH_HUGE_VMAP)) if (!IS_ENABLED(CONFIG_HAVE_ARCH_HUGE_VMAP))
VM_WARN_ON(!pud_huge(pud)); VM_WARN_ON(!pud_leaf(pud));
return pte_page(pud_pte(pud)); return pte_page(pud_pte(pud));
} }
return virt_to_page(pud_pgtable(pud)); return virt_to_page(pud_pgtable(pud));
@ -132,7 +132,7 @@ struct page *pmd_page(pmd_t pmd)
* enabled so these checks can't be used. * enabled so these checks can't be used.
*/ */
if (!IS_ENABLED(CONFIG_HAVE_ARCH_HUGE_VMAP)) if (!IS_ENABLED(CONFIG_HAVE_ARCH_HUGE_VMAP))
VM_WARN_ON(!(pmd_leaf(pmd) || pmd_huge(pmd))); VM_WARN_ON(!pmd_leaf(pmd));
return pte_page(pmd_pte(pmd)); return pte_page(pmd_pte(pmd));
} }
return virt_to_page(pmd_page_vaddr(pmd)); return virt_to_page(pmd_page_vaddr(pmd));

View File

@ -695,7 +695,7 @@ static const struct dma_map_ops ps3_sb_dma_ops = {
.unmap_page = ps3_unmap_page, .unmap_page = ps3_unmap_page,
.mmap = dma_common_mmap, .mmap = dma_common_mmap,
.get_sgtable = dma_common_get_sgtable, .get_sgtable = dma_common_get_sgtable,
.alloc_pages = dma_common_alloc_pages, .alloc_pages_op = dma_common_alloc_pages,
.free_pages = dma_common_free_pages, .free_pages = dma_common_free_pages,
}; };
@ -709,7 +709,7 @@ static const struct dma_map_ops ps3_ioc0_dma_ops = {
.unmap_page = ps3_unmap_page, .unmap_page = ps3_unmap_page,
.mmap = dma_common_mmap, .mmap = dma_common_mmap,
.get_sgtable = dma_common_get_sgtable, .get_sgtable = dma_common_get_sgtable,
.alloc_pages = dma_common_alloc_pages, .alloc_pages_op = dma_common_alloc_pages,
.free_pages = dma_common_free_pages, .free_pages = dma_common_free_pages,
}; };

View File

@ -611,7 +611,7 @@ static const struct dma_map_ops vio_dma_mapping_ops = {
.get_required_mask = dma_iommu_get_required_mask, .get_required_mask = dma_iommu_get_required_mask,
.mmap = dma_common_mmap, .mmap = dma_common_mmap,
.get_sgtable = dma_common_get_sgtable, .get_sgtable = dma_common_get_sgtable,
.alloc_pages = dma_common_alloc_pages, .alloc_pages_op = dma_common_alloc_pages,
.free_pages = dma_common_free_pages, .free_pages = dma_common_free_pages,
}; };

View File

@ -132,7 +132,7 @@ config RISCV
select HAVE_FUNCTION_GRAPH_RETVAL if HAVE_FUNCTION_GRAPH_TRACER select HAVE_FUNCTION_GRAPH_RETVAL if HAVE_FUNCTION_GRAPH_TRACER
select HAVE_FUNCTION_TRACER if !XIP_KERNEL && !PREEMPTION select HAVE_FUNCTION_TRACER if !XIP_KERNEL && !PREEMPTION
select HAVE_EBPF_JIT if MMU select HAVE_EBPF_JIT if MMU
select HAVE_FAST_GUP if MMU select HAVE_GUP_FAST if MMU
select HAVE_FUNCTION_ARG_ACCESS_API select HAVE_FUNCTION_ARG_ACCESS_API
select HAVE_FUNCTION_ERROR_INJECTION select HAVE_FUNCTION_ERROR_INJECTION
select HAVE_GCC_PLUGINS select HAVE_GCC_PLUGINS

View File

@ -5,11 +5,11 @@
#include <asm/cacheflush.h> #include <asm/cacheflush.h>
#include <asm/page.h> #include <asm/page.h>
static inline void arch_clear_hugepage_flags(struct page *page) static inline void arch_clear_hugetlb_flags(struct folio *folio)
{ {
clear_bit(PG_dcache_clean, &page->flags); clear_bit(PG_dcache_clean, &folio->flags);
} }
#define arch_clear_hugepage_flags arch_clear_hugepage_flags #define arch_clear_hugetlb_flags arch_clear_hugetlb_flags
#ifdef CONFIG_ARCH_ENABLE_HUGEPAGE_MIGRATION #ifdef CONFIG_ARCH_ENABLE_HUGEPAGE_MIGRATION
bool arch_hugetlb_migration_supported(struct hstate *h); bool arch_hugetlb_migration_supported(struct hstate *h);

View File

@ -651,6 +651,7 @@ static inline unsigned long pmd_pfn(pmd_t pmd)
#define __pud_to_phys(pud) (__page_val_to_pfn(pud_val(pud)) << PAGE_SHIFT) #define __pud_to_phys(pud) (__page_val_to_pfn(pud_val(pud)) << PAGE_SHIFT)
#define pud_pfn pud_pfn
static inline unsigned long pud_pfn(pud_t pud) static inline unsigned long pud_pfn(pud_t pud)
{ {
return ((__pud_to_phys(pud) & PUD_MASK) >> PAGE_SHIFT); return ((__pud_to_phys(pud) & PUD_MASK) >> PAGE_SHIFT);

View File

@ -19,6 +19,7 @@
#include <linux/libfdt.h> #include <linux/libfdt.h>
#include <linux/types.h> #include <linux/types.h>
#include <linux/memblock.h> #include <linux/memblock.h>
#include <linux/vmalloc.h>
#include <asm/setup.h> #include <asm/setup.h>
int arch_kimage_file_post_load_cleanup(struct kimage *image) int arch_kimage_file_post_load_cleanup(struct kimage *image)

View File

@ -6,6 +6,7 @@
#include <linux/extable.h> #include <linux/extable.h>
#include <linux/slab.h> #include <linux/slab.h>
#include <linux/stop_machine.h> #include <linux/stop_machine.h>
#include <linux/vmalloc.h>
#include <asm/ptrace.h> #include <asm/ptrace.h>
#include <linux/uaccess.h> #include <linux/uaccess.h>
#include <asm/sections.h> #include <asm/sections.h>

View File

@ -292,7 +292,10 @@ void handle_page_fault(struct pt_regs *regs)
if (unlikely(access_error(cause, vma))) { if (unlikely(access_error(cause, vma))) {
vma_end_read(vma); vma_end_read(vma);
goto lock_mmap; count_vm_vma_lock_event(VMA_LOCK_SUCCESS);
tsk->thread.bad_cause = cause;
bad_area_nosemaphore(regs, SEGV_ACCERR, addr);
return;
} }
fault = handle_mm_fault(vma, addr, flags | FAULT_FLAG_VMA_LOCK, regs); fault = handle_mm_fault(vma, addr, flags | FAULT_FLAG_VMA_LOCK, regs);

View File

@ -399,16 +399,6 @@ static bool is_napot_size(unsigned long size)
#endif /*CONFIG_RISCV_ISA_SVNAPOT*/ #endif /*CONFIG_RISCV_ISA_SVNAPOT*/
int pud_huge(pud_t pud)
{
return pud_leaf(pud);
}
int pmd_huge(pmd_t pmd)
{
return pmd_leaf(pmd);
}
static bool __hugetlb_valid_size(unsigned long size) static bool __hugetlb_valid_size(unsigned long size)
{ {
if (size == HPAGE_SIZE) if (size == HPAGE_SIZE)

View File

@ -177,7 +177,7 @@ config S390
select HAVE_DYNAMIC_FTRACE_WITH_REGS select HAVE_DYNAMIC_FTRACE_WITH_REGS
select HAVE_EBPF_JIT if HAVE_MARCH_Z196_FEATURES select HAVE_EBPF_JIT if HAVE_MARCH_Z196_FEATURES
select HAVE_EFFICIENT_UNALIGNED_ACCESS select HAVE_EFFICIENT_UNALIGNED_ACCESS
select HAVE_FAST_GUP select HAVE_GUP_FAST
select HAVE_FENTRY select HAVE_FENTRY
select HAVE_FTRACE_MCOUNT_RECORD select HAVE_FTRACE_MCOUNT_RECORD
select HAVE_FUNCTION_ARG_ACCESS_API select HAVE_FUNCTION_ARG_ACCESS_API

View File

@ -39,11 +39,11 @@ static inline int prepare_hugepage_range(struct file *file,
return 0; return 0;
} }
static inline void arch_clear_hugepage_flags(struct page *page) static inline void arch_clear_hugetlb_flags(struct folio *folio)
{ {
clear_bit(PG_arch_1, &page->flags); clear_bit(PG_arch_1, &folio->flags);
} }
#define arch_clear_hugepage_flags arch_clear_hugepage_flags #define arch_clear_hugetlb_flags arch_clear_hugetlb_flags
static inline void huge_pte_clear(struct mm_struct *mm, unsigned long addr, static inline void huge_pte_clear(struct mm_struct *mm, unsigned long addr,
pte_t *ptep, unsigned long sz) pte_t *ptep, unsigned long sz)

View File

@ -1421,6 +1421,7 @@ static inline unsigned long pud_deref(pud_t pud)
return (unsigned long)__va(pud_val(pud) & origin_mask); return (unsigned long)__va(pud_val(pud) & origin_mask);
} }
#define pud_pfn pud_pfn
static inline unsigned long pud_pfn(pud_t pud) static inline unsigned long pud_pfn(pud_t pud)
{ {
return __pa(pud_deref(pud)) >> PAGE_SHIFT; return __pa(pud_deref(pud)) >> PAGE_SHIFT;
@ -1784,8 +1785,10 @@ static inline pmd_t pmdp_huge_clear_flush(struct vm_area_struct *vma,
static inline pmd_t pmdp_invalidate(struct vm_area_struct *vma, static inline pmd_t pmdp_invalidate(struct vm_area_struct *vma,
unsigned long addr, pmd_t *pmdp) unsigned long addr, pmd_t *pmdp)
{ {
pmd_t pmd = __pmd(pmd_val(*pmdp) | _SEGMENT_ENTRY_INVALID); pmd_t pmd;
VM_WARN_ON_ONCE(!pmd_present(*pmdp));
pmd = __pmd(pmd_val(*pmdp) | _SEGMENT_ENTRY_INVALID);
return pmdp_xchg_direct(vma->vm_mm, addr, pmdp, pmd); return pmdp_xchg_direct(vma->vm_mm, addr, pmdp, pmd);
} }

View File

@ -21,6 +21,7 @@
#include <linux/seq_file.h> #include <linux/seq_file.h>
#include <linux/slab.h> #include <linux/slab.h>
#include <linux/sysfs.h> #include <linux/sysfs.h>
#include <linux/vmalloc.h>
#include <crypto/sha2.h> #include <crypto/sha2.h>
#include <keys/user-type.h> #include <keys/user-type.h>
#include <asm/debug.h> #include <asm/debug.h>

View File

@ -20,6 +20,7 @@
#include <linux/gfp.h> #include <linux/gfp.h>
#include <linux/crash_dump.h> #include <linux/crash_dump.h>
#include <linux/debug_locks.h> #include <linux/debug_locks.h>
#include <linux/vmalloc.h>
#include <asm/asm-extable.h> #include <asm/asm-extable.h>
#include <asm/diag.h> #include <asm/diag.h>
#include <asm/ipl.h> #include <asm/ipl.h>

View File

@ -325,7 +325,8 @@ static void do_exception(struct pt_regs *regs, int access)
goto lock_mmap; goto lock_mmap;
if (!(vma->vm_flags & access)) { if (!(vma->vm_flags & access)) {
vma_end_read(vma); vma_end_read(vma);
goto lock_mmap; count_vm_vma_lock_event(VMA_LOCK_SUCCESS);
return handle_fault_error_nolock(regs, SEGV_ACCERR);
} }
fault = handle_mm_fault(vma, address, flags | FAULT_FLAG_VMA_LOCK, regs); fault = handle_mm_fault(vma, address, flags | FAULT_FLAG_VMA_LOCK, regs);
if (!(fault & (VM_FAULT_RETRY | VM_FAULT_COMPLETED))) if (!(fault & (VM_FAULT_RETRY | VM_FAULT_COMPLETED)))

View File

@ -233,16 +233,6 @@ pte_t *huge_pte_offset(struct mm_struct *mm,
return (pte_t *) pmdp; return (pte_t *) pmdp;
} }
int pmd_huge(pmd_t pmd)
{
return pmd_leaf(pmd);
}
int pud_huge(pud_t pud)
{
return pud_leaf(pud);
}
bool __init arch_hugetlb_valid_size(unsigned long size) bool __init arch_hugetlb_valid_size(unsigned long size)
{ {
if (MACHINE_HAS_EDAT1 && size == PMD_SIZE) if (MACHINE_HAS_EDAT1 && size == PMD_SIZE)
@ -258,14 +248,12 @@ static unsigned long hugetlb_get_unmapped_area_bottomup(struct file *file,
unsigned long pgoff, unsigned long flags) unsigned long pgoff, unsigned long flags)
{ {
struct hstate *h = hstate_file(file); struct hstate *h = hstate_file(file);
struct vm_unmapped_area_info info; struct vm_unmapped_area_info info = {};
info.flags = 0;
info.length = len; info.length = len;
info.low_limit = current->mm->mmap_base; info.low_limit = current->mm->mmap_base;
info.high_limit = TASK_SIZE; info.high_limit = TASK_SIZE;
info.align_mask = PAGE_MASK & ~huge_page_mask(h); info.align_mask = PAGE_MASK & ~huge_page_mask(h);
info.align_offset = 0;
return vm_unmapped_area(&info); return vm_unmapped_area(&info);
} }
@ -274,7 +262,7 @@ static unsigned long hugetlb_get_unmapped_area_topdown(struct file *file,
unsigned long pgoff, unsigned long flags) unsigned long pgoff, unsigned long flags)
{ {
struct hstate *h = hstate_file(file); struct hstate *h = hstate_file(file);
struct vm_unmapped_area_info info; struct vm_unmapped_area_info info = {};
unsigned long addr; unsigned long addr;
info.flags = VM_UNMAPPED_AREA_TOPDOWN; info.flags = VM_UNMAPPED_AREA_TOPDOWN;
@ -282,7 +270,6 @@ static unsigned long hugetlb_get_unmapped_area_topdown(struct file *file,
info.low_limit = PAGE_SIZE; info.low_limit = PAGE_SIZE;
info.high_limit = current->mm->mmap_base; info.high_limit = current->mm->mmap_base;
info.align_mask = PAGE_MASK & ~huge_page_mask(h); info.align_mask = PAGE_MASK & ~huge_page_mask(h);
info.align_offset = 0;
addr = vm_unmapped_area(&info); addr = vm_unmapped_area(&info);
/* /*
@ -328,7 +315,7 @@ unsigned long hugetlb_get_unmapped_area(struct file *file, unsigned long addr,
goto check_asce_limit; goto check_asce_limit;
} }
if (mm->get_unmapped_area == arch_get_unmapped_area) if (!test_bit(MMF_TOPDOWN, &mm->flags))
addr = hugetlb_get_unmapped_area_bottomup(file, addr, len, addr = hugetlb_get_unmapped_area_bottomup(file, addr, len,
pgoff, flags); pgoff, flags);
else else

View File

@ -86,7 +86,7 @@ unsigned long arch_get_unmapped_area(struct file *filp, unsigned long addr,
{ {
struct mm_struct *mm = current->mm; struct mm_struct *mm = current->mm;
struct vm_area_struct *vma; struct vm_area_struct *vma;
struct vm_unmapped_area_info info; struct vm_unmapped_area_info info = {};
if (len > TASK_SIZE - mmap_min_addr) if (len > TASK_SIZE - mmap_min_addr)
return -ENOMEM; return -ENOMEM;
@ -102,7 +102,6 @@ unsigned long arch_get_unmapped_area(struct file *filp, unsigned long addr,
goto check_asce_limit; goto check_asce_limit;
} }
info.flags = 0;
info.length = len; info.length = len;
info.low_limit = mm->mmap_base; info.low_limit = mm->mmap_base;
info.high_limit = TASK_SIZE; info.high_limit = TASK_SIZE;
@ -122,7 +121,7 @@ unsigned long arch_get_unmapped_area_topdown(struct file *filp, unsigned long ad
{ {
struct vm_area_struct *vma; struct vm_area_struct *vma;
struct mm_struct *mm = current->mm; struct mm_struct *mm = current->mm;
struct vm_unmapped_area_info info; struct vm_unmapped_area_info info = {};
/* requested length too big for entire address space */ /* requested length too big for entire address space */
if (len > TASK_SIZE - mmap_min_addr) if (len > TASK_SIZE - mmap_min_addr)
@ -185,10 +184,10 @@ void arch_pick_mmap_layout(struct mm_struct *mm, struct rlimit *rlim_stack)
*/ */
if (mmap_is_legacy(rlim_stack)) { if (mmap_is_legacy(rlim_stack)) {
mm->mmap_base = mmap_base_legacy(random_factor); mm->mmap_base = mmap_base_legacy(random_factor);
mm->get_unmapped_area = arch_get_unmapped_area; clear_bit(MMF_TOPDOWN, &mm->flags);
} else { } else {
mm->mmap_base = mmap_base(random_factor, rlim_stack); mm->mmap_base = mmap_base(random_factor, rlim_stack);
mm->get_unmapped_area = arch_get_unmapped_area_topdown; set_bit(MMF_TOPDOWN, &mm->flags);
} }
} }

View File

@ -169,7 +169,7 @@ SYSCALL_DEFINE3(s390_pci_mmio_write, unsigned long, mmio_addr,
if (!(vma->vm_flags & VM_WRITE)) if (!(vma->vm_flags & VM_WRITE))
goto out_unlock_mmap; goto out_unlock_mmap;
ret = follow_pte(vma->vm_mm, mmio_addr, &ptep, &ptl); ret = follow_pte(vma, mmio_addr, &ptep, &ptl);
if (ret) if (ret)
goto out_unlock_mmap; goto out_unlock_mmap;
@ -308,7 +308,7 @@ SYSCALL_DEFINE3(s390_pci_mmio_read, unsigned long, mmio_addr,
if (!(vma->vm_flags & VM_WRITE)) if (!(vma->vm_flags & VM_WRITE))
goto out_unlock_mmap; goto out_unlock_mmap;
ret = follow_pte(vma->vm_mm, mmio_addr, &ptep, &ptl); ret = follow_pte(vma, mmio_addr, &ptep, &ptl);
if (ret) if (ret)
goto out_unlock_mmap; goto out_unlock_mmap;

View File

@ -38,7 +38,7 @@ config SUPERH
select HAVE_DEBUG_BUGVERBOSE select HAVE_DEBUG_BUGVERBOSE
select HAVE_DEBUG_KMEMLEAK select HAVE_DEBUG_KMEMLEAK
select HAVE_DYNAMIC_FTRACE select HAVE_DYNAMIC_FTRACE
select HAVE_FAST_GUP if MMU select HAVE_GUP_FAST if MMU
select HAVE_FUNCTION_GRAPH_TRACER select HAVE_FUNCTION_GRAPH_TRACER
select HAVE_FUNCTION_TRACER select HAVE_FUNCTION_TRACER
select HAVE_FTRACE_MCOUNT_RECORD select HAVE_FTRACE_MCOUNT_RECORD

View File

@ -27,11 +27,11 @@ static inline pte_t huge_ptep_clear_flush(struct vm_area_struct *vma,
return *ptep; return *ptep;
} }
static inline void arch_clear_hugepage_flags(struct page *page) static inline void arch_clear_hugetlb_flags(struct folio *folio)
{ {
clear_bit(PG_dcache_clean, &page->flags); clear_bit(PG_dcache_clean, &folio->flags);
} }
#define arch_clear_hugepage_flags arch_clear_hugepage_flags #define arch_clear_hugetlb_flags arch_clear_hugetlb_flags
#include <asm-generic/hugetlb.h> #include <asm-generic/hugetlb.h>

View File

@ -241,13 +241,14 @@ static void sh4_flush_cache_page(void *args)
if ((vma->vm_mm == current->active_mm)) if ((vma->vm_mm == current->active_mm))
vaddr = NULL; vaddr = NULL;
else { else {
struct folio *folio = page_folio(page);
/* /*
* Use kmap_coherent or kmap_atomic to do flushes for * Use kmap_coherent or kmap_atomic to do flushes for
* another ASID than the current one. * another ASID than the current one.
*/ */
map_coherent = (current_cpu_data.dcache.n_aliases && map_coherent = (current_cpu_data.dcache.n_aliases &&
test_bit(PG_dcache_clean, &page->flags) && test_bit(PG_dcache_clean, folio_flags(folio, 0)) &&
page_mapcount(page)); page_mapped(page));
if (map_coherent) if (map_coherent)
vaddr = kmap_coherent(page, address); vaddr = kmap_coherent(page, address);
else else

Some files were not shown because too many files have changed in this diff Show More