linux

mirror of https://github.com/torvalds/linux.git synced 2024-10-24 06:01:01 +00:00

Author	SHA1	Message	Date
Matthew Wilcox (Oracle)	9cfb816b1c	mm/fs: convert inode_attach_wb() to take a folio Patch series "Writeback folio conversions". Remove more calls to compound_head() by passing folios around instead of pages. This patch (of 2): The only caller of inode_attach_wb() which doesn't pass NULL already has a folio, so convert the whole call-chain to take folios. Link: https://lkml.kernel.org/r/20230116192507.2146150-1-willy@infradead.org Link: https://lkml.kernel.org/r/20230116192507.2146150-2-willy@infradead.org Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2023-02-02 22:33:19 -08:00
Matthew Wilcox (Oracle)	14ddee4126	mm: use a folio in copy_present_pte() We still have to keep the page around because we need to know which page in the folio we're copying, but we can replace five implict calls to compound_head() with one. Link: https://lkml.kernel.org/r/20230116191813.2145215-6-willy@infradead.org Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Reviewed-by: Zi Yan <ziy@nvidia.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2023-02-02 22:33:19 -08:00
Matthew Wilcox (Oracle)	edf5047058	mm: use a folio in copy_pte_range() Allocate an order-0 folio instead of a page and pass it all the way down the call chain. Removes dozens of calls to compound_head(). Link: https://lkml.kernel.org/r/20230116191813.2145215-5-willy@infradead.org Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Reviewed-by: Zi Yan <ziy@nvidia.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2023-02-02 22:33:19 -08:00
Matthew Wilcox (Oracle)	28d41a4863	mm: convert wp_page_copy() to use folios Use new_folio instead of new_page throughout, because we allocated it and know it's an order-0 folio. Most old_page uses become old_folio, but use vmf->page where we need the precise page. Link: https://lkml.kernel.org/r/20230116191813.2145215-4-willy@infradead.org Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Reviewed-by: Zi Yan <ziy@nvidia.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2023-02-02 22:33:19 -08:00
Matthew Wilcox (Oracle)	cb3184deef	mm: convert do_anonymous_page() to use a folio Removes six calls to compound_head(); some inline and some external. Link: https://lkml.kernel.org/r/20230116191813.2145215-3-willy@infradead.org Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Reviewed-by: Zi Yan <ziy@nvidia.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2023-02-02 22:33:18 -08:00
Matthew Wilcox (Oracle)	6bc56a4d85	mm: add vma_alloc_zeroed_movable_folio() Replace alloc_zeroed_user_highpage_movable(). The main difference is returning a folio containing a single page instead of returning the page, but take the opportunity to rename the function to match other allocation functions a little better and rewrite the documentation to place more emphasis on the zeroing rather than the highmem aspect. Link: https://lkml.kernel.org/r/20230116191813.2145215-2-willy@infradead.org Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Reviewed-by: Zi Yan <ziy@nvidia.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2023-02-02 22:33:18 -08:00
Vishal Moola (Oracle)	c5792d9384	filemap: remove find_get_pages_range_tag() All callers to find_get_pages_range_tag(), find_get_pages_tag(), pagevec_lookup_range_tag(), and pagevec_lookup_tag() have been removed. Link: https://lkml.kernel.org/r/20230104211448.4804-24-vishal.moola@gmail.com Signed-off-by: Vishal Moola (Oracle) <vishal.moola@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2023-02-02 22:33:18 -08:00
Vishal Moola (Oracle)	243c5ea4f7	nilfs2: convert nilfs_clear_dirty_pages() to use filemap_get_folios_tag() Convert function to use folios throughout. This is in preparation for the removal of find_get_pages_range_tag(). This change removes 2 calls to compound_head(). Link: https://lkml.kernel.org/r/20230104211448.4804-23-vishal.moola@gmail.com Signed-off-by: Vishal Moola (Oracle) <vishal.moola@gmail.com> Acked-by: Ryusuke Konishi <konishi.ryusuke@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2023-02-02 22:33:18 -08:00
Vishal Moola (Oracle)	d4a16d3133	nilfs2: convert nilfs_copy_dirty_pages() to use filemap_get_folios_tag() Convert function to use folios throughout. This is in preparation for the removal of find_get_pages_range_tag(). This change removes 8 calls to compound_head(). Link: https://lkml.kernel.org/r/20230104211448.4804-22-vishal.moola@gmail.com Signed-off-by: Vishal Moola (Oracle) <vishal.moola@gmail.com> Acked-by: Ryusuke Konishi <konishi.ryusuke@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2023-02-02 22:33:17 -08:00
Vishal Moola (Oracle)	41f3f3b537	nilfs2: convert nilfs_btree_lookup_dirty_buffers() to use filemap_get_folios_tag() Convert function to use folios throughout. This is in preparation for the removal of find_get_pages_range_tag(). This change removes 1 call to compound_head(). Link: https://lkml.kernel.org/r/20230104211448.4804-21-vishal.moola@gmail.com Signed-off-by: Vishal Moola (Oracle) <vishal.moola@gmail.com> Acked-by: Ryusuke Konishi <konishi.ryusuke@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2023-02-02 22:33:17 -08:00
Vishal Moola (Oracle)	a245865831	nilfs2: convert nilfs_lookup_dirty_node_buffers() to use filemap_get_folios_tag() Convert function to use folios throughout. This is in preparation for the removal of find_get_pages_range_tag(). This change removes 1 call to compound_head(). Link: https://lkml.kernel.org/r/20230104211448.4804-20-vishal.moola@gmail.com Signed-off-by: Vishal Moola (Oracle) <vishal.moola@gmail.com> Acked-by: Ryusuke Konishi <konishi.ryusuke@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2023-02-02 22:33:17 -08:00
Vishal Moola (Oracle)	5ee4b25cb7	nilfs2: convert nilfs_lookup_dirty_data_buffers() to use filemap_get_folios_tag() Convert function to use folios throughout. This is in preparation for the removal of find_get_pages_range_tag(). This change removes 4 calls to compound_head(). Link: https://lkml.kernel.org/r/20230104211448.4804-19-vishal.moola@gmail.com Signed-off-by: Vishal Moola (Oracle) <vishal.moola@gmail.com> Acked-by: Ryusuke Konishi <konishi.ryusuke@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2023-02-02 22:33:17 -08:00
Vishal Moola (Oracle)	87ed37e66d	gfs2: convert gfs2_write_cache_jdata() to use filemap_get_folios_tag() Convert function to use folios throughout. This is in preparation for the removal of find_get_pgaes_range_tag(). This change removes 8 calls to compound_head(). Also had to modify and rename gfs2_write_jdata_pagevec() to take in and utilize folio_batch rather than pagevec and use folios rather than pages. gfs2_write_jdata_batch() now supports large folios. Link: https://lkml.kernel.org/r/20230104211448.4804-18-vishal.moola@gmail.com Signed-off-by: Vishal Moola (Oracle) <vishal.moola@gmail.com> Reviewed-by: Andreas Gruenbacher <agruenba@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2023-02-02 22:33:17 -08:00
Vishal Moola (Oracle)	580e7a4926	f2fs: convert f2fs_sync_meta_pages() to use filemap_get_folios_tag() Convert function to use folios throughout. This is in preparation for the removal of find_get_pages_range_tag(). This change removes 5 calls to compound_head(). Initially the function was checking if the previous page index is truly the previous page i.e. 1 index behind the current page. To convert to folios and maintain this check we need to make the check folio->index != prev + folio_nr_pages(previous folio) since we don't know how many pages are in a folio. At index i == 0 the check is guaranteed to succeed, so to workaround indexing bounds we can simply ignore the check for that specific index. This makes the initial assignment of prev trivial, so I removed that as well. Also modify a comment in commit_checkpoint for consistency. Link: https://lkml.kernel.org/r/20230104211448.4804-17-vishal.moola@gmail.com Signed-off-by: Vishal Moola (Oracle) <vishal.moola@gmail.com> Acked-by: Chao Yu <chao@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2023-02-02 22:33:16 -08:00
Vishal Moola (Oracle)	4f4a4f0feb	f2fs: convert last_fsync_dnode() to use filemap_get_folios_tag() Convert to use a folio_batch instead of pagevec. This is in preparation for the removal of find_get_pages_range_tag(). Link: https://lkml.kernel.org/r/20230104211448.4804-16-vishal.moola@gmail.com Signed-off-by: Vishal Moola (Oracle) <vishal.moola@gmail.com> Acked-by: Chao Yu <chao@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2023-02-02 22:33:16 -08:00
Vishal Moola (Oracle)	1cd98ee747	f2fs: convert f2fs_write_cache_pages() to use filemap_get_folios_tag() Convert the function to use a folio_batch instead of pagevec. This is in preparation for the removal of find_get_pages_range_tag(). Also modified f2fs_all_cluster_page_ready to take in a folio_batch instead of pagevec. This does NOT support large folios. The function currently only utilizes folios of size 1 so this shouldn't cause any issues right now. This version of the patch limits the number of pages fetched to F2FS_ONSTACK_PAGES. If that ever happens, update the start index here since filemap_get_folios_tag() updates the index to be after the last found folio, not necessarily the last used page. Link: https://lkml.kernel.org/r/20230104211448.4804-15-vishal.moola@gmail.com Signed-off-by: Vishal Moola (Oracle) <vishal.moola@gmail.com> Acked-by: Chao Yu <chao@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2023-02-02 22:33:16 -08:00
Vishal Moola (Oracle)	7525486aff	f2fs: convert f2fs_sync_node_pages() to use filemap_get_folios_tag() Convert function to use a folio_batch instead of pagevec. This is in preparation for the removal of find_get_pages_range_tag(). Link: https://lkml.kernel.org/r/20230104211448.4804-14-vishal.moola@gmail.com Signed-off-by: Vishal Moola (Oracle) <vishal.moola@gmail.com> Acked-by: Chao Yu <chao@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2023-02-02 22:33:16 -08:00
Vishal Moola (Oracle)	a40a4ad118	f2fs: convert f2fs_flush_inline_data() to use filemap_get_folios_tag() Convert function to use a folio_batch instead of pagevec. This is in preparation for the removal of find_get_pages_tag(). Link: https://lkml.kernel.org/r/20230104211448.4804-13-vishal.moola@gmail.com Signed-off-by: Vishal Moola (Oracle) <vishal.moola@gmail.com> Acked-by: Chao Yu <chao@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2023-02-02 22:33:15 -08:00
Vishal Moola (Oracle)	e6e46e1eb7	f2fs: convert f2fs_fsync_node_pages() to use filemap_get_folios_tag() Convert function to use a folio_batch instead of pagevec. This is in preparation for the removal of find_get_pages_range_tag(). Link: https://lkml.kernel.org/r/20230104211448.4804-12-vishal.moola@gmail.com Signed-off-by: Vishal Moola (Oracle) <vishal.moola@gmail.com> Acked-by: Chao Yu <chao@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2023-02-02 22:33:15 -08:00
Vishal Moola (Oracle)	50ead25374	ext4: convert mpage_prepare_extent_to_map() to use filemap_get_folios_tag() Convert the function to use folios throughout. This is in preparation for the removal of find_get_pages_range_tag(). Now supports large folios. This change removes 11 calls to compound_head(). Link: https://lkml.kernel.org/r/20230104211448.4804-11-vishal.moola@gmail.com Signed-off-by: Vishal Moola (Oracle) <vishal.moola@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2023-02-02 22:33:15 -08:00
Vishal Moola (Oracle)	4cda80f3a7	cifs: convert wdata_alloc_and_fillpages() to use filemap_get_folios_tag() This is in preparation for the removal of find_get_pages_range_tag(). Now also supports the use of large folios. Since tofind might be larger than the max number of folios in a folio_batch (15), we loop through filling in wdata->pages pulling more batches until we either reach tofind pages or run out of folios. This function may not return all pages in the last found folio before tofind pages are reached. Link: https://lkml.kernel.org/r/20230104211448.4804-10-vishal.moola@gmail.com Signed-off-by: Vishal Moola (Oracle) <vishal.moola@gmail.com> Acked-by: Paulo Alcantara (SUSE) <pc@cjr.nz> Cc: Tom Talpey <tom@talpey.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2023-02-02 22:33:15 -08:00
Vishal Moola (Oracle)	590a2b5f0a	ceph: convert ceph_writepages_start() to use filemap_get_folios_tag() Convert function to use a folio_batch instead of pagevec. This is in preparation for the removal of find_get_pages_range_tag(). Also some minor renaming for consistency. Link: https://lkml.kernel.org/r/20230104211448.4804-9-vishal.moola@gmail.com Signed-off-by: Vishal Moola (Oracle) <vishal.moola@gmail.com> Acked-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2023-02-02 22:33:15 -08:00
Vishal Moola (Oracle)	9f50fd2e92	btrfs: convert extent_write_cache_pages() to use filemap_get_folios_tag() Convert function to use folios throughout. This is in preparation for the removal of find_get_pages_range_tag(). Now also supports large folios. Link: https://lkml.kernel.org/r/20230104211448.4804-8-vishal.moola@gmail.com Signed-off-by: Vishal Moola (Oracle) <vishal.moola@gmail.com> Acked-by: David Sterba <dsterba@suse.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2023-02-02 22:33:14 -08:00
Vishal Moola (Oracle)	51c5cd3baf	btrfs: convert btree_write_cache_pages() to use filemap_get_folio_tag() Convert function to use folios throughout. This is in preparation for the removal of find_get_pages_range_tag(). Link: https://lkml.kernel.org/r/20230104211448.4804-7-vishal.moola@gmail.com Signed-off-by: Vishal Moola (Oracle) <vishal.moola@gmail.com> Acked-by: David Sterba <dsterba@suse.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2023-02-02 22:33:14 -08:00
Vishal Moola (Oracle)	acc8d8588c	afs: convert afs_writepages_region() to use filemap_get_folios_tag() Convert to use folios throughout. This function is in preparation to remove find_get_pages_range_tag(). Also modify this function to write the whole batch one at a time, rather than calling for a new set every single write. Link: https://lkml.kernel.org/r/20230104211448.4804-6-vishal.moola@gmail.com Signed-off-by: Vishal Moola (Oracle) <vishal.moola@gmail.com> Tested-by: David Howells <dhowells@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2023-02-02 22:33:14 -08:00
Vishal Moola (Oracle)	0fff435f06	page-writeback: convert write_cache_pages() to use filemap_get_folios_tag() Convert function to use folios throughout. This is in preparation for the removal of find_get_pages_range_tag(). This change removes 8 calls to compound_head(), and the function now supports large folios. Link: https://lkml.kernel.org/r/20230104211448.4804-5-vishal.moola@gmail.com Signed-off-by: Vishal Moola (Oracle) <vishal.moola@gmail.com> Reviewed-by: Matthew Wilcow (Oracle) <willy@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2023-02-02 22:33:14 -08:00
Vishal Moola (Oracle)	6817ef514e	filemap: convert __filemap_fdatawait_range() to use filemap_get_folios_tag() Convert function to use folios. This is in preparation for the removal of find_get_pages_range_tag(). This change removes 2 calls to compound_head(). Link: https://lkml.kernel.org/r/20230104211448.4804-4-vishal.moola@gmail.com Signed-off-by: Vishal Moola (Oracle) <vishal.moola@gmail.com> Reviewed-by: Matthew Wilcow (Oracle) <willy@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2023-02-02 22:33:13 -08:00
Vishal Moola (Oracle)	247f9e1fee	filemap: add filemap_get_folios_tag() This is the equivalent of find_get_pages_range_tag(), except for folios instead of pages. One noteable difference is filemap_get_folios_tag() does not take in a maximum pages argument. It instead tries to fill a folio batch and stops either once full (15 folios) or reaching the end of the search range. The new function supports large folios, the initial function did not since all callers don't use large folios. Link: https://lkml.kernel.org/r/20230104211448.4804-3-vishal.moola@gmail.com Signed-off-by: Vishal Moola (Oracle) <vishal.moola@gmail.com> Reviewed-by: Matthew Wilcow (Oracle) <willy@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2023-02-02 22:33:13 -08:00
Vishal Moola (Oracle)	ee7a5906ff	pagemap: add filemap_grab_folio() Patch series "Convert to filemap_get_folios_tag()", v5. This patch series replaces find_get_pages_range_tag() with filemap_get_folios_tag(). This also allows the removal of multiple calls to compound_head() throughout. It also makes a good chunk of the straightforward conversions to folios, and takes the opportunity to introduce a function that grabs a folio from the pagecache. This patch (of 23): Add function filemap_grab_folio() to grab a folio from the page cache. This function is meant to serve as a folio replacement for grab_cache_page, and is used to facilitate the removal of find_get_pages_range_tag(). Link: https://lkml.kernel.org/r/20230104211448.4804-1-vishal.moola@gmail.com Link: https://lkml.kernel.org/r/20230104211448.4804-2-vishal.moola@gmail.com Signed-off-by: Vishal Moola (Oracle) <vishal.moola@gmail.com> Reviewed-by: Matthew Wilcox (Oracle) <willy@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2023-02-02 22:33:13 -08:00
David Stevens	2cf1338454	mm: fix khugepaged with shmem_enabled=advise Pass vm_flags as a parameter to shmem_is_huge, rather than reading the flags from the vm_area_struct in question. This allows the updated flags from hugepage_madvise to be passed to the check, which is necessary because madvise does not update the vm_area_struct's flags until after hugepage_madvise returns. This fixes an issue when shmem_enabled=madvise, where MADV_HUGEPAGE on shmem was not able to register the mm_struct with khugepaged. Prior to `cd89fb0650`, the mm_struct was registered by MADV_HUGEPAGE regardless of the value of shmem_enabled (which was only checked when scanning vmas). Link: https://lkml.kernel.org/r/20230113023011.1784015-1-stevensd@google.com Fixes: `cd89fb0650` ("mm,thp,shmem: make khugepaged obey tmpfs mount flags") Signed-off-by: David Stevens <stevensd@chromium.org> Cc: David Stevens <stevensd@chromium.org> Cc: Hugh Dickins <hughd@google.com> Cc: Rik van Riel <riel@surriel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2023-02-02 22:33:13 -08:00
NeilBrown	2973d8229b	mm: discard __GFP_ATOMIC __GFP_ATOMIC serves little purpose. Its main effect is to set ALLOC_HARDER which adds a few little boosts to increase the chance of an allocation succeeding, one of which is to lower the water-mark at which it will succeed. It is always paired with __GFP_HIGH which sets ALLOC_HIGH which also adjusts this watermark. It is probable that other users of __GFP_HIGH should benefit from the other little bonuses that __GFP_ATOMIC gets. __GFP_ATOMIC also gives a warning if used with __GFP_DIRECT_RECLAIM. There is little point to this. We already get a might_sleep() warning if __GFP_DIRECT_RECLAIM is set. __GFP_ATOMIC allows the "watermark_boost" to be side-stepped. It is probable that testing ALLOC_HARDER is a better fit here. __GFP_ATOMIC is used by tegra-smmu.c to check if the allocation might sleep. This should test __GFP_DIRECT_RECLAIM instead. This patch: - removes __GFP_ATOMIC - allows __GFP_HIGH allocations to ignore watermark boosting as well as GFP_ATOMIC requests. - makes other adjustments as suggested by the above. The net result is not change to GFP_ATOMIC allocations. Other allocations that use __GFP_HIGH will benefit from a few different extra privileges. This affects: xen, dm, md, ntfs3 the vermillion frame buffer hibernation ksm swap all of which likely produce more benefit than cost if these selected allocation are more likely to succeed quickly. [mgorman: Minor adjustments to rework on top of a series] Link: https://lkml.kernel.org/r/163712397076.13692.4727608274002939094@noble.neil.brown.name Link: https://lkml.kernel.org/r/20230113111217.14134-7-mgorman@techsingularity.net Signed-off-by: NeilBrown <neilb@suse.de> Signed-off-by: Mel Gorman <mgorman@techsingularity.net> Acked-by: Vlastimil Babka <vbabka@suse.cz> Acked-by: Michal Hocko <mhocko@suse.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Thierry Reding <thierry.reding@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2023-02-02 22:33:13 -08:00
Mel Gorman	1ebbb21811	mm/page_alloc: explicitly define how __GFP_HIGH non-blocking allocations accesses reserves GFP_ATOMIC allocations get flagged ALLOC_HARDER which is a vague description. In preparation for the removal of GFP_ATOMIC redefine __GFP_ATOMIC to simply mean non-blocking and renaming ALLOC_HARDER to ALLOC_NON_BLOCK accordingly. __GFP_HIGH is required for access to reserves but non-blocking is granted more access. For example, GFP_NOWAIT is non-blocking but has no special access to reserves. A __GFP_NOFAIL blocking allocation is granted access similar to __GFP_HIGH if the only alternative is an OOM kill. Link: https://lkml.kernel.org/r/20230113111217.14134-6-mgorman@techsingularity.net Signed-off-by: Mel Gorman <mgorman@techsingularity.net> Acked-by: Michal Hocko <mhocko@suse.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: NeilBrown <neilb@suse.de> Cc: Thierry Reding <thierry.reding@gmail.com> Cc: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2023-02-02 22:33:12 -08:00
Mel Gorman	ab35088543	mm/page_alloc: explicitly define what alloc flags deplete min reserves As there are more ALLOC_ flags that affect reserves, define what flags affect reserves and clarify the effect of each flag. Link: https://lkml.kernel.org/r/20230113111217.14134-5-mgorman@techsingularity.net Signed-off-by: Mel Gorman <mgorman@techsingularity.net> Acked-by: Vlastimil Babka <vbabka@suse.cz> Acked-by: Michal Hocko <mhocko@suse.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: NeilBrown <neilb@suse.de> Cc: Thierry Reding <thierry.reding@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2023-02-02 22:33:12 -08:00
Mel Gorman	eb2e2b425c	mm/page_alloc: explicitly record high-order atomic allocations in alloc_flags A high-order ALLOC_HARDER allocation is assumed to be atomic. While that is accurate, it changes later in the series. In preparation, explicitly record high-order atomic allocations in gfp_to_alloc_flags(). Link: https://lkml.kernel.org/r/20230113111217.14134-4-mgorman@techsingularity.net Signed-off-by: Mel Gorman <mgorman@techsingularity.net> Acked-by: Vlastimil Babka <vbabka@suse.cz> Acked-by: Michal Hocko <mhocko@suse.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: NeilBrown <neilb@suse.de> Cc: Thierry Reding <thierry.reding@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2023-02-02 22:33:12 -08:00
Mel Gorman	c988dcbecf	mm/page_alloc: treat RT tasks similar to __GFP_HIGH RT tasks are allowed to dip below the min reserve but ALLOC_HARDER is typically combined with ALLOC_MIN_RESERVE so RT tasks are a little unusual. While there is some justification for allowing RT tasks access to memory reserves, there is a strong chance that a RT task that is also under memory pressure is at risk of missing deadlines anyway. Relax how much reserves an RT task can access by treating it the same as __GFP_HIGH allocations. Note that in a future kernel release that the RT special casing will be removed. Hard realtime tasks should be locking down resources in advance and ensuring enough memory is available. Even a soft-realtime task like audio or video live decoding which cannot jitter should be allocating both memory and any disk space required up-front before the recording starts instead of relying on reserves. At best, reserve access will only delay the problem by a very short interval. Link: https://lkml.kernel.org/r/20230113111217.14134-3-mgorman@techsingularity.net Signed-off-by: Mel Gorman <mgorman@techsingularity.net> Acked-by: Vlastimil Babka <vbabka@suse.cz> Acked-by: Michal Hocko <mhocko@suse.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: NeilBrown <neilb@suse.de> Cc: Thierry Reding <thierry.reding@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2023-02-02 22:33:12 -08:00
Mel Gorman	524c48072e	mm/page_alloc: rename ALLOC_HIGH to ALLOC_MIN_RESERVE Patch series "Discard __GFP_ATOMIC", v3. Neil's patch has been residing in mm-unstable as commit 2fafb4fe8f7a ("mm: discard __GFP_ATOMIC") for a long time and recently brought up again. Most recently, I was worried that __GFP_HIGH allocations could use high-order atomic reserves which is unintentional but there was no response so lets revisit -- this series reworks how min reserves are used, protects highorder reserves and then finishes with Neil's patch with very minor modifications so it fits on top. There was a review discussion on renaming __GFP_DIRECT_RECLAIM to __GFP_ALLOW_BLOCKING but I didn't think it was that big an issue and is orthogonal to the removal of __GFP_ATOMIC. There were some concerns about how the gfp flags affect the min reserves but it never reached a solid conclusion so I made my own attempt. The series tries to iron out some of the details on how reserves are used. ALLOC_HIGH becomes ALLOC_MIN_RESERVE and ALLOC_HARDER becomes ALLOC_NON_BLOCK and documents how the reserves are affected. For example, ALLOC_NON_BLOCK (no direct reclaim) on its own allows 25% of the min reserve. ALLOC_MIN_RESERVE (__GFP_HIGH) allows 50% and both combined allows deeper access again. ALLOC_OOM allows access to 75%. High-order atomic allocations are explicitly handled with the caveat that no __GFP_ATOMIC flag means that any high-order allocation that specifies GFP_HIGH and cannot enter direct reclaim will be treated as if it was GFP_ATOMIC. This patch (of 6): __GFP_HIGH aliases to ALLOC_HIGH but the name does not really hint what it means. As ALLOC_HIGH is internal to the allocator, rename it to ALLOC_MIN_RESERVE to document that the min reserves can be depleted. Link: https://lkml.kernel.org/r/20230113111217.14134-1-mgorman@techsingularity.net Link: https://lkml.kernel.org/r/20230113111217.14134-2-mgorman@techsingularity.net Signed-off-by: Mel Gorman <mgorman@techsingularity.net> Acked-by: Vlastimil Babka <vbabka@suse.cz> Acked-by: Michal Hocko <mhocko@suse.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: NeilBrown <neilb@suse.de> Cc: Thierry Reding <thierry.reding@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2023-02-02 22:33:11 -08:00
Pasha Tatashin	6189eb82f0	mm/page_ext: do not allocate space for page_ext->flags if not needed There is 8 byte page_ext->flags field allocated per page whenever CONFIG_PAGE_EXTENSION is enabled. However, not every user of page_ext uses flags. Therefore, check whether flags is needed at least by one user and if so allocate space for it. For example when page_table_check is enabled, on a machine with 128G of memory before the fix: [ 2.244288] allocated 536870912 bytes of page_ext after the fix: [ 2.160154] allocated 268435456 bytes of page_ext Also, add a kernel-doc comment before page_ext_operations that describes the fields, and remove check if need() is set, as that is now a required field. [pasha.tatashin@soleen.com: address comments from Mike Rapoport] Link: https://lkml.kernel.org/r/20230117202103.1412449-1-pasha.tatashin@soleen.com Link: https://lkml.kernel.org/r/20230113154253.92480-1-pasha.tatashin@soleen.com Signed-off-by: Pasha Tatashin <pasha.tatashin@soleen.com> Acked-by: David Hildenbrand <david@redhat.com> Acked-by: Vlastimil Babka <vbabka@suse.cz> Acked-by: David Rientjes <rientjes@google.com> Reviewed-by: Mike Rapoport (IBM) <rppt@kernel.org> Cc: Charan Teja Kalla <quic_charante@quicinc.com> Cc: Li Zhe <lizhe.67@bytedance.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2023-02-02 22:33:11 -08:00
David Hildenbrand	950fe885a8	mm: remove __HAVE_ARCH_PTE_SWP_EXCLUSIVE __HAVE_ARCH_PTE_SWP_EXCLUSIVE is now supported by all architectures that support swp PTEs, so let's drop it. Link: https://lkml.kernel.org/r/20230113171026.582290-27-david@redhat.com Signed-off-by: David Hildenbrand <david@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2023-02-02 22:33:11 -08:00
David Hildenbrand	f5c3fe300c	xtensa/mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE Let's support __HAVE_ARCH_PTE_SWP_EXCLUSIVE by using bit 1. This bit should be safe to use for our usecase. Most importantly, we can still distinguish swap PTEs from PAGE_NONE PTEs (see pte_present()) and don't use one of the two reserved attribute masks (1101 and 1111). Attribute mask 1100 and 1110 now identify swap PTEs. While at it, remove SWP_TYPE_BITS (not really helpful as it's not used in the actual swap macros) and mask the type in __swp_entry(). Link: https://lkml.kernel.org/r/20230113171026.582290-26-david@redhat.com Signed-off-by: David Hildenbrand <david@redhat.com> Cc: Chris Zankel <chris@zankel.net> Cc: Max Filippov <jcmvbkbc@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2023-02-02 22:33:11 -08:00
David Hildenbrand	93c0eac40d	x86/mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE also on 32bit Let's support __HAVE_ARCH_PTE_SWP_EXCLUSIVE just like we already do on x86-64. After deciphering the PTE layout it becomes clear that there are still unused bits for 2-level and 3-level page tables that we should be able to use. Reusing a bit avoids stealing one bit from the swap offset. While at it, mask the type in __swp_entry(); use some helper definitions to make the macros easier to grasp. Link: https://lkml.kernel.org/r/20230113171026.582290-25-david@redhat.com Signed-off-by: David Hildenbrand <david@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2023-02-02 22:33:10 -08:00
David Hildenbrand	e2858d778e	um/mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE Let's support __HAVE_ARCH_PTE_SWP_EXCLUSIVE by using bit 10, which is yet unused for swap PTEs. The pte_mkuptodate() is a bit weird in __pte_to_swp_entry() for a swap PTE ... but it only messes with bit 1 and 2 and there is a comment in set_pte(), so leave these bits alone. While at it, mask the type in __swp_entry(). Link: https://lkml.kernel.org/r/20230113171026.582290-24-david@redhat.com Signed-off-by: David Hildenbrand <david@redhat.com> Cc: Richard Weinberger <richard@nod.at> Cc: Anton Ivanov <anton.ivanov@cambridgegreys.com> Cc: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2023-02-02 22:33:10 -08:00
David Hildenbrand	adf8e329ff	sparc/mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE on 64bit Let's support __HAVE_ARCH_PTE_SWP_EXCLUSIVE by stealing one bit from the type. Generic MM currently only uses 5 bits for the type (MAX_SWAPFILES_SHIFT), so the stolen bit was effectively unused. While at it, mask the type in __swp_entry(). Link: https://lkml.kernel.org/r/20230113171026.582290-23-david@redhat.com Signed-off-by: David Hildenbrand <david@redhat.com> Cc: "David S. Miller" <davem@davemloft.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2023-02-02 22:33:10 -08:00
David Hildenbrand	e6b37d7f6f	sparc/mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE on 32bit Let's support __HAVE_ARCH_PTE_SWP_EXCLUSIVE by reusing the SRMMU_DIRTY bit as that seems to be safe to reuse inside a swap PTE. This avoids having to steal one bit from the swap offset. While at it, relocate the swap PTE layout documentation and use the same style now used for most other archs. Note that the old documentation was wrong: we use 20 bit for the offset and the reserved bits were 8 instead of 7 bits in the ascii art. Link: https://lkml.kernel.org/r/20230113171026.582290-22-david@redhat.com Signed-off-by: David Hildenbrand <david@redhat.com> Cc: "David S. Miller" <davem@davemloft.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2023-02-02 22:33:10 -08:00
David Hildenbrand	cca10df102	sh/mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE Let's support __HAVE_ARCH_PTE_SWP_EXCLUSIVE by using bit 6 in the PTE, reducing the swap type in the !CONFIG_X2TLB case to 5 bits. Generic MM currently only uses 5 bits for the type (MAX_SWAPFILES_SHIFT), so the stolen bit is effectively unused. Interrestingly, the swap type in the !CONFIG_X2TLB case could currently overlap with the _PAGE_PRESENT bit, because there is a sneaky shift by 1 in __pte_to_swp_entry() and __swp_entry_to_pte(). Bit 0-7 in the architecture specific swap PTE would get shifted to bit 1-8 in the PTE. As generic MM uses 5 bits only, this didn't matter so far. While at it, mask the type in __swp_entry(). Link: https://lkml.kernel.org/r/20230113171026.582290-21-david@redhat.com Signed-off-by: David Hildenbrand <david@redhat.com> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Cc: Rich Felker <dalias@libc.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2023-02-02 22:33:10 -08:00
David Hildenbrand	51a1007d41	riscv/mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE Let's support __HAVE_ARCH_PTE_SWP_EXCLUSIVE by stealing one bit from the offset. This reduces the maximum swap space per file: on 32bit to 16 GiB (was 32 GiB). Note that this bit does not conflict with swap PMDs and could also be used in swap PMD context later. While at it, mask the type in __swp_entry(). Link: https://lkml.kernel.org/r/20230113171026.582290-20-david@redhat.com Signed-off-by: David Hildenbrand <david@redhat.com> Cc: Paul Walmsley <paul.walmsley@sifive.com> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: Albert Ou <aou@eecs.berkeley.edu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2023-02-02 22:33:09 -08:00
David Hildenbrand	2bba2ffbe0	powerpc/nohash/mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE Let's support __HAVE_ARCH_PTE_SWP_EXCLUSIVE on 32bit and 64bit. On 64bit, let's use MSB 56 (LSB 7), located right next to the page type. On 32bit, let's use LSB 2 to avoid stealing one bit from the swap offset. There seems to be no real reason why these bits cannot be used for swap PTEs. The important part is that _PAGE_PRESENT and _PAGE_HASHPTE remain 0. While at it, mask the type in __swp_entry() and remove _PAGE_BIT_SWAP_TYPE from pte-e500.h: while it was used in 64bit code it was ignored in 32bit code. Link: https://lkml.kernel.org/r/20230113171026.582290-19-david@redhat.com Signed-off-by: David Hildenbrand <david@redhat.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2023-02-02 22:33:09 -08:00
David Hildenbrand	8897ebff37	powerpc/mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE on 32bit book3s We already implemented support for 64bit book3s in commit `bff9beaa2e` ("powerpc/pgtable: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE for book3s") Let's support __HAVE_ARCH_PTE_SWP_EXCLUSIVE also in 32bit by reusing yet unused LSB 2 / MSB 29. There seems to be no real reason why that bit cannot be used, and reusing it avoids having to steal one bit from the swap offset. While at it, mask the type in __swp_entry(). Link: https://lkml.kernel.org/r/20230113171026.582290-18-david@redhat.com Signed-off-by: David Hildenbrand <david@redhat.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2023-02-02 22:33:09 -08:00
David Hildenbrand	6d239fc78c	parisc/mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE Let's support __HAVE_ARCH_PTE_SWP_EXCLUSIVE by using the yet-unused _PAGE_ACCESSED location in the swap PTE. Looking at pte_present() and pte_none() checks, there seems to be no actual reason why we cannot use it: we only have to make sure we're not using _PAGE_PRESENT. Reusing this bit avoids having to steal one bit from the swap offset. Link: https://lkml.kernel.org/r/20230113171026.582290-17-david@redhat.com Signed-off-by: David Hildenbrand <david@redhat.com> Cc: "James E.J. Bottomley" <James.Bottomley@HansenPartnership.com> Cc: Helge Deller <deller@gmx.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2023-02-02 22:33:09 -08:00
David Hildenbrand	5ae3e74474	openrisc/mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE Let's support __HAVE_ARCH_PTE_SWP_EXCLUSIVE by stealing one bit from the type. Generic MM currently only uses 5 bits for the type (MAX_SWAPFILES_SHIFT), so the stolen bit is effectively unused. While at it, mask the type in __swp_entry(). Link: https://lkml.kernel.org/r/20230113171026.582290-16-david@redhat.com Signed-off-by: David Hildenbrand <david@redhat.com> Cc: Stefan Kristiansson <stefan.kristiansson@saunalahti.fi> Cc: Stafford Horne <shorne@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2023-02-02 22:33:08 -08:00
David Hildenbrand	4d1d955f7c	nios2/mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE Let's support __HAVE_ARCH_PTE_SWP_EXCLUSIVE by using the yet-unused bit 31. Link: https://lkml.kernel.org/r/20230113171026.582290-15-david@redhat.com Signed-off-by: David Hildenbrand <david@redhat.com> Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2023-02-02 22:33:08 -08:00

... 3 4 5 6 7 ...

1154741 Commits