linux/mm
Mike Kravetz 5e9113731a mm/hugetlb: add cache of descriptors to resv_map for region_add
hugetlbfs is used today by applications that want a high degree of
control over huge page usage.  Often, large hugetlbfs files are used to
map a large number huge pages into the application processes.  The
applications know when page ranges within these large files will no
longer be used, and ideally would like to release them back to the
subpool or global pools for other uses.  The fallocate() system call
provides an interface for preallocation and hole punching within files.
This patch set adds fallocate functionality to hugetlbfs.

fallocate hole punch will want to remove a specific range of pages.
When pages are removed, their associated entries in the region/reserve
map will also be removed.  This will break an assumption in the
region_chg/region_add calling sequence.  If a new region descriptor must
be allocated, it is done as part of the region_chg processing.  In this
way, region_add can not fail because it does not need to attempt an
allocation.

To prepare for fallocate hole punch, create a "cache" of descriptors
that can be used by region_add if necessary.  region_chg will ensure
there are sufficient entries in the cache.  It will be necessary to
track the number of in progress add operations to know a sufficient
number of descriptors reside in the cache.  A new routine region_abort
is added to adjust this in progress count when add operations are
aborted.  vma_abort_reservation is also added for callers creating
reservations with vma_needs_reservation/vma_commit_reservation.

[akpm@linux-foundation.org: fix typo in comment, use more cols]
Signed-off-by: Mike Kravetz <mike.kravetz@oracle.com>
Reviewed-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Acked-by: Hillf Danton <hillf.zj@alibaba-inc.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Aneesh Kumar <aneesh.kumar@linux.vnet.ibm.com>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Michal Hocko <mhocko@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-09-08 15:35:28 -07:00
..
kasan x86/kasan, mm: Introduce generic kasan_populate_zero_shadow() 2015-08-22 14:54:55 +02:00
backing-dev.c inode: rename i_wb_list to i_io_list 2015-08-17 23:38:10 -04:00
balloon_compaction.c mm/balloon_compaction: fix deflation when compaction is disabled 2014-10-29 16:33:15 -07:00
bootmem.c mm: page_alloc: pass PFN to __free_pages_bootmem 2015-06-30 19:44:55 -07:00
cleancache.c cleancache: remove limit on the number of cleancache enabled filesystems 2015-04-14 16:49:03 -07:00
cma_debug.c mm/cma_debug: correct size input to bitmap function 2015-07-17 16:39:54 -07:00
cma.c mm/memblock: add extra "flags" to memblock to allow selection of memory based on attribute 2015-06-24 17:49:44 -07:00
cma.h mm: cma: mark cma_bitmap_maxno() inline in header 2015-08-14 15:56:32 -07:00
compaction.c mm/compaction.c: fix "suitable_migration_target() unused" warning 2015-04-15 16:35:20 -07:00
debug-pagealloc.c mm/debug-pagealloc: make debug-pagealloc boottime configurable 2014-12-13 12:42:48 -08:00
debug.c tracing: Rename ftrace_event.h to trace_events.h 2015-05-13 14:05:12 -04:00
dmapool.c mm/dmapool: allow NULL `pool' pointer in dma_pool_destroy() 2015-09-08 15:35:28 -07:00
early_ioremap.c mm: create generic early_ioremap() support 2014-04-07 16:36:15 -07:00
fadvise.c writeback: implement and use inode_congested() 2015-06-02 08:33:35 -06:00
failslab.c
filemap.c fs: do not prefault sys_write() user buffer pages 2015-09-08 15:35:28 -07:00
frontswap.c frontswap: allow multiple backends 2015-06-24 17:49:45 -07:00
gup.c mm: make GUP handle pfn mapping unless FOLL_GET is requested 2015-09-04 16:54:41 -07:00
highmem.c mm/highmem: make kmap cache coloring aware 2014-08-06 18:01:22 -07:00
huge_memory.c mm: make set_recommended_min_free_kbytes() return void 2015-09-08 15:35:28 -07:00
hugetlb_cgroup.c mm: page_counter: pull "-1" handling out of page_counter_memparse() 2015-02-11 17:06:02 -08:00
hugetlb.c mm/hugetlb: add cache of descriptors to resv_map for region_add 2015-09-08 15:35:28 -07:00
hwpoison-inject.c mm/memory-failure: introduce get_hwpoison_page() for consistent refcount handling 2015-06-24 17:49:42 -07:00
init-mm.c
internal.h mm: defer flush of writable TLB entries 2015-09-04 16:54:41 -07:00
interval_tree.c mm: replace vma->sharead.linear with vma->shared 2015-02-10 14:30:31 -08:00
Kconfig mm/Kconfig: NEED_BOUNCE_POOL: clean-up condition 2015-07-23 20:59:41 +02:00
Kconfig.debug mm/debug_pagealloc: remove obsolete Kconfig options 2015-01-08 15:10:52 -08:00
kmemcheck.c mm/slab_common: move kmem_cache definition to internal header 2014-10-09 22:25:50 -04:00
kmemleak-test.c mm/kmemleak-test.c: use pr_fmt for logging 2014-06-06 16:08:18 -07:00
kmemleak.c mm: kmemleak_alloc_percpu() should follow the gfp from per_alloc() 2015-06-24 17:49:46 -07:00
ksm.c mm: remove rest of ACCESS_ONCE() usages 2015-04-15 16:35:18 -07:00
list_lru.c memcg: reparent list_lrus and free kmemcg_id on css offline 2015-02-12 18:54:10 -08:00
maccess.c lib: move strncpy_from_unsafe() into mm/maccess.c 2015-08-31 12:36:10 -07:00
madvise.c mm/madvise.c: make madvise_behaviour_valid() return bool 2015-09-04 16:54:41 -07:00
Makefile userfaultfd: mcopy_atomic|mfill_zeropage: UFFDIO_COPY|UFFDIO_ZEROPAGE preparation 2015-09-04 16:54:41 -07:00
memblock.c mm/memblock.c: WARN_ON when flags differs from overlap region 2015-09-08 15:35:28 -07:00
memcontrol.c memcg: move memcg_proto_active from sock.h 2015-09-08 15:35:28 -07:00
memory_hotplug.c memory-hotplug: add hot-added memory ranges to memblock before allocate node_data for a node. 2015-09-04 16:54:41 -07:00
memory-failure.c memcg: export struct mem_cgroup 2015-09-08 15:35:28 -07:00
memory.c mm, dax: use i_mmap_unlock_write() in do_cow_fault() 2015-09-08 15:35:28 -07:00
mempolicy.c userfaultfd: teach vma_merge to merge across vma->vm_userfaultfd_ctx 2015-09-04 16:54:41 -07:00
mempool.c mm/mempool: allow NULL `pool' pointer in mempool_destroy() 2015-09-08 15:35:28 -07:00
memtest.c memtest: remove unused header files 2015-09-08 15:35:28 -07:00
migrate.c mm: fix status code which move_pages() returns for zero page 2015-09-04 16:54:41 -07:00
mincore.c mincore: apply page table walker on do_mincore() 2015-02-11 17:06:06 -08:00
mlock.c userfaultfd: teach vma_merge to merge across vma->vm_userfaultfd_ctx 2015-09-04 16:54:41 -07:00
mm_init.c mm: meminit: remove mminit_verify_page_links 2015-06-30 19:44:56 -07:00
mmap.c mremap: fix the wrong !vma->vm_file check in copy_vma() 2015-09-08 15:35:28 -07:00
mmu_context.c sched/mm: call finish_arch_post_lock_switch in idle_task_exit and use_mm 2014-02-21 08:50:17 +01:00
mmu_notifier.c mmu_notifier: add the callback for mmu_notifier_invalidate_range() 2014-11-13 13:46:09 +11:00
mmzone.c mm: microoptimize zonelist operations 2015-02-11 17:06:02 -08:00
mprotect.c userfaultfd: teach vma_merge to merge across vma->vm_userfaultfd_ctx 2015-09-04 16:54:41 -07:00
mremap.c mremap: simplify the "overlap" check in mremap_to() 2015-09-04 16:54:41 -07:00
msync.c mm: remove rest usage of VM_NONLINEAR and pte_file() 2015-02-10 14:30:31 -08:00
nobootmem.c mm: page_alloc: pass PFN to __free_pages_bootmem 2015-06-30 19:44:55 -07:00
nommu.c Merge branch 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial 2015-09-01 18:46:42 -07:00
oom_kill.c mm, oom: remove unnecessary variable 2015-09-08 15:35:28 -07:00
page_alloc.c mm: rename and move get/set_freepage_migratetype 2015-09-08 15:35:28 -07:00
page_counter.c mm: page_counter: pull "-1" handling out of page_counter_memparse() 2015-02-11 17:06:02 -08:00
page_ext.c mm/page_owner: keep track of page owners 2014-12-13 12:42:48 -08:00
page_io.c fs: use helper bio_add_page() instead of open coding on bi_io_vec 2015-08-13 12:32:00 -06:00
page_isolation.c mm, page_isolation: remove bogus tests for isolated pages 2015-09-08 15:35:28 -07:00
page_owner.c mm/page_owner: set correct gfp_mask on page_owner 2015-07-17 16:39:54 -07:00
page-writeback.c writeback: fix initial dirty limit 2015-08-07 04:39:42 +03:00
pagewalk.c mm/pagewalk.c: prevent positive return value of walk_page_test() from being passed to callers 2015-03-25 16:20:30 -07:00
percpu-km.c percpu: implmeent pcpu_nr_empty_pop_pages and chunk->nr_populated 2014-09-02 14:46:05 -04:00
percpu-vm.c percpu: move region iterations out of pcpu_[de]populate_chunk() 2014-09-02 14:46:02 -04:00
percpu.c percpu: clean up of schunk->map[] assignment in pcpu_setup_first_chunk 2015-07-21 11:31:00 -04:00
pgtable-generic.c mm: clarify that the function operates on hugepage pte 2015-06-24 17:49:44 -07:00
process_vm_access.c process_vm_access: switch to {compat_,}import_iovec() 2015-04-11 22:27:12 -04:00
quicklist.c
readahead.c writeback: implement and use inode_congested() 2015-06-02 08:33:35 -06:00
rmap.c mm: defer flush of writable TLB entries 2015-09-04 16:54:41 -07:00
shmem.c ipc: use private shmem or hugetlbfs inodes for shm segments. 2015-08-07 04:39:41 +03:00
slab_common.c memcg: export struct mem_cgroup 2015-09-08 15:35:28 -07:00
slab.c slab: infrastructure for bulk object allocation and freeing 2015-09-04 16:54:41 -07:00
slab.h mm/slab.h: fix argument order in cache_from_obj's error message 2015-09-04 16:54:41 -07:00
slob.c slab: infrastructure for bulk object allocation and freeing 2015-09-04 16:54:41 -07:00
slub.c mm/slub: don't wait for high-order page allocation 2015-09-04 16:54:41 -07:00
sparse-vmemmap.c mm/sparse: use memblock apis for early memory allocations 2014-01-21 16:19:47 -08:00
sparse.c mm: use macros from compiler.h instead of __attribute__((...)) 2014-04-07 16:35:54 -07:00
swap_cgroup.c mm: page_cgroup: rename file to mm/swap_cgroup.c 2014-12-10 17:41:09 -08:00
swap_state.c mm: remove rest of ACCESS_ONCE() usages 2015-04-15 16:35:18 -07:00
swap.c mm: drop bogus VM_BUG_ON_PAGE assert in put_page() codepath 2015-06-24 17:49:42 -07:00
swapfile.c mm: /proc/pid/smaps:: show proportional swap share of the mapping 2015-09-08 15:35:28 -07:00
truncate.c memcg: add per cgroup dirty page accounting 2015-06-02 08:33:33 -06:00
userfaultfd.c userfaultfd: avoid mmap_sem read recursion in mcopy_atomic 2015-09-04 16:54:41 -07:00
util.c mm: uninline and cleanup page-mapping related helpers 2015-04-15 16:35:19 -07:00
vmacache.c mm,vmacache: count number of system-wide flushes 2014-12-13 12:42:48 -08:00
vmalloc.c mm/vmalloc: get rid of dirty bitmap inside vmap_block structure 2015-04-15 16:35:18 -07:00
vmpressure.c mm/vmpressure.c: fix race in vmpressure_work_fn() 2014-12-02 17:32:07 -08:00
vmscan.c memcg: export struct mem_cgroup 2015-09-08 15:35:28 -07:00
vmstat.c vmstat: Reduce time interval to stat update on idle cpu 2015-02-11 17:06:07 -08:00
workingset.c list_lru: add helpers to isolate items 2015-02-12 18:54:10 -08:00
zbud.c zpool: remove zpool_evict() 2015-06-25 17:00:37 -07:00
zpool.c zpool: remove zpool_evict() 2015-06-25 17:00:37 -07:00
zsmalloc.c zpool: remove zpool_evict() 2015-06-25 17:00:37 -07:00
zswap.c zswap: runtime enable/disable 2015-06-25 17:00:37 -07:00