linux/mm
Mel Gorman 62c230bc17 mm: add support for a filesystem to activate swap files and use direct_IO for writing swap pages
Currently swapfiles are managed entirely by the core VM by using ->bmap to
allocate space and write to the blocks directly.  This effectively ensures
that the underlying blocks are allocated and avoids the need for the swap
subsystem to locate what physical blocks store offsets within a file.

If the swap subsystem is to use the filesystem information to locate the
blocks, it is critical that information such as block groups, block
bitmaps and the block descriptor table that map the swap file were
resident in memory.  This patch adds address_space_operations that the VM
can call when activating or deactivating swap backed by a file.

  int swap_activate(struct file *);
  int swap_deactivate(struct file *);

The ->swap_activate() method is used to communicate to the file that the
VM relies on it, and the address_space should take adequate measures such
as reserving space in the underlying device, reserving memory for mempools
and pinning information such as the block descriptor table in memory.  The
->swap_deactivate() method is called on sys_swapoff() if ->swap_activate()
returned success.

After a successful swapfile ->swap_activate, the swapfile is marked
SWP_FILE and swapper_space.a_ops will proxy to
sis->swap_file->f_mappings->a_ops using ->direct_io to write swapcache
pages and ->readpage to read.

It is perfectly possible that direct_IO be used to read the swap pages but
it is an unnecessary complication.  Similarly, it is possible that
->writepage be used instead of direct_io to write the pages but filesystem
developers have stated that calling writepage from the VM is undesirable
for a variety of reasons and using direct_IO opens up the possibility of
writing back batches of swap pages in the future.

[a.p.zijlstra@chello.nl: Original patch]
Signed-off-by: Mel Gorman <mgorman@suse.de>
Acked-by: Rik van Riel <riel@redhat.com>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: David S. Miller <davem@davemloft.net>
Cc: Eric B Munson <emunson@mgebm.net>
Cc: Eric Paris <eparis@redhat.com>
Cc: James Morris <jmorris@namei.org>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Mike Christie <michaelc@cs.wisc.edu>
Cc: Neil Brown <neilb@suse.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Sebastian Andrzej Siewior <sebastian@breakpoint.cc>
Cc: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: Xiaotian Feng <dfeng@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-07-31 18:42:47 -07:00
..
backing-dev.c mm: prepare for removal of obsolete /proc/sys/vm/nr_pdflush_threads 2012-07-31 18:42:40 -07:00
bootmem.c bootmem: make ___alloc_bootmem_node_nopanic() really nopanic 2012-07-17 16:21:29 -07:00
bounce.c bounce: allow use of bounce pool via config option 2012-07-18 16:40:35 -04:00
cleancache.c ->encode_fh() API change 2012-05-29 23:28:33 -04:00
compaction.c mm: have order > 0 compaction start off where it left 2012-07-31 18:42:43 -07:00
debug-pagealloc.c
dmapool.c
fadvise.c mm, fadvise: don't return -EINVAL when filesystem cannot implement fadvise() 2012-07-31 18:42:42 -07:00
failslab.c
filemap_xip.c fs: introduce inode operation ->update_time 2012-06-01 12:07:25 -04:00
filemap.c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2012-06-01 10:34:35 -07:00
fremap.c
frontswap.c mm/frontswap: cleanup doc and comment error 2012-07-23 11:16:20 -04:00
highmem.c
huge_memory.c mm/memcg: apply add/del_page to lruvec 2012-05-29 16:22:28 -07:00
hugetlb_cgroup.c hugetlb/cgroup: remove exclude and wakeup rmdir calls from migrate 2012-07-31 18:42:41 -07:00
hugetlb.c hugetlb/cgroup: assign the page hugetlb cgroup when we move the page to active list. 2012-07-31 18:42:41 -07:00
hwpoison-inject.c memcg: rename config variables 2012-07-31 18:42:43 -07:00
init-mm.c
internal.h netvm: allow skb allocation to use PFMEMALLOC reserves 2012-07-31 18:42:46 -07:00
Kconfig mm: factor out memory isolate functions 2012-07-31 18:42:45 -07:00
Kconfig.debug
kmemcheck.c
kmemleak-test.c
kmemleak.c
ksm.c ksm: cleanup: introduce find_mergeable_vma() 2012-03-21 17:54:59 -07:00
maccess.c
madvise.c mm: Hold a file reference in madvise_remove 2012-07-06 10:34:38 -07:00
Makefile mm: factor out memory isolate functions 2012-07-31 18:42:45 -07:00
memblock.c mm/memblock.c:memblock_double_array(): cosmetic cleanups 2012-07-31 18:42:41 -07:00
memcontrol.c mm, memcg: move all oom handling to memcontrol.c 2012-07-31 18:42:45 -07:00
memory_hotplug.c mm/hotplug: free zone->pageset when a zone becomes empty 2012-07-31 18:42:44 -07:00
memory-failure.c memcg: rename config variables 2012-07-31 18:42:43 -07:00
memory.c mm/memory.c:print_vma_addr(): call up_read(&mm->mmap_sem) directly 2012-07-31 18:42:43 -07:00
mempolicy.c Merge branch 'slab/next' of git://git.kernel.org/pub/scm/linux/kernel/git/penberg/linux 2012-07-30 11:32:24 -07:00
mempool.c
migrate.c hugetlb/cgroup: migrate hugetlb cgroup info from oldpage to new page during migration 2012-07-31 18:42:41 -07:00
mincore.c mm: thp: fix pmd_bad() triggering in code paths holding mmap_sem read mode 2012-03-21 17:54:54 -07:00
mlock.c vm: avoid using find_vma_prev() unnecessarily 2012-03-06 18:23:36 -08:00
mm_init.c
mmap.c mm: account the total_vm in the vm_stat_account() 2012-07-31 18:42:39 -07:00
mmu_context.c mm, counters: remove task argument to sync_mm_rss() and __sync_task_rss_stat() 2012-03-21 17:54:59 -07:00
mmu_notifier.c
mmzone.c memcg: rename config variables 2012-07-31 18:42:43 -07:00
mprotect.c Merge branch 'akpm' (Andrew's patch-bomb) 2012-03-22 09:04:48 -07:00
mremap.c mm: account the total_vm in the vm_stat_account() 2012-07-31 18:42:39 -07:00
msync.c
nobootmem.c memblock: free allocated memblock_reserved_regions later 2012-07-11 16:04:50 -07:00
nommu.c nommu: fix compilation of nommu.c 2012-06-04 17:17:31 -04:00
oom_kill.c mm, memcg: move all oom handling to memcontrol.c 2012-07-31 18:42:45 -07:00
page_alloc.c mm: throttle direct reclaimers if PF_MEMALLOC reserves are low and swap is backed by network storage 2012-07-31 18:42:46 -07:00
page_cgroup.c memcg: rename config variables 2012-07-31 18:42:43 -07:00
page_io.c mm: add support for a filesystem to activate swap files and use direct_IO for writing swap pages 2012-07-31 18:42:47 -07:00
page_isolation.c memory-hotplug: fix kswapd looping forever problem 2012-07-31 18:42:45 -07:00
page-writeback.c writeback: Fix some comment errors 2012-06-09 19:54:47 +08:00
pagewalk.c mm: fix kernel-doc warnings 2012-06-20 14:39:36 -07:00
percpu-km.c
percpu-vm.c mm: fix kernel-doc warnings 2012-06-20 14:39:36 -07:00
percpu.c kmemleak: Fix the kmemleak tracking of the percpu areas with !SMP 2012-05-09 10:13:29 -07:00
pgtable-generic.c arch/tile: allow building Linux with transparent huge pages enabled 2012-05-25 12:48:21 -04:00
prio_tree.c
process_vm_access.c aio/vfs: cleanup of rw_copy_check_uvector() and compat_rw_copy_check_uvector() 2012-05-31 17:49:32 -07:00
quicklist.c
readahead.c mm: move readahead syscall to mm/readahead.c 2012-05-29 16:22:23 -07:00
rmap.c mm: remove swap token code 2012-05-29 16:22:19 -07:00
shmem.c don't pass nameidata to ->create() 2012-07-14 16:34:47 +04:00
slab_common.c mm: Fix build warning in kmem_cache_create() 2012-07-30 13:15:40 +03:00
slab.c mm: micro-optimise slab to avoid a function call 2012-07-31 18:42:46 -07:00
slab.h mm, sl[aou]b: Use a common mutex definition 2012-07-09 12:13:41 +03:00
slob.c slob: Fix early boot kernel crash 2012-07-12 10:13:22 +03:00
slub.c mm: slub: optimise the SLUB fast path to avoid pfmemalloc checks 2012-07-31 18:42:45 -07:00
sparse-vmemmap.c
sparse.c mm: setup pageblock_order before it's used by sparsemem 2012-07-31 18:42:43 -07:00
swap_state.c mm: add support for a filesystem to activate swap files and use direct_IO for writing swap pages 2012-07-31 18:42:47 -07:00
swap.c mm: add get_kernel_page[s] for pinning of kernel addresses for I/O 2012-07-31 18:42:47 -07:00
swapfile.c mm: add support for a filesystem to activate swap files and use direct_IO for writing swap pages 2012-07-31 18:42:47 -07:00
truncate.c mm/fs: remove truncate_range 2012-05-29 16:22:23 -07:00
util.c new helper: vm_mmap_pgoff() 2012-06-01 10:37:18 -04:00
vmalloc.c mm: make vb_alloc() more foolproof 2012-07-31 18:42:39 -07:00
vmscan.c mm: account for the number of times direct reclaimers get throttled 2012-07-31 18:42:46 -07:00
vmstat.c mm: account for the number of times direct reclaimers get throttled 2012-07-31 18:42:46 -07:00