linux/mm
David Rientjes 028fec414d mempolicy: support optional mode flags
With the evolution of mempolicies, it is necessary to support mempolicy mode
flags that specify how the policy shall behave in certain circumstances.  The
most immediate need for mode flag support is to suppress remapping the
nodemask of a policy at the time of rebind.

Both the mempolicy mode and flags are passed by the user in the 'int policy'
formal of either the set_mempolicy() or mbind() syscall.  A new constant,
MPOL_MODE_FLAGS, represents the union of legal optional flags that may be
passed as part of this int.  Mempolicies that include illegal flags as part of
their policy are rejected as invalid.

An additional member to struct mempolicy is added to support the mode flags:

	struct mempolicy {
		...
		unsigned short policy;
		unsigned short flags;
	}

The splitting of the 'int' actual passed by the user is done in
sys_set_mempolicy() and sys_mbind() for their respective syscalls.  This is
done by intersecting the actual with MPOL_MODE_FLAGS, rejecting the syscall of
there are additional flags, and storing it in the new 'flags' member of struct
mempolicy.  The intersection of the actual with ~MPOL_MODE_FLAGS is stored in
the 'policy' member of the struct and all current users of pol->policy remain
unchanged.

The union of the policy mode and optional mode flags is passed back to the
user in get_mempolicy().

This combination of mode and flags within the same actual does not break
userspace code that relies on get_mempolicy(&policy, ...) and either

	switch (policy) {
	case MPOL_BIND:
		...
	case MPOL_INTERLEAVE:
		...
	};

statements or

	if (policy == MPOL_INTERLEAVE) {
		...
	}

statements.  Such applications would need to use optional mode flags when
calling set_mempolicy() or mbind() for these previously implemented statements
to stop working.  If an application does start using optional mode flags, it
will need to mask the optional flags off the policy in switch and conditional
statements that only test mode.

An additional member is also added to struct shmem_sb_info to store the
optional mode flags.

[hugh@veritas.com: shmem mpol: fix build warning]
Cc: Paul Jackson <pj@sgi.com>
Cc: Christoph Lameter <clameter@sgi.com>
Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
Cc: Andi Kleen <ak@suse.de>
Signed-off-by: David Rientjes <rientjes@google.com>
Signed-off-by: Hugh Dickins <hugh@veritas.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-04-28 08:58:19 -07:00
..
allocpercpu.c cpumask: Cleanup more uses of CPU_MASK and NODE_MASK 2008-04-19 19:44:58 +02:00
backing-dev.c mm/backing-dev.c: fix percpu_counter_destroy call bug in bdi_init 2007-12-05 09:21:18 -08:00
bootmem.c mm: allow reserve_bootmem() cross nodes 2008-04-26 22:51:08 +02:00
bounce.c block: Initial support for data-less (or empty) barrier support 2007-10-16 11:03:56 +02:00
dmapool.c pool: Improve memory usage for devices which can't cross boundaries 2007-12-04 10:39:58 -05:00
fadvise.c check ADVICE of fadvise64_64 even if get_xip_page is given 2008-02-05 09:44:19 -08:00
filemap_xip.c Use pgoff_t instead of unsigned long 2008-02-08 09:22:32 -08:00
filemap.c mm: fix various kernel-doc comments 2008-03-19 18:53:35 -07:00
fremap.c mm: fix various kernel-doc comments 2008-03-19 18:53:35 -07:00
highmem.c mm: highmem kernel-doc additions 2008-03-19 18:53:35 -07:00
hugetlb.c hugetlb: decrease hugetlb_lock cycling in gather_surplus_huge_pages 2008-04-28 08:58:19 -07:00
internal.h Solve section mismatch for free_area_init_core. 2008-02-23 17:13:24 -08:00
Kconfig sh: Bump number of quicklists for SH-5. 2008-01-28 13:18:55 +09:00
maccess.c kgdb: fix optional arch functions and probe_kernel_* 2008-04-17 20:05:39 +02:00
madvise.c speed up madvise_need_mmap_write() usage 2007-07-16 09:05:36 -07:00
Makefile uaccess: add probe_kernel_write() 2008-04-17 20:05:36 +02:00
memcontrol.c memcg: fix node_state handling 2008-04-08 18:25:53 -07:00
memory_hotplug.c hotplug-memory: make online_page() common 2008-04-28 08:58:17 -07:00
memory.c mm: remove nopage 2008-04-28 08:58:18 -07:00
mempolicy.c mempolicy: support optional mode flags 2008-04-28 08:58:19 -07:00
mempool.c spelling fixes: mm/ 2007-10-20 01:27:18 +02:00
migrate.c memcg: fix VM_BUG_ON from page migration 2008-03-04 16:35:14 -08:00
mincore.c mm: remove nopage 2008-04-28 08:58:18 -07:00
mlock.c do not limit locked memory when RLIMIT_MEMLOCK is RLIM_INFINITY 2007-07-16 09:05:37 -07:00
mmap.c mmap_region: cleanup the final vma_merge() related code 2008-04-28 08:58:18 -07:00
mmzone.c mm: filter based on a nodemask as well as a gfp_mask 2008-04-28 08:58:19 -07:00
mprotect.c fix mprotect vma_wants_writenotify prot 2007-10-23 08:32:06 -07:00
mremap.c sparse pointer use of zero as null 2007-10-18 14:37:31 -07:00
msync.c Detach sched.h from mm.h 2007-05-21 09:18:19 -07:00
nommu.c nommu: add new vmalloc_user() and remap_vmalloc_range() interfaces. 2008-02-05 09:44:21 -08:00
oom_kill.c mm: have zonelist contains structs with both a zone pointer and zone_idx 2008-04-28 08:58:18 -07:00
page_alloc.c mm: filter based on a nodemask as well as a gfp_mask 2008-04-28 08:58:19 -07:00
page_io.c mm: fix PageUptodate data race 2008-02-05 09:44:19 -08:00
page_isolation.c memory hotremove: unset migrate type "ISOLATE" after removal 2007-11-14 18:45:38 -08:00
page-writeback.c writeback: speed up writeback of big dirty files 2008-02-05 09:44:19 -08:00
pagewalk.c mm: fix possible off-by-one in walk_pte_range() 2008-04-28 08:58:16 -07:00
pdflush.c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/juhl/trivial 2008-04-21 16:36:46 -07:00
prio_tree.c spelling fixes: mm/ 2007-10-20 01:27:18 +02:00
quicklist.c quicklists: Only consider memory that can be used with GFP_KERNEL 2008-01-14 08:52:22 -08:00
readahead.c mm/readahead: fix kernel-doc notation 2008-03-19 18:53:37 -07:00
rmap.c mm: remove nopage 2008-04-28 08:58:18 -07:00
shmem_acl.c [PATCH] Fix typos in mm/shmem_acl.c 2006-10-11 11:14:23 -07:00
shmem.c mempolicy: support optional mode flags 2008-04-28 08:58:19 -07:00
slab.c mm: move cache_line_size() to <linux/cache.h> 2008-04-28 08:58:19 -07:00
slob.c slob: reduce external fragmentation by using three free lists 2008-02-05 09:44:19 -08:00
slub.c mm: move cache_line_size() to <linux/cache.h> 2008-04-28 08:58:19 -07:00
sparse-vmemmap.c NULL noise: fs/*, mm/*, kernel/* 2008-03-30 14:18:41 -07:00
sparse.c hotplug memory remove: generic __remove_pages() support 2008-04-28 08:58:17 -07:00
swap_state.c mm: fix various kernel-doc comments 2008-03-19 18:53:35 -07:00
swap.c mm: fix various kernel-doc comments 2008-03-19 18:53:35 -07:00
swapfile.c mm: try both endianess when checking for endianess 2008-04-28 08:58:19 -07:00
thrash.c Bug in mm/thrash.c function grab_swap_token() 2007-05-11 08:29:32 -07:00
tiny-shmem.c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6 2008-03-25 08:57:47 -07:00
truncate.c fix invalidate_inode_pages2_range() to not clear ret 2008-04-28 08:58:18 -07:00
util.c fix mm/util.c:krealloc() 2007-11-14 18:45:41 -08:00
vmalloc.c mm: fix various kernel-doc comments 2008-03-19 18:53:35 -07:00
vmscan.c mm: have zonelist contains structs with both a zone pointer and zone_idx 2008-04-28 08:58:18 -07:00
vmstat.c mm: remember what the preferred zone is for zone_statistics 2008-04-28 08:58:18 -07:00