linux/arch
Dave Hansen e8c24d3a23 x86/pkeys: Allocation/free syscalls
This patch adds two new system calls:

	int pkey_alloc(unsigned long flags, unsigned long init_access_rights)
	int pkey_free(int pkey);

These implement an "allocator" for the protection keys
themselves, which can be thought of as analogous to the allocator
that the kernel has for file descriptors.  The kernel tracks
which numbers are in use, and only allows operations on keys that
are valid.  A key which was not obtained by pkey_alloc() may not,
for instance, be passed to pkey_mprotect().

These system calls are also very important given the kernel's use
of pkeys to implement execute-only support.  These help ensure
that userspace can never assume that it has control of a key
unless it first asks the kernel.  The kernel does not promise to
preserve PKRU (right register) contents except for allocated
pkeys.

The 'init_access_rights' argument to pkey_alloc() specifies the
rights that will be established for the returned pkey.  For
instance:

	pkey = pkey_alloc(flags, PKEY_DENY_WRITE);

will allocate 'pkey', but also sets the bits in PKRU[1] such that
writing to 'pkey' is already denied.

The kernel does not prevent pkey_free() from successfully freeing
in-use pkeys (those still assigned to a memory range by
pkey_mprotect()).  It would be expensive to implement the checks
for this, so we instead say, "Just don't do it" since sane
software will never do it anyway.

Any piece of userspace calling pkey_alloc() needs to be prepared
for it to fail.  Why?  pkey_alloc() returns the same error code
(ENOSPC) when there are no pkeys and when pkeys are unsupported.
They can be unsupported for a whole host of reasons, so apps must
be prepared for this.  Also, libraries or LD_PRELOADs might steal
keys before an application gets access to them.

This allocation mechanism could be implemented in userspace.
Even if we did it in userspace, we would still need additional
user/kernel interfaces to tell userspace which keys are being
used by the kernel internally (such as for execute-only
mappings).  Having the kernel provide this facility completely
removes the need for these additional interfaces, or having an
implementation of this in userspace at all.

Note that we have to make changes to all of the architectures
that do not use mman-common.h because we use the new
PKEY_DENY_ACCESS/WRITE macros in arch-independent code.

1. PKRU is the Protection Key Rights User register.  It is a
   usermode-accessible register that controls whether writes
   and/or access to each individual pkey is allowed or denied.

Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Acked-by: Mel Gorman <mgorman@techsingularity.net>
Cc: linux-arch@vger.kernel.org
Cc: Dave Hansen <dave@sr71.net>
Cc: arnd@arndb.de
Cc: linux-api@vger.kernel.org
Cc: linux-mm@kvack.org
Cc: luto@kernel.org
Cc: akpm@linux-foundation.org
Cc: torvalds@linux-foundation.org
Link: http://lkml.kernel.org/r/20160729163015.444FE75F@viggo.jf.intel.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-09-09 13:02:27 +02:00
..
alpha x86/pkeys: Allocation/free syscalls 2016-09-09 13:02:27 +02:00
arc ARC: export __udivdi3 for modules 2016-08-19 14:09:33 -07:00
arm Staging/IIO driver fixes for 4.8-rc5 2016-09-03 11:33:33 -07:00
arm64 - arm64 fix: debug exception unmasking on the CPU resume path 2016-09-03 12:31:37 -07:00
avr32 dma-mapping: use unsigned long for dma_attrs 2016-08-04 08:50:07 -04:00
blackfin net: smc91x: fix SMC accesses 2016-08-28 23:44:55 -04:00
c6x dma-mapping: use unsigned long for dma_attrs 2016-08-04 08:50:07 -04:00
cris dma-mapping: use unsigned long for dma_attrs 2016-08-04 08:50:07 -04:00
frv RTC for 4.8 2016-08-05 09:48:22 -04:00
h8300 h8300: Add missing include file to asm/io.h 2016-08-13 08:53:56 -07:00
hexagon dma-mapping: use unsigned long for dma_attrs 2016-08-04 08:50:07 -04:00
ia64 Implements HARDENED_USERCOPY verification of copy_to_user/copy_from_user 2016-08-08 14:48:14 -07:00
m32r mm: do not pass mm_struct into handle_mm_fault 2016-07-26 16:19:19 -07:00
m68k m68knommu: fix user a5 register being overwritten 2016-08-08 12:38:47 +10:00
metag metag: Drop show_mem() from mem_init() 2016-08-09 13:41:30 +01:00
microblaze dma-mapping: use unsigned long for dma_attrs 2016-08-04 08:50:07 -04:00
mips x86/pkeys: Allocation/free syscalls 2016-09-09 13:02:27 +02:00
mn10300 RTC for 4.8 2016-08-05 09:48:22 -04:00
nios2 dma-mapping: use unsigned long for dma_attrs 2016-08-04 08:50:07 -04:00
openrisc dma-mapping: use unsigned long for dma_attrs 2016-08-04 08:50:07 -04:00
parisc x86/pkeys: Allocation/free syscalls 2016-09-09 13:02:27 +02:00
powerpc powerpc: signals: Discard transaction state from signal frames 2016-08-29 12:48:40 +10:00
s390 mm/usercopy: get rid of CONFIG_DEBUG_STRICT_USER_COPY_CHECKS 2016-08-30 10:10:21 -07:00
score treewide: replace obsolete _refok by __ref 2016-08-02 17:31:41 -04:00
sh These changes improve device tree support (including builtin DTB), add 2016-08-06 09:00:05 -04:00
sparc Implements HARDENED_USERCOPY verification of copy_to_user/copy_from_user 2016-08-08 14:48:14 -07:00
tile mm/usercopy: get rid of CONFIG_DEBUG_STRICT_USER_COPY_CHECKS 2016-08-30 10:10:21 -07:00
um um: Don't discard .text.exit section 2016-08-23 23:16:16 +02:00
unicore32 unicore32: mm: Add missing parameter to arch_vma_access_permitted 2016-08-13 08:53:18 -07:00
x86 x86/pkeys: Allocation/free syscalls 2016-09-09 13:02:27 +02:00
xtensa x86/pkeys: Allocation/free syscalls 2016-09-09 13:02:27 +02:00
.gitignore
Kconfig Implements HARDENED_USERCOPY verification of copy_to_user/copy_from_user 2016-08-08 14:48:14 -07:00