Documentation/protection-keys: add AArch64 to documentation

As POE support was recently added, update the documentation.

Also note that kernel threads have a default protection key register value.

Signed-off-by: Joey Gouly <joey.gouly@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Link: https://lore.kernel.org/r/20241001133618.1547996-3-joey.gouly@arm.com
[will: Adjusted wording based on feedback from Kevin]
Signed-off-by: Will Deacon <will@kernel.org>
This commit is contained in:
Joey Gouly 2024-10-01 14:36:18 +01:00 committed by Will Deacon
parent e3e8527133
commit f56d8d2389

View File

@ -12,7 +12,10 @@ Pkeys Userspace (PKU) is a feature which can be found on:
* Intel server CPUs, Skylake and later * Intel server CPUs, Skylake and later
* Intel client CPUs, Tiger Lake (11th Gen Core) and later * Intel client CPUs, Tiger Lake (11th Gen Core) and later
* Future AMD CPUs * Future AMD CPUs
* arm64 CPUs implementing the Permission Overlay Extension (FEAT_S1POE)
x86_64
======
Pkeys work by dedicating 4 previously Reserved bits in each page table entry to Pkeys work by dedicating 4 previously Reserved bits in each page table entry to
a "protection key", giving 16 possible keys. a "protection key", giving 16 possible keys.
@ -28,6 +31,22 @@ register. The feature is only available in 64-bit mode, even though there is
theoretically space in the PAE PTEs. These permissions are enforced on data theoretically space in the PAE PTEs. These permissions are enforced on data
access only and have no effect on instruction fetches. access only and have no effect on instruction fetches.
arm64
=====
Pkeys use 3 bits in each page table entry, to encode a "protection key index",
giving 8 possible keys.
Protections for each key are defined with a per-CPU user-writable system
register (POR_EL0). This is a 64-bit register encoding read, write and execute
overlay permissions for each protection key index.
Being a CPU register, POR_EL0 is inherently thread-local, potentially giving
each thread a different set of protections from every other thread.
Unlike x86_64, the protection key permissions also apply to instruction
fetches.
Syscalls Syscalls
======== ========
@ -38,11 +57,10 @@ There are 3 system calls which directly interact with pkeys::
int pkey_mprotect(unsigned long start, size_t len, int pkey_mprotect(unsigned long start, size_t len,
unsigned long prot, int pkey); unsigned long prot, int pkey);
Before a pkey can be used, it must first be allocated with Before a pkey can be used, it must first be allocated with pkey_alloc(). An
pkey_alloc(). An application calls the WRPKRU instruction application writes to the architecture specific CPU register directly in order
directly in order to change access permissions to memory covered to change access permissions to memory covered with a key. In this example
with a key. In this example WRPKRU is wrapped by a C function this is wrapped by a C function called pkey_set().
called pkey_set().
:: ::
int real_prot = PROT_READ|PROT_WRITE; int real_prot = PROT_READ|PROT_WRITE;
@ -64,9 +82,9 @@ is no longer in use::
munmap(ptr, PAGE_SIZE); munmap(ptr, PAGE_SIZE);
pkey_free(pkey); pkey_free(pkey);
.. note:: pkey_set() is a wrapper for the RDPKRU and WRPKRU instructions. .. note:: pkey_set() is a wrapper around writing to the CPU register.
An example implementation can be found in Example implementations can be found in
tools/testing/selftests/x86/protection_keys.c. tools/testing/selftests/mm/pkey-{arm64,powerpc,x86}.h
Behavior Behavior
======== ========
@ -96,3 +114,7 @@ with a read()::
The kernel will send a SIGSEGV in both cases, but si_code will be set The kernel will send a SIGSEGV in both cases, but si_code will be set
to SEGV_PKERR when violating protection keys versus SEGV_ACCERR when to SEGV_PKERR when violating protection keys versus SEGV_ACCERR when
the plain mprotect() permissions are violated. the plain mprotect() permissions are violated.
Note that kernel accesses from a kthread (such as io_uring) will use a default
value for the protection key register and so will not be consistent with
userspace's value of the register or mprotect().