forked from Minki/linux
Merge branches 'for-next/sme', 'for-next/stacktrace', 'for-next/fault-in-subpage', 'for-next/misc', 'for-next/ftrace' and 'for-next/crashkernel', remote-tracking branch 'arm64/for-next/perf' into for-next/core
* arm64/for-next/perf: perf/arm-cmn: Decode CAL devices properly in debugfs perf/arm-cmn: Fix filter_sel lookup perf/marvell_cn10k: Fix tad_pmu_event_init() to check pmu type first drivers/perf: hisi: Add Support for CPA PMU drivers/perf: hisi: Associate PMUs in SICL with CPUs online drivers/perf: arm_spe: Expose saturating counter to 16-bit perf/arm-cmn: Add CMN-700 support perf/arm-cmn: Refactor occupancy filter selector perf/arm-cmn: Add CMN-650 support dt-bindings: perf: arm-cmn: Add CMN-650 and CMN-700 perf: check return value of armpmu_request_irq() perf: RISC-V: Remove non-kernel-doc ** comments * for-next/sme: (30 commits) : Scalable Matrix Extensions support. arm64/sve: Move sve_free() into SVE code section arm64/sve: Make kernel FPU protection RT friendly arm64/sve: Delay freeing memory in fpsimd_flush_thread() arm64/sme: More sensibly define the size for the ZA register set arm64/sme: Fix NULL check after kzalloc arm64/sme: Add ID_AA64SMFR0_EL1 to __read_sysreg_by_encoding() arm64/sme: Provide Kconfig for SME KVM: arm64: Handle SME host state when running guests KVM: arm64: Trap SME usage in guest KVM: arm64: Hide SME system registers from guests arm64/sme: Save and restore streaming mode over EFI runtime calls arm64/sme: Disable streaming mode and ZA when flushing CPU state arm64/sme: Add ptrace support for ZA arm64/sme: Implement ptrace support for streaming mode SVE registers arm64/sme: Implement ZA signal handling arm64/sme: Implement streaming SVE signal handling arm64/sme: Disable ZA and streaming mode when handling signals arm64/sme: Implement traps and syscall handling for SME arm64/sme: Implement ZA context switching arm64/sme: Implement streaming SVE context switching ... * for-next/stacktrace: : Stacktrace cleanups. arm64: stacktrace: align with common naming arm64: stacktrace: rename stackframe to unwind_state arm64: stacktrace: rename unwinder functions arm64: stacktrace: make struct stackframe private to stacktrace.c arm64: stacktrace: delete PCS comment arm64: stacktrace: remove NULL task check from unwind_frame() * for-next/fault-in-subpage: : btrfs search_ioctl() live-lock fix using fault_in_subpage_writeable(). btrfs: Avoid live-lock in search_ioctl() on hardware with sub-page faults arm64: Add support for user sub-page fault probing mm: Add fault_in_subpage_writeable() to probe at sub-page granularity * for-next/misc: : Miscellaneous patches. arm64: Kconfig.platforms: Add comments arm64: Kconfig: Fix indentation and add comments arm64: mm: avoid writable executable mappings in kexec/hibernate code arm64: lds: move special code sections out of kernel exec segment arm64/hugetlb: Implement arm64 specific huge_ptep_get() arm64/hugetlb: Use ptep_get() to get the pte value of a huge page arm64: mm: Make arch_faults_on_old_pte() check for migratability arm64: mte: Clean up user tag accessors arm64/hugetlb: Drop TLB flush from get_clear_flush() arm64: Declare non global symbols as static arm64: mm: Cleanup useless parameters in zone_sizes_init() arm64: fix types in copy_highpage() arm64: Set ARCH_NR_GPIO to 2048 for ARCH_APPLE arm64: cputype: Avoid overflow using MIDR_IMPLEMENTOR_MASK arm64: document the boot requirements for MTE arm64/mm: Compute PTRS_PER_[PMD|PUD] independently of PTRS_PER_PTE * for-next/ftrace: : ftrace cleanups. arm64/ftrace: Make function graph use ftrace directly ftrace: cleanup ftrace_graph_caller enable and disable * for-next/crashkernel: : Support for crashkernel reservations above ZONE_DMA. arm64: kdump: Do not allocate crash low memory if not needed docs: kdump: Update the crashkernel description for arm64 of: Support more than one crash kernel regions for kexec -s of: fdt: Add memory for devices by DT property "linux,usable-memory-range" arm64: kdump: Reimplement crashkernel=X arm64: Use insert_resource() to simplify code kdump: return -ENOENT if required cmdline option does not exist
This commit is contained in:
commit
201729d53a
@ -808,7 +808,7 @@
|
||||
Documentation/admin-guide/kdump/kdump.rst for an example.
|
||||
|
||||
crashkernel=size[KMG],high
|
||||
[KNL, X86-64] range could be above 4G. Allow kernel
|
||||
[KNL, X86-64, ARM64] range could be above 4G. Allow kernel
|
||||
to allocate physical memory region from top, so could
|
||||
be above 4G if system have more than 4G ram installed.
|
||||
Otherwise memory region will be allocated below 4G, if
|
||||
@ -821,14 +821,20 @@
|
||||
that require some amount of low memory, e.g. swiotlb
|
||||
requires at least 64M+32K low memory, also enough extra
|
||||
low memory is needed to make sure DMA buffers for 32-bit
|
||||
devices won't run out. Kernel would try to allocate at
|
||||
devices won't run out. Kernel would try to allocate
|
||||
at least 256M below 4G automatically.
|
||||
This one let user to specify own low range under 4G
|
||||
This one lets the user specify own low range under 4G
|
||||
for second kernel instead.
|
||||
0: to disable low allocation.
|
||||
It will be ignored when crashkernel=X,high is not used
|
||||
or memory reserved is below 4G.
|
||||
|
||||
[KNL, ARM64] range in low memory.
|
||||
This one lets the user specify a low range in the
|
||||
DMA zone for the crash dump kernel.
|
||||
It will be ignored when crashkernel=X,high is not used
|
||||
or memory reserved is located in the DMA zones.
|
||||
|
||||
cryptomgr.notests
|
||||
[KNL] Disable crypto self-tests
|
||||
|
||||
|
@ -350,6 +350,16 @@ Before jumping into the kernel, the following conditions must be met:
|
||||
|
||||
- SMCR_EL2.FA64 (bit 31) must be initialised to 0b1.
|
||||
|
||||
For CPUs with the Memory Tagging Extension feature (FEAT_MTE2):
|
||||
|
||||
- If EL3 is present:
|
||||
|
||||
- SCR_EL3.ATA (bit 26) must be initialised to 0b1.
|
||||
|
||||
- If the kernel is entered at EL1 and EL2 is present:
|
||||
|
||||
- HCR_EL2.ATA (bit 56) must be initialised to 0b1.
|
||||
|
||||
The requirements described above for CPU mode, caches, MMUs, architected
|
||||
timers, coherency and system registers apply to all CPUs. All CPUs must
|
||||
enter the kernel in the same exception level. Where the values documented
|
||||
|
@ -264,6 +264,39 @@ HWCAP2_MTE3
|
||||
Functionality implied by ID_AA64PFR1_EL1.MTE == 0b0011, as described
|
||||
by Documentation/arm64/memory-tagging-extension.rst.
|
||||
|
||||
HWCAP2_SME
|
||||
|
||||
Functionality implied by ID_AA64PFR1_EL1.SME == 0b0001, as described
|
||||
by Documentation/arm64/sme.rst.
|
||||
|
||||
HWCAP2_SME_I16I64
|
||||
|
||||
Functionality implied by ID_AA64SMFR0_EL1.I16I64 == 0b1111.
|
||||
|
||||
HWCAP2_SME_F64F64
|
||||
|
||||
Functionality implied by ID_AA64SMFR0_EL1.F64F64 == 0b1.
|
||||
|
||||
HWCAP2_SME_I8I32
|
||||
|
||||
Functionality implied by ID_AA64SMFR0_EL1.I8I32 == 0b1111.
|
||||
|
||||
HWCAP2_SME_F16F32
|
||||
|
||||
Functionality implied by ID_AA64SMFR0_EL1.F16F32 == 0b1.
|
||||
|
||||
HWCAP2_SME_B16F32
|
||||
|
||||
Functionality implied by ID_AA64SMFR0_EL1.B16F32 == 0b1.
|
||||
|
||||
HWCAP2_SME_F32F32
|
||||
|
||||
Functionality implied by ID_AA64SMFR0_EL1.F32F32 == 0b1.
|
||||
|
||||
HWCAP2_SME_FA64
|
||||
|
||||
Functionality implied by ID_AA64SMFR0_EL1.FA64 == 0b1.
|
||||
|
||||
4. Unused AT_HWCAP bits
|
||||
-----------------------
|
||||
|
||||
|
@ -21,6 +21,7 @@ ARM64 Architecture
|
||||
perf
|
||||
pointer-authentication
|
||||
silicon-errata
|
||||
sme
|
||||
sve
|
||||
tagged-address-abi
|
||||
tagged-pointers
|
||||
|
428
Documentation/arm64/sme.rst
Normal file
428
Documentation/arm64/sme.rst
Normal file
@ -0,0 +1,428 @@
|
||||
===================================================
|
||||
Scalable Matrix Extension support for AArch64 Linux
|
||||
===================================================
|
||||
|
||||
This document outlines briefly the interface provided to userspace by Linux in
|
||||
order to support use of the ARM Scalable Matrix Extension (SME).
|
||||
|
||||
This is an outline of the most important features and issues only and not
|
||||
intended to be exhaustive. It should be read in conjunction with the SVE
|
||||
documentation in sve.rst which provides details on the Streaming SVE mode
|
||||
included in SME.
|
||||
|
||||
This document does not aim to describe the SME architecture or programmer's
|
||||
model. To aid understanding, a minimal description of relevant programmer's
|
||||
model features for SME is included in Appendix A.
|
||||
|
||||
|
||||
1. General
|
||||
-----------
|
||||
|
||||
* PSTATE.SM, PSTATE.ZA, the streaming mode vector length, the ZA
|
||||
register state and TPIDR2_EL0 are tracked per thread.
|
||||
|
||||
* The presence of SME is reported to userspace via HWCAP2_SME in the aux vector
|
||||
AT_HWCAP2 entry. Presence of this flag implies the presence of the SME
|
||||
instructions and registers, and the Linux-specific system interfaces
|
||||
described in this document. SME is reported in /proc/cpuinfo as "sme".
|
||||
|
||||
* Support for the execution of SME instructions in userspace can also be
|
||||
detected by reading the CPU ID register ID_AA64PFR1_EL1 using an MRS
|
||||
instruction, and checking that the value of the SME field is nonzero. [3]
|
||||
|
||||
It does not guarantee the presence of the system interfaces described in the
|
||||
following sections: software that needs to verify that those interfaces are
|
||||
present must check for HWCAP2_SME instead.
|
||||
|
||||
* There are a number of optional SME features, presence of these is reported
|
||||
through AT_HWCAP2 through:
|
||||
|
||||
HWCAP2_SME_I16I64
|
||||
HWCAP2_SME_F64F64
|
||||
HWCAP2_SME_I8I32
|
||||
HWCAP2_SME_F16F32
|
||||
HWCAP2_SME_B16F32
|
||||
HWCAP2_SME_F32F32
|
||||
HWCAP2_SME_FA64
|
||||
|
||||
This list may be extended over time as the SME architecture evolves.
|
||||
|
||||
These extensions are also reported via the CPU ID register ID_AA64SMFR0_EL1,
|
||||
which userspace can read using an MRS instruction. See elf_hwcaps.txt and
|
||||
cpu-feature-registers.txt for details.
|
||||
|
||||
* Debuggers should restrict themselves to interacting with the target via the
|
||||
NT_ARM_SVE, NT_ARM_SSVE and NT_ARM_ZA regsets. The recommended way
|
||||
of detecting support for these regsets is to connect to a target process
|
||||
first and then attempt a
|
||||
|
||||
ptrace(PTRACE_GETREGSET, pid, NT_ARM_<regset>, &iov).
|
||||
|
||||
* Whenever ZA register values are exchanged in memory between userspace and
|
||||
the kernel, the register value is encoded in memory as a series of horizontal
|
||||
vectors from 0 to VL/8-1 stored in the same endianness invariant format as is
|
||||
used for SVE vectors.
|
||||
|
||||
* On thread creation TPIDR2_EL0 is preserved unless CLONE_SETTLS is specified,
|
||||
in which case it is set to 0.
|
||||
|
||||
2. Vector lengths
|
||||
------------------
|
||||
|
||||
SME defines a second vector length similar to the SVE vector length which is
|
||||
controls the size of the streaming mode SVE vectors and the ZA matrix array.
|
||||
The ZA matrix is square with each side having as many bytes as a streaming
|
||||
mode SVE vector.
|
||||
|
||||
|
||||
3. Sharing of streaming and non-streaming mode SVE state
|
||||
---------------------------------------------------------
|
||||
|
||||
It is implementation defined which if any parts of the SVE state are shared
|
||||
between streaming and non-streaming modes. When switching between modes
|
||||
via software interfaces such as ptrace if no register content is provided as
|
||||
part of switching no state will be assumed to be shared and everything will
|
||||
be zeroed.
|
||||
|
||||
|
||||
4. System call behaviour
|
||||
-------------------------
|
||||
|
||||
* On syscall PSTATE.ZA is preserved, if PSTATE.ZA==1 then the contents of the
|
||||
ZA matrix are preserved.
|
||||
|
||||
* On syscall PSTATE.SM will be cleared and the SVE registers will be handled
|
||||
as per the standard SVE ABI.
|
||||
|
||||
* Neither the SVE registers nor ZA are used to pass arguments to or receive
|
||||
results from any syscall.
|
||||
|
||||
* On process creation (eg, clone()) the newly created process will have
|
||||
PSTATE.SM cleared.
|
||||
|
||||
* All other SME state of a thread, including the currently configured vector
|
||||
length, the state of the PR_SME_VL_INHERIT flag, and the deferred vector
|
||||
length (if any), is preserved across all syscalls, subject to the specific
|
||||
exceptions for execve() described in section 6.
|
||||
|
||||
|
||||
5. Signal handling
|
||||
-------------------
|
||||
|
||||
* Signal handlers are invoked with streaming mode and ZA disabled.
|
||||
|
||||
* A new signal frame record za_context encodes the ZA register contents on
|
||||
signal delivery. [1]
|
||||
|
||||
* The signal frame record for ZA always contains basic metadata, in particular
|
||||
the thread's vector length (in za_context.vl).
|
||||
|
||||
* The ZA matrix may or may not be included in the record, depending on
|
||||
the value of PSTATE.ZA. The registers are present if and only if:
|
||||
za_context.head.size >= ZA_SIG_CONTEXT_SIZE(sve_vq_from_vl(za_context.vl))
|
||||
in which case PSTATE.ZA == 1.
|
||||
|
||||
* If matrix data is present, the remainder of the record has a vl-dependent
|
||||
size and layout. Macros ZA_SIG_* are defined [1] to facilitate access to
|
||||
them.
|
||||
|
||||
* The matrix is stored as a series of horizontal vectors in the same format as
|
||||
is used for SVE vectors.
|
||||
|
||||
* If the ZA context is too big to fit in sigcontext.__reserved[], then extra
|
||||
space is allocated on the stack, an extra_context record is written in
|
||||
__reserved[] referencing this space. za_context is then written in the
|
||||
extra space. Refer to [1] for further details about this mechanism.
|
||||
|
||||
|
||||
5. Signal return
|
||||
-----------------
|
||||
|
||||
When returning from a signal handler:
|
||||
|
||||
* If there is no za_context record in the signal frame, or if the record is
|
||||
present but contains no register data as described in the previous section,
|
||||
then ZA is disabled.
|
||||
|
||||
* If za_context is present in the signal frame and contains matrix data then
|
||||
PSTATE.ZA is set to 1 and ZA is populated with the specified data.
|
||||
|
||||
* The vector length cannot be changed via signal return. If za_context.vl in
|
||||
the signal frame does not match the current vector length, the signal return
|
||||
attempt is treated as illegal, resulting in a forced SIGSEGV.
|
||||
|
||||
|
||||
6. prctl extensions
|
||||
--------------------
|
||||
|
||||
Some new prctl() calls are added to allow programs to manage the SME vector
|
||||
length:
|
||||
|
||||
prctl(PR_SME_SET_VL, unsigned long arg)
|
||||
|
||||
Sets the vector length of the calling thread and related flags, where
|
||||
arg == vl | flags. Other threads of the calling process are unaffected.
|
||||
|
||||
vl is the desired vector length, where sve_vl_valid(vl) must be true.
|
||||
|
||||
flags:
|
||||
|
||||
PR_SME_VL_INHERIT
|
||||
|
||||
Inherit the current vector length across execve(). Otherwise, the
|
||||
vector length is reset to the system default at execve(). (See
|
||||
Section 9.)
|
||||
|
||||
PR_SME_SET_VL_ONEXEC
|
||||
|
||||
Defer the requested vector length change until the next execve()
|
||||
performed by this thread.
|
||||
|
||||
The effect is equivalent to implicit execution of the following
|
||||
call immediately after the next execve() (if any) by the thread:
|
||||
|
||||
prctl(PR_SME_SET_VL, arg & ~PR_SME_SET_VL_ONEXEC)
|
||||
|
||||
This allows launching of a new program with a different vector
|
||||
length, while avoiding runtime side effects in the caller.
|
||||
|
||||
Without PR_SME_SET_VL_ONEXEC, the requested change takes effect
|
||||
immediately.
|
||||
|
||||
|
||||
Return value: a nonnegative on success, or a negative value on error:
|
||||
EINVAL: SME not supported, invalid vector length requested, or
|
||||
invalid flags.
|
||||
|
||||
|
||||
On success:
|
||||
|
||||
* Either the calling thread's vector length or the deferred vector length
|
||||
to be applied at the next execve() by the thread (dependent on whether
|
||||
PR_SME_SET_VL_ONEXEC is present in arg), is set to the largest value
|
||||
supported by the system that is less than or equal to vl. If vl ==
|
||||
SVE_VL_MAX, the value set will be the largest value supported by the
|
||||
system.
|
||||
|
||||
* Any previously outstanding deferred vector length change in the calling
|
||||
thread is cancelled.
|
||||
|
||||
* The returned value describes the resulting configuration, encoded as for
|
||||
PR_SME_GET_VL. The vector length reported in this value is the new
|
||||
current vector length for this thread if PR_SME_SET_VL_ONEXEC was not
|
||||
present in arg; otherwise, the reported vector length is the deferred
|
||||
vector length that will be applied at the next execve() by the calling
|
||||
thread.
|
||||
|
||||
* Changing the vector length causes all of ZA, P0..P15, FFR and all bits of
|
||||
Z0..Z31 except for Z0 bits [127:0] .. Z31 bits [127:0] to become
|
||||
unspecified, including both streaming and non-streaming SVE state.
|
||||
Calling PR_SME_SET_VL with vl equal to the thread's current vector
|
||||
length, or calling PR_SME_SET_VL with the PR_SVE_SET_VL_ONEXEC flag,
|
||||
does not constitute a change to the vector length for this purpose.
|
||||
|
||||
* Changing the vector length causes PSTATE.ZA and PSTATE.SM to be cleared.
|
||||
Calling PR_SME_SET_VL with vl equal to the thread's current vector
|
||||
length, or calling PR_SME_SET_VL with the PR_SVE_SET_VL_ONEXEC flag,
|
||||
does not constitute a change to the vector length for this purpose.
|
||||
|
||||
|
||||
prctl(PR_SME_GET_VL)
|
||||
|
||||
Gets the vector length of the calling thread.
|
||||
|
||||
The following flag may be OR-ed into the result:
|
||||
|
||||
PR_SME_VL_INHERIT
|
||||
|
||||
Vector length will be inherited across execve().
|
||||
|
||||
There is no way to determine whether there is an outstanding deferred
|
||||
vector length change (which would only normally be the case between a
|
||||
fork() or vfork() and the corresponding execve() in typical use).
|
||||
|
||||
To extract the vector length from the result, bitwise and it with
|
||||
PR_SME_VL_LEN_MASK.
|
||||
|
||||
Return value: a nonnegative value on success, or a negative value on error:
|
||||
EINVAL: SME not supported.
|
||||
|
||||
|
||||
7. ptrace extensions
|
||||
---------------------
|
||||
|
||||
* A new regset NT_ARM_SSVE is defined for access to streaming mode SVE
|
||||
state via PTRACE_GETREGSET and PTRACE_SETREGSET, this is documented in
|
||||
sve.rst.
|
||||
|
||||
* A new regset NT_ARM_ZA is defined for ZA state for access to ZA state via
|
||||
PTRACE_GETREGSET and PTRACE_SETREGSET.
|
||||
|
||||
Refer to [2] for definitions.
|
||||
|
||||
The regset data starts with struct user_za_header, containing:
|
||||
|
||||
size
|
||||
|
||||
Size of the complete regset, in bytes.
|
||||
This depends on vl and possibly on other things in the future.
|
||||
|
||||
If a call to PTRACE_GETREGSET requests less data than the value of
|
||||
size, the caller can allocate a larger buffer and retry in order to
|
||||
read the complete regset.
|
||||
|
||||
max_size
|
||||
|
||||
Maximum size in bytes that the regset can grow to for the target
|
||||
thread. The regset won't grow bigger than this even if the target
|
||||
thread changes its vector length etc.
|
||||
|
||||
vl
|
||||
|
||||
Target thread's current streaming vector length, in bytes.
|
||||
|
||||
max_vl
|
||||
|
||||
Maximum possible streaming vector length for the target thread.
|
||||
|
||||
flags
|
||||
|
||||
Zero or more of the following flags, which have the same
|
||||
meaning and behaviour as the corresponding PR_SET_VL_* flags:
|
||||
|
||||
SME_PT_VL_INHERIT
|
||||
|
||||
SME_PT_VL_ONEXEC (SETREGSET only).
|
||||
|
||||
* The effects of changing the vector length and/or flags are equivalent to
|
||||
those documented for PR_SME_SET_VL.
|
||||
|
||||
The caller must make a further GETREGSET call if it needs to know what VL is
|
||||
actually set by SETREGSET, unless is it known in advance that the requested
|
||||
VL is supported.
|
||||
|
||||
* The size and layout of the payload depends on the header fields. The
|
||||
SME_PT_ZA_*() macros are provided to facilitate access to the data.
|
||||
|
||||
* In either case, for SETREGSET it is permissible to omit the payload, in which
|
||||
case the vector length and flags are changed and PSTATE.ZA is set to 0
|
||||
(along with any consequences of those changes). If a payload is provided
|
||||
then PSTATE.ZA will be set to 1.
|
||||
|
||||
* For SETREGSET, if the requested VL is not supported, the effect will be the
|
||||
same as if the payload were omitted, except that an EIO error is reported.
|
||||
No attempt is made to translate the payload data to the correct layout
|
||||
for the vector length actually set. It is up to the caller to translate the
|
||||
payload layout for the actual VL and retry.
|
||||
|
||||
* The effect of writing a partial, incomplete payload is unspecified.
|
||||
|
||||
|
||||
8. ELF coredump extensions
|
||||
---------------------------
|
||||
|
||||
* NT_ARM_SSVE notes will be added to each coredump for
|
||||
each thread of the dumped process. The contents will be equivalent to the
|
||||
data that would have been read if a PTRACE_GETREGSET of the corresponding
|
||||
type were executed for each thread when the coredump was generated.
|
||||
|
||||
* A NT_ARM_ZA note will be added to each coredump for each thread of the
|
||||
dumped process. The contents will be equivalent to the data that would have
|
||||
been read if a PTRACE_GETREGSET of NT_ARM_ZA were executed for each thread
|
||||
when the coredump was generated.
|
||||
|
||||
|
||||
9. System runtime configuration
|
||||
--------------------------------
|
||||
|
||||
* To mitigate the ABI impact of expansion of the signal frame, a policy
|
||||
mechanism is provided for administrators, distro maintainers and developers
|
||||
to set the default vector length for userspace processes:
|
||||
|
||||
/proc/sys/abi/sme_default_vector_length
|
||||
|
||||
Writing the text representation of an integer to this file sets the system
|
||||
default vector length to the specified value, unless the value is greater
|
||||
than the maximum vector length supported by the system in which case the
|
||||
default vector length is set to that maximum.
|
||||
|
||||
The result can be determined by reopening the file and reading its
|
||||
contents.
|
||||
|
||||
At boot, the default vector length is initially set to 32 or the maximum
|
||||
supported vector length, whichever is smaller and supported. This
|
||||
determines the initial vector length of the init process (PID 1).
|
||||
|
||||
Reading this file returns the current system default vector length.
|
||||
|
||||
* At every execve() call, the new vector length of the new process is set to
|
||||
the system default vector length, unless
|
||||
|
||||
* PR_SME_VL_INHERIT (or equivalently SME_PT_VL_INHERIT) is set for the
|
||||
calling thread, or
|
||||
|
||||
* a deferred vector length change is pending, established via the
|
||||
PR_SME_SET_VL_ONEXEC flag (or SME_PT_VL_ONEXEC).
|
||||
|
||||
* Modifying the system default vector length does not affect the vector length
|
||||
of any existing process or thread that does not make an execve() call.
|
||||
|
||||
|
||||
Appendix A. SME programmer's model (informative)
|
||||
=================================================
|
||||
|
||||
This section provides a minimal description of the additions made by SVE to the
|
||||
ARMv8-A programmer's model that are relevant to this document.
|
||||
|
||||
Note: This section is for information only and not intended to be complete or
|
||||
to replace any architectural specification.
|
||||
|
||||
A.1. Registers
|
||||
---------------
|
||||
|
||||
In A64 state, SME adds the following:
|
||||
|
||||
* A new mode, streaming mode, in which a subset of the normal FPSIMD and SVE
|
||||
features are available. When supported EL0 software may enter and leave
|
||||
streaming mode at any time.
|
||||
|
||||
For best system performance it is strongly encouraged for software to enable
|
||||
streaming mode only when it is actively being used.
|
||||
|
||||
* A new vector length controlling the size of ZA and the Z registers when in
|
||||
streaming mode, separately to the vector length used for SVE when not in
|
||||
streaming mode. There is no requirement that either the currently selected
|
||||
vector length or the set of vector lengths supported for the two modes in
|
||||
a given system have any relationship. The streaming mode vector length
|
||||
is referred to as SVL.
|
||||
|
||||
* A new ZA matrix register. This is a square matrix of SVLxSVL bits. Most
|
||||
operations on ZA require that streaming mode be enabled but ZA can be
|
||||
enabled without streaming mode in order to load, save and retain data.
|
||||
|
||||
For best system performance it is strongly encouraged for software to enable
|
||||
ZA only when it is actively being used.
|
||||
|
||||
* Two new 1 bit fields in PSTATE which may be controlled via the SMSTART and
|
||||
SMSTOP instructions or by access to the SVCR system register:
|
||||
|
||||
* PSTATE.ZA, if this is 1 then the ZA matrix is accessible and has valid
|
||||
data while if it is 0 then ZA can not be accessed. When PSTATE.ZA is
|
||||
changed from 0 to 1 all bits in ZA are cleared.
|
||||
|
||||
* PSTATE.SM, if this is 1 then the PE is in streaming mode. When the value
|
||||
of PSTATE.SM is changed then it is implementation defined if the subset
|
||||
of the floating point register bits valid in both modes may be retained.
|
||||
Any other bits will be cleared.
|
||||
|
||||
|
||||
References
|
||||
==========
|
||||
|
||||
[1] arch/arm64/include/uapi/asm/sigcontext.h
|
||||
AArch64 Linux signal ABI definitions
|
||||
|
||||
[2] arch/arm64/include/uapi/asm/ptrace.h
|
||||
AArch64 Linux ptrace ABI definitions
|
||||
|
||||
[3] Documentation/arm64/cpu-feature-registers.rst
|
@ -7,7 +7,9 @@ Author: Dave Martin <Dave.Martin@arm.com>
|
||||
Date: 4 August 2017
|
||||
|
||||
This document outlines briefly the interface provided to userspace by Linux in
|
||||
order to support use of the ARM Scalable Vector Extension (SVE).
|
||||
order to support use of the ARM Scalable Vector Extension (SVE), including
|
||||
interactions with Streaming SVE mode added by the Scalable Matrix Extension
|
||||
(SME).
|
||||
|
||||
This is an outline of the most important features and issues only and not
|
||||
intended to be exhaustive.
|
||||
@ -23,6 +25,10 @@ model features for SVE is included in Appendix A.
|
||||
* SVE registers Z0..Z31, P0..P15 and FFR and the current vector length VL, are
|
||||
tracked per-thread.
|
||||
|
||||
* In streaming mode FFR is not accessible unless HWCAP2_SME_FA64 is present
|
||||
in the system, when it is not supported and these interfaces are used to
|
||||
access streaming mode FFR is read and written as zero.
|
||||
|
||||
* The presence of SVE is reported to userspace via HWCAP_SVE in the aux vector
|
||||
AT_HWCAP entry. Presence of this flag implies the presence of the SVE
|
||||
instructions and registers, and the Linux-specific system interfaces
|
||||
@ -53,10 +59,19 @@ model features for SVE is included in Appendix A.
|
||||
which userspace can read using an MRS instruction. See elf_hwcaps.txt and
|
||||
cpu-feature-registers.txt for details.
|
||||
|
||||
* On hardware that supports the SME extensions, HWCAP2_SME will also be
|
||||
reported in the AT_HWCAP2 aux vector entry. Among other things SME adds
|
||||
streaming mode which provides a subset of the SVE feature set using a
|
||||
separate SME vector length and the same Z/V registers. See sme.rst
|
||||
for more details.
|
||||
|
||||
* Debuggers should restrict themselves to interacting with the target via the
|
||||
NT_ARM_SVE regset. The recommended way of detecting support for this regset
|
||||
is to connect to a target process first and then attempt a
|
||||
ptrace(PTRACE_GETREGSET, pid, NT_ARM_SVE, &iov).
|
||||
ptrace(PTRACE_GETREGSET, pid, NT_ARM_SVE, &iov). Note that when SME is
|
||||
present and streaming SVE mode is in use the FPSIMD subset of registers
|
||||
will be read via NT_ARM_SVE and NT_ARM_SVE writes will exit streaming mode
|
||||
in the target.
|
||||
|
||||
* Whenever SVE scalable register values (Zn, Pn, FFR) are exchanged in memory
|
||||
between userspace and the kernel, the register value is encoded in memory in
|
||||
@ -126,6 +141,11 @@ the SVE instruction set architecture.
|
||||
are only present in fpsimd_context. For convenience, the content of V0..V31
|
||||
is duplicated between sve_context and fpsimd_context.
|
||||
|
||||
* The record contains a flag field which includes a flag SVE_SIG_FLAG_SM which
|
||||
if set indicates that the thread is in streaming mode and the vector length
|
||||
and register data (if present) describe the streaming SVE data and vector
|
||||
length.
|
||||
|
||||
* The signal frame record for SVE always contains basic metadata, in particular
|
||||
the thread's vector length (in sve_context.vl).
|
||||
|
||||
@ -170,6 +190,11 @@ When returning from a signal handler:
|
||||
the signal frame does not match the current vector length, the signal return
|
||||
attempt is treated as illegal, resulting in a forced SIGSEGV.
|
||||
|
||||
* It is permitted to enter or leave streaming mode by setting or clearing
|
||||
the SVE_SIG_FLAG_SM flag but applications should take care to ensure that
|
||||
when doing so sve_context.vl and any register data are appropriate for the
|
||||
vector length in the new mode.
|
||||
|
||||
|
||||
6. prctl extensions
|
||||
--------------------
|
||||
@ -265,8 +290,14 @@ prctl(PR_SVE_GET_VL)
|
||||
7. ptrace extensions
|
||||
---------------------
|
||||
|
||||
* A new regset NT_ARM_SVE is defined for use with PTRACE_GETREGSET and
|
||||
PTRACE_SETREGSET.
|
||||
* New regsets NT_ARM_SVE and NT_ARM_SSVE are defined for use with
|
||||
PTRACE_GETREGSET and PTRACE_SETREGSET. NT_ARM_SSVE describes the
|
||||
streaming mode SVE registers and NT_ARM_SVE describes the
|
||||
non-streaming mode SVE registers.
|
||||
|
||||
In this description a register set is referred to as being "live" when
|
||||
the target is in the appropriate streaming or non-streaming mode and is
|
||||
using data beyond the subset shared with the FPSIMD Vn registers.
|
||||
|
||||
Refer to [2] for definitions.
|
||||
|
||||
@ -297,7 +328,7 @@ The regset data starts with struct user_sve_header, containing:
|
||||
|
||||
flags
|
||||
|
||||
either
|
||||
at most one of
|
||||
|
||||
SVE_PT_REGS_FPSIMD
|
||||
|
||||
@ -331,6 +362,10 @@ The regset data starts with struct user_sve_header, containing:
|
||||
|
||||
SVE_PT_VL_ONEXEC (SETREGSET only).
|
||||
|
||||
If neither FPSIMD nor SVE flags are provided then no register
|
||||
payload is available, this is only possible when SME is implemented.
|
||||
|
||||
|
||||
* The effects of changing the vector length and/or flags are equivalent to
|
||||
those documented for PR_SVE_SET_VL.
|
||||
|
||||
@ -346,6 +381,13 @@ The regset data starts with struct user_sve_header, containing:
|
||||
case only the vector length and flags are changed (along with any
|
||||
consequences of those changes).
|
||||
|
||||
* In systems supporting SME when in streaming mode a GETREGSET for
|
||||
NT_REG_SVE will return only the user_sve_header with no register data,
|
||||
similarly a GETREGSET for NT_REG_SSVE will not return any register data
|
||||
when not in streaming mode.
|
||||
|
||||
* A GETREGSET for NT_ARM_SSVE will never return SVE_PT_REGS_FPSIMD.
|
||||
|
||||
* For SETREGSET, if an SVE_PT_REGS_SVE payload is present and the
|
||||
requested VL is not supported, the effect will be the same as if the
|
||||
payload were omitted, except that an EIO error is reported. No
|
||||
@ -355,17 +397,25 @@ The regset data starts with struct user_sve_header, containing:
|
||||
unspecified. It is up to the caller to translate the payload layout
|
||||
for the actual VL and retry.
|
||||
|
||||
* Where SME is implemented it is not possible to GETREGSET the register
|
||||
state for normal SVE when in streaming mode, nor the streaming mode
|
||||
register state when in normal mode, regardless of the implementation defined
|
||||
behaviour of the hardware for sharing data between the two modes.
|
||||
|
||||
* Any SETREGSET of NT_ARM_SVE will exit streaming mode if the target was in
|
||||
streaming mode and any SETREGSET of NT_ARM_SSVE will enter streaming mode
|
||||
if the target was not in streaming mode.
|
||||
|
||||
* The effect of writing a partial, incomplete payload is unspecified.
|
||||
|
||||
|
||||
8. ELF coredump extensions
|
||||
---------------------------
|
||||
|
||||
* A NT_ARM_SVE note will be added to each coredump for each thread of the
|
||||
dumped process. The contents will be equivalent to the data that would have
|
||||
been read if a PTRACE_GETREGSET of NT_ARM_SVE were executed for each thread
|
||||
when the coredump was generated.
|
||||
|
||||
* NT_ARM_SVE and NT_ARM_SSVE notes will be added to each coredump for
|
||||
each thread of the dumped process. The contents will be equivalent to the
|
||||
data that would have been read if a PTRACE_GETREGSET of the corresponding
|
||||
type were executed for each thread when the coredump was generated.
|
||||
|
||||
9. System runtime configuration
|
||||
--------------------------------
|
||||
|
@ -24,6 +24,13 @@ config KEXEC_ELF
|
||||
config HAVE_IMA_KEXEC
|
||||
bool
|
||||
|
||||
config ARCH_HAS_SUBPAGE_FAULTS
|
||||
bool
|
||||
help
|
||||
Select if the architecture can check permissions at sub-page
|
||||
granularity (e.g. arm64 MTE). The probe_user_*() functions
|
||||
must be implemented.
|
||||
|
||||
config HOTPLUG_SMT
|
||||
bool
|
||||
|
||||
|
@ -253,31 +253,31 @@ config ARM64_CONT_PMD_SHIFT
|
||||
default 4
|
||||
|
||||
config ARCH_MMAP_RND_BITS_MIN
|
||||
default 14 if ARM64_64K_PAGES
|
||||
default 16 if ARM64_16K_PAGES
|
||||
default 18
|
||||
default 14 if ARM64_64K_PAGES
|
||||
default 16 if ARM64_16K_PAGES
|
||||
default 18
|
||||
|
||||
# max bits determined by the following formula:
|
||||
# VA_BITS - PAGE_SHIFT - 3
|
||||
config ARCH_MMAP_RND_BITS_MAX
|
||||
default 19 if ARM64_VA_BITS=36
|
||||
default 24 if ARM64_VA_BITS=39
|
||||
default 27 if ARM64_VA_BITS=42
|
||||
default 30 if ARM64_VA_BITS=47
|
||||
default 29 if ARM64_VA_BITS=48 && ARM64_64K_PAGES
|
||||
default 31 if ARM64_VA_BITS=48 && ARM64_16K_PAGES
|
||||
default 33 if ARM64_VA_BITS=48
|
||||
default 14 if ARM64_64K_PAGES
|
||||
default 16 if ARM64_16K_PAGES
|
||||
default 18
|
||||
default 19 if ARM64_VA_BITS=36
|
||||
default 24 if ARM64_VA_BITS=39
|
||||
default 27 if ARM64_VA_BITS=42
|
||||
default 30 if ARM64_VA_BITS=47
|
||||
default 29 if ARM64_VA_BITS=48 && ARM64_64K_PAGES
|
||||
default 31 if ARM64_VA_BITS=48 && ARM64_16K_PAGES
|
||||
default 33 if ARM64_VA_BITS=48
|
||||
default 14 if ARM64_64K_PAGES
|
||||
default 16 if ARM64_16K_PAGES
|
||||
default 18
|
||||
|
||||
config ARCH_MMAP_RND_COMPAT_BITS_MIN
|
||||
default 7 if ARM64_64K_PAGES
|
||||
default 9 if ARM64_16K_PAGES
|
||||
default 11
|
||||
default 7 if ARM64_64K_PAGES
|
||||
default 9 if ARM64_16K_PAGES
|
||||
default 11
|
||||
|
||||
config ARCH_MMAP_RND_COMPAT_BITS_MAX
|
||||
default 16
|
||||
default 16
|
||||
|
||||
config NO_IOPORT_MAP
|
||||
def_bool y if !PCI
|
||||
@ -304,7 +304,7 @@ config GENERIC_HWEIGHT
|
||||
def_bool y
|
||||
|
||||
config GENERIC_CSUM
|
||||
def_bool y
|
||||
def_bool y
|
||||
|
||||
config GENERIC_CALIBRATE_DELAY
|
||||
def_bool y
|
||||
@ -1037,8 +1037,7 @@ config SOCIONEXT_SYNQUACER_PREITS
|
||||
|
||||
If unsure, say Y.
|
||||
|
||||
endmenu
|
||||
|
||||
endmenu # "ARM errata workarounds via the alternatives framework"
|
||||
|
||||
choice
|
||||
prompt "Page size"
|
||||
@ -1566,9 +1565,9 @@ config SETEND_EMULATION
|
||||
be unexpected results in the applications.
|
||||
|
||||
If unsure, say Y
|
||||
endif
|
||||
endif # ARMV8_DEPRECATED
|
||||
|
||||
endif
|
||||
endif # COMPAT
|
||||
|
||||
menu "ARMv8.1 architectural features"
|
||||
|
||||
@ -1593,15 +1592,15 @@ config ARM64_PAN
|
||||
bool "Enable support for Privileged Access Never (PAN)"
|
||||
default y
|
||||
help
|
||||
Privileged Access Never (PAN; part of the ARMv8.1 Extensions)
|
||||
prevents the kernel or hypervisor from accessing user-space (EL0)
|
||||
memory directly.
|
||||
Privileged Access Never (PAN; part of the ARMv8.1 Extensions)
|
||||
prevents the kernel or hypervisor from accessing user-space (EL0)
|
||||
memory directly.
|
||||
|
||||
Choosing this option will cause any unprotected (not using
|
||||
copy_to_user et al) memory access to fail with a permission fault.
|
||||
Choosing this option will cause any unprotected (not using
|
||||
copy_to_user et al) memory access to fail with a permission fault.
|
||||
|
||||
The feature is detected at runtime, and will remain as a 'nop'
|
||||
instruction if the cpu does not implement the feature.
|
||||
The feature is detected at runtime, and will remain as a 'nop'
|
||||
instruction if the cpu does not implement the feature.
|
||||
|
||||
config AS_HAS_LDAPR
|
||||
def_bool $(as-instr,.arch_extension rcpc)
|
||||
@ -1629,15 +1628,15 @@ config ARM64_USE_LSE_ATOMICS
|
||||
built with binutils >= 2.25 in order for the new instructions
|
||||
to be used.
|
||||
|
||||
endmenu
|
||||
endmenu # "ARMv8.1 architectural features"
|
||||
|
||||
menu "ARMv8.2 architectural features"
|
||||
|
||||
config AS_HAS_ARMV8_2
|
||||
def_bool $(cc-option,-Wa$(comma)-march=armv8.2-a)
|
||||
def_bool $(cc-option,-Wa$(comma)-march=armv8.2-a)
|
||||
|
||||
config AS_HAS_SHA3
|
||||
def_bool $(as-instr,.arch armv8.2-a+sha3)
|
||||
def_bool $(as-instr,.arch armv8.2-a+sha3)
|
||||
|
||||
config ARM64_PMEM
|
||||
bool "Enable support for persistent memory"
|
||||
@ -1681,7 +1680,7 @@ config ARM64_CNP
|
||||
at runtime, and does not affect PEs that do not implement
|
||||
this feature.
|
||||
|
||||
endmenu
|
||||
endmenu # "ARMv8.2 architectural features"
|
||||
|
||||
menu "ARMv8.3 architectural features"
|
||||
|
||||
@ -1744,7 +1743,7 @@ config AS_HAS_PAC
|
||||
config AS_HAS_CFI_NEGATE_RA_STATE
|
||||
def_bool $(as-instr,.cfi_startproc\n.cfi_negate_ra_state\n.cfi_endproc\n)
|
||||
|
||||
endmenu
|
||||
endmenu # "ARMv8.3 architectural features"
|
||||
|
||||
menu "ARMv8.4 architectural features"
|
||||
|
||||
@ -1785,7 +1784,7 @@ config ARM64_TLB_RANGE
|
||||
The feature introduces new assembly instructions, and they were
|
||||
support when binutils >= 2.30.
|
||||
|
||||
endmenu
|
||||
endmenu # "ARMv8.4 architectural features"
|
||||
|
||||
menu "ARMv8.5 architectural features"
|
||||
|
||||
@ -1871,6 +1870,7 @@ config ARM64_MTE
|
||||
depends on AS_HAS_LSE_ATOMICS
|
||||
# Required for tag checking in the uaccess routines
|
||||
depends on ARM64_PAN
|
||||
select ARCH_HAS_SUBPAGE_FAULTS
|
||||
select ARCH_USES_HIGH_VMA_FLAGS
|
||||
help
|
||||
Memory Tagging (part of the ARMv8.5 Extensions) provides
|
||||
@ -1892,7 +1892,7 @@ config ARM64_MTE
|
||||
|
||||
Documentation/arm64/memory-tagging-extension.rst.
|
||||
|
||||
endmenu
|
||||
endmenu # "ARMv8.5 architectural features"
|
||||
|
||||
menu "ARMv8.7 architectural features"
|
||||
|
||||
@ -1901,12 +1901,12 @@ config ARM64_EPAN
|
||||
default y
|
||||
depends on ARM64_PAN
|
||||
help
|
||||
Enhanced Privileged Access Never (EPAN) allows Privileged
|
||||
Access Never to be used with Execute-only mappings.
|
||||
Enhanced Privileged Access Never (EPAN) allows Privileged
|
||||
Access Never to be used with Execute-only mappings.
|
||||
|
||||
The feature is detected at runtime, and will remain disabled
|
||||
if the cpu does not implement the feature.
|
||||
endmenu
|
||||
The feature is detected at runtime, and will remain disabled
|
||||
if the cpu does not implement the feature.
|
||||
endmenu # "ARMv8.7 architectural features"
|
||||
|
||||
config ARM64_SVE
|
||||
bool "ARM Scalable Vector Extension support"
|
||||
@ -1939,6 +1939,17 @@ config ARM64_SVE
|
||||
booting the kernel. If unsure and you are not observing these
|
||||
symptoms, you should assume that it is safe to say Y.
|
||||
|
||||
config ARM64_SME
|
||||
bool "ARM Scalable Matrix Extension support"
|
||||
default y
|
||||
depends on ARM64_SVE
|
||||
help
|
||||
The Scalable Matrix Extension (SME) is an extension to the AArch64
|
||||
execution state which utilises a substantial subset of the SVE
|
||||
instruction set, together with the addition of new architectural
|
||||
register state capable of holding two dimensional matrix tiles to
|
||||
enable various matrix operations.
|
||||
|
||||
config ARM64_MODULE_PLTS
|
||||
bool "Use PLTs to allow module memory to spill over into vmalloc area"
|
||||
depends on MODULES
|
||||
@ -1982,7 +1993,7 @@ config ARM64_DEBUG_PRIORITY_MASKING
|
||||
the validity of ICC_PMR_EL1 when calling concerned functions.
|
||||
|
||||
If unsure, say N
|
||||
endif
|
||||
endif # ARM64_PSEUDO_NMI
|
||||
|
||||
config RELOCATABLE
|
||||
bool "Build a relocatable kernel image" if EXPERT
|
||||
@ -2041,7 +2052,19 @@ config STACKPROTECTOR_PER_TASK
|
||||
def_bool y
|
||||
depends on STACKPROTECTOR && CC_HAVE_STACKPROTECTOR_SYSREG
|
||||
|
||||
endmenu
|
||||
# The GPIO number here must be sorted by descending number. In case of
|
||||
# a multiplatform kernel, we just want the highest value required by the
|
||||
# selected platforms.
|
||||
config ARCH_NR_GPIO
|
||||
int
|
||||
default 2048 if ARCH_APPLE
|
||||
default 0
|
||||
help
|
||||
Maximum number of GPIOs in the system.
|
||||
|
||||
If unsure, leave the default value.
|
||||
|
||||
endmenu # "Kernel Features"
|
||||
|
||||
menu "Boot options"
|
||||
|
||||
@ -2105,7 +2128,7 @@ config EFI
|
||||
help
|
||||
This option provides support for runtime services provided
|
||||
by UEFI firmware (such as non-volatile variables, realtime
|
||||
clock, and platform reset). A UEFI stub is also provided to
|
||||
clock, and platform reset). A UEFI stub is also provided to
|
||||
allow the kernel to be booted as an EFI application. This
|
||||
is only useful on systems that have UEFI firmware.
|
||||
|
||||
@ -2120,7 +2143,7 @@ config DMI
|
||||
However, even with this option, the resultant kernel should
|
||||
continue to boot on existing non-UEFI platforms.
|
||||
|
||||
endmenu
|
||||
endmenu # "Boot options"
|
||||
|
||||
config SYSVIPC_COMPAT
|
||||
def_bool y
|
||||
@ -2141,7 +2164,7 @@ config ARCH_HIBERNATION_HEADER
|
||||
config ARCH_SUSPEND_POSSIBLE
|
||||
def_bool y
|
||||
|
||||
endmenu
|
||||
endmenu # "Power management options"
|
||||
|
||||
menu "CPU Power Management"
|
||||
|
||||
@ -2149,7 +2172,7 @@ source "drivers/cpuidle/Kconfig"
|
||||
|
||||
source "drivers/cpufreq/Kconfig"
|
||||
|
||||
endmenu
|
||||
endmenu # "CPU Power Management"
|
||||
|
||||
source "drivers/acpi/Kconfig"
|
||||
|
||||
@ -2157,4 +2180,4 @@ source "arch/arm64/kvm/Kconfig"
|
||||
|
||||
if CRYPTO
|
||||
source "arch/arm64/crypto/Kconfig"
|
||||
endif
|
||||
endif # CRYPTO
|
||||
|
@ -325,4 +325,4 @@ config ARCH_ZYNQMP
|
||||
help
|
||||
This enables support for Xilinx ZynqMP Family
|
||||
|
||||
endmenu
|
||||
endmenu # "Platform selection"
|
||||
|
@ -58,11 +58,15 @@ struct cpuinfo_arm64 {
|
||||
u64 reg_id_aa64pfr0;
|
||||
u64 reg_id_aa64pfr1;
|
||||
u64 reg_id_aa64zfr0;
|
||||
u64 reg_id_aa64smfr0;
|
||||
|
||||
struct cpuinfo_32bit aarch32;
|
||||
|
||||
/* pseudo-ZCR for recording maximum ZCR_EL1 LEN value: */
|
||||
u64 reg_zcr;
|
||||
|
||||
/* pseudo-SMCR for recording maximum SMCR_EL1 LEN value: */
|
||||
u64 reg_smcr;
|
||||
};
|
||||
|
||||
DECLARE_PER_CPU(struct cpuinfo_arm64, cpu_data);
|
||||
|
@ -622,6 +622,13 @@ static inline bool id_aa64pfr0_sve(u64 pfr0)
|
||||
return val > 0;
|
||||
}
|
||||
|
||||
static inline bool id_aa64pfr1_sme(u64 pfr1)
|
||||
{
|
||||
u32 val = cpuid_feature_extract_unsigned_field(pfr1, ID_AA64PFR1_SME_SHIFT);
|
||||
|
||||
return val > 0;
|
||||
}
|
||||
|
||||
static inline bool id_aa64pfr1_mte(u64 pfr1)
|
||||
{
|
||||
u32 val = cpuid_feature_extract_unsigned_field(pfr1, ID_AA64PFR1_MTE_SHIFT);
|
||||
@ -759,6 +766,23 @@ static __always_inline bool system_supports_sve(void)
|
||||
cpus_have_const_cap(ARM64_SVE);
|
||||
}
|
||||
|
||||
static __always_inline bool system_supports_sme(void)
|
||||
{
|
||||
return IS_ENABLED(CONFIG_ARM64_SME) &&
|
||||
cpus_have_const_cap(ARM64_SME);
|
||||
}
|
||||
|
||||
static __always_inline bool system_supports_fa64(void)
|
||||
{
|
||||
return IS_ENABLED(CONFIG_ARM64_SME) &&
|
||||
cpus_have_const_cap(ARM64_SME_FA64);
|
||||
}
|
||||
|
||||
static __always_inline bool system_supports_tpidr2(void)
|
||||
{
|
||||
return system_supports_sme();
|
||||
}
|
||||
|
||||
static __always_inline bool system_supports_cnp(void)
|
||||
{
|
||||
return IS_ENABLED(CONFIG_ARM64_CNP) &&
|
||||
|
@ -36,7 +36,7 @@
|
||||
#define MIDR_VARIANT(midr) \
|
||||
(((midr) & MIDR_VARIANT_MASK) >> MIDR_VARIANT_SHIFT)
|
||||
#define MIDR_IMPLEMENTOR_SHIFT 24
|
||||
#define MIDR_IMPLEMENTOR_MASK (0xff << MIDR_IMPLEMENTOR_SHIFT)
|
||||
#define MIDR_IMPLEMENTOR_MASK (0xffU << MIDR_IMPLEMENTOR_SHIFT)
|
||||
#define MIDR_IMPLEMENTOR(midr) \
|
||||
(((midr) & MIDR_IMPLEMENTOR_MASK) >> MIDR_IMPLEMENTOR_SHIFT)
|
||||
|
||||
|
@ -143,6 +143,50 @@
|
||||
.Lskip_sve_\@:
|
||||
.endm
|
||||
|
||||
/* SME register access and priority mapping */
|
||||
.macro __init_el2_nvhe_sme
|
||||
mrs x1, id_aa64pfr1_el1
|
||||
ubfx x1, x1, #ID_AA64PFR1_SME_SHIFT, #4
|
||||
cbz x1, .Lskip_sme_\@
|
||||
|
||||
bic x0, x0, #CPTR_EL2_TSM // Also disable SME traps
|
||||
msr cptr_el2, x0 // Disable copro. traps to EL2
|
||||
isb
|
||||
|
||||
mrs x1, sctlr_el2
|
||||
orr x1, x1, #SCTLR_ELx_ENTP2 // Disable TPIDR2 traps
|
||||
msr sctlr_el2, x1
|
||||
isb
|
||||
|
||||
mov x1, #0 // SMCR controls
|
||||
|
||||
mrs_s x2, SYS_ID_AA64SMFR0_EL1
|
||||
ubfx x2, x2, #ID_AA64SMFR0_FA64_SHIFT, #1 // Full FP in SM?
|
||||
cbz x2, .Lskip_sme_fa64_\@
|
||||
|
||||
orr x1, x1, SMCR_ELx_FA64_MASK
|
||||
.Lskip_sme_fa64_\@:
|
||||
|
||||
orr x1, x1, #SMCR_ELx_LEN_MASK // Enable full SME vector
|
||||
msr_s SYS_SMCR_EL2, x1 // length for EL1.
|
||||
|
||||
mrs_s x1, SYS_SMIDR_EL1 // Priority mapping supported?
|
||||
ubfx x1, x1, #SYS_SMIDR_EL1_SMPS_SHIFT, #1
|
||||
cbz x1, .Lskip_sme_\@
|
||||
|
||||
msr_s SYS_SMPRIMAP_EL2, xzr // Make all priorities equal
|
||||
|
||||
mrs x1, id_aa64mmfr1_el1 // HCRX_EL2 present?
|
||||
ubfx x1, x1, #ID_AA64MMFR1_HCX_SHIFT, #4
|
||||
cbz x1, .Lskip_sme_\@
|
||||
|
||||
mrs_s x1, SYS_HCRX_EL2
|
||||
orr x1, x1, #HCRX_EL2_SMPME_MASK // Enable priority mapping
|
||||
msr_s SYS_HCRX_EL2, x1
|
||||
|
||||
.Lskip_sme_\@:
|
||||
.endm
|
||||
|
||||
/* Disable any fine grained traps */
|
||||
.macro __init_el2_fgt
|
||||
mrs x1, id_aa64mmfr0_el1
|
||||
@ -153,15 +197,26 @@
|
||||
mrs x1, id_aa64dfr0_el1
|
||||
ubfx x1, x1, #ID_AA64DFR0_PMSVER_SHIFT, #4
|
||||
cmp x1, #3
|
||||
b.lt .Lset_fgt_\@
|
||||
b.lt .Lset_debug_fgt_\@
|
||||
/* Disable PMSNEVFR_EL1 read and write traps */
|
||||
orr x0, x0, #(1 << 62)
|
||||
|
||||
.Lset_fgt_\@:
|
||||
.Lset_debug_fgt_\@:
|
||||
msr_s SYS_HDFGRTR_EL2, x0
|
||||
msr_s SYS_HDFGWTR_EL2, x0
|
||||
msr_s SYS_HFGRTR_EL2, xzr
|
||||
msr_s SYS_HFGWTR_EL2, xzr
|
||||
|
||||
mov x0, xzr
|
||||
mrs x1, id_aa64pfr1_el1
|
||||
ubfx x1, x1, #ID_AA64PFR1_SME_SHIFT, #4
|
||||
cbz x1, .Lset_fgt_\@
|
||||
|
||||
/* Disable nVHE traps of TPIDR2 and SMPRI */
|
||||
orr x0, x0, #HFGxTR_EL2_nSMPRI_EL1_MASK
|
||||
orr x0, x0, #HFGxTR_EL2_nTPIDR2_EL0_MASK
|
||||
|
||||
.Lset_fgt_\@:
|
||||
msr_s SYS_HFGRTR_EL2, x0
|
||||
msr_s SYS_HFGWTR_EL2, x0
|
||||
msr_s SYS_HFGITR_EL2, xzr
|
||||
|
||||
mrs x1, id_aa64pfr0_el1 // AMU traps UNDEF without AMU
|
||||
@ -196,6 +251,7 @@
|
||||
__init_el2_nvhe_idregs
|
||||
__init_el2_nvhe_cptr
|
||||
__init_el2_nvhe_sve
|
||||
__init_el2_nvhe_sme
|
||||
__init_el2_fgt
|
||||
__init_el2_nvhe_prepare_eret
|
||||
.endm
|
||||
|
@ -37,7 +37,8 @@
|
||||
#define ESR_ELx_EC_ERET (0x1a) /* EL2 only */
|
||||
/* Unallocated EC: 0x1B */
|
||||
#define ESR_ELx_EC_FPAC (0x1C) /* EL1 and above */
|
||||
/* Unallocated EC: 0x1D - 0x1E */
|
||||
#define ESR_ELx_EC_SME (0x1D)
|
||||
/* Unallocated EC: 0x1E */
|
||||
#define ESR_ELx_EC_IMP_DEF (0x1f) /* EL3 only */
|
||||
#define ESR_ELx_EC_IABT_LOW (0x20)
|
||||
#define ESR_ELx_EC_IABT_CUR (0x21)
|
||||
@ -75,6 +76,7 @@
|
||||
#define ESR_ELx_IL_SHIFT (25)
|
||||
#define ESR_ELx_IL (UL(1) << ESR_ELx_IL_SHIFT)
|
||||
#define ESR_ELx_ISS_MASK (ESR_ELx_IL - 1)
|
||||
#define ESR_ELx_ISS(esr) ((esr) & ESR_ELx_ISS_MASK)
|
||||
|
||||
/* ISS field definitions shared by different classes */
|
||||
#define ESR_ELx_WNR_SHIFT (6)
|
||||
@ -327,6 +329,15 @@
|
||||
#define ESR_ELx_CP15_32_ISS_SYS_CNTFRQ (ESR_ELx_CP15_32_ISS_SYS_VAL(0, 0, 14, 0) |\
|
||||
ESR_ELx_CP15_32_ISS_DIR_READ)
|
||||
|
||||
/*
|
||||
* ISS values for SME traps
|
||||
*/
|
||||
|
||||
#define ESR_ELx_SME_ISS_SME_DISABLED 0
|
||||
#define ESR_ELx_SME_ISS_ILL 1
|
||||
#define ESR_ELx_SME_ISS_SM_DISABLED 2
|
||||
#define ESR_ELx_SME_ISS_ZA_DISABLED 3
|
||||
|
||||
#ifndef __ASSEMBLY__
|
||||
#include <asm/types.h>
|
||||
|
||||
|
@ -64,6 +64,7 @@ void do_debug_exception(unsigned long addr_if_watchpoint, unsigned int esr,
|
||||
struct pt_regs *regs);
|
||||
void do_fpsimd_acc(unsigned int esr, struct pt_regs *regs);
|
||||
void do_sve_acc(unsigned int esr, struct pt_regs *regs);
|
||||
void do_sme_acc(unsigned int esr, struct pt_regs *regs);
|
||||
void do_fpsimd_exc(unsigned int esr, struct pt_regs *regs);
|
||||
void do_sysinstr(unsigned int esr, struct pt_regs *regs);
|
||||
void do_sp_pc_abort(unsigned long addr, unsigned int esr, struct pt_regs *regs);
|
||||
|
@ -32,6 +32,18 @@
|
||||
#define VFP_STATE_SIZE ((32 * 8) + 4)
|
||||
#endif
|
||||
|
||||
/*
|
||||
* When we defined the maximum SVE vector length we defined the ABI so
|
||||
* that the maximum vector length included all the reserved for future
|
||||
* expansion bits in ZCR rather than those just currently defined by
|
||||
* the architecture. While SME follows a similar pattern the fact that
|
||||
* it includes a square matrix means that any allocations that attempt
|
||||
* to cover the maximum potential vector length (such as happen with
|
||||
* the regset used for ptrace) end up being extremely large. Define
|
||||
* the much lower actual limit for use in such situations.
|
||||
*/
|
||||
#define SME_VQ_MAX 16
|
||||
|
||||
struct task_struct;
|
||||
|
||||
extern void fpsimd_save_state(struct user_fpsimd_state *state);
|
||||
@ -46,11 +58,23 @@ extern void fpsimd_restore_current_state(void);
|
||||
extern void fpsimd_update_current_state(struct user_fpsimd_state const *state);
|
||||
|
||||
extern void fpsimd_bind_state_to_cpu(struct user_fpsimd_state *state,
|
||||
void *sve_state, unsigned int sve_vl);
|
||||
void *sve_state, unsigned int sve_vl,
|
||||
void *za_state, unsigned int sme_vl,
|
||||
u64 *svcr);
|
||||
|
||||
extern void fpsimd_flush_task_state(struct task_struct *target);
|
||||
extern void fpsimd_save_and_flush_cpu_state(void);
|
||||
|
||||
static inline bool thread_sm_enabled(struct thread_struct *thread)
|
||||
{
|
||||
return system_supports_sme() && (thread->svcr & SYS_SVCR_EL0_SM_MASK);
|
||||
}
|
||||
|
||||
static inline bool thread_za_enabled(struct thread_struct *thread)
|
||||
{
|
||||
return system_supports_sme() && (thread->svcr & SYS_SVCR_EL0_ZA_MASK);
|
||||
}
|
||||
|
||||
/* Maximum VL that SVE/SME VL-agnostic software can transparently support */
|
||||
#define VL_ARCH_MAX 0x100
|
||||
|
||||
@ -62,7 +86,14 @@ static inline size_t sve_ffr_offset(int vl)
|
||||
|
||||
static inline void *sve_pffr(struct thread_struct *thread)
|
||||
{
|
||||
return (char *)thread->sve_state + sve_ffr_offset(thread_get_sve_vl(thread));
|
||||
unsigned int vl;
|
||||
|
||||
if (system_supports_sme() && thread_sm_enabled(thread))
|
||||
vl = thread_get_sme_vl(thread);
|
||||
else
|
||||
vl = thread_get_sve_vl(thread);
|
||||
|
||||
return (char *)thread->sve_state + sve_ffr_offset(vl);
|
||||
}
|
||||
|
||||
extern void sve_save_state(void *state, u32 *pfpsr, int save_ffr);
|
||||
@ -71,11 +102,17 @@ extern void sve_load_state(void const *state, u32 const *pfpsr,
|
||||
extern void sve_flush_live(bool flush_ffr, unsigned long vq_minus_1);
|
||||
extern unsigned int sve_get_vl(void);
|
||||
extern void sve_set_vq(unsigned long vq_minus_1);
|
||||
extern void sme_set_vq(unsigned long vq_minus_1);
|
||||
extern void za_save_state(void *state);
|
||||
extern void za_load_state(void const *state);
|
||||
|
||||
struct arm64_cpu_capabilities;
|
||||
extern void sve_kernel_enable(const struct arm64_cpu_capabilities *__unused);
|
||||
extern void sme_kernel_enable(const struct arm64_cpu_capabilities *__unused);
|
||||
extern void fa64_kernel_enable(const struct arm64_cpu_capabilities *__unused);
|
||||
|
||||
extern u64 read_zcr_features(void);
|
||||
extern u64 read_smcr_features(void);
|
||||
|
||||
/*
|
||||
* Helpers to translate bit indices in sve_vq_map to VQ values (and
|
||||
@ -119,6 +156,7 @@ struct vl_info {
|
||||
extern void sve_alloc(struct task_struct *task);
|
||||
extern void fpsimd_release_task(struct task_struct *task);
|
||||
extern void fpsimd_sync_to_sve(struct task_struct *task);
|
||||
extern void fpsimd_force_sync_to_sve(struct task_struct *task);
|
||||
extern void sve_sync_to_fpsimd(struct task_struct *task);
|
||||
extern void sve_sync_from_fpsimd_zeropad(struct task_struct *task);
|
||||
|
||||
@ -170,6 +208,12 @@ static inline void write_vl(enum vec_type type, u64 val)
|
||||
tmp = read_sysreg_s(SYS_ZCR_EL1) & ~ZCR_ELx_LEN_MASK;
|
||||
write_sysreg_s(tmp | val, SYS_ZCR_EL1);
|
||||
break;
|
||||
#endif
|
||||
#ifdef CONFIG_ARM64_SME
|
||||
case ARM64_VEC_SME:
|
||||
tmp = read_sysreg_s(SYS_SMCR_EL1) & ~SMCR_ELx_LEN_MASK;
|
||||
write_sysreg_s(tmp | val, SYS_SMCR_EL1);
|
||||
break;
|
||||
#endif
|
||||
default:
|
||||
WARN_ON_ONCE(1);
|
||||
@ -208,6 +252,8 @@ static inline bool sve_vq_available(unsigned int vq)
|
||||
return vq_available(ARM64_VEC_SVE, vq);
|
||||
}
|
||||
|
||||
size_t sve_state_size(struct task_struct const *task);
|
||||
|
||||
#else /* ! CONFIG_ARM64_SVE */
|
||||
|
||||
static inline void sve_alloc(struct task_struct *task) { }
|
||||
@ -247,8 +293,93 @@ static inline void vec_update_vq_map(enum vec_type t) { }
|
||||
static inline int vec_verify_vq_map(enum vec_type t) { return 0; }
|
||||
static inline void sve_setup(void) { }
|
||||
|
||||
static inline size_t sve_state_size(struct task_struct const *task)
|
||||
{
|
||||
return 0;
|
||||
}
|
||||
|
||||
#endif /* ! CONFIG_ARM64_SVE */
|
||||
|
||||
#ifdef CONFIG_ARM64_SME
|
||||
|
||||
static inline void sme_user_disable(void)
|
||||
{
|
||||
sysreg_clear_set(cpacr_el1, CPACR_EL1_SMEN_EL0EN, 0);
|
||||
}
|
||||
|
||||
static inline void sme_user_enable(void)
|
||||
{
|
||||
sysreg_clear_set(cpacr_el1, 0, CPACR_EL1_SMEN_EL0EN);
|
||||
}
|
||||
|
||||
static inline void sme_smstart_sm(void)
|
||||
{
|
||||
asm volatile(__msr_s(SYS_SVCR_SMSTART_SM_EL0, "xzr"));
|
||||
}
|
||||
|
||||
static inline void sme_smstop_sm(void)
|
||||
{
|
||||
asm volatile(__msr_s(SYS_SVCR_SMSTOP_SM_EL0, "xzr"));
|
||||
}
|
||||
|
||||
static inline void sme_smstop(void)
|
||||
{
|
||||
asm volatile(__msr_s(SYS_SVCR_SMSTOP_SMZA_EL0, "xzr"));
|
||||
}
|
||||
|
||||
extern void __init sme_setup(void);
|
||||
|
||||
static inline int sme_max_vl(void)
|
||||
{
|
||||
return vec_max_vl(ARM64_VEC_SME);
|
||||
}
|
||||
|
||||
static inline int sme_max_virtualisable_vl(void)
|
||||
{
|
||||
return vec_max_virtualisable_vl(ARM64_VEC_SME);
|
||||
}
|
||||
|
||||
extern void sme_alloc(struct task_struct *task);
|
||||
extern unsigned int sme_get_vl(void);
|
||||
extern int sme_set_current_vl(unsigned long arg);
|
||||
extern int sme_get_current_vl(void);
|
||||
|
||||
/*
|
||||
* Return how many bytes of memory are required to store the full SME
|
||||
* specific state (currently just ZA) for task, given task's currently
|
||||
* configured vector length.
|
||||
*/
|
||||
static inline size_t za_state_size(struct task_struct const *task)
|
||||
{
|
||||
unsigned int vl = task_get_sme_vl(task);
|
||||
|
||||
return ZA_SIG_REGS_SIZE(sve_vq_from_vl(vl));
|
||||
}
|
||||
|
||||
#else
|
||||
|
||||
static inline void sme_user_disable(void) { BUILD_BUG(); }
|
||||
static inline void sme_user_enable(void) { BUILD_BUG(); }
|
||||
|
||||
static inline void sme_smstart_sm(void) { }
|
||||
static inline void sme_smstop_sm(void) { }
|
||||
static inline void sme_smstop(void) { }
|
||||
|
||||
static inline void sme_alloc(struct task_struct *task) { }
|
||||
static inline void sme_setup(void) { }
|
||||
static inline unsigned int sme_get_vl(void) { return 0; }
|
||||
static inline int sme_max_vl(void) { return 0; }
|
||||
static inline int sme_max_virtualisable_vl(void) { return 0; }
|
||||
static inline int sme_set_current_vl(unsigned long arg) { return -EINVAL; }
|
||||
static inline int sme_get_current_vl(void) { return -EINVAL; }
|
||||
|
||||
static inline size_t za_state_size(struct task_struct const *task)
|
||||
{
|
||||
return 0;
|
||||
}
|
||||
|
||||
#endif /* ! CONFIG_ARM64_SME */
|
||||
|
||||
/* For use by EFI runtime services calls only */
|
||||
extern void __efi_fpsimd_begin(void);
|
||||
extern void __efi_fpsimd_end(void);
|
||||
|
@ -93,6 +93,12 @@
|
||||
.endif
|
||||
.endm
|
||||
|
||||
.macro _sme_check_wv v
|
||||
.if (\v) < 12 || (\v) > 15
|
||||
.error "Bad vector select register \v."
|
||||
.endif
|
||||
.endm
|
||||
|
||||
/* SVE instruction encodings for non-SVE-capable assemblers */
|
||||
/* (pre binutils 2.28, all kernel capable clang versions support SVE) */
|
||||
|
||||
@ -174,6 +180,54 @@
|
||||
| (\np)
|
||||
.endm
|
||||
|
||||
/* SME instruction encodings for non-SME-capable assemblers */
|
||||
/* (pre binutils 2.38/LLVM 13) */
|
||||
|
||||
/* RDSVL X\nx, #\imm */
|
||||
.macro _sme_rdsvl nx, imm
|
||||
_check_general_reg \nx
|
||||
_check_num (\imm), -0x20, 0x1f
|
||||
.inst 0x04bf5800 \
|
||||
| (\nx) \
|
||||
| (((\imm) & 0x3f) << 5)
|
||||
.endm
|
||||
|
||||
/*
|
||||
* STR (vector from ZA array):
|
||||
* STR ZA[\nw, #\offset], [X\nxbase, #\offset, MUL VL]
|
||||
*/
|
||||
.macro _sme_str_zav nw, nxbase, offset=0
|
||||
_sme_check_wv \nw
|
||||
_check_general_reg \nxbase
|
||||
_check_num (\offset), -0x100, 0xff
|
||||
.inst 0xe1200000 \
|
||||
| (((\nw) & 3) << 13) \
|
||||
| ((\nxbase) << 5) \
|
||||
| ((\offset) & 7)
|
||||
.endm
|
||||
|
||||
/*
|
||||
* LDR (vector to ZA array):
|
||||
* LDR ZA[\nw, #\offset], [X\nxbase, #\offset, MUL VL]
|
||||
*/
|
||||
.macro _sme_ldr_zav nw, nxbase, offset=0
|
||||
_sme_check_wv \nw
|
||||
_check_general_reg \nxbase
|
||||
_check_num (\offset), -0x100, 0xff
|
||||
.inst 0xe1000000 \
|
||||
| (((\nw) & 3) << 13) \
|
||||
| ((\nxbase) << 5) \
|
||||
| ((\offset) & 7)
|
||||
.endm
|
||||
|
||||
/*
|
||||
* Zero the entire ZA array
|
||||
* ZERO ZA
|
||||
*/
|
||||
.macro zero_za
|
||||
.inst 0xc00800ff
|
||||
.endm
|
||||
|
||||
.macro __for from:req, to:req
|
||||
.if (\from) == (\to)
|
||||
_for__body %\from
|
||||
@ -208,6 +262,17 @@
|
||||
921:
|
||||
.endm
|
||||
|
||||
/* Update SMCR_EL1.LEN with the new VQ */
|
||||
.macro sme_load_vq xvqminus1, xtmp, xtmp2
|
||||
mrs_s \xtmp, SYS_SMCR_EL1
|
||||
bic \xtmp2, \xtmp, SMCR_ELx_LEN_MASK
|
||||
orr \xtmp2, \xtmp2, \xvqminus1
|
||||
cmp \xtmp2, \xtmp
|
||||
b.eq 921f
|
||||
msr_s SYS_SMCR_EL1, \xtmp2 //self-synchronising
|
||||
921:
|
||||
.endm
|
||||
|
||||
/* Preserve the first 128-bits of Znz and zero the rest. */
|
||||
.macro _sve_flush_z nz
|
||||
_sve_check_zreg \nz
|
||||
@ -254,3 +319,25 @@
|
||||
ldr w\nxtmp, [\xpfpsr, #4]
|
||||
msr fpcr, x\nxtmp
|
||||
.endm
|
||||
|
||||
.macro sme_save_za nxbase, xvl, nw
|
||||
mov w\nw, #0
|
||||
|
||||
423:
|
||||
_sme_str_zav \nw, \nxbase
|
||||
add x\nxbase, x\nxbase, \xvl
|
||||
add x\nw, x\nw, #1
|
||||
cmp \xvl, x\nw
|
||||
bne 423b
|
||||
.endm
|
||||
|
||||
.macro sme_load_za nxbase, xvl, nw
|
||||
mov w\nw, #0
|
||||
|
||||
423:
|
||||
_sme_ldr_zav \nw, \nxbase
|
||||
add x\nxbase, x\nxbase, \xvl
|
||||
add x\nw, x\nw, #1
|
||||
cmp \xvl, x\nw
|
||||
bne 423b
|
||||
.endm
|
||||
|
@ -80,8 +80,15 @@ static inline unsigned long ftrace_call_adjust(unsigned long addr)
|
||||
|
||||
#ifdef CONFIG_DYNAMIC_FTRACE_WITH_REGS
|
||||
struct dyn_ftrace;
|
||||
struct ftrace_ops;
|
||||
struct ftrace_regs;
|
||||
|
||||
int ftrace_init_nop(struct module *mod, struct dyn_ftrace *rec);
|
||||
#define ftrace_init_nop ftrace_init_nop
|
||||
|
||||
void ftrace_graph_func(unsigned long ip, unsigned long parent_ip,
|
||||
struct ftrace_ops *op, struct ftrace_regs *fregs);
|
||||
#define ftrace_graph_func ftrace_graph_func
|
||||
#endif
|
||||
|
||||
#define ftrace_return_address(n) return_address(n)
|
||||
|
@ -44,6 +44,8 @@ extern void huge_ptep_clear_flush(struct vm_area_struct *vma,
|
||||
#define __HAVE_ARCH_HUGE_PTE_CLEAR
|
||||
extern void huge_pte_clear(struct mm_struct *mm, unsigned long addr,
|
||||
pte_t *ptep, unsigned long sz);
|
||||
#define __HAVE_ARCH_HUGE_PTEP_GET
|
||||
extern pte_t huge_ptep_get(pte_t *ptep);
|
||||
extern void set_huge_swap_pte_at(struct mm_struct *mm, unsigned long addr,
|
||||
pte_t *ptep, pte_t pte, unsigned long sz);
|
||||
#define set_huge_swap_pte_at set_huge_swap_pte_at
|
||||
|
@ -109,6 +109,14 @@
|
||||
#define KERNEL_HWCAP_AFP __khwcap2_feature(AFP)
|
||||
#define KERNEL_HWCAP_RPRES __khwcap2_feature(RPRES)
|
||||
#define KERNEL_HWCAP_MTE3 __khwcap2_feature(MTE3)
|
||||
#define KERNEL_HWCAP_SME __khwcap2_feature(SME)
|
||||
#define KERNEL_HWCAP_SME_I16I64 __khwcap2_feature(SME_I16I64)
|
||||
#define KERNEL_HWCAP_SME_F64F64 __khwcap2_feature(SME_F64F64)
|
||||
#define KERNEL_HWCAP_SME_I8I32 __khwcap2_feature(SME_I8I32)
|
||||
#define KERNEL_HWCAP_SME_F16F32 __khwcap2_feature(SME_F16F32)
|
||||
#define KERNEL_HWCAP_SME_B16F32 __khwcap2_feature(SME_B16F32)
|
||||
#define KERNEL_HWCAP_SME_F32F32 __khwcap2_feature(SME_F32F32)
|
||||
#define KERNEL_HWCAP_SME_FA64 __khwcap2_feature(SME_FA64)
|
||||
|
||||
/*
|
||||
* This yields a mask that user programs can use to figure out what
|
||||
|
@ -279,6 +279,7 @@
|
||||
#define CPTR_EL2_TCPAC (1U << 31)
|
||||
#define CPTR_EL2_TAM (1 << 30)
|
||||
#define CPTR_EL2_TTA (1 << 20)
|
||||
#define CPTR_EL2_TSM (1 << 12)
|
||||
#define CPTR_EL2_TFP (1 << CPTR_EL2_TFP_SHIFT)
|
||||
#define CPTR_EL2_TZ (1 << 8)
|
||||
#define CPTR_NVHE_EL2_RES1 0x000032ff /* known RES1 bits in CPTR_EL2 (nVHE) */
|
||||
|
@ -295,8 +295,11 @@ struct vcpu_reset_state {
|
||||
|
||||
struct kvm_vcpu_arch {
|
||||
struct kvm_cpu_context ctxt;
|
||||
|
||||
/* Guest floating point state */
|
||||
void *sve_state;
|
||||
unsigned int sve_max_vl;
|
||||
u64 svcr;
|
||||
|
||||
/* Stage 2 paging state used by the hardware on next switch */
|
||||
struct kvm_s2_mmu *hw_mmu;
|
||||
@ -451,6 +454,7 @@ struct kvm_vcpu_arch {
|
||||
#define KVM_ARM64_DEBUG_STATE_SAVE_TRBE (1 << 13) /* Save TRBE context if active */
|
||||
#define KVM_ARM64_FP_FOREIGN_FPSTATE (1 << 14)
|
||||
#define KVM_ARM64_ON_UNSUPPORTED_CPU (1 << 15) /* Physical CPU not in supported_cpus */
|
||||
#define KVM_ARM64_HOST_SME_ENABLED (1 << 16) /* SME enabled for EL0 */
|
||||
|
||||
#define KVM_GUESTDBG_VALID_MASK (KVM_GUESTDBG_ENABLE | \
|
||||
KVM_GUESTDBG_USE_SW_BP | \
|
||||
|
@ -47,6 +47,7 @@ long set_mte_ctrl(struct task_struct *task, unsigned long arg);
|
||||
long get_mte_ctrl(struct task_struct *task);
|
||||
int mte_ptrace_copy_tags(struct task_struct *child, long request,
|
||||
unsigned long addr, unsigned long data);
|
||||
size_t mte_probe_user_range(const char __user *uaddr, size_t size);
|
||||
|
||||
#else /* CONFIG_ARM64_MTE */
|
||||
|
||||
|
@ -49,7 +49,7 @@
|
||||
#define PMD_SHIFT ARM64_HW_PGTABLE_LEVEL_SHIFT(2)
|
||||
#define PMD_SIZE (_AC(1, UL) << PMD_SHIFT)
|
||||
#define PMD_MASK (~(PMD_SIZE-1))
|
||||
#define PTRS_PER_PMD PTRS_PER_PTE
|
||||
#define PTRS_PER_PMD (1 << (PAGE_SHIFT - 3))
|
||||
#endif
|
||||
|
||||
/*
|
||||
@ -59,7 +59,7 @@
|
||||
#define PUD_SHIFT ARM64_HW_PGTABLE_LEVEL_SHIFT(1)
|
||||
#define PUD_SIZE (_AC(1, UL) << PUD_SHIFT)
|
||||
#define PUD_MASK (~(PUD_SIZE-1))
|
||||
#define PTRS_PER_PUD PTRS_PER_PTE
|
||||
#define PTRS_PER_PUD (1 << (PAGE_SHIFT - 3))
|
||||
#endif
|
||||
|
||||
/*
|
||||
|
@ -1001,7 +1001,8 @@ static inline void update_mmu_cache(struct vm_area_struct *vma,
|
||||
*/
|
||||
static inline bool arch_faults_on_old_pte(void)
|
||||
{
|
||||
WARN_ON(preemptible());
|
||||
/* The register read below requires a stable CPU to make any sense */
|
||||
cant_migrate();
|
||||
|
||||
return !cpu_has_hw_af();
|
||||
}
|
||||
|
@ -118,6 +118,7 @@ struct debug_info {
|
||||
|
||||
enum vec_type {
|
||||
ARM64_VEC_SVE = 0,
|
||||
ARM64_VEC_SME,
|
||||
ARM64_VEC_MAX,
|
||||
};
|
||||
|
||||
@ -153,6 +154,7 @@ struct thread_struct {
|
||||
|
||||
unsigned int fpsimd_cpu;
|
||||
void *sve_state; /* SVE registers, if any */
|
||||
void *za_state; /* ZA register, if any */
|
||||
unsigned int vl[ARM64_VEC_MAX]; /* vector length */
|
||||
unsigned int vl_onexec[ARM64_VEC_MAX]; /* vl after next exec */
|
||||
unsigned long fault_address; /* fault info */
|
||||
@ -168,6 +170,8 @@ struct thread_struct {
|
||||
u64 mte_ctrl;
|
||||
#endif
|
||||
u64 sctlr_user;
|
||||
u64 svcr;
|
||||
u64 tpidr2_el0;
|
||||
};
|
||||
|
||||
static inline unsigned int thread_get_vl(struct thread_struct *thread,
|
||||
@ -181,6 +185,19 @@ static inline unsigned int thread_get_sve_vl(struct thread_struct *thread)
|
||||
return thread_get_vl(thread, ARM64_VEC_SVE);
|
||||
}
|
||||
|
||||
static inline unsigned int thread_get_sme_vl(struct thread_struct *thread)
|
||||
{
|
||||
return thread_get_vl(thread, ARM64_VEC_SME);
|
||||
}
|
||||
|
||||
static inline unsigned int thread_get_cur_vl(struct thread_struct *thread)
|
||||
{
|
||||
if (system_supports_sme() && (thread->svcr & SYS_SVCR_EL0_SM_MASK))
|
||||
return thread_get_sme_vl(thread);
|
||||
else
|
||||
return thread_get_sve_vl(thread);
|
||||
}
|
||||
|
||||
unsigned int task_get_vl(const struct task_struct *task, enum vec_type type);
|
||||
void task_set_vl(struct task_struct *task, enum vec_type type,
|
||||
unsigned long vl);
|
||||
@ -194,6 +211,11 @@ static inline unsigned int task_get_sve_vl(const struct task_struct *task)
|
||||
return task_get_vl(task, ARM64_VEC_SVE);
|
||||
}
|
||||
|
||||
static inline unsigned int task_get_sme_vl(const struct task_struct *task)
|
||||
{
|
||||
return task_get_vl(task, ARM64_VEC_SME);
|
||||
}
|
||||
|
||||
static inline void task_set_sve_vl(struct task_struct *task, unsigned long vl)
|
||||
{
|
||||
task_set_vl(task, ARM64_VEC_SVE, vl);
|
||||
@ -354,9 +376,11 @@ extern void __init minsigstksz_setup(void);
|
||||
*/
|
||||
#include <asm/fpsimd.h>
|
||||
|
||||
/* Userspace interface for PR_SVE_{SET,GET}_VL prctl()s: */
|
||||
/* Userspace interface for PR_S[MV]E_{SET,GET}_VL prctl()s: */
|
||||
#define SVE_SET_VL(arg) sve_set_current_vl(arg)
|
||||
#define SVE_GET_VL() sve_get_current_vl()
|
||||
#define SME_SET_VL(arg) sme_set_current_vl(arg)
|
||||
#define SME_GET_VL() sme_get_current_vl()
|
||||
|
||||
/* PR_PAC_RESET_KEYS prctl */
|
||||
#define PAC_RESET_KEYS(tsk, arg) ptrauth_prctl_reset_keys(tsk, arg)
|
||||
|
@ -31,38 +31,6 @@ struct stack_info {
|
||||
enum stack_type type;
|
||||
};
|
||||
|
||||
/*
|
||||
* A snapshot of a frame record or fp/lr register values, along with some
|
||||
* accounting information necessary for robust unwinding.
|
||||
*
|
||||
* @fp: The fp value in the frame record (or the real fp)
|
||||
* @pc: The lr value in the frame record (or the real lr)
|
||||
*
|
||||
* @stacks_done: Stacks which have been entirely unwound, for which it is no
|
||||
* longer valid to unwind to.
|
||||
*
|
||||
* @prev_fp: The fp that pointed to this frame record, or a synthetic value
|
||||
* of 0. This is used to ensure that within a stack, each
|
||||
* subsequent frame record is at an increasing address.
|
||||
* @prev_type: The type of stack this frame record was on, or a synthetic
|
||||
* value of STACK_TYPE_UNKNOWN. This is used to detect a
|
||||
* transition from one stack to another.
|
||||
*
|
||||
* @kr_cur: When KRETPROBES is selected, holds the kretprobe instance
|
||||
* associated with the most recently encountered replacement lr
|
||||
* value.
|
||||
*/
|
||||
struct stackframe {
|
||||
unsigned long fp;
|
||||
unsigned long pc;
|
||||
DECLARE_BITMAP(stacks_done, __NR_STACK_TYPES);
|
||||
unsigned long prev_fp;
|
||||
enum stack_type prev_type;
|
||||
#ifdef CONFIG_KRETPROBES
|
||||
struct llist_node *kr_cur;
|
||||
#endif
|
||||
};
|
||||
|
||||
extern void dump_backtrace(struct pt_regs *regs, struct task_struct *tsk,
|
||||
const char *loglvl);
|
||||
|
||||
|
@ -118,6 +118,10 @@
|
||||
* System registers, organised loosely by encoding but grouped together
|
||||
* where the architected name contains an index. e.g. ID_MMFR<n>_EL1.
|
||||
*/
|
||||
#define SYS_SVCR_SMSTOP_SM_EL0 sys_reg(0, 3, 4, 2, 3)
|
||||
#define SYS_SVCR_SMSTART_SM_EL0 sys_reg(0, 3, 4, 3, 3)
|
||||
#define SYS_SVCR_SMSTOP_SMZA_EL0 sys_reg(0, 3, 4, 6, 3)
|
||||
|
||||
#define SYS_OSDTRRX_EL1 sys_reg(2, 0, 0, 0, 2)
|
||||
#define SYS_MDCCINT_EL1 sys_reg(2, 0, 0, 2, 0)
|
||||
#define SYS_MDSCR_EL1 sys_reg(2, 0, 0, 2, 2)
|
||||
@ -181,6 +185,7 @@
|
||||
#define SYS_ID_AA64PFR0_EL1 sys_reg(3, 0, 0, 4, 0)
|
||||
#define SYS_ID_AA64PFR1_EL1 sys_reg(3, 0, 0, 4, 1)
|
||||
#define SYS_ID_AA64ZFR0_EL1 sys_reg(3, 0, 0, 4, 4)
|
||||
#define SYS_ID_AA64SMFR0_EL1 sys_reg(3, 0, 0, 4, 5)
|
||||
|
||||
#define SYS_ID_AA64DFR0_EL1 sys_reg(3, 0, 0, 5, 0)
|
||||
#define SYS_ID_AA64DFR1_EL1 sys_reg(3, 0, 0, 5, 1)
|
||||
@ -204,6 +209,8 @@
|
||||
|
||||
#define SYS_ZCR_EL1 sys_reg(3, 0, 1, 2, 0)
|
||||
#define SYS_TRFCR_EL1 sys_reg(3, 0, 1, 2, 1)
|
||||
#define SYS_SMPRI_EL1 sys_reg(3, 0, 1, 2, 4)
|
||||
#define SYS_SMCR_EL1 sys_reg(3, 0, 1, 2, 6)
|
||||
|
||||
#define SYS_TTBR0_EL1 sys_reg(3, 0, 2, 0, 0)
|
||||
#define SYS_TTBR1_EL1 sys_reg(3, 0, 2, 0, 1)
|
||||
@ -396,6 +403,8 @@
|
||||
#define TRBIDR_ALIGN_MASK GENMASK(3, 0)
|
||||
#define TRBIDR_ALIGN_SHIFT 0
|
||||
|
||||
#define SMPRI_EL1_PRIORITY_MASK 0xf
|
||||
|
||||
#define SYS_PMINTENSET_EL1 sys_reg(3, 0, 9, 14, 1)
|
||||
#define SYS_PMINTENCLR_EL1 sys_reg(3, 0, 9, 14, 2)
|
||||
|
||||
@ -451,8 +460,13 @@
|
||||
#define SYS_CCSIDR_EL1 sys_reg(3, 1, 0, 0, 0)
|
||||
#define SYS_CLIDR_EL1 sys_reg(3, 1, 0, 0, 1)
|
||||
#define SYS_GMID_EL1 sys_reg(3, 1, 0, 0, 4)
|
||||
#define SYS_SMIDR_EL1 sys_reg(3, 1, 0, 0, 6)
|
||||
#define SYS_AIDR_EL1 sys_reg(3, 1, 0, 0, 7)
|
||||
|
||||
#define SYS_SMIDR_EL1_IMPLEMENTER_SHIFT 24
|
||||
#define SYS_SMIDR_EL1_SMPS_SHIFT 15
|
||||
#define SYS_SMIDR_EL1_AFFINITY_SHIFT 0
|
||||
|
||||
#define SYS_CSSELR_EL1 sys_reg(3, 2, 0, 0, 0)
|
||||
|
||||
#define SYS_CTR_EL0 sys_reg(3, 3, 0, 0, 1)
|
||||
@ -461,6 +475,10 @@
|
||||
#define SYS_RNDR_EL0 sys_reg(3, 3, 2, 4, 0)
|
||||
#define SYS_RNDRRS_EL0 sys_reg(3, 3, 2, 4, 1)
|
||||
|
||||
#define SYS_SVCR_EL0 sys_reg(3, 3, 4, 2, 2)
|
||||
#define SYS_SVCR_EL0_ZA_MASK 2
|
||||
#define SYS_SVCR_EL0_SM_MASK 1
|
||||
|
||||
#define SYS_PMCR_EL0 sys_reg(3, 3, 9, 12, 0)
|
||||
#define SYS_PMCNTENSET_EL0 sys_reg(3, 3, 9, 12, 1)
|
||||
#define SYS_PMCNTENCLR_EL0 sys_reg(3, 3, 9, 12, 2)
|
||||
@ -477,6 +495,7 @@
|
||||
|
||||
#define SYS_TPIDR_EL0 sys_reg(3, 3, 13, 0, 2)
|
||||
#define SYS_TPIDRRO_EL0 sys_reg(3, 3, 13, 0, 3)
|
||||
#define SYS_TPIDR2_EL0 sys_reg(3, 3, 13, 0, 5)
|
||||
|
||||
#define SYS_SCXTNUM_EL0 sys_reg(3, 3, 13, 0, 7)
|
||||
|
||||
@ -546,6 +565,9 @@
|
||||
#define SYS_HFGITR_EL2 sys_reg(3, 4, 1, 1, 6)
|
||||
#define SYS_ZCR_EL2 sys_reg(3, 4, 1, 2, 0)
|
||||
#define SYS_TRFCR_EL2 sys_reg(3, 4, 1, 2, 1)
|
||||
#define SYS_HCRX_EL2 sys_reg(3, 4, 1, 2, 2)
|
||||
#define SYS_SMPRIMAP_EL2 sys_reg(3, 4, 1, 2, 5)
|
||||
#define SYS_SMCR_EL2 sys_reg(3, 4, 1, 2, 6)
|
||||
#define SYS_DACR32_EL2 sys_reg(3, 4, 3, 0, 0)
|
||||
#define SYS_HDFGRTR_EL2 sys_reg(3, 4, 3, 1, 4)
|
||||
#define SYS_HDFGWTR_EL2 sys_reg(3, 4, 3, 1, 5)
|
||||
@ -605,6 +627,7 @@
|
||||
#define SYS_SCTLR_EL12 sys_reg(3, 5, 1, 0, 0)
|
||||
#define SYS_CPACR_EL12 sys_reg(3, 5, 1, 0, 2)
|
||||
#define SYS_ZCR_EL12 sys_reg(3, 5, 1, 2, 0)
|
||||
#define SYS_SMCR_EL12 sys_reg(3, 5, 1, 2, 6)
|
||||
#define SYS_TTBR0_EL12 sys_reg(3, 5, 2, 0, 0)
|
||||
#define SYS_TTBR1_EL12 sys_reg(3, 5, 2, 0, 1)
|
||||
#define SYS_TCR_EL12 sys_reg(3, 5, 2, 0, 2)
|
||||
@ -628,6 +651,7 @@
|
||||
#define SYS_CNTV_CVAL_EL02 sys_reg(3, 5, 14, 3, 2)
|
||||
|
||||
/* Common SCTLR_ELx flags. */
|
||||
#define SCTLR_ELx_ENTP2 (BIT(60))
|
||||
#define SCTLR_ELx_DSSBS (BIT(44))
|
||||
#define SCTLR_ELx_ATA (BIT(43))
|
||||
|
||||
@ -836,6 +860,7 @@
|
||||
#define ID_AA64PFR0_ELx_32BIT_64BIT 0x2
|
||||
|
||||
/* id_aa64pfr1 */
|
||||
#define ID_AA64PFR1_SME_SHIFT 24
|
||||
#define ID_AA64PFR1_MPAMFRAC_SHIFT 16
|
||||
#define ID_AA64PFR1_RASFRAC_SHIFT 12
|
||||
#define ID_AA64PFR1_MTE_SHIFT 8
|
||||
@ -846,6 +871,7 @@
|
||||
#define ID_AA64PFR1_SSBS_PSTATE_ONLY 1
|
||||
#define ID_AA64PFR1_SSBS_PSTATE_INSNS 2
|
||||
#define ID_AA64PFR1_BT_BTI 0x1
|
||||
#define ID_AA64PFR1_SME 1
|
||||
|
||||
#define ID_AA64PFR1_MTE_NI 0x0
|
||||
#define ID_AA64PFR1_MTE_EL0 0x1
|
||||
@ -874,6 +900,23 @@
|
||||
#define ID_AA64ZFR0_AES_PMULL 0x2
|
||||
#define ID_AA64ZFR0_SVEVER_SVE2 0x1
|
||||
|
||||
/* id_aa64smfr0 */
|
||||
#define ID_AA64SMFR0_FA64_SHIFT 63
|
||||
#define ID_AA64SMFR0_I16I64_SHIFT 52
|
||||
#define ID_AA64SMFR0_F64F64_SHIFT 48
|
||||
#define ID_AA64SMFR0_I8I32_SHIFT 36
|
||||
#define ID_AA64SMFR0_F16F32_SHIFT 35
|
||||
#define ID_AA64SMFR0_B16F32_SHIFT 34
|
||||
#define ID_AA64SMFR0_F32F32_SHIFT 32
|
||||
|
||||
#define ID_AA64SMFR0_FA64 0x1
|
||||
#define ID_AA64SMFR0_I16I64 0x4
|
||||
#define ID_AA64SMFR0_F64F64 0x1
|
||||
#define ID_AA64SMFR0_I8I32 0x4
|
||||
#define ID_AA64SMFR0_F16F32 0x1
|
||||
#define ID_AA64SMFR0_B16F32 0x1
|
||||
#define ID_AA64SMFR0_F32F32 0x1
|
||||
|
||||
/* id_aa64mmfr0 */
|
||||
#define ID_AA64MMFR0_ECV_SHIFT 60
|
||||
#define ID_AA64MMFR0_FGT_SHIFT 56
|
||||
@ -926,6 +969,7 @@
|
||||
|
||||
/* id_aa64mmfr1 */
|
||||
#define ID_AA64MMFR1_ECBHB_SHIFT 60
|
||||
#define ID_AA64MMFR1_HCX_SHIFT 40
|
||||
#define ID_AA64MMFR1_AFP_SHIFT 44
|
||||
#define ID_AA64MMFR1_ETS_SHIFT 36
|
||||
#define ID_AA64MMFR1_TWED_SHIFT 32
|
||||
@ -1119,9 +1163,24 @@
|
||||
#define ZCR_ELx_LEN_SIZE 9
|
||||
#define ZCR_ELx_LEN_MASK 0x1ff
|
||||
|
||||
#define SMCR_ELx_FA64_SHIFT 31
|
||||
#define SMCR_ELx_FA64_MASK (1 << SMCR_ELx_FA64_SHIFT)
|
||||
|
||||
/*
|
||||
* The SMCR_ELx_LEN_* definitions intentionally include bits [8:4] which
|
||||
* are reserved by the SME architecture for future expansion of the LEN
|
||||
* field, with compatible semantics.
|
||||
*/
|
||||
#define SMCR_ELx_LEN_SHIFT 0
|
||||
#define SMCR_ELx_LEN_SIZE 9
|
||||
#define SMCR_ELx_LEN_MASK 0x1ff
|
||||
|
||||
#define CPACR_EL1_FPEN_EL1EN (BIT(20)) /* enable EL1 access */
|
||||
#define CPACR_EL1_FPEN_EL0EN (BIT(21)) /* enable EL0 access, if EL1EN set */
|
||||
|
||||
#define CPACR_EL1_SMEN_EL1EN (BIT(24)) /* enable EL1 access */
|
||||
#define CPACR_EL1_SMEN_EL0EN (BIT(25)) /* enable EL0 access, if EL1EN set */
|
||||
|
||||
#define CPACR_EL1_ZEN_EL1EN (BIT(16)) /* enable EL1 access */
|
||||
#define CPACR_EL1_ZEN_EL0EN (BIT(17)) /* enable EL0 access, if EL1EN set */
|
||||
|
||||
@ -1170,6 +1229,8 @@
|
||||
#define TRFCR_ELx_ExTRE BIT(1)
|
||||
#define TRFCR_ELx_E0TRE BIT(0)
|
||||
|
||||
/* HCRX_EL2 definitions */
|
||||
#define HCRX_EL2_SMPME_MASK (1 << 5)
|
||||
|
||||
/* GIC Hypervisor interface registers */
|
||||
/* ICH_MISR_EL2 bit definitions */
|
||||
@ -1233,6 +1294,12 @@
|
||||
#define ICH_VTR_TDS_SHIFT 19
|
||||
#define ICH_VTR_TDS_MASK (1 << ICH_VTR_TDS_SHIFT)
|
||||
|
||||
/* HFG[WR]TR_EL2 bit definitions */
|
||||
#define HFGxTR_EL2_nTPIDR2_EL0_SHIFT 55
|
||||
#define HFGxTR_EL2_nTPIDR2_EL0_MASK BIT_MASK(HFGxTR_EL2_nTPIDR2_EL0_SHIFT)
|
||||
#define HFGxTR_EL2_nSMPRI_EL1_SHIFT 54
|
||||
#define HFGxTR_EL2_nSMPRI_EL1_MASK BIT_MASK(HFGxTR_EL2_nSMPRI_EL1_SHIFT)
|
||||
|
||||
#define ARM64_FEATURE_FIELD_BITS 4
|
||||
|
||||
/* Create a mask for the feature bits of the specified feature. */
|
||||
|
@ -82,6 +82,8 @@ int arch_dup_task_struct(struct task_struct *dst,
|
||||
#define TIF_SVE_VL_INHERIT 24 /* Inherit SVE vl_onexec across exec */
|
||||
#define TIF_SSBD 25 /* Wants SSB mitigation */
|
||||
#define TIF_TAGGED_ADDR 26 /* Allow tagged user addresses */
|
||||
#define TIF_SME 27 /* SME in use */
|
||||
#define TIF_SME_VL_INHERIT 28 /* Inherit SME vl_onexec across exec */
|
||||
|
||||
#define _TIF_SIGPENDING (1 << TIF_SIGPENDING)
|
||||
#define _TIF_NEED_RESCHED (1 << TIF_NEED_RESCHED)
|
||||
|
@ -460,4 +460,19 @@ static inline int __copy_from_user_flushcache(void *dst, const void __user *src,
|
||||
}
|
||||
#endif
|
||||
|
||||
#ifdef CONFIG_ARCH_HAS_SUBPAGE_FAULTS
|
||||
|
||||
/*
|
||||
* Return 0 on success, the number of bytes not probed otherwise.
|
||||
*/
|
||||
static inline size_t probe_subpage_writeable(const char __user *uaddr,
|
||||
size_t size)
|
||||
{
|
||||
if (!system_supports_mte())
|
||||
return 0;
|
||||
return mte_probe_user_range(uaddr, size);
|
||||
}
|
||||
|
||||
#endif /* CONFIG_ARCH_HAS_SUBPAGE_FAULTS */
|
||||
|
||||
#endif /* __ASM_UACCESS_H */
|
||||
|
@ -79,5 +79,13 @@
|
||||
#define HWCAP2_AFP (1 << 20)
|
||||
#define HWCAP2_RPRES (1 << 21)
|
||||
#define HWCAP2_MTE3 (1 << 22)
|
||||
#define HWCAP2_SME (1 << 23)
|
||||
#define HWCAP2_SME_I16I64 (1 << 24)
|
||||
#define HWCAP2_SME_F64F64 (1 << 25)
|
||||
#define HWCAP2_SME_I8I32 (1 << 26)
|
||||
#define HWCAP2_SME_F16F32 (1 << 27)
|
||||
#define HWCAP2_SME_B16F32 (1 << 28)
|
||||
#define HWCAP2_SME_F32F32 (1 << 29)
|
||||
#define HWCAP2_SME_FA64 (1 << 30)
|
||||
|
||||
#endif /* _UAPI__ASM_HWCAP_H */
|
||||
|
@ -109,7 +109,7 @@ struct user_hwdebug_state {
|
||||
} dbg_regs[16];
|
||||
};
|
||||
|
||||
/* SVE/FP/SIMD state (NT_ARM_SVE) */
|
||||
/* SVE/FP/SIMD state (NT_ARM_SVE & NT_ARM_SSVE) */
|
||||
|
||||
struct user_sve_header {
|
||||
__u32 size; /* total meaningful regset content in bytes */
|
||||
@ -220,6 +220,7 @@ struct user_sve_header {
|
||||
(SVE_PT_SVE_PREG_OFFSET(vq, __SVE_NUM_PREGS) - \
|
||||
SVE_PT_SVE_PREGS_OFFSET(vq))
|
||||
|
||||
/* For streaming mode SVE (SSVE) FFR must be read and written as zero */
|
||||
#define SVE_PT_SVE_FFR_OFFSET(vq) \
|
||||
(SVE_PT_REGS_OFFSET + __SVE_FFR_OFFSET(vq))
|
||||
|
||||
@ -240,10 +241,12 @@ struct user_sve_header {
|
||||
- SVE_PT_SVE_OFFSET + (__SVE_VQ_BYTES - 1)) \
|
||||
/ __SVE_VQ_BYTES * __SVE_VQ_BYTES)
|
||||
|
||||
#define SVE_PT_SIZE(vq, flags) \
|
||||
(((flags) & SVE_PT_REGS_MASK) == SVE_PT_REGS_SVE ? \
|
||||
SVE_PT_SVE_OFFSET + SVE_PT_SVE_SIZE(vq, flags) \
|
||||
: SVE_PT_FPSIMD_OFFSET + SVE_PT_FPSIMD_SIZE(vq, flags))
|
||||
#define SVE_PT_SIZE(vq, flags) \
|
||||
(((flags) & SVE_PT_REGS_MASK) == SVE_PT_REGS_SVE ? \
|
||||
SVE_PT_SVE_OFFSET + SVE_PT_SVE_SIZE(vq, flags) \
|
||||
: ((((flags) & SVE_PT_REGS_MASK) == SVE_PT_REGS_FPSIMD ? \
|
||||
SVE_PT_FPSIMD_OFFSET + SVE_PT_FPSIMD_SIZE(vq, flags) \
|
||||
: SVE_PT_REGS_OFFSET)))
|
||||
|
||||
/* pointer authentication masks (NT_ARM_PAC_MASK) */
|
||||
|
||||
@ -265,6 +268,62 @@ struct user_pac_generic_keys {
|
||||
__uint128_t apgakey;
|
||||
};
|
||||
|
||||
/* ZA state (NT_ARM_ZA) */
|
||||
|
||||
struct user_za_header {
|
||||
__u32 size; /* total meaningful regset content in bytes */
|
||||
__u32 max_size; /* maxmium possible size for this thread */
|
||||
__u16 vl; /* current vector length */
|
||||
__u16 max_vl; /* maximum possible vector length */
|
||||
__u16 flags;
|
||||
__u16 __reserved;
|
||||
};
|
||||
|
||||
/*
|
||||
* Common ZA_PT_* flags:
|
||||
* These must be kept in sync with prctl interface in <linux/prctl.h>
|
||||
*/
|
||||
#define ZA_PT_VL_INHERIT ((1 << 17) /* PR_SME_VL_INHERIT */ >> 16)
|
||||
#define ZA_PT_VL_ONEXEC ((1 << 18) /* PR_SME_SET_VL_ONEXEC */ >> 16)
|
||||
|
||||
|
||||
/*
|
||||
* The remainder of the ZA state follows struct user_za_header. The
|
||||
* total size of the ZA state (including header) depends on the
|
||||
* metadata in the header: ZA_PT_SIZE(vq, flags) gives the total size
|
||||
* of the state in bytes, including the header.
|
||||
*
|
||||
* Refer to <asm/sigcontext.h> for details of how to pass the correct
|
||||
* "vq" argument to these macros.
|
||||
*/
|
||||
|
||||
/* Offset from the start of struct user_za_header to the register data */
|
||||
#define ZA_PT_ZA_OFFSET \
|
||||
((sizeof(struct user_za_header) + (__SVE_VQ_BYTES - 1)) \
|
||||
/ __SVE_VQ_BYTES * __SVE_VQ_BYTES)
|
||||
|
||||
/*
|
||||
* The payload starts at offset ZA_PT_ZA_OFFSET, and is of size
|
||||
* ZA_PT_ZA_SIZE(vq, flags).
|
||||
*
|
||||
* The ZA array is stored as a sequence of horizontal vectors ZAV of SVL/8
|
||||
* bytes each, starting from vector 0.
|
||||
*
|
||||
* Additional data might be appended in the future.
|
||||
*
|
||||
* The ZA matrix is represented in memory in an endianness-invariant layout
|
||||
* which differs from the layout used for the FPSIMD V-registers on big-endian
|
||||
* systems: see sigcontext.h for more explanation.
|
||||
*/
|
||||
|
||||
#define ZA_PT_ZAV_OFFSET(vq, n) \
|
||||
(ZA_PT_ZA_OFFSET + ((vq * __SVE_VQ_BYTES) * n))
|
||||
|
||||
#define ZA_PT_ZA_SIZE(vq) ((vq * __SVE_VQ_BYTES) * (vq * __SVE_VQ_BYTES))
|
||||
|
||||
#define ZA_PT_SIZE(vq) \
|
||||
(ZA_PT_ZA_OFFSET + ZA_PT_ZA_SIZE(vq))
|
||||
|
||||
#endif /* __ASSEMBLY__ */
|
||||
|
||||
#endif /* _UAPI__ASM_PTRACE_H */
|
||||
|
@ -132,6 +132,17 @@ struct extra_context {
|
||||
#define SVE_MAGIC 0x53564501
|
||||
|
||||
struct sve_context {
|
||||
struct _aarch64_ctx head;
|
||||
__u16 vl;
|
||||
__u16 flags;
|
||||
__u16 __reserved[2];
|
||||
};
|
||||
|
||||
#define SVE_SIG_FLAG_SM 0x1 /* Context describes streaming mode */
|
||||
|
||||
#define ZA_MAGIC 0x54366345
|
||||
|
||||
struct za_context {
|
||||
struct _aarch64_ctx head;
|
||||
__u16 vl;
|
||||
__u16 __reserved[3];
|
||||
@ -186,9 +197,16 @@ struct sve_context {
|
||||
* sve_context.vl must equal the thread's current vector length when
|
||||
* doing a sigreturn.
|
||||
*
|
||||
* On systems with support for SME the SVE register state may reflect either
|
||||
* streaming or non-streaming mode. In streaming mode the streaming mode
|
||||
* vector length will be used and the flag SVE_SIG_FLAG_SM will be set in
|
||||
* the flags field. It is permitted to enter or leave streaming mode in
|
||||
* a signal return, applications should take care to ensure that any difference
|
||||
* in vector length between the two modes is handled, including any resizing
|
||||
* and movement of context blocks.
|
||||
*
|
||||
* Note: for all these macros, the "vq" argument denotes the SVE
|
||||
* vector length in quadwords (i.e., units of 128 bits).
|
||||
* Note: for all these macros, the "vq" argument denotes the vector length
|
||||
* in quadwords (i.e., units of 128 bits).
|
||||
*
|
||||
* The correct way to obtain vq is to use sve_vq_from_vl(vl). The
|
||||
* result is valid if and only if sve_vl_valid(vl) is true. This is
|
||||
@ -249,4 +267,37 @@ struct sve_context {
|
||||
#define SVE_SIG_CONTEXT_SIZE(vq) \
|
||||
(SVE_SIG_REGS_OFFSET + SVE_SIG_REGS_SIZE(vq))
|
||||
|
||||
/*
|
||||
* If the ZA register is enabled for the thread at signal delivery then,
|
||||
* za_context.head.size >= ZA_SIG_CONTEXT_SIZE(sve_vq_from_vl(za_context.vl))
|
||||
* and the register data may be accessed using the ZA_SIG_*() macros.
|
||||
*
|
||||
* If za_context.head.size < ZA_SIG_CONTEXT_SIZE(sve_vq_from_vl(za_context.vl))
|
||||
* then ZA was not enabled and no register data was included in which case
|
||||
* ZA register was not enabled for the thread and no register data
|
||||
* the ZA_SIG_*() macros should not be used except for this check.
|
||||
*
|
||||
* The same convention applies when returning from a signal: a caller
|
||||
* will need to remove or resize the za_context block if it wants to
|
||||
* enable the ZA register when it was previously non-live or vice-versa.
|
||||
* This may require the caller to allocate fresh memory and/or move other
|
||||
* context blocks in the signal frame.
|
||||
*
|
||||
* Changing the vector length during signal return is not permitted:
|
||||
* za_context.vl must equal the thread's current SME vector length when
|
||||
* doing a sigreturn.
|
||||
*/
|
||||
|
||||
#define ZA_SIG_REGS_OFFSET \
|
||||
((sizeof(struct za_context) + (__SVE_VQ_BYTES - 1)) \
|
||||
/ __SVE_VQ_BYTES * __SVE_VQ_BYTES)
|
||||
|
||||
#define ZA_SIG_REGS_SIZE(vq) ((vq * __SVE_VQ_BYTES) * (vq * __SVE_VQ_BYTES))
|
||||
|
||||
#define ZA_SIG_ZAV_OFFSET(vq, n) (ZA_SIG_REGS_OFFSET + \
|
||||
(SVE_SIG_ZREG_SIZE(vq) * n))
|
||||
|
||||
#define ZA_SIG_CONTEXT_SIZE(vq) \
|
||||
(ZA_SIG_REGS_OFFSET + ZA_SIG_REGS_SIZE(vq))
|
||||
|
||||
#endif /* _UAPI__ASM_SIGCONTEXT_H */
|
||||
|
@ -215,7 +215,7 @@ static const struct arm64_cpu_capabilities arm64_repeat_tlbi_list[] = {
|
||||
#endif
|
||||
|
||||
#ifdef CONFIG_CAVIUM_ERRATUM_23154
|
||||
const struct midr_range cavium_erratum_23154_cpus[] = {
|
||||
static const struct midr_range cavium_erratum_23154_cpus[] = {
|
||||
MIDR_ALL_VERSIONS(MIDR_THUNDERX),
|
||||
MIDR_ALL_VERSIONS(MIDR_THUNDERX_81XX),
|
||||
MIDR_ALL_VERSIONS(MIDR_THUNDERX_83XX),
|
||||
|
@ -261,6 +261,8 @@ static const struct arm64_ftr_bits ftr_id_aa64pfr0[] = {
|
||||
};
|
||||
|
||||
static const struct arm64_ftr_bits ftr_id_aa64pfr1[] = {
|
||||
ARM64_FTR_BITS(FTR_VISIBLE_IF_IS_ENABLED(CONFIG_ARM64_SME),
|
||||
FTR_STRICT, FTR_LOWER_SAFE, ID_AA64PFR1_SME_SHIFT, 4, 0),
|
||||
ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_LOWER_SAFE, ID_AA64PFR1_MPAMFRAC_SHIFT, 4, 0),
|
||||
ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_LOWER_SAFE, ID_AA64PFR1_RASFRAC_SHIFT, 4, 0),
|
||||
ARM64_FTR_BITS(FTR_VISIBLE_IF_IS_ENABLED(CONFIG_ARM64_MTE),
|
||||
@ -293,6 +295,24 @@ static const struct arm64_ftr_bits ftr_id_aa64zfr0[] = {
|
||||
ARM64_FTR_END,
|
||||
};
|
||||
|
||||
static const struct arm64_ftr_bits ftr_id_aa64smfr0[] = {
|
||||
ARM64_FTR_BITS(FTR_VISIBLE_IF_IS_ENABLED(CONFIG_ARM64_SME),
|
||||
FTR_STRICT, FTR_EXACT, ID_AA64SMFR0_FA64_SHIFT, 1, 0),
|
||||
ARM64_FTR_BITS(FTR_VISIBLE_IF_IS_ENABLED(CONFIG_ARM64_SME),
|
||||
FTR_STRICT, FTR_EXACT, ID_AA64SMFR0_I16I64_SHIFT, 4, 0),
|
||||
ARM64_FTR_BITS(FTR_VISIBLE_IF_IS_ENABLED(CONFIG_ARM64_SME),
|
||||
FTR_STRICT, FTR_EXACT, ID_AA64SMFR0_F64F64_SHIFT, 1, 0),
|
||||
ARM64_FTR_BITS(FTR_VISIBLE_IF_IS_ENABLED(CONFIG_ARM64_SME),
|
||||
FTR_STRICT, FTR_EXACT, ID_AA64SMFR0_I8I32_SHIFT, 4, 0),
|
||||
ARM64_FTR_BITS(FTR_VISIBLE_IF_IS_ENABLED(CONFIG_ARM64_SME),
|
||||
FTR_STRICT, FTR_EXACT, ID_AA64SMFR0_F16F32_SHIFT, 1, 0),
|
||||
ARM64_FTR_BITS(FTR_VISIBLE_IF_IS_ENABLED(CONFIG_ARM64_SME),
|
||||
FTR_STRICT, FTR_EXACT, ID_AA64SMFR0_B16F32_SHIFT, 1, 0),
|
||||
ARM64_FTR_BITS(FTR_VISIBLE_IF_IS_ENABLED(CONFIG_ARM64_SME),
|
||||
FTR_STRICT, FTR_EXACT, ID_AA64SMFR0_F32F32_SHIFT, 1, 0),
|
||||
ARM64_FTR_END,
|
||||
};
|
||||
|
||||
static const struct arm64_ftr_bits ftr_id_aa64mmfr0[] = {
|
||||
ARM64_FTR_BITS(FTR_VISIBLE, FTR_STRICT, FTR_LOWER_SAFE, ID_AA64MMFR0_ECV_SHIFT, 4, 0),
|
||||
ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_LOWER_SAFE, ID_AA64MMFR0_FGT_SHIFT, 4, 0),
|
||||
@ -561,6 +581,12 @@ static const struct arm64_ftr_bits ftr_zcr[] = {
|
||||
ARM64_FTR_END,
|
||||
};
|
||||
|
||||
static const struct arm64_ftr_bits ftr_smcr[] = {
|
||||
ARM64_FTR_BITS(FTR_HIDDEN, FTR_NONSTRICT, FTR_LOWER_SAFE,
|
||||
SMCR_ELx_LEN_SHIFT, SMCR_ELx_LEN_SIZE, 0), /* LEN */
|
||||
ARM64_FTR_END,
|
||||
};
|
||||
|
||||
/*
|
||||
* Common ftr bits for a 32bit register with all hidden, strict
|
||||
* attributes, with 4bit feature fields and a default safe value of
|
||||
@ -645,6 +671,7 @@ static const struct __ftr_reg_entry {
|
||||
ARM64_FTR_REG_OVERRIDE(SYS_ID_AA64PFR1_EL1, ftr_id_aa64pfr1,
|
||||
&id_aa64pfr1_override),
|
||||
ARM64_FTR_REG(SYS_ID_AA64ZFR0_EL1, ftr_id_aa64zfr0),
|
||||
ARM64_FTR_REG(SYS_ID_AA64SMFR0_EL1, ftr_id_aa64smfr0),
|
||||
|
||||
/* Op1 = 0, CRn = 0, CRm = 5 */
|
||||
ARM64_FTR_REG(SYS_ID_AA64DFR0_EL1, ftr_id_aa64dfr0),
|
||||
@ -666,6 +693,7 @@ static const struct __ftr_reg_entry {
|
||||
|
||||
/* Op1 = 0, CRn = 1, CRm = 2 */
|
||||
ARM64_FTR_REG(SYS_ZCR_EL1, ftr_zcr),
|
||||
ARM64_FTR_REG(SYS_SMCR_EL1, ftr_smcr),
|
||||
|
||||
/* Op1 = 1, CRn = 0, CRm = 0 */
|
||||
ARM64_FTR_REG(SYS_GMID_EL1, ftr_gmid),
|
||||
@ -960,6 +988,7 @@ void __init init_cpu_features(struct cpuinfo_arm64 *info)
|
||||
init_cpu_ftr_reg(SYS_ID_AA64PFR0_EL1, info->reg_id_aa64pfr0);
|
||||
init_cpu_ftr_reg(SYS_ID_AA64PFR1_EL1, info->reg_id_aa64pfr1);
|
||||
init_cpu_ftr_reg(SYS_ID_AA64ZFR0_EL1, info->reg_id_aa64zfr0);
|
||||
init_cpu_ftr_reg(SYS_ID_AA64SMFR0_EL1, info->reg_id_aa64smfr0);
|
||||
|
||||
if (id_aa64pfr0_32bit_el0(info->reg_id_aa64pfr0))
|
||||
init_32bit_cpu_features(&info->aarch32);
|
||||
@ -969,6 +998,12 @@ void __init init_cpu_features(struct cpuinfo_arm64 *info)
|
||||
vec_init_vq_map(ARM64_VEC_SVE);
|
||||
}
|
||||
|
||||
if (id_aa64pfr1_sme(info->reg_id_aa64pfr1)) {
|
||||
init_cpu_ftr_reg(SYS_SMCR_EL1, info->reg_smcr);
|
||||
if (IS_ENABLED(CONFIG_ARM64_SME))
|
||||
vec_init_vq_map(ARM64_VEC_SME);
|
||||
}
|
||||
|
||||
if (id_aa64pfr1_mte(info->reg_id_aa64pfr1))
|
||||
init_cpu_ftr_reg(SYS_GMID_EL1, info->reg_gmid);
|
||||
|
||||
@ -1195,6 +1230,9 @@ void update_cpu_features(int cpu,
|
||||
taint |= check_update_ftr_reg(SYS_ID_AA64ZFR0_EL1, cpu,
|
||||
info->reg_id_aa64zfr0, boot->reg_id_aa64zfr0);
|
||||
|
||||
taint |= check_update_ftr_reg(SYS_ID_AA64SMFR0_EL1, cpu,
|
||||
info->reg_id_aa64smfr0, boot->reg_id_aa64smfr0);
|
||||
|
||||
if (id_aa64pfr0_sve(info->reg_id_aa64pfr0)) {
|
||||
taint |= check_update_ftr_reg(SYS_ZCR_EL1, cpu,
|
||||
info->reg_zcr, boot->reg_zcr);
|
||||
@ -1205,6 +1243,16 @@ void update_cpu_features(int cpu,
|
||||
vec_update_vq_map(ARM64_VEC_SVE);
|
||||
}
|
||||
|
||||
if (id_aa64pfr1_sme(info->reg_id_aa64pfr1)) {
|
||||
taint |= check_update_ftr_reg(SYS_SMCR_EL1, cpu,
|
||||
info->reg_smcr, boot->reg_smcr);
|
||||
|
||||
/* Probe vector lengths, unless we already gave up on SME */
|
||||
if (id_aa64pfr1_sme(read_sanitised_ftr_reg(SYS_ID_AA64PFR1_EL1)) &&
|
||||
!system_capabilities_finalized())
|
||||
vec_update_vq_map(ARM64_VEC_SME);
|
||||
}
|
||||
|
||||
/*
|
||||
* The kernel uses the LDGM/STGM instructions and the number of tags
|
||||
* they read/write depends on the GMID_EL1.BS field. Check that the
|
||||
@ -1288,6 +1336,7 @@ u64 __read_sysreg_by_encoding(u32 sys_id)
|
||||
read_sysreg_case(SYS_ID_AA64PFR0_EL1);
|
||||
read_sysreg_case(SYS_ID_AA64PFR1_EL1);
|
||||
read_sysreg_case(SYS_ID_AA64ZFR0_EL1);
|
||||
read_sysreg_case(SYS_ID_AA64SMFR0_EL1);
|
||||
read_sysreg_case(SYS_ID_AA64DFR0_EL1);
|
||||
read_sysreg_case(SYS_ID_AA64DFR1_EL1);
|
||||
read_sysreg_case(SYS_ID_AA64MMFR0_EL1);
|
||||
@ -2442,6 +2491,33 @@ static const struct arm64_cpu_capabilities arm64_features[] = {
|
||||
.matches = has_cpuid_feature,
|
||||
.min_field_value = 1,
|
||||
},
|
||||
#ifdef CONFIG_ARM64_SME
|
||||
{
|
||||
.desc = "Scalable Matrix Extension",
|
||||
.type = ARM64_CPUCAP_SYSTEM_FEATURE,
|
||||
.capability = ARM64_SME,
|
||||
.sys_reg = SYS_ID_AA64PFR1_EL1,
|
||||
.sign = FTR_UNSIGNED,
|
||||
.field_pos = ID_AA64PFR1_SME_SHIFT,
|
||||
.field_width = 4,
|
||||
.min_field_value = ID_AA64PFR1_SME,
|
||||
.matches = has_cpuid_feature,
|
||||
.cpu_enable = sme_kernel_enable,
|
||||
},
|
||||
/* FA64 should be sorted after the base SME capability */
|
||||
{
|
||||
.desc = "FA64",
|
||||
.type = ARM64_CPUCAP_SYSTEM_FEATURE,
|
||||
.capability = ARM64_SME_FA64,
|
||||
.sys_reg = SYS_ID_AA64SMFR0_EL1,
|
||||
.sign = FTR_UNSIGNED,
|
||||
.field_pos = ID_AA64SMFR0_FA64_SHIFT,
|
||||
.field_width = 1,
|
||||
.min_field_value = ID_AA64SMFR0_FA64,
|
||||
.matches = has_cpuid_feature,
|
||||
.cpu_enable = fa64_kernel_enable,
|
||||
},
|
||||
#endif /* CONFIG_ARM64_SME */
|
||||
{},
|
||||
};
|
||||
|
||||
@ -2575,6 +2651,16 @@ static const struct arm64_cpu_capabilities arm64_elf_hwcaps[] = {
|
||||
HWCAP_CAP(SYS_ID_AA64MMFR0_EL1, ID_AA64MMFR0_ECV_SHIFT, 4, FTR_UNSIGNED, 1, CAP_HWCAP, KERNEL_HWCAP_ECV),
|
||||
HWCAP_CAP(SYS_ID_AA64MMFR1_EL1, ID_AA64MMFR1_AFP_SHIFT, 4, FTR_UNSIGNED, 1, CAP_HWCAP, KERNEL_HWCAP_AFP),
|
||||
HWCAP_CAP(SYS_ID_AA64ISAR2_EL1, ID_AA64ISAR2_RPRES_SHIFT, 4, FTR_UNSIGNED, 1, CAP_HWCAP, KERNEL_HWCAP_RPRES),
|
||||
#ifdef CONFIG_ARM64_SME
|
||||
HWCAP_CAP(SYS_ID_AA64PFR1_EL1, ID_AA64PFR1_SME_SHIFT, 4, FTR_UNSIGNED, ID_AA64PFR1_SME, CAP_HWCAP, KERNEL_HWCAP_SME),
|
||||
HWCAP_CAP(SYS_ID_AA64SMFR0_EL1, ID_AA64SMFR0_FA64_SHIFT, 1, FTR_UNSIGNED, ID_AA64SMFR0_FA64, CAP_HWCAP, KERNEL_HWCAP_SME_FA64),
|
||||
HWCAP_CAP(SYS_ID_AA64SMFR0_EL1, ID_AA64SMFR0_I16I64_SHIFT, 4, FTR_UNSIGNED, ID_AA64SMFR0_I16I64, CAP_HWCAP, KERNEL_HWCAP_SME_I16I64),
|
||||
HWCAP_CAP(SYS_ID_AA64SMFR0_EL1, ID_AA64SMFR0_F64F64_SHIFT, 1, FTR_UNSIGNED, ID_AA64SMFR0_F64F64, CAP_HWCAP, KERNEL_HWCAP_SME_F64F64),
|
||||
HWCAP_CAP(SYS_ID_AA64SMFR0_EL1, ID_AA64SMFR0_I8I32_SHIFT, 4, FTR_UNSIGNED, ID_AA64SMFR0_I8I32, CAP_HWCAP, KERNEL_HWCAP_SME_I8I32),
|
||||
HWCAP_CAP(SYS_ID_AA64SMFR0_EL1, ID_AA64SMFR0_F16F32_SHIFT, 1, FTR_UNSIGNED, ID_AA64SMFR0_F16F32, CAP_HWCAP, KERNEL_HWCAP_SME_F16F32),
|
||||
HWCAP_CAP(SYS_ID_AA64SMFR0_EL1, ID_AA64SMFR0_B16F32_SHIFT, 1, FTR_UNSIGNED, ID_AA64SMFR0_B16F32, CAP_HWCAP, KERNEL_HWCAP_SME_B16F32),
|
||||
HWCAP_CAP(SYS_ID_AA64SMFR0_EL1, ID_AA64SMFR0_F32F32_SHIFT, 1, FTR_UNSIGNED, ID_AA64SMFR0_F32F32, CAP_HWCAP, KERNEL_HWCAP_SME_F32F32),
|
||||
#endif /* CONFIG_ARM64_SME */
|
||||
{},
|
||||
};
|
||||
|
||||
@ -2872,6 +2958,23 @@ static void verify_sve_features(void)
|
||||
/* Add checks on other ZCR bits here if necessary */
|
||||
}
|
||||
|
||||
static void verify_sme_features(void)
|
||||
{
|
||||
u64 safe_smcr = read_sanitised_ftr_reg(SYS_SMCR_EL1);
|
||||
u64 smcr = read_smcr_features();
|
||||
|
||||
unsigned int safe_len = safe_smcr & SMCR_ELx_LEN_MASK;
|
||||
unsigned int len = smcr & SMCR_ELx_LEN_MASK;
|
||||
|
||||
if (len < safe_len || vec_verify_vq_map(ARM64_VEC_SME)) {
|
||||
pr_crit("CPU%d: SME: vector length support mismatch\n",
|
||||
smp_processor_id());
|
||||
cpu_die_early();
|
||||
}
|
||||
|
||||
/* Add checks on other SMCR bits here if necessary */
|
||||
}
|
||||
|
||||
static void verify_hyp_capabilities(void)
|
||||
{
|
||||
u64 safe_mmfr1, mmfr0, mmfr1;
|
||||
@ -2924,6 +3027,9 @@ static void verify_local_cpu_capabilities(void)
|
||||
if (system_supports_sve())
|
||||
verify_sve_features();
|
||||
|
||||
if (system_supports_sme())
|
||||
verify_sme_features();
|
||||
|
||||
if (is_hyp_mode_available())
|
||||
verify_hyp_capabilities();
|
||||
}
|
||||
@ -3041,6 +3147,7 @@ void __init setup_cpu_features(void)
|
||||
pr_info("emulated: Privileged Access Never (PAN) using TTBR0_EL1 switching\n");
|
||||
|
||||
sve_setup();
|
||||
sme_setup();
|
||||
minsigstksz_setup();
|
||||
|
||||
/* Advertise that we have computed the system capabilities */
|
||||
|
@ -98,6 +98,14 @@ static const char *const hwcap_str[] = {
|
||||
[KERNEL_HWCAP_AFP] = "afp",
|
||||
[KERNEL_HWCAP_RPRES] = "rpres",
|
||||
[KERNEL_HWCAP_MTE3] = "mte3",
|
||||
[KERNEL_HWCAP_SME] = "sme",
|
||||
[KERNEL_HWCAP_SME_I16I64] = "smei16i64",
|
||||
[KERNEL_HWCAP_SME_F64F64] = "smef64f64",
|
||||
[KERNEL_HWCAP_SME_I8I32] = "smei8i32",
|
||||
[KERNEL_HWCAP_SME_F16F32] = "smef16f32",
|
||||
[KERNEL_HWCAP_SME_B16F32] = "smeb16f32",
|
||||
[KERNEL_HWCAP_SME_F32F32] = "smef32f32",
|
||||
[KERNEL_HWCAP_SME_FA64] = "smefa64",
|
||||
};
|
||||
|
||||
#ifdef CONFIG_COMPAT
|
||||
@ -401,6 +409,7 @@ static void __cpuinfo_store_cpu(struct cpuinfo_arm64 *info)
|
||||
info->reg_id_aa64pfr0 = read_cpuid(ID_AA64PFR0_EL1);
|
||||
info->reg_id_aa64pfr1 = read_cpuid(ID_AA64PFR1_EL1);
|
||||
info->reg_id_aa64zfr0 = read_cpuid(ID_AA64ZFR0_EL1);
|
||||
info->reg_id_aa64smfr0 = read_cpuid(ID_AA64SMFR0_EL1);
|
||||
|
||||
if (id_aa64pfr1_mte(info->reg_id_aa64pfr1))
|
||||
info->reg_gmid = read_cpuid(GMID_EL1);
|
||||
@ -412,6 +421,10 @@ static void __cpuinfo_store_cpu(struct cpuinfo_arm64 *info)
|
||||
id_aa64pfr0_sve(info->reg_id_aa64pfr0))
|
||||
info->reg_zcr = read_zcr_features();
|
||||
|
||||
if (IS_ENABLED(CONFIG_ARM64_SME) &&
|
||||
id_aa64pfr1_sme(info->reg_id_aa64pfr1))
|
||||
info->reg_smcr = read_smcr_features();
|
||||
|
||||
cpuinfo_detect_icache_policy(info);
|
||||
}
|
||||
|
||||
|
@ -537,6 +537,14 @@ static void noinstr el0_sve_acc(struct pt_regs *regs, unsigned long esr)
|
||||
exit_to_user_mode(regs);
|
||||
}
|
||||
|
||||
static void noinstr el0_sme_acc(struct pt_regs *regs, unsigned long esr)
|
||||
{
|
||||
enter_from_user_mode(regs);
|
||||
local_daif_restore(DAIF_PROCCTX);
|
||||
do_sme_acc(esr, regs);
|
||||
exit_to_user_mode(regs);
|
||||
}
|
||||
|
||||
static void noinstr el0_fpsimd_exc(struct pt_regs *regs, unsigned long esr)
|
||||
{
|
||||
enter_from_user_mode(regs);
|
||||
@ -645,6 +653,9 @@ asmlinkage void noinstr el0t_64_sync_handler(struct pt_regs *regs)
|
||||
case ESR_ELx_EC_SVE:
|
||||
el0_sve_acc(regs, esr);
|
||||
break;
|
||||
case ESR_ELx_EC_SME:
|
||||
el0_sme_acc(regs, esr);
|
||||
break;
|
||||
case ESR_ELx_EC_FP_EXC64:
|
||||
el0_fpsimd_exc(regs, esr);
|
||||
break;
|
||||
|
@ -86,3 +86,39 @@ SYM_FUNC_START(sve_flush_live)
|
||||
SYM_FUNC_END(sve_flush_live)
|
||||
|
||||
#endif /* CONFIG_ARM64_SVE */
|
||||
|
||||
#ifdef CONFIG_ARM64_SME
|
||||
|
||||
SYM_FUNC_START(sme_get_vl)
|
||||
_sme_rdsvl 0, 1
|
||||
ret
|
||||
SYM_FUNC_END(sme_get_vl)
|
||||
|
||||
SYM_FUNC_START(sme_set_vq)
|
||||
sme_load_vq x0, x1, x2
|
||||
ret
|
||||
SYM_FUNC_END(sme_set_vq)
|
||||
|
||||
/*
|
||||
* Save the SME state
|
||||
*
|
||||
* x0 - pointer to buffer for state
|
||||
*/
|
||||
SYM_FUNC_START(za_save_state)
|
||||
_sme_rdsvl 1, 1 // x1 = VL/8
|
||||
sme_save_za 0, x1, 12
|
||||
ret
|
||||
SYM_FUNC_END(za_save_state)
|
||||
|
||||
/*
|
||||
* Load the SME state
|
||||
*
|
||||
* x0 - pointer to buffer for state
|
||||
*/
|
||||
SYM_FUNC_START(za_load_state)
|
||||
_sme_rdsvl 1, 1 // x1 = VL/8
|
||||
sme_load_za 0, x1, 12
|
||||
ret
|
||||
SYM_FUNC_END(za_load_state)
|
||||
|
||||
#endif /* CONFIG_ARM64_SME */
|
||||
|
@ -97,12 +97,6 @@ SYM_CODE_START(ftrace_common)
|
||||
SYM_INNER_LABEL(ftrace_call, SYM_L_GLOBAL)
|
||||
bl ftrace_stub
|
||||
|
||||
#ifdef CONFIG_FUNCTION_GRAPH_TRACER
|
||||
SYM_INNER_LABEL(ftrace_graph_call, SYM_L_GLOBAL) // ftrace_graph_caller();
|
||||
nop // If enabled, this will be replaced
|
||||
// "b ftrace_graph_caller"
|
||||
#endif
|
||||
|
||||
/*
|
||||
* At the callsite x0-x8 and x19-x30 were live. Any C code will have preserved
|
||||
* x19-x29 per the AAPCS, and we created frame records upon entry, so we need
|
||||
@ -127,17 +121,6 @@ ftrace_common_return:
|
||||
ret x9
|
||||
SYM_CODE_END(ftrace_common)
|
||||
|
||||
#ifdef CONFIG_FUNCTION_GRAPH_TRACER
|
||||
SYM_CODE_START(ftrace_graph_caller)
|
||||
ldr x0, [sp, #S_PC]
|
||||
sub x0, x0, #AARCH64_INSN_SIZE // ip (callsite's BL insn)
|
||||
add x1, sp, #S_LR // parent_ip (callsite's LR)
|
||||
ldr x2, [sp, #PT_REGS_SIZE] // parent fp (callsite's FP)
|
||||
bl prepare_ftrace_return
|
||||
b ftrace_common_return
|
||||
SYM_CODE_END(ftrace_graph_caller)
|
||||
#endif
|
||||
|
||||
#else /* CONFIG_DYNAMIC_FTRACE_WITH_REGS */
|
||||
|
||||
/*
|
||||
|
@ -121,7 +121,10 @@
|
||||
struct fpsimd_last_state_struct {
|
||||
struct user_fpsimd_state *st;
|
||||
void *sve_state;
|
||||
void *za_state;
|
||||
u64 *svcr;
|
||||
unsigned int sve_vl;
|
||||
unsigned int sme_vl;
|
||||
};
|
||||
|
||||
static DEFINE_PER_CPU(struct fpsimd_last_state_struct, fpsimd_last_state);
|
||||
@ -136,6 +139,12 @@ __ro_after_init struct vl_info vl_info[ARM64_VEC_MAX] = {
|
||||
.max_virtualisable_vl = SVE_VL_MIN,
|
||||
},
|
||||
#endif
|
||||
#ifdef CONFIG_ARM64_SME
|
||||
[ARM64_VEC_SME] = {
|
||||
.type = ARM64_VEC_SME,
|
||||
.name = "SME",
|
||||
},
|
||||
#endif
|
||||
};
|
||||
|
||||
static unsigned int vec_vl_inherit_flag(enum vec_type type)
|
||||
@ -143,6 +152,8 @@ static unsigned int vec_vl_inherit_flag(enum vec_type type)
|
||||
switch (type) {
|
||||
case ARM64_VEC_SVE:
|
||||
return TIF_SVE_VL_INHERIT;
|
||||
case ARM64_VEC_SME:
|
||||
return TIF_SME_VL_INHERIT;
|
||||
default:
|
||||
WARN_ON_ONCE(1);
|
||||
return 0;
|
||||
@ -186,6 +197,26 @@ extern void __percpu *efi_sve_state;
|
||||
|
||||
#endif /* ! CONFIG_ARM64_SVE */
|
||||
|
||||
#ifdef CONFIG_ARM64_SME
|
||||
|
||||
static int get_sme_default_vl(void)
|
||||
{
|
||||
return get_default_vl(ARM64_VEC_SME);
|
||||
}
|
||||
|
||||
static void set_sme_default_vl(int val)
|
||||
{
|
||||
set_default_vl(ARM64_VEC_SME, val);
|
||||
}
|
||||
|
||||
static void sme_free(struct task_struct *);
|
||||
|
||||
#else
|
||||
|
||||
static inline void sme_free(struct task_struct *t) { }
|
||||
|
||||
#endif
|
||||
|
||||
DEFINE_PER_CPU(bool, fpsimd_context_busy);
|
||||
EXPORT_PER_CPU_SYMBOL(fpsimd_context_busy);
|
||||
|
||||
@ -206,10 +237,19 @@ static void __get_cpu_fpsimd_context(void)
|
||||
*
|
||||
* The double-underscore version must only be called if you know the task
|
||||
* can't be preempted.
|
||||
*
|
||||
* On RT kernels local_bh_disable() is not sufficient because it only
|
||||
* serializes soft interrupt related sections via a local lock, but stays
|
||||
* preemptible. Disabling preemption is the right choice here as bottom
|
||||
* half processing is always in thread context on RT kernels so it
|
||||
* implicitly prevents bottom half processing as well.
|
||||
*/
|
||||
static void get_cpu_fpsimd_context(void)
|
||||
{
|
||||
local_bh_disable();
|
||||
if (!IS_ENABLED(CONFIG_PREEMPT_RT))
|
||||
local_bh_disable();
|
||||
else
|
||||
preempt_disable();
|
||||
__get_cpu_fpsimd_context();
|
||||
}
|
||||
|
||||
@ -230,7 +270,10 @@ static void __put_cpu_fpsimd_context(void)
|
||||
static void put_cpu_fpsimd_context(void)
|
||||
{
|
||||
__put_cpu_fpsimd_context();
|
||||
local_bh_enable();
|
||||
if (!IS_ENABLED(CONFIG_PREEMPT_RT))
|
||||
local_bh_enable();
|
||||
else
|
||||
preempt_enable();
|
||||
}
|
||||
|
||||
static bool have_cpu_fpsimd_context(void)
|
||||
@ -238,23 +281,6 @@ static bool have_cpu_fpsimd_context(void)
|
||||
return !preemptible() && __this_cpu_read(fpsimd_context_busy);
|
||||
}
|
||||
|
||||
/*
|
||||
* Call __sve_free() directly only if you know task can't be scheduled
|
||||
* or preempted.
|
||||
*/
|
||||
static void __sve_free(struct task_struct *task)
|
||||
{
|
||||
kfree(task->thread.sve_state);
|
||||
task->thread.sve_state = NULL;
|
||||
}
|
||||
|
||||
static void sve_free(struct task_struct *task)
|
||||
{
|
||||
WARN_ON(test_tsk_thread_flag(task, TIF_SVE));
|
||||
|
||||
__sve_free(task);
|
||||
}
|
||||
|
||||
unsigned int task_get_vl(const struct task_struct *task, enum vec_type type)
|
||||
{
|
||||
return task->thread.vl[type];
|
||||
@ -278,17 +304,28 @@ void task_set_vl_onexec(struct task_struct *task, enum vec_type type,
|
||||
task->thread.vl_onexec[type] = vl;
|
||||
}
|
||||
|
||||
/*
|
||||
* TIF_SME controls whether a task can use SME without trapping while
|
||||
* in userspace, when TIF_SME is set then we must have storage
|
||||
* alocated in sve_state and za_state to store the contents of both ZA
|
||||
* and the SVE registers for both streaming and non-streaming modes.
|
||||
*
|
||||
* If both SVCR.ZA and SVCR.SM are disabled then at any point we
|
||||
* may disable TIF_SME and reenable traps.
|
||||
*/
|
||||
|
||||
|
||||
/*
|
||||
* TIF_SVE controls whether a task can use SVE without trapping while
|
||||
* in userspace, and also the way a task's FPSIMD/SVE state is stored
|
||||
* in thread_struct.
|
||||
* in userspace, and also (together with TIF_SME) the way a task's
|
||||
* FPSIMD/SVE state is stored in thread_struct.
|
||||
*
|
||||
* The kernel uses this flag to track whether a user task is actively
|
||||
* using SVE, and therefore whether full SVE register state needs to
|
||||
* be tracked. If not, the cheaper FPSIMD context handling code can
|
||||
* be used instead of the more costly SVE equivalents.
|
||||
*
|
||||
* * TIF_SVE set:
|
||||
* * TIF_SVE or SVCR.SM set:
|
||||
*
|
||||
* The task can execute SVE instructions while in userspace without
|
||||
* trapping to the kernel.
|
||||
@ -296,7 +333,8 @@ void task_set_vl_onexec(struct task_struct *task, enum vec_type type,
|
||||
* When stored, Z0-Z31 (incorporating Vn in bits[127:0] or the
|
||||
* corresponding Zn), P0-P15 and FFR are encoded in in
|
||||
* task->thread.sve_state, formatted appropriately for vector
|
||||
* length task->thread.sve_vl.
|
||||
* length task->thread.sve_vl or, if SVCR.SM is set,
|
||||
* task->thread.sme_vl.
|
||||
*
|
||||
* task->thread.sve_state must point to a valid buffer at least
|
||||
* sve_state_size(task) bytes in size.
|
||||
@ -334,16 +372,44 @@ void task_set_vl_onexec(struct task_struct *task, enum vec_type type,
|
||||
*/
|
||||
static void task_fpsimd_load(void)
|
||||
{
|
||||
bool restore_sve_regs = false;
|
||||
bool restore_ffr;
|
||||
|
||||
WARN_ON(!system_supports_fpsimd());
|
||||
WARN_ON(!have_cpu_fpsimd_context());
|
||||
|
||||
/* Check if we should restore SVE first */
|
||||
if (IS_ENABLED(CONFIG_ARM64_SVE) && test_thread_flag(TIF_SVE)) {
|
||||
sve_set_vq(sve_vq_from_vl(task_get_sve_vl(current)) - 1);
|
||||
sve_load_state(sve_pffr(¤t->thread),
|
||||
¤t->thread.uw.fpsimd_state.fpsr, true);
|
||||
} else {
|
||||
fpsimd_load_state(¤t->thread.uw.fpsimd_state);
|
||||
restore_sve_regs = true;
|
||||
restore_ffr = true;
|
||||
}
|
||||
|
||||
/* Restore SME, override SVE register configuration if needed */
|
||||
if (system_supports_sme()) {
|
||||
unsigned long sme_vl = task_get_sme_vl(current);
|
||||
|
||||
/* Ensure VL is set up for restoring data */
|
||||
if (test_thread_flag(TIF_SME))
|
||||
sme_set_vq(sve_vq_from_vl(sme_vl) - 1);
|
||||
|
||||
write_sysreg_s(current->thread.svcr, SYS_SVCR_EL0);
|
||||
|
||||
if (thread_za_enabled(¤t->thread))
|
||||
za_load_state(current->thread.za_state);
|
||||
|
||||
if (thread_sm_enabled(¤t->thread)) {
|
||||
restore_sve_regs = true;
|
||||
restore_ffr = system_supports_fa64();
|
||||
}
|
||||
}
|
||||
|
||||
if (restore_sve_regs)
|
||||
sve_load_state(sve_pffr(¤t->thread),
|
||||
¤t->thread.uw.fpsimd_state.fpsr,
|
||||
restore_ffr);
|
||||
else
|
||||
fpsimd_load_state(¤t->thread.uw.fpsimd_state);
|
||||
}
|
||||
|
||||
/*
|
||||
@ -361,6 +427,9 @@ static void fpsimd_save(void)
|
||||
struct fpsimd_last_state_struct const *last =
|
||||
this_cpu_ptr(&fpsimd_last_state);
|
||||
/* set by fpsimd_bind_task_to_cpu() or fpsimd_bind_state_to_cpu() */
|
||||
bool save_sve_regs = false;
|
||||
bool save_ffr;
|
||||
unsigned int vl;
|
||||
|
||||
WARN_ON(!system_supports_fpsimd());
|
||||
WARN_ON(!have_cpu_fpsimd_context());
|
||||
@ -368,9 +437,32 @@ static void fpsimd_save(void)
|
||||
if (test_thread_flag(TIF_FOREIGN_FPSTATE))
|
||||
return;
|
||||
|
||||
if (IS_ENABLED(CONFIG_ARM64_SVE) &&
|
||||
test_thread_flag(TIF_SVE)) {
|
||||
if (WARN_ON(sve_get_vl() != last->sve_vl)) {
|
||||
if (test_thread_flag(TIF_SVE)) {
|
||||
save_sve_regs = true;
|
||||
save_ffr = true;
|
||||
vl = last->sve_vl;
|
||||
}
|
||||
|
||||
if (system_supports_sme()) {
|
||||
u64 *svcr = last->svcr;
|
||||
*svcr = read_sysreg_s(SYS_SVCR_EL0);
|
||||
|
||||
*svcr = read_sysreg_s(SYS_SVCR_EL0);
|
||||
|
||||
if (*svcr & SYS_SVCR_EL0_ZA_MASK)
|
||||
za_save_state(last->za_state);
|
||||
|
||||
/* If we are in streaming mode override regular SVE. */
|
||||
if (*svcr & SYS_SVCR_EL0_SM_MASK) {
|
||||
save_sve_regs = true;
|
||||
save_ffr = system_supports_fa64();
|
||||
vl = last->sme_vl;
|
||||
}
|
||||
}
|
||||
|
||||
if (IS_ENABLED(CONFIG_ARM64_SVE) && save_sve_regs) {
|
||||
/* Get the configured VL from RDVL, will account for SM */
|
||||
if (WARN_ON(sve_get_vl() != vl)) {
|
||||
/*
|
||||
* Can't save the user regs, so current would
|
||||
* re-enter user with corrupt state.
|
||||
@ -381,8 +473,8 @@ static void fpsimd_save(void)
|
||||
}
|
||||
|
||||
sve_save_state((char *)last->sve_state +
|
||||
sve_ffr_offset(last->sve_vl),
|
||||
&last->st->fpsr, true);
|
||||
sve_ffr_offset(vl),
|
||||
&last->st->fpsr, save_ffr);
|
||||
} else {
|
||||
fpsimd_save_state(last->st);
|
||||
}
|
||||
@ -409,6 +501,8 @@ static unsigned int find_supported_vector_length(enum vec_type type,
|
||||
|
||||
if (vl > max_vl)
|
||||
vl = max_vl;
|
||||
if (vl < info->min_vl)
|
||||
vl = info->min_vl;
|
||||
|
||||
bit = find_next_bit(info->vq_map, SVE_VQ_MAX,
|
||||
__vq_to_bit(sve_vq_from_vl(vl)));
|
||||
@ -467,6 +561,30 @@ static int __init sve_sysctl_init(void)
|
||||
static int __init sve_sysctl_init(void) { return 0; }
|
||||
#endif /* ! (CONFIG_ARM64_SVE && CONFIG_SYSCTL) */
|
||||
|
||||
#if defined(CONFIG_ARM64_SME) && defined(CONFIG_SYSCTL)
|
||||
static struct ctl_table sme_default_vl_table[] = {
|
||||
{
|
||||
.procname = "sme_default_vector_length",
|
||||
.mode = 0644,
|
||||
.proc_handler = vec_proc_do_default_vl,
|
||||
.extra1 = &vl_info[ARM64_VEC_SME],
|
||||
},
|
||||
{ }
|
||||
};
|
||||
|
||||
static int __init sme_sysctl_init(void)
|
||||
{
|
||||
if (system_supports_sme())
|
||||
if (!register_sysctl("abi", sme_default_vl_table))
|
||||
return -EINVAL;
|
||||
|
||||
return 0;
|
||||
}
|
||||
|
||||
#else /* ! (CONFIG_ARM64_SME && CONFIG_SYSCTL) */
|
||||
static int __init sme_sysctl_init(void) { return 0; }
|
||||
#endif /* ! (CONFIG_ARM64_SME && CONFIG_SYSCTL) */
|
||||
|
||||
#define ZREG(sve_state, vq, n) ((char *)(sve_state) + \
|
||||
(SVE_SIG_ZREG_OFFSET(vq, n) - SVE_SIG_REGS_OFFSET))
|
||||
|
||||
@ -520,7 +638,7 @@ static void fpsimd_to_sve(struct task_struct *task)
|
||||
if (!system_supports_sve())
|
||||
return;
|
||||
|
||||
vq = sve_vq_from_vl(task_get_sve_vl(task));
|
||||
vq = sve_vq_from_vl(thread_get_cur_vl(&task->thread));
|
||||
__fpsimd_to_sve(sst, fst, vq);
|
||||
}
|
||||
|
||||
@ -537,7 +655,7 @@ static void fpsimd_to_sve(struct task_struct *task)
|
||||
*/
|
||||
static void sve_to_fpsimd(struct task_struct *task)
|
||||
{
|
||||
unsigned int vq;
|
||||
unsigned int vq, vl;
|
||||
void const *sst = task->thread.sve_state;
|
||||
struct user_fpsimd_state *fst = &task->thread.uw.fpsimd_state;
|
||||
unsigned int i;
|
||||
@ -546,7 +664,8 @@ static void sve_to_fpsimd(struct task_struct *task)
|
||||
if (!system_supports_sve())
|
||||
return;
|
||||
|
||||
vq = sve_vq_from_vl(task_get_sve_vl(task));
|
||||
vl = thread_get_cur_vl(&task->thread);
|
||||
vq = sve_vq_from_vl(vl);
|
||||
for (i = 0; i < SVE_NUM_ZREGS; ++i) {
|
||||
p = (__uint128_t const *)ZREG(sst, vq, i);
|
||||
fst->vregs[i] = arm64_le128_to_cpu(*p);
|
||||
@ -554,14 +673,37 @@ static void sve_to_fpsimd(struct task_struct *task)
|
||||
}
|
||||
|
||||
#ifdef CONFIG_ARM64_SVE
|
||||
/*
|
||||
* Call __sve_free() directly only if you know task can't be scheduled
|
||||
* or preempted.
|
||||
*/
|
||||
static void __sve_free(struct task_struct *task)
|
||||
{
|
||||
kfree(task->thread.sve_state);
|
||||
task->thread.sve_state = NULL;
|
||||
}
|
||||
|
||||
static void sve_free(struct task_struct *task)
|
||||
{
|
||||
WARN_ON(test_tsk_thread_flag(task, TIF_SVE));
|
||||
|
||||
__sve_free(task);
|
||||
}
|
||||
|
||||
/*
|
||||
* Return how many bytes of memory are required to store the full SVE
|
||||
* state for task, given task's currently configured vector length.
|
||||
*/
|
||||
static size_t sve_state_size(struct task_struct const *task)
|
||||
size_t sve_state_size(struct task_struct const *task)
|
||||
{
|
||||
return SVE_SIG_REGS_SIZE(sve_vq_from_vl(task_get_sve_vl(task)));
|
||||
unsigned int vl = 0;
|
||||
|
||||
if (system_supports_sve())
|
||||
vl = task_get_sve_vl(task);
|
||||
if (system_supports_sme())
|
||||
vl = max(vl, task_get_sme_vl(task));
|
||||
|
||||
return SVE_SIG_REGS_SIZE(sve_vq_from_vl(vl));
|
||||
}
|
||||
|
||||
/*
|
||||
@ -587,6 +729,19 @@ void sve_alloc(struct task_struct *task)
|
||||
}
|
||||
|
||||
|
||||
/*
|
||||
* Force the FPSIMD state shared with SVE to be updated in the SVE state
|
||||
* even if the SVE state is the current active state.
|
||||
*
|
||||
* This should only be called by ptrace. task must be non-runnable.
|
||||
* task->thread.sve_state must point to at least sve_state_size(task)
|
||||
* bytes of allocated kernel memory.
|
||||
*/
|
||||
void fpsimd_force_sync_to_sve(struct task_struct *task)
|
||||
{
|
||||
fpsimd_to_sve(task);
|
||||
}
|
||||
|
||||
/*
|
||||
* Ensure that task->thread.sve_state is up to date with respect to
|
||||
* the user task, irrespective of when SVE is in use or not.
|
||||
@ -597,7 +752,8 @@ void sve_alloc(struct task_struct *task)
|
||||
*/
|
||||
void fpsimd_sync_to_sve(struct task_struct *task)
|
||||
{
|
||||
if (!test_tsk_thread_flag(task, TIF_SVE))
|
||||
if (!test_tsk_thread_flag(task, TIF_SVE) &&
|
||||
!thread_sm_enabled(&task->thread))
|
||||
fpsimd_to_sve(task);
|
||||
}
|
||||
|
||||
@ -611,7 +767,8 @@ void fpsimd_sync_to_sve(struct task_struct *task)
|
||||
*/
|
||||
void sve_sync_to_fpsimd(struct task_struct *task)
|
||||
{
|
||||
if (test_tsk_thread_flag(task, TIF_SVE))
|
||||
if (test_tsk_thread_flag(task, TIF_SVE) ||
|
||||
thread_sm_enabled(&task->thread))
|
||||
sve_to_fpsimd(task);
|
||||
}
|
||||
|
||||
@ -636,7 +793,7 @@ void sve_sync_from_fpsimd_zeropad(struct task_struct *task)
|
||||
if (!test_tsk_thread_flag(task, TIF_SVE))
|
||||
return;
|
||||
|
||||
vq = sve_vq_from_vl(task_get_sve_vl(task));
|
||||
vq = sve_vq_from_vl(thread_get_cur_vl(&task->thread));
|
||||
|
||||
memset(sst, 0, SVE_SIG_REGS_SIZE(vq));
|
||||
__fpsimd_to_sve(sst, fst, vq);
|
||||
@ -680,8 +837,7 @@ int vec_set_vector_length(struct task_struct *task, enum vec_type type,
|
||||
/*
|
||||
* To ensure the FPSIMD bits of the SVE vector registers are preserved,
|
||||
* write any live register state back to task_struct, and convert to a
|
||||
* regular FPSIMD thread. Since the vector length can only be changed
|
||||
* with a syscall we can't be in streaming mode while reconfiguring.
|
||||
* regular FPSIMD thread.
|
||||
*/
|
||||
if (task == current) {
|
||||
get_cpu_fpsimd_context();
|
||||
@ -690,17 +846,26 @@ int vec_set_vector_length(struct task_struct *task, enum vec_type type,
|
||||
}
|
||||
|
||||
fpsimd_flush_task_state(task);
|
||||
if (test_and_clear_tsk_thread_flag(task, TIF_SVE))
|
||||
if (test_and_clear_tsk_thread_flag(task, TIF_SVE) ||
|
||||
thread_sm_enabled(&task->thread))
|
||||
sve_to_fpsimd(task);
|
||||
|
||||
if (system_supports_sme() && type == ARM64_VEC_SME) {
|
||||
task->thread.svcr &= ~(SYS_SVCR_EL0_SM_MASK |
|
||||
SYS_SVCR_EL0_ZA_MASK);
|
||||
clear_thread_flag(TIF_SME);
|
||||
}
|
||||
|
||||
if (task == current)
|
||||
put_cpu_fpsimd_context();
|
||||
|
||||
/*
|
||||
* Force reallocation of task SVE state to the correct size
|
||||
* on next use:
|
||||
* Force reallocation of task SVE and SME state to the correct
|
||||
* size on next use:
|
||||
*/
|
||||
sve_free(task);
|
||||
if (system_supports_sme() && type == ARM64_VEC_SME)
|
||||
sme_free(task);
|
||||
|
||||
task_set_vl(task, type, vl);
|
||||
|
||||
@ -761,6 +926,36 @@ int sve_get_current_vl(void)
|
||||
return vec_prctl_status(ARM64_VEC_SVE, 0);
|
||||
}
|
||||
|
||||
#ifdef CONFIG_ARM64_SME
|
||||
/* PR_SME_SET_VL */
|
||||
int sme_set_current_vl(unsigned long arg)
|
||||
{
|
||||
unsigned long vl, flags;
|
||||
int ret;
|
||||
|
||||
vl = arg & PR_SME_VL_LEN_MASK;
|
||||
flags = arg & ~vl;
|
||||
|
||||
if (!system_supports_sme() || is_compat_task())
|
||||
return -EINVAL;
|
||||
|
||||
ret = vec_set_vector_length(current, ARM64_VEC_SME, vl, flags);
|
||||
if (ret)
|
||||
return ret;
|
||||
|
||||
return vec_prctl_status(ARM64_VEC_SME, flags);
|
||||
}
|
||||
|
||||
/* PR_SME_GET_VL */
|
||||
int sme_get_current_vl(void)
|
||||
{
|
||||
if (!system_supports_sme() || is_compat_task())
|
||||
return -EINVAL;
|
||||
|
||||
return vec_prctl_status(ARM64_VEC_SME, 0);
|
||||
}
|
||||
#endif /* CONFIG_ARM64_SME */
|
||||
|
||||
static void vec_probe_vqs(struct vl_info *info,
|
||||
DECLARE_BITMAP(map, SVE_VQ_MAX))
|
||||
{
|
||||
@ -770,7 +965,23 @@ static void vec_probe_vqs(struct vl_info *info,
|
||||
|
||||
for (vq = SVE_VQ_MAX; vq >= SVE_VQ_MIN; --vq) {
|
||||
write_vl(info->type, vq - 1); /* self-syncing */
|
||||
vl = sve_get_vl();
|
||||
|
||||
switch (info->type) {
|
||||
case ARM64_VEC_SVE:
|
||||
vl = sve_get_vl();
|
||||
break;
|
||||
case ARM64_VEC_SME:
|
||||
vl = sme_get_vl();
|
||||
break;
|
||||
default:
|
||||
vl = 0;
|
||||
break;
|
||||
}
|
||||
|
||||
/* Minimum VL identified? */
|
||||
if (sve_vq_from_vl(vl) > vq)
|
||||
break;
|
||||
|
||||
vq = sve_vq_from_vl(vl); /* skip intervening lengths */
|
||||
set_bit(__vq_to_bit(vq), map);
|
||||
}
|
||||
@ -856,21 +1067,25 @@ int vec_verify_vq_map(enum vec_type type)
|
||||
|
||||
static void __init sve_efi_setup(void)
|
||||
{
|
||||
struct vl_info *info = &vl_info[ARM64_VEC_SVE];
|
||||
int max_vl = 0;
|
||||
int i;
|
||||
|
||||
if (!IS_ENABLED(CONFIG_EFI))
|
||||
return;
|
||||
|
||||
for (i = 0; i < ARRAY_SIZE(vl_info); i++)
|
||||
max_vl = max(vl_info[i].max_vl, max_vl);
|
||||
|
||||
/*
|
||||
* alloc_percpu() warns and prints a backtrace if this goes wrong.
|
||||
* This is evidence of a crippled system and we are returning void,
|
||||
* so no attempt is made to handle this situation here.
|
||||
*/
|
||||
if (!sve_vl_valid(info->max_vl))
|
||||
if (!sve_vl_valid(max_vl))
|
||||
goto fail;
|
||||
|
||||
efi_sve_state = __alloc_percpu(
|
||||
SVE_SIG_REGS_SIZE(sve_vq_from_vl(info->max_vl)), SVE_VQ_BYTES);
|
||||
SVE_SIG_REGS_SIZE(sve_vq_from_vl(max_vl)), SVE_VQ_BYTES);
|
||||
if (!efi_sve_state)
|
||||
goto fail;
|
||||
|
||||
@ -989,10 +1204,172 @@ void __init sve_setup(void)
|
||||
void fpsimd_release_task(struct task_struct *dead_task)
|
||||
{
|
||||
__sve_free(dead_task);
|
||||
sme_free(dead_task);
|
||||
}
|
||||
|
||||
#endif /* CONFIG_ARM64_SVE */
|
||||
|
||||
#ifdef CONFIG_ARM64_SME
|
||||
|
||||
/*
|
||||
* Ensure that task->thread.za_state is allocated and sufficiently large.
|
||||
*
|
||||
* This function should be used only in preparation for replacing
|
||||
* task->thread.za_state with new data. The memory is always zeroed
|
||||
* here to prevent stale data from showing through: this is done in
|
||||
* the interest of testability and predictability, the architecture
|
||||
* guarantees that when ZA is enabled it will be zeroed.
|
||||
*/
|
||||
void sme_alloc(struct task_struct *task)
|
||||
{
|
||||
if (task->thread.za_state) {
|
||||
memset(task->thread.za_state, 0, za_state_size(task));
|
||||
return;
|
||||
}
|
||||
|
||||
/* This could potentially be up to 64K. */
|
||||
task->thread.za_state =
|
||||
kzalloc(za_state_size(task), GFP_KERNEL);
|
||||
}
|
||||
|
||||
static void sme_free(struct task_struct *task)
|
||||
{
|
||||
kfree(task->thread.za_state);
|
||||
task->thread.za_state = NULL;
|
||||
}
|
||||
|
||||
void sme_kernel_enable(const struct arm64_cpu_capabilities *__always_unused p)
|
||||
{
|
||||
/* Set priority for all PEs to architecturally defined minimum */
|
||||
write_sysreg_s(read_sysreg_s(SYS_SMPRI_EL1) & ~SMPRI_EL1_PRIORITY_MASK,
|
||||
SYS_SMPRI_EL1);
|
||||
|
||||
/* Allow SME in kernel */
|
||||
write_sysreg(read_sysreg(CPACR_EL1) | CPACR_EL1_SMEN_EL1EN, CPACR_EL1);
|
||||
isb();
|
||||
|
||||
/* Allow EL0 to access TPIDR2 */
|
||||
write_sysreg(read_sysreg(SCTLR_EL1) | SCTLR_ELx_ENTP2, SCTLR_EL1);
|
||||
isb();
|
||||
}
|
||||
|
||||
/*
|
||||
* This must be called after sme_kernel_enable(), we rely on the
|
||||
* feature table being sorted to ensure this.
|
||||
*/
|
||||
void fa64_kernel_enable(const struct arm64_cpu_capabilities *__always_unused p)
|
||||
{
|
||||
/* Allow use of FA64 */
|
||||
write_sysreg_s(read_sysreg_s(SYS_SMCR_EL1) | SMCR_ELx_FA64_MASK,
|
||||
SYS_SMCR_EL1);
|
||||
}
|
||||
|
||||
/*
|
||||
* Read the pseudo-SMCR used by cpufeatures to identify the supported
|
||||
* vector length.
|
||||
*
|
||||
* Use only if SME is present.
|
||||
* This function clobbers the SME vector length.
|
||||
*/
|
||||
u64 read_smcr_features(void)
|
||||
{
|
||||
u64 smcr;
|
||||
unsigned int vq_max;
|
||||
|
||||
sme_kernel_enable(NULL);
|
||||
sme_smstart_sm();
|
||||
|
||||
/*
|
||||
* Set the maximum possible VL.
|
||||
*/
|
||||
write_sysreg_s(read_sysreg_s(SYS_SMCR_EL1) | SMCR_ELx_LEN_MASK,
|
||||
SYS_SMCR_EL1);
|
||||
|
||||
smcr = read_sysreg_s(SYS_SMCR_EL1);
|
||||
smcr &= ~(u64)SMCR_ELx_LEN_MASK; /* Only the LEN field */
|
||||
vq_max = sve_vq_from_vl(sve_get_vl());
|
||||
smcr |= vq_max - 1; /* set LEN field to maximum effective value */
|
||||
|
||||
sme_smstop_sm();
|
||||
|
||||
return smcr;
|
||||
}
|
||||
|
||||
void __init sme_setup(void)
|
||||
{
|
||||
struct vl_info *info = &vl_info[ARM64_VEC_SME];
|
||||
u64 smcr;
|
||||
int min_bit;
|
||||
|
||||
if (!system_supports_sme())
|
||||
return;
|
||||
|
||||
/*
|
||||
* SME doesn't require any particular vector length be
|
||||
* supported but it does require at least one. We should have
|
||||
* disabled the feature entirely while bringing up CPUs but
|
||||
* let's double check here.
|
||||
*/
|
||||
WARN_ON(bitmap_empty(info->vq_map, SVE_VQ_MAX));
|
||||
|
||||
min_bit = find_last_bit(info->vq_map, SVE_VQ_MAX);
|
||||
info->min_vl = sve_vl_from_vq(__bit_to_vq(min_bit));
|
||||
|
||||
smcr = read_sanitised_ftr_reg(SYS_SMCR_EL1);
|
||||
info->max_vl = sve_vl_from_vq((smcr & SMCR_ELx_LEN_MASK) + 1);
|
||||
|
||||
/*
|
||||
* Sanity-check that the max VL we determined through CPU features
|
||||
* corresponds properly to sme_vq_map. If not, do our best:
|
||||
*/
|
||||
if (WARN_ON(info->max_vl != find_supported_vector_length(ARM64_VEC_SME,
|
||||
info->max_vl)))
|
||||
info->max_vl = find_supported_vector_length(ARM64_VEC_SME,
|
||||
info->max_vl);
|
||||
|
||||
WARN_ON(info->min_vl > info->max_vl);
|
||||
|
||||
/*
|
||||
* For the default VL, pick the maximum supported value <= 32
|
||||
* (256 bits) if there is one since this is guaranteed not to
|
||||
* grow the signal frame when in streaming mode, otherwise the
|
||||
* minimum available VL will be used.
|
||||
*/
|
||||
set_sme_default_vl(find_supported_vector_length(ARM64_VEC_SME, 32));
|
||||
|
||||
pr_info("SME: minimum available vector length %u bytes per vector\n",
|
||||
info->min_vl);
|
||||
pr_info("SME: maximum available vector length %u bytes per vector\n",
|
||||
info->max_vl);
|
||||
pr_info("SME: default vector length %u bytes per vector\n",
|
||||
get_sme_default_vl());
|
||||
}
|
||||
|
||||
#endif /* CONFIG_ARM64_SME */
|
||||
|
||||
static void sve_init_regs(void)
|
||||
{
|
||||
/*
|
||||
* Convert the FPSIMD state to SVE, zeroing all the state that
|
||||
* is not shared with FPSIMD. If (as is likely) the current
|
||||
* state is live in the registers then do this there and
|
||||
* update our metadata for the current task including
|
||||
* disabling the trap, otherwise update our in-memory copy.
|
||||
* We are guaranteed to not be in streaming mode, we can only
|
||||
* take a SVE trap when not in streaming mode and we can't be
|
||||
* in streaming mode when taking a SME trap.
|
||||
*/
|
||||
if (!test_thread_flag(TIF_FOREIGN_FPSTATE)) {
|
||||
unsigned long vq_minus_one =
|
||||
sve_vq_from_vl(task_get_sve_vl(current)) - 1;
|
||||
sve_set_vq(vq_minus_one);
|
||||
sve_flush_live(true, vq_minus_one);
|
||||
fpsimd_bind_task_to_cpu();
|
||||
} else {
|
||||
fpsimd_to_sve(current);
|
||||
}
|
||||
}
|
||||
|
||||
/*
|
||||
* Trapped SVE access
|
||||
*
|
||||
@ -1024,22 +1401,77 @@ void do_sve_acc(unsigned int esr, struct pt_regs *regs)
|
||||
WARN_ON(1); /* SVE access shouldn't have trapped */
|
||||
|
||||
/*
|
||||
* Convert the FPSIMD state to SVE, zeroing all the state that
|
||||
* is not shared with FPSIMD. If (as is likely) the current
|
||||
* state is live in the registers then do this there and
|
||||
* update our metadata for the current task including
|
||||
* disabling the trap, otherwise update our in-memory copy.
|
||||
* Even if the task can have used streaming mode we can only
|
||||
* generate SVE access traps in normal SVE mode and
|
||||
* transitioning out of streaming mode may discard any
|
||||
* streaming mode state. Always clear the high bits to avoid
|
||||
* any potential errors tracking what is properly initialised.
|
||||
*/
|
||||
sve_init_regs();
|
||||
|
||||
put_cpu_fpsimd_context();
|
||||
}
|
||||
|
||||
/*
|
||||
* Trapped SME access
|
||||
*
|
||||
* Storage is allocated for the full SVE and SME state, the current
|
||||
* FPSIMD register contents are migrated to SVE if SVE is not already
|
||||
* active, and the access trap is disabled.
|
||||
*
|
||||
* TIF_SME should be clear on entry: otherwise, fpsimd_restore_current_state()
|
||||
* would have disabled the SME access trap for userspace during
|
||||
* ret_to_user, making an SVE access trap impossible in that case.
|
||||
*/
|
||||
void do_sme_acc(unsigned int esr, struct pt_regs *regs)
|
||||
{
|
||||
/* Even if we chose not to use SME, the hardware could still trap: */
|
||||
if (unlikely(!system_supports_sme()) || WARN_ON(is_compat_task())) {
|
||||
force_signal_inject(SIGILL, ILL_ILLOPC, regs->pc, 0);
|
||||
return;
|
||||
}
|
||||
|
||||
/*
|
||||
* If this not a trap due to SME being disabled then something
|
||||
* is being used in the wrong mode, report as SIGILL.
|
||||
*/
|
||||
if (ESR_ELx_ISS(esr) != ESR_ELx_SME_ISS_SME_DISABLED) {
|
||||
force_signal_inject(SIGILL, ILL_ILLOPC, regs->pc, 0);
|
||||
return;
|
||||
}
|
||||
|
||||
sve_alloc(current);
|
||||
sme_alloc(current);
|
||||
if (!current->thread.sve_state || !current->thread.za_state) {
|
||||
force_sig(SIGKILL);
|
||||
return;
|
||||
}
|
||||
|
||||
get_cpu_fpsimd_context();
|
||||
|
||||
/* With TIF_SME userspace shouldn't generate any traps */
|
||||
if (test_and_set_thread_flag(TIF_SME))
|
||||
WARN_ON(1);
|
||||
|
||||
if (!test_thread_flag(TIF_FOREIGN_FPSTATE)) {
|
||||
unsigned long vq_minus_one =
|
||||
sve_vq_from_vl(task_get_sve_vl(current)) - 1;
|
||||
sve_set_vq(vq_minus_one);
|
||||
sve_flush_live(true, vq_minus_one);
|
||||
sve_vq_from_vl(task_get_sme_vl(current)) - 1;
|
||||
sme_set_vq(vq_minus_one);
|
||||
|
||||
fpsimd_bind_task_to_cpu();
|
||||
} else {
|
||||
fpsimd_to_sve(current);
|
||||
}
|
||||
|
||||
/*
|
||||
* If SVE was not already active initialise the SVE registers,
|
||||
* any non-shared state between the streaming and regular SVE
|
||||
* registers is architecturally guaranteed to be zeroed when
|
||||
* we enter streaming mode. We do not need to initialize ZA
|
||||
* since ZA must be disabled at this point and enabling ZA is
|
||||
* architecturally defined to zero ZA.
|
||||
*/
|
||||
if (system_supports_sve() && !test_thread_flag(TIF_SVE))
|
||||
sve_init_regs();
|
||||
|
||||
put_cpu_fpsimd_context();
|
||||
}
|
||||
|
||||
@ -1141,6 +1573,9 @@ static void fpsimd_flush_thread_vl(enum vec_type type)
|
||||
|
||||
void fpsimd_flush_thread(void)
|
||||
{
|
||||
void *sve_state = NULL;
|
||||
void *za_state = NULL;
|
||||
|
||||
if (!system_supports_fpsimd())
|
||||
return;
|
||||
|
||||
@ -1152,11 +1587,28 @@ void fpsimd_flush_thread(void)
|
||||
|
||||
if (system_supports_sve()) {
|
||||
clear_thread_flag(TIF_SVE);
|
||||
sve_free(current);
|
||||
|
||||
/* Defer kfree() while in atomic context */
|
||||
sve_state = current->thread.sve_state;
|
||||
current->thread.sve_state = NULL;
|
||||
|
||||
fpsimd_flush_thread_vl(ARM64_VEC_SVE);
|
||||
}
|
||||
|
||||
if (system_supports_sme()) {
|
||||
clear_thread_flag(TIF_SME);
|
||||
|
||||
/* Defer kfree() while in atomic context */
|
||||
za_state = current->thread.za_state;
|
||||
current->thread.za_state = NULL;
|
||||
|
||||
fpsimd_flush_thread_vl(ARM64_VEC_SME);
|
||||
current->thread.svcr = 0;
|
||||
}
|
||||
|
||||
put_cpu_fpsimd_context();
|
||||
kfree(sve_state);
|
||||
kfree(za_state);
|
||||
}
|
||||
|
||||
/*
|
||||
@ -1198,22 +1650,34 @@ static void fpsimd_bind_task_to_cpu(void)
|
||||
WARN_ON(!system_supports_fpsimd());
|
||||
last->st = ¤t->thread.uw.fpsimd_state;
|
||||
last->sve_state = current->thread.sve_state;
|
||||
last->za_state = current->thread.za_state;
|
||||
last->sve_vl = task_get_sve_vl(current);
|
||||
last->sme_vl = task_get_sme_vl(current);
|
||||
last->svcr = ¤t->thread.svcr;
|
||||
current->thread.fpsimd_cpu = smp_processor_id();
|
||||
|
||||
/*
|
||||
* Toggle SVE and SME trapping for userspace if needed, these
|
||||
* are serialsied by ret_to_user().
|
||||
*/
|
||||
if (system_supports_sme()) {
|
||||
if (test_thread_flag(TIF_SME))
|
||||
sme_user_enable();
|
||||
else
|
||||
sme_user_disable();
|
||||
}
|
||||
|
||||
if (system_supports_sve()) {
|
||||
/* Toggle SVE trapping for userspace if needed */
|
||||
if (test_thread_flag(TIF_SVE))
|
||||
sve_user_enable();
|
||||
else
|
||||
sve_user_disable();
|
||||
|
||||
/* Serialised by exception return to user */
|
||||
}
|
||||
}
|
||||
|
||||
void fpsimd_bind_state_to_cpu(struct user_fpsimd_state *st, void *sve_state,
|
||||
unsigned int sve_vl)
|
||||
unsigned int sve_vl, void *za_state,
|
||||
unsigned int sme_vl, u64 *svcr)
|
||||
{
|
||||
struct fpsimd_last_state_struct *last =
|
||||
this_cpu_ptr(&fpsimd_last_state);
|
||||
@ -1222,8 +1686,11 @@ void fpsimd_bind_state_to_cpu(struct user_fpsimd_state *st, void *sve_state,
|
||||
WARN_ON(!in_softirq() && !irqs_disabled());
|
||||
|
||||
last->st = st;
|
||||
last->svcr = svcr;
|
||||
last->sve_state = sve_state;
|
||||
last->za_state = za_state;
|
||||
last->sve_vl = sve_vl;
|
||||
last->sme_vl = sme_vl;
|
||||
}
|
||||
|
||||
/*
|
||||
@ -1320,6 +1787,15 @@ static void fpsimd_flush_cpu_state(void)
|
||||
{
|
||||
WARN_ON(!system_supports_fpsimd());
|
||||
__this_cpu_write(fpsimd_last_state.st, NULL);
|
||||
|
||||
/*
|
||||
* Leaving streaming mode enabled will cause issues for any kernel
|
||||
* NEON and leaving streaming mode or ZA enabled may increase power
|
||||
* consumption.
|
||||
*/
|
||||
if (system_supports_sme())
|
||||
sme_smstop();
|
||||
|
||||
set_thread_flag(TIF_FOREIGN_FPSTATE);
|
||||
}
|
||||
|
||||
@ -1397,6 +1873,7 @@ EXPORT_SYMBOL(kernel_neon_end);
|
||||
static DEFINE_PER_CPU(struct user_fpsimd_state, efi_fpsimd_state);
|
||||
static DEFINE_PER_CPU(bool, efi_fpsimd_state_used);
|
||||
static DEFINE_PER_CPU(bool, efi_sve_state_used);
|
||||
static DEFINE_PER_CPU(bool, efi_sm_state);
|
||||
|
||||
/*
|
||||
* EFI runtime services support functions
|
||||
@ -1431,12 +1908,28 @@ void __efi_fpsimd_begin(void)
|
||||
*/
|
||||
if (system_supports_sve() && likely(efi_sve_state)) {
|
||||
char *sve_state = this_cpu_ptr(efi_sve_state);
|
||||
bool ffr = true;
|
||||
u64 svcr;
|
||||
|
||||
__this_cpu_write(efi_sve_state_used, true);
|
||||
|
||||
if (system_supports_sme()) {
|
||||
svcr = read_sysreg_s(SYS_SVCR_EL0);
|
||||
|
||||
if (!system_supports_fa64())
|
||||
ffr = svcr & SYS_SVCR_EL0_SM_MASK;
|
||||
|
||||
__this_cpu_write(efi_sm_state, ffr);
|
||||
}
|
||||
|
||||
sve_save_state(sve_state + sve_ffr_offset(sve_max_vl()),
|
||||
&this_cpu_ptr(&efi_fpsimd_state)->fpsr,
|
||||
true);
|
||||
ffr);
|
||||
|
||||
if (system_supports_sme())
|
||||
sysreg_clear_set_s(SYS_SVCR_EL0,
|
||||
SYS_SVCR_EL0_SM_MASK, 0);
|
||||
|
||||
} else {
|
||||
fpsimd_save_state(this_cpu_ptr(&efi_fpsimd_state));
|
||||
}
|
||||
@ -1459,11 +1952,26 @@ void __efi_fpsimd_end(void)
|
||||
if (system_supports_sve() &&
|
||||
likely(__this_cpu_read(efi_sve_state_used))) {
|
||||
char const *sve_state = this_cpu_ptr(efi_sve_state);
|
||||
bool ffr = true;
|
||||
|
||||
/*
|
||||
* Restore streaming mode; EFI calls are
|
||||
* normal function calls so should not return in
|
||||
* streaming mode.
|
||||
*/
|
||||
if (system_supports_sme()) {
|
||||
if (__this_cpu_read(efi_sm_state)) {
|
||||
sysreg_clear_set_s(SYS_SVCR_EL0,
|
||||
0,
|
||||
SYS_SVCR_EL0_SM_MASK);
|
||||
if (!system_supports_fa64())
|
||||
ffr = efi_sm_state;
|
||||
}
|
||||
}
|
||||
|
||||
sve_set_vq(sve_vq_from_vl(sve_get_vl()) - 1);
|
||||
sve_load_state(sve_state + sve_ffr_offset(sve_max_vl()),
|
||||
&this_cpu_ptr(&efi_fpsimd_state)->fpsr,
|
||||
true);
|
||||
ffr);
|
||||
|
||||
__this_cpu_write(efi_sve_state_used, false);
|
||||
} else {
|
||||
@ -1538,6 +2046,13 @@ static int __init fpsimd_init(void)
|
||||
if (!cpu_have_named_feature(ASIMD))
|
||||
pr_notice("Advanced SIMD is not implemented\n");
|
||||
|
||||
return sve_sysctl_init();
|
||||
|
||||
if (cpu_have_named_feature(SME) && !cpu_have_named_feature(SVE))
|
||||
pr_notice("SME is implemented but not SVE\n");
|
||||
|
||||
sve_sysctl_init();
|
||||
sme_sysctl_init();
|
||||
|
||||
return 0;
|
||||
}
|
||||
core_initcall(fpsimd_init);
|
||||
|
@ -268,6 +268,22 @@ void prepare_ftrace_return(unsigned long self_addr, unsigned long *parent,
|
||||
}
|
||||
|
||||
#ifdef CONFIG_DYNAMIC_FTRACE
|
||||
|
||||
#ifdef CONFIG_DYNAMIC_FTRACE_WITH_REGS
|
||||
void ftrace_graph_func(unsigned long ip, unsigned long parent_ip,
|
||||
struct ftrace_ops *op, struct ftrace_regs *fregs)
|
||||
{
|
||||
/*
|
||||
* When DYNAMIC_FTRACE_WITH_REGS is selected, `fregs` can never be NULL
|
||||
* and arch_ftrace_get_regs(fregs) will always give a non-NULL pt_regs
|
||||
* in which we can safely modify the LR.
|
||||
*/
|
||||
struct pt_regs *regs = arch_ftrace_get_regs(fregs);
|
||||
unsigned long *parent = (unsigned long *)&procedure_link_pointer(regs);
|
||||
|
||||
prepare_ftrace_return(ip, parent, frame_pointer(regs));
|
||||
}
|
||||
#else
|
||||
/*
|
||||
* Turn on/off the call to ftrace_graph_caller() in ftrace_caller()
|
||||
* depending on @enable.
|
||||
@ -297,5 +313,6 @@ int ftrace_disable_ftrace_graph_caller(void)
|
||||
{
|
||||
return ftrace_modify_graph_caller(false);
|
||||
}
|
||||
#endif /* CONFIG_DYNAMIC_FTRACE_WITH_REGS */
|
||||
#endif /* CONFIG_DYNAMIC_FTRACE */
|
||||
#endif /* CONFIG_FUNCTION_GRAPH_TRACER */
|
||||
|
@ -329,8 +329,13 @@ bool crash_is_nosave(unsigned long pfn)
|
||||
|
||||
/* in reserved memory? */
|
||||
addr = __pfn_to_phys(pfn);
|
||||
if ((addr < crashk_res.start) || (crashk_res.end < addr))
|
||||
return false;
|
||||
if ((addr < crashk_res.start) || (crashk_res.end < addr)) {
|
||||
if (!crashk_low_res.end)
|
||||
return false;
|
||||
|
||||
if ((addr < crashk_low_res.start) || (crashk_low_res.end < addr))
|
||||
return false;
|
||||
}
|
||||
|
||||
if (!kexec_crash_image)
|
||||
return true;
|
||||
|
@ -65,10 +65,18 @@ static int prepare_elf_headers(void **addr, unsigned long *sz)
|
||||
|
||||
/* Exclude crashkernel region */
|
||||
ret = crash_exclude_mem_range(cmem, crashk_res.start, crashk_res.end);
|
||||
if (ret)
|
||||
goto out;
|
||||
|
||||
if (!ret)
|
||||
ret = crash_prepare_elf64_headers(cmem, true, addr, sz);
|
||||
if (crashk_low_res.end) {
|
||||
ret = crash_exclude_mem_range(cmem, crashk_low_res.start, crashk_low_res.end);
|
||||
if (ret)
|
||||
goto out;
|
||||
}
|
||||
|
||||
ret = crash_prepare_elf64_headers(cmem, true, addr, sz);
|
||||
|
||||
out:
|
||||
kfree(cmem);
|
||||
return ret;
|
||||
}
|
||||
|
@ -15,6 +15,7 @@
|
||||
#include <linux/swapops.h>
|
||||
#include <linux/thread_info.h>
|
||||
#include <linux/types.h>
|
||||
#include <linux/uaccess.h>
|
||||
#include <linux/uio.h>
|
||||
|
||||
#include <asm/barrier.h>
|
||||
@ -543,3 +544,32 @@ static int register_mte_tcf_preferred_sysctl(void)
|
||||
return 0;
|
||||
}
|
||||
subsys_initcall(register_mte_tcf_preferred_sysctl);
|
||||
|
||||
/*
|
||||
* Return 0 on success, the number of bytes not probed otherwise.
|
||||
*/
|
||||
size_t mte_probe_user_range(const char __user *uaddr, size_t size)
|
||||
{
|
||||
const char __user *end = uaddr + size;
|
||||
int err = 0;
|
||||
char val;
|
||||
|
||||
__raw_get_user(val, uaddr, err);
|
||||
if (err)
|
||||
return size;
|
||||
|
||||
uaddr = PTR_ALIGN(uaddr, MTE_GRANULE_SIZE);
|
||||
while (uaddr < end) {
|
||||
/*
|
||||
* A read is sufficient for mte, the caller should have probed
|
||||
* for the pte write permission if required.
|
||||
*/
|
||||
__raw_get_user(val, uaddr, err);
|
||||
if (err)
|
||||
return end - uaddr;
|
||||
uaddr += MTE_GRANULE_SIZE;
|
||||
}
|
||||
(void)val;
|
||||
|
||||
return 0;
|
||||
}
|
||||
|
@ -250,6 +250,8 @@ void show_regs(struct pt_regs *regs)
|
||||
static void tls_thread_flush(void)
|
||||
{
|
||||
write_sysreg(0, tpidr_el0);
|
||||
if (system_supports_tpidr2())
|
||||
write_sysreg_s(0, SYS_TPIDR2_EL0);
|
||||
|
||||
if (is_compat_task()) {
|
||||
current->thread.uw.tp_value = 0;
|
||||
@ -298,16 +300,42 @@ int arch_dup_task_struct(struct task_struct *dst, struct task_struct *src)
|
||||
|
||||
/*
|
||||
* Detach src's sve_state (if any) from dst so that it does not
|
||||
* get erroneously used or freed prematurely. dst's sve_state
|
||||
* get erroneously used or freed prematurely. dst's copies
|
||||
* will be allocated on demand later on if dst uses SVE.
|
||||
* For consistency, also clear TIF_SVE here: this could be done
|
||||
* later in copy_process(), but to avoid tripping up future
|
||||
* maintainers it is best not to leave TIF_SVE and sve_state in
|
||||
* maintainers it is best not to leave TIF flags and buffers in
|
||||
* an inconsistent state, even temporarily.
|
||||
*/
|
||||
dst->thread.sve_state = NULL;
|
||||
clear_tsk_thread_flag(dst, TIF_SVE);
|
||||
|
||||
/*
|
||||
* In the unlikely event that we create a new thread with ZA
|
||||
* enabled we should retain the ZA state so duplicate it here.
|
||||
* This may be shortly freed if we exec() or if CLONE_SETTLS
|
||||
* but it's simpler to do it here. To avoid confusing the rest
|
||||
* of the code ensure that we have a sve_state allocated
|
||||
* whenever za_state is allocated.
|
||||
*/
|
||||
if (thread_za_enabled(&src->thread)) {
|
||||
dst->thread.sve_state = kzalloc(sve_state_size(src),
|
||||
GFP_KERNEL);
|
||||
if (!dst->thread.sve_state)
|
||||
return -ENOMEM;
|
||||
dst->thread.za_state = kmemdup(src->thread.za_state,
|
||||
za_state_size(src),
|
||||
GFP_KERNEL);
|
||||
if (!dst->thread.za_state) {
|
||||
kfree(dst->thread.sve_state);
|
||||
dst->thread.sve_state = NULL;
|
||||
return -ENOMEM;
|
||||
}
|
||||
} else {
|
||||
dst->thread.za_state = NULL;
|
||||
clear_tsk_thread_flag(dst, TIF_SME);
|
||||
}
|
||||
|
||||
/* clear any pending asynchronous tag fault raised by the parent */
|
||||
clear_tsk_thread_flag(dst, TIF_MTE_ASYNC_FAULT);
|
||||
|
||||
@ -343,6 +371,8 @@ int copy_thread(unsigned long clone_flags, unsigned long stack_start,
|
||||
* out-of-sync with the saved value.
|
||||
*/
|
||||
*task_user_tls(p) = read_sysreg(tpidr_el0);
|
||||
if (system_supports_tpidr2())
|
||||
p->thread.tpidr2_el0 = read_sysreg_s(SYS_TPIDR2_EL0);
|
||||
|
||||
if (stack_start) {
|
||||
if (is_compat_thread(task_thread_info(p)))
|
||||
@ -353,10 +383,12 @@ int copy_thread(unsigned long clone_flags, unsigned long stack_start,
|
||||
|
||||
/*
|
||||
* If a TLS pointer was passed to clone, use it for the new
|
||||
* thread.
|
||||
* thread. We also reset TPIDR2 if it's in use.
|
||||
*/
|
||||
if (clone_flags & CLONE_SETTLS)
|
||||
if (clone_flags & CLONE_SETTLS) {
|
||||
p->thread.uw.tp_value = tls;
|
||||
p->thread.tpidr2_el0 = 0;
|
||||
}
|
||||
} else {
|
||||
/*
|
||||
* A kthread has no context to ERET to, so ensure any buggy
|
||||
@ -387,6 +419,8 @@ int copy_thread(unsigned long clone_flags, unsigned long stack_start,
|
||||
void tls_preserve_current_state(void)
|
||||
{
|
||||
*task_user_tls(current) = read_sysreg(tpidr_el0);
|
||||
if (system_supports_tpidr2() && !is_compat_task())
|
||||
current->thread.tpidr2_el0 = read_sysreg_s(SYS_TPIDR2_EL0);
|
||||
}
|
||||
|
||||
static void tls_thread_switch(struct task_struct *next)
|
||||
@ -399,6 +433,8 @@ static void tls_thread_switch(struct task_struct *next)
|
||||
write_sysreg(0, tpidrro_el0);
|
||||
|
||||
write_sysreg(*task_user_tls(next), tpidr_el0);
|
||||
if (system_supports_tpidr2())
|
||||
write_sysreg_s(next->thread.tpidr2_el0, SYS_TPIDR2_EL0);
|
||||
}
|
||||
|
||||
/*
|
||||
|
@ -713,21 +713,51 @@ static int system_call_set(struct task_struct *target,
|
||||
#ifdef CONFIG_ARM64_SVE
|
||||
|
||||
static void sve_init_header_from_task(struct user_sve_header *header,
|
||||
struct task_struct *target)
|
||||
struct task_struct *target,
|
||||
enum vec_type type)
|
||||
{
|
||||
unsigned int vq;
|
||||
bool active;
|
||||
bool fpsimd_only;
|
||||
enum vec_type task_type;
|
||||
|
||||
memset(header, 0, sizeof(*header));
|
||||
|
||||
header->flags = test_tsk_thread_flag(target, TIF_SVE) ?
|
||||
SVE_PT_REGS_SVE : SVE_PT_REGS_FPSIMD;
|
||||
if (test_tsk_thread_flag(target, TIF_SVE_VL_INHERIT))
|
||||
header->flags |= SVE_PT_VL_INHERIT;
|
||||
/* Check if the requested registers are active for the task */
|
||||
if (thread_sm_enabled(&target->thread))
|
||||
task_type = ARM64_VEC_SME;
|
||||
else
|
||||
task_type = ARM64_VEC_SVE;
|
||||
active = (task_type == type);
|
||||
|
||||
header->vl = task_get_sve_vl(target);
|
||||
switch (type) {
|
||||
case ARM64_VEC_SVE:
|
||||
if (test_tsk_thread_flag(target, TIF_SVE_VL_INHERIT))
|
||||
header->flags |= SVE_PT_VL_INHERIT;
|
||||
fpsimd_only = !test_tsk_thread_flag(target, TIF_SVE);
|
||||
break;
|
||||
case ARM64_VEC_SME:
|
||||
if (test_tsk_thread_flag(target, TIF_SME_VL_INHERIT))
|
||||
header->flags |= SVE_PT_VL_INHERIT;
|
||||
fpsimd_only = false;
|
||||
break;
|
||||
default:
|
||||
WARN_ON_ONCE(1);
|
||||
return;
|
||||
}
|
||||
|
||||
if (active) {
|
||||
if (fpsimd_only) {
|
||||
header->flags |= SVE_PT_REGS_FPSIMD;
|
||||
} else {
|
||||
header->flags |= SVE_PT_REGS_SVE;
|
||||
}
|
||||
}
|
||||
|
||||
header->vl = task_get_vl(target, type);
|
||||
vq = sve_vq_from_vl(header->vl);
|
||||
|
||||
header->max_vl = sve_max_vl();
|
||||
header->max_vl = vec_max_vl(type);
|
||||
header->size = SVE_PT_SIZE(vq, header->flags);
|
||||
header->max_size = SVE_PT_SIZE(sve_vq_from_vl(header->max_vl),
|
||||
SVE_PT_REGS_SVE);
|
||||
@ -738,19 +768,17 @@ static unsigned int sve_size_from_header(struct user_sve_header const *header)
|
||||
return ALIGN(header->size, SVE_VQ_BYTES);
|
||||
}
|
||||
|
||||
static int sve_get(struct task_struct *target,
|
||||
const struct user_regset *regset,
|
||||
struct membuf to)
|
||||
static int sve_get_common(struct task_struct *target,
|
||||
const struct user_regset *regset,
|
||||
struct membuf to,
|
||||
enum vec_type type)
|
||||
{
|
||||
struct user_sve_header header;
|
||||
unsigned int vq;
|
||||
unsigned long start, end;
|
||||
|
||||
if (!system_supports_sve())
|
||||
return -EINVAL;
|
||||
|
||||
/* Header */
|
||||
sve_init_header_from_task(&header, target);
|
||||
sve_init_header_from_task(&header, target, type);
|
||||
vq = sve_vq_from_vl(header.vl);
|
||||
|
||||
membuf_write(&to, &header, sizeof(header));
|
||||
@ -758,49 +786,61 @@ static int sve_get(struct task_struct *target,
|
||||
if (target == current)
|
||||
fpsimd_preserve_current_state();
|
||||
|
||||
/* Registers: FPSIMD-only case */
|
||||
|
||||
BUILD_BUG_ON(SVE_PT_FPSIMD_OFFSET != sizeof(header));
|
||||
if ((header.flags & SVE_PT_REGS_MASK) == SVE_PT_REGS_FPSIMD)
|
||||
BUILD_BUG_ON(SVE_PT_SVE_OFFSET != sizeof(header));
|
||||
|
||||
switch ((header.flags & SVE_PT_REGS_MASK)) {
|
||||
case SVE_PT_REGS_FPSIMD:
|
||||
return __fpr_get(target, regset, to);
|
||||
|
||||
/* Otherwise: full SVE case */
|
||||
case SVE_PT_REGS_SVE:
|
||||
start = SVE_PT_SVE_OFFSET;
|
||||
end = SVE_PT_SVE_FFR_OFFSET(vq) + SVE_PT_SVE_FFR_SIZE(vq);
|
||||
membuf_write(&to, target->thread.sve_state, end - start);
|
||||
|
||||
BUILD_BUG_ON(SVE_PT_SVE_OFFSET != sizeof(header));
|
||||
start = SVE_PT_SVE_OFFSET;
|
||||
end = SVE_PT_SVE_FFR_OFFSET(vq) + SVE_PT_SVE_FFR_SIZE(vq);
|
||||
membuf_write(&to, target->thread.sve_state, end - start);
|
||||
start = end;
|
||||
end = SVE_PT_SVE_FPSR_OFFSET(vq);
|
||||
membuf_zero(&to, end - start);
|
||||
|
||||
start = end;
|
||||
end = SVE_PT_SVE_FPSR_OFFSET(vq);
|
||||
membuf_zero(&to, end - start);
|
||||
/*
|
||||
* Copy fpsr, and fpcr which must follow contiguously in
|
||||
* struct fpsimd_state:
|
||||
*/
|
||||
start = end;
|
||||
end = SVE_PT_SVE_FPCR_OFFSET(vq) + SVE_PT_SVE_FPCR_SIZE;
|
||||
membuf_write(&to, &target->thread.uw.fpsimd_state.fpsr,
|
||||
end - start);
|
||||
|
||||
/*
|
||||
* Copy fpsr, and fpcr which must follow contiguously in
|
||||
* struct fpsimd_state:
|
||||
*/
|
||||
start = end;
|
||||
end = SVE_PT_SVE_FPCR_OFFSET(vq) + SVE_PT_SVE_FPCR_SIZE;
|
||||
membuf_write(&to, &target->thread.uw.fpsimd_state.fpsr, end - start);
|
||||
start = end;
|
||||
end = sve_size_from_header(&header);
|
||||
return membuf_zero(&to, end - start);
|
||||
|
||||
start = end;
|
||||
end = sve_size_from_header(&header);
|
||||
return membuf_zero(&to, end - start);
|
||||
default:
|
||||
return 0;
|
||||
}
|
||||
}
|
||||
|
||||
static int sve_set(struct task_struct *target,
|
||||
static int sve_get(struct task_struct *target,
|
||||
const struct user_regset *regset,
|
||||
unsigned int pos, unsigned int count,
|
||||
const void *kbuf, const void __user *ubuf)
|
||||
struct membuf to)
|
||||
{
|
||||
if (!system_supports_sve())
|
||||
return -EINVAL;
|
||||
|
||||
return sve_get_common(target, regset, to, ARM64_VEC_SVE);
|
||||
}
|
||||
|
||||
static int sve_set_common(struct task_struct *target,
|
||||
const struct user_regset *regset,
|
||||
unsigned int pos, unsigned int count,
|
||||
const void *kbuf, const void __user *ubuf,
|
||||
enum vec_type type)
|
||||
{
|
||||
int ret;
|
||||
struct user_sve_header header;
|
||||
unsigned int vq;
|
||||
unsigned long start, end;
|
||||
|
||||
if (!system_supports_sve())
|
||||
return -EINVAL;
|
||||
|
||||
/* Header */
|
||||
if (count < sizeof(header))
|
||||
return -EINVAL;
|
||||
@ -813,13 +853,37 @@ static int sve_set(struct task_struct *target,
|
||||
* Apart from SVE_PT_REGS_MASK, all SVE_PT_* flags are consumed by
|
||||
* vec_set_vector_length(), which will also validate them for us:
|
||||
*/
|
||||
ret = vec_set_vector_length(target, ARM64_VEC_SVE, header.vl,
|
||||
ret = vec_set_vector_length(target, type, header.vl,
|
||||
((unsigned long)header.flags & ~SVE_PT_REGS_MASK) << 16);
|
||||
if (ret)
|
||||
goto out;
|
||||
|
||||
/* Actual VL set may be less than the user asked for: */
|
||||
vq = sve_vq_from_vl(task_get_sve_vl(target));
|
||||
vq = sve_vq_from_vl(task_get_vl(target, type));
|
||||
|
||||
/* Enter/exit streaming mode */
|
||||
if (system_supports_sme()) {
|
||||
u64 old_svcr = target->thread.svcr;
|
||||
|
||||
switch (type) {
|
||||
case ARM64_VEC_SVE:
|
||||
target->thread.svcr &= ~SYS_SVCR_EL0_SM_MASK;
|
||||
break;
|
||||
case ARM64_VEC_SME:
|
||||
target->thread.svcr |= SYS_SVCR_EL0_SM_MASK;
|
||||
break;
|
||||
default:
|
||||
WARN_ON_ONCE(1);
|
||||
return -EINVAL;
|
||||
}
|
||||
|
||||
/*
|
||||
* If we switched then invalidate any existing SVE
|
||||
* state and ensure there's storage.
|
||||
*/
|
||||
if (target->thread.svcr != old_svcr)
|
||||
sve_alloc(target);
|
||||
}
|
||||
|
||||
/* Registers: FPSIMD-only case */
|
||||
|
||||
@ -828,10 +892,15 @@ static int sve_set(struct task_struct *target,
|
||||
ret = __fpr_set(target, regset, pos, count, kbuf, ubuf,
|
||||
SVE_PT_FPSIMD_OFFSET);
|
||||
clear_tsk_thread_flag(target, TIF_SVE);
|
||||
if (type == ARM64_VEC_SME)
|
||||
fpsimd_force_sync_to_sve(target);
|
||||
goto out;
|
||||
}
|
||||
|
||||
/* Otherwise: full SVE case */
|
||||
/*
|
||||
* Otherwise: no registers or full SVE case. For backwards
|
||||
* compatibility reasons we treat empty flags as SVE registers.
|
||||
*/
|
||||
|
||||
/*
|
||||
* If setting a different VL from the requested VL and there is
|
||||
@ -852,8 +921,9 @@ static int sve_set(struct task_struct *target,
|
||||
|
||||
/*
|
||||
* Ensure target->thread.sve_state is up to date with target's
|
||||
* FPSIMD regs, so that a short copyin leaves trailing registers
|
||||
* unmodified.
|
||||
* FPSIMD regs, so that a short copyin leaves trailing
|
||||
* registers unmodified. Always enable SVE even if going into
|
||||
* streaming mode.
|
||||
*/
|
||||
fpsimd_sync_to_sve(target);
|
||||
set_tsk_thread_flag(target, TIF_SVE);
|
||||
@ -889,8 +959,181 @@ out:
|
||||
return ret;
|
||||
}
|
||||
|
||||
static int sve_set(struct task_struct *target,
|
||||
const struct user_regset *regset,
|
||||
unsigned int pos, unsigned int count,
|
||||
const void *kbuf, const void __user *ubuf)
|
||||
{
|
||||
if (!system_supports_sve())
|
||||
return -EINVAL;
|
||||
|
||||
return sve_set_common(target, regset, pos, count, kbuf, ubuf,
|
||||
ARM64_VEC_SVE);
|
||||
}
|
||||
|
||||
#endif /* CONFIG_ARM64_SVE */
|
||||
|
||||
#ifdef CONFIG_ARM64_SME
|
||||
|
||||
static int ssve_get(struct task_struct *target,
|
||||
const struct user_regset *regset,
|
||||
struct membuf to)
|
||||
{
|
||||
if (!system_supports_sme())
|
||||
return -EINVAL;
|
||||
|
||||
return sve_get_common(target, regset, to, ARM64_VEC_SME);
|
||||
}
|
||||
|
||||
static int ssve_set(struct task_struct *target,
|
||||
const struct user_regset *regset,
|
||||
unsigned int pos, unsigned int count,
|
||||
const void *kbuf, const void __user *ubuf)
|
||||
{
|
||||
if (!system_supports_sme())
|
||||
return -EINVAL;
|
||||
|
||||
return sve_set_common(target, regset, pos, count, kbuf, ubuf,
|
||||
ARM64_VEC_SME);
|
||||
}
|
||||
|
||||
static int za_get(struct task_struct *target,
|
||||
const struct user_regset *regset,
|
||||
struct membuf to)
|
||||
{
|
||||
struct user_za_header header;
|
||||
unsigned int vq;
|
||||
unsigned long start, end;
|
||||
|
||||
if (!system_supports_sme())
|
||||
return -EINVAL;
|
||||
|
||||
/* Header */
|
||||
memset(&header, 0, sizeof(header));
|
||||
|
||||
if (test_tsk_thread_flag(target, TIF_SME_VL_INHERIT))
|
||||
header.flags |= ZA_PT_VL_INHERIT;
|
||||
|
||||
header.vl = task_get_sme_vl(target);
|
||||
vq = sve_vq_from_vl(header.vl);
|
||||
header.max_vl = sme_max_vl();
|
||||
header.max_size = ZA_PT_SIZE(vq);
|
||||
|
||||
/* If ZA is not active there is only the header */
|
||||
if (thread_za_enabled(&target->thread))
|
||||
header.size = ZA_PT_SIZE(vq);
|
||||
else
|
||||
header.size = ZA_PT_ZA_OFFSET;
|
||||
|
||||
membuf_write(&to, &header, sizeof(header));
|
||||
|
||||
BUILD_BUG_ON(ZA_PT_ZA_OFFSET != sizeof(header));
|
||||
end = ZA_PT_ZA_OFFSET;
|
||||
|
||||
if (target == current)
|
||||
fpsimd_preserve_current_state();
|
||||
|
||||
/* Any register data to include? */
|
||||
if (thread_za_enabled(&target->thread)) {
|
||||
start = end;
|
||||
end = ZA_PT_SIZE(vq);
|
||||
membuf_write(&to, target->thread.za_state, end - start);
|
||||
}
|
||||
|
||||
/* Zero any trailing padding */
|
||||
start = end;
|
||||
end = ALIGN(header.size, SVE_VQ_BYTES);
|
||||
return membuf_zero(&to, end - start);
|
||||
}
|
||||
|
||||
static int za_set(struct task_struct *target,
|
||||
const struct user_regset *regset,
|
||||
unsigned int pos, unsigned int count,
|
||||
const void *kbuf, const void __user *ubuf)
|
||||
{
|
||||
int ret;
|
||||
struct user_za_header header;
|
||||
unsigned int vq;
|
||||
unsigned long start, end;
|
||||
|
||||
if (!system_supports_sme())
|
||||
return -EINVAL;
|
||||
|
||||
/* Header */
|
||||
if (count < sizeof(header))
|
||||
return -EINVAL;
|
||||
ret = user_regset_copyin(&pos, &count, &kbuf, &ubuf, &header,
|
||||
0, sizeof(header));
|
||||
if (ret)
|
||||
goto out;
|
||||
|
||||
/*
|
||||
* All current ZA_PT_* flags are consumed by
|
||||
* vec_set_vector_length(), which will also validate them for
|
||||
* us:
|
||||
*/
|
||||
ret = vec_set_vector_length(target, ARM64_VEC_SME, header.vl,
|
||||
((unsigned long)header.flags) << 16);
|
||||
if (ret)
|
||||
goto out;
|
||||
|
||||
/* Actual VL set may be less than the user asked for: */
|
||||
vq = sve_vq_from_vl(task_get_sme_vl(target));
|
||||
|
||||
/* Ensure there is some SVE storage for streaming mode */
|
||||
if (!target->thread.sve_state) {
|
||||
sve_alloc(target);
|
||||
if (!target->thread.sve_state) {
|
||||
clear_thread_flag(TIF_SME);
|
||||
ret = -ENOMEM;
|
||||
goto out;
|
||||
}
|
||||
}
|
||||
|
||||
/* Allocate/reinit ZA storage */
|
||||
sme_alloc(target);
|
||||
if (!target->thread.za_state) {
|
||||
ret = -ENOMEM;
|
||||
clear_tsk_thread_flag(target, TIF_SME);
|
||||
goto out;
|
||||
}
|
||||
|
||||
/* If there is no data then disable ZA */
|
||||
if (!count) {
|
||||
target->thread.svcr &= ~SYS_SVCR_EL0_ZA_MASK;
|
||||
goto out;
|
||||
}
|
||||
|
||||
/*
|
||||
* If setting a different VL from the requested VL and there is
|
||||
* register data, the data layout will be wrong: don't even
|
||||
* try to set the registers in this case.
|
||||
*/
|
||||
if (vq != sve_vq_from_vl(header.vl)) {
|
||||
ret = -EIO;
|
||||
goto out;
|
||||
}
|
||||
|
||||
BUILD_BUG_ON(ZA_PT_ZA_OFFSET != sizeof(header));
|
||||
start = ZA_PT_ZA_OFFSET;
|
||||
end = ZA_PT_SIZE(vq);
|
||||
ret = user_regset_copyin(&pos, &count, &kbuf, &ubuf,
|
||||
target->thread.za_state,
|
||||
start, end);
|
||||
if (ret)
|
||||
goto out;
|
||||
|
||||
/* Mark ZA as active and let userspace use it */
|
||||
set_tsk_thread_flag(target, TIF_SME);
|
||||
target->thread.svcr |= SYS_SVCR_EL0_ZA_MASK;
|
||||
|
||||
out:
|
||||
fpsimd_flush_task_state(target);
|
||||
return ret;
|
||||
}
|
||||
|
||||
#endif /* CONFIG_ARM64_SME */
|
||||
|
||||
#ifdef CONFIG_ARM64_PTR_AUTH
|
||||
static int pac_mask_get(struct task_struct *target,
|
||||
const struct user_regset *regset,
|
||||
@ -1108,6 +1351,10 @@ enum aarch64_regset {
|
||||
#ifdef CONFIG_ARM64_SVE
|
||||
REGSET_SVE,
|
||||
#endif
|
||||
#ifdef CONFIG_ARM64_SVE
|
||||
REGSET_SSVE,
|
||||
REGSET_ZA,
|
||||
#endif
|
||||
#ifdef CONFIG_ARM64_PTR_AUTH
|
||||
REGSET_PAC_MASK,
|
||||
REGSET_PAC_ENABLED_KEYS,
|
||||
@ -1188,6 +1435,33 @@ static const struct user_regset aarch64_regsets[] = {
|
||||
.set = sve_set,
|
||||
},
|
||||
#endif
|
||||
#ifdef CONFIG_ARM64_SME
|
||||
[REGSET_SSVE] = { /* Streaming mode SVE */
|
||||
.core_note_type = NT_ARM_SSVE,
|
||||
.n = DIV_ROUND_UP(SVE_PT_SIZE(SME_VQ_MAX, SVE_PT_REGS_SVE),
|
||||
SVE_VQ_BYTES),
|
||||
.size = SVE_VQ_BYTES,
|
||||
.align = SVE_VQ_BYTES,
|
||||
.regset_get = ssve_get,
|
||||
.set = ssve_set,
|
||||
},
|
||||
[REGSET_ZA] = { /* SME ZA */
|
||||
.core_note_type = NT_ARM_ZA,
|
||||
/*
|
||||
* ZA is a single register but it's variably sized and
|
||||
* the ptrace core requires that the size of any data
|
||||
* be an exact multiple of the configured register
|
||||
* size so report as though we had SVE_VQ_BYTES
|
||||
* registers. These values aren't exposed to
|
||||
* userspace.
|
||||
*/
|
||||
.n = DIV_ROUND_UP(ZA_PT_SIZE(SME_VQ_MAX), SVE_VQ_BYTES),
|
||||
.size = SVE_VQ_BYTES,
|
||||
.align = SVE_VQ_BYTES,
|
||||
.regset_get = za_get,
|
||||
.set = za_set,
|
||||
},
|
||||
#endif
|
||||
#ifdef CONFIG_ARM64_PTR_AUTH
|
||||
[REGSET_PAC_MASK] = {
|
||||
.core_note_type = NT_ARM_PAC_MASK,
|
||||
|
@ -225,6 +225,8 @@ static void __init request_standard_resources(void)
|
||||
kernel_code.end = __pa_symbol(__init_begin - 1);
|
||||
kernel_data.start = __pa_symbol(_sdata);
|
||||
kernel_data.end = __pa_symbol(_end - 1);
|
||||
insert_resource(&iomem_resource, &kernel_code);
|
||||
insert_resource(&iomem_resource, &kernel_data);
|
||||
|
||||
num_standard_resources = memblock.memory.cnt;
|
||||
res_size = num_standard_resources * sizeof(*standard_resources);
|
||||
@ -246,20 +248,7 @@ static void __init request_standard_resources(void)
|
||||
res->end = __pfn_to_phys(memblock_region_memory_end_pfn(region)) - 1;
|
||||
}
|
||||
|
||||
request_resource(&iomem_resource, res);
|
||||
|
||||
if (kernel_code.start >= res->start &&
|
||||
kernel_code.end <= res->end)
|
||||
request_resource(res, &kernel_code);
|
||||
if (kernel_data.start >= res->start &&
|
||||
kernel_data.end <= res->end)
|
||||
request_resource(res, &kernel_data);
|
||||
#ifdef CONFIG_KEXEC_CORE
|
||||
/* Userspace will find "Crash kernel" region in /proc/iomem. */
|
||||
if (crashk_res.end && crashk_res.start >= res->start &&
|
||||
crashk_res.end <= res->end)
|
||||
request_resource(res, &crashk_res);
|
||||
#endif
|
||||
insert_resource(&iomem_resource, res);
|
||||
}
|
||||
}
|
||||
|
||||
|
@ -56,6 +56,7 @@ struct rt_sigframe_user_layout {
|
||||
unsigned long fpsimd_offset;
|
||||
unsigned long esr_offset;
|
||||
unsigned long sve_offset;
|
||||
unsigned long za_offset;
|
||||
unsigned long extra_offset;
|
||||
unsigned long end_offset;
|
||||
};
|
||||
@ -218,6 +219,7 @@ static int restore_fpsimd_context(struct fpsimd_context __user *ctx)
|
||||
struct user_ctxs {
|
||||
struct fpsimd_context __user *fpsimd;
|
||||
struct sve_context __user *sve;
|
||||
struct za_context __user *za;
|
||||
};
|
||||
|
||||
#ifdef CONFIG_ARM64_SVE
|
||||
@ -226,11 +228,17 @@ static int preserve_sve_context(struct sve_context __user *ctx)
|
||||
{
|
||||
int err = 0;
|
||||
u16 reserved[ARRAY_SIZE(ctx->__reserved)];
|
||||
u16 flags = 0;
|
||||
unsigned int vl = task_get_sve_vl(current);
|
||||
unsigned int vq = 0;
|
||||
|
||||
if (test_thread_flag(TIF_SVE))
|
||||
if (thread_sm_enabled(¤t->thread)) {
|
||||
vl = task_get_sme_vl(current);
|
||||
vq = sve_vq_from_vl(vl);
|
||||
flags |= SVE_SIG_FLAG_SM;
|
||||
} else if (test_thread_flag(TIF_SVE)) {
|
||||
vq = sve_vq_from_vl(vl);
|
||||
}
|
||||
|
||||
memset(reserved, 0, sizeof(reserved));
|
||||
|
||||
@ -238,6 +246,7 @@ static int preserve_sve_context(struct sve_context __user *ctx)
|
||||
__put_user_error(round_up(SVE_SIG_CONTEXT_SIZE(vq), 16),
|
||||
&ctx->head.size, err);
|
||||
__put_user_error(vl, &ctx->vl, err);
|
||||
__put_user_error(flags, &ctx->flags, err);
|
||||
BUILD_BUG_ON(sizeof(ctx->__reserved) != sizeof(reserved));
|
||||
err |= __copy_to_user(&ctx->__reserved, reserved, sizeof(reserved));
|
||||
|
||||
@ -258,18 +267,28 @@ static int preserve_sve_context(struct sve_context __user *ctx)
|
||||
static int restore_sve_fpsimd_context(struct user_ctxs *user)
|
||||
{
|
||||
int err;
|
||||
unsigned int vq;
|
||||
unsigned int vl, vq;
|
||||
struct user_fpsimd_state fpsimd;
|
||||
struct sve_context sve;
|
||||
|
||||
if (__copy_from_user(&sve, user->sve, sizeof(sve)))
|
||||
return -EFAULT;
|
||||
|
||||
if (sve.vl != task_get_sve_vl(current))
|
||||
if (sve.flags & SVE_SIG_FLAG_SM) {
|
||||
if (!system_supports_sme())
|
||||
return -EINVAL;
|
||||
|
||||
vl = task_get_sme_vl(current);
|
||||
} else {
|
||||
vl = task_get_sve_vl(current);
|
||||
}
|
||||
|
||||
if (sve.vl != vl)
|
||||
return -EINVAL;
|
||||
|
||||
if (sve.head.size <= sizeof(*user->sve)) {
|
||||
clear_thread_flag(TIF_SVE);
|
||||
current->thread.svcr &= ~SYS_SVCR_EL0_SM_MASK;
|
||||
goto fpsimd_only;
|
||||
}
|
||||
|
||||
@ -301,7 +320,10 @@ static int restore_sve_fpsimd_context(struct user_ctxs *user)
|
||||
if (err)
|
||||
return -EFAULT;
|
||||
|
||||
set_thread_flag(TIF_SVE);
|
||||
if (sve.flags & SVE_SIG_FLAG_SM)
|
||||
current->thread.svcr |= SYS_SVCR_EL0_SM_MASK;
|
||||
else
|
||||
set_thread_flag(TIF_SVE);
|
||||
|
||||
fpsimd_only:
|
||||
/* copy the FP and status/control registers */
|
||||
@ -326,6 +348,101 @@ extern int restore_sve_fpsimd_context(struct user_ctxs *user);
|
||||
|
||||
#endif /* ! CONFIG_ARM64_SVE */
|
||||
|
||||
#ifdef CONFIG_ARM64_SME
|
||||
|
||||
static int preserve_za_context(struct za_context __user *ctx)
|
||||
{
|
||||
int err = 0;
|
||||
u16 reserved[ARRAY_SIZE(ctx->__reserved)];
|
||||
unsigned int vl = task_get_sme_vl(current);
|
||||
unsigned int vq;
|
||||
|
||||
if (thread_za_enabled(¤t->thread))
|
||||
vq = sve_vq_from_vl(vl);
|
||||
else
|
||||
vq = 0;
|
||||
|
||||
memset(reserved, 0, sizeof(reserved));
|
||||
|
||||
__put_user_error(ZA_MAGIC, &ctx->head.magic, err);
|
||||
__put_user_error(round_up(ZA_SIG_CONTEXT_SIZE(vq), 16),
|
||||
&ctx->head.size, err);
|
||||
__put_user_error(vl, &ctx->vl, err);
|
||||
BUILD_BUG_ON(sizeof(ctx->__reserved) != sizeof(reserved));
|
||||
err |= __copy_to_user(&ctx->__reserved, reserved, sizeof(reserved));
|
||||
|
||||
if (vq) {
|
||||
/*
|
||||
* This assumes that the ZA state has already been saved to
|
||||
* the task struct by calling the function
|
||||
* fpsimd_signal_preserve_current_state().
|
||||
*/
|
||||
err |= __copy_to_user((char __user *)ctx + ZA_SIG_REGS_OFFSET,
|
||||
current->thread.za_state,
|
||||
ZA_SIG_REGS_SIZE(vq));
|
||||
}
|
||||
|
||||
return err ? -EFAULT : 0;
|
||||
}
|
||||
|
||||
static int restore_za_context(struct user_ctxs __user *user)
|
||||
{
|
||||
int err;
|
||||
unsigned int vq;
|
||||
struct za_context za;
|
||||
|
||||
if (__copy_from_user(&za, user->za, sizeof(za)))
|
||||
return -EFAULT;
|
||||
|
||||
if (za.vl != task_get_sme_vl(current))
|
||||
return -EINVAL;
|
||||
|
||||
if (za.head.size <= sizeof(*user->za)) {
|
||||
current->thread.svcr &= ~SYS_SVCR_EL0_ZA_MASK;
|
||||
return 0;
|
||||
}
|
||||
|
||||
vq = sve_vq_from_vl(za.vl);
|
||||
|
||||
if (za.head.size < ZA_SIG_CONTEXT_SIZE(vq))
|
||||
return -EINVAL;
|
||||
|
||||
/*
|
||||
* Careful: we are about __copy_from_user() directly into
|
||||
* thread.za_state with preemption enabled, so protection is
|
||||
* needed to prevent a racing context switch from writing stale
|
||||
* registers back over the new data.
|
||||
*/
|
||||
|
||||
fpsimd_flush_task_state(current);
|
||||
/* From now, fpsimd_thread_switch() won't touch thread.sve_state */
|
||||
|
||||
sme_alloc(current);
|
||||
if (!current->thread.za_state) {
|
||||
current->thread.svcr &= ~SYS_SVCR_EL0_ZA_MASK;
|
||||
clear_thread_flag(TIF_SME);
|
||||
return -ENOMEM;
|
||||
}
|
||||
|
||||
err = __copy_from_user(current->thread.za_state,
|
||||
(char __user const *)user->za +
|
||||
ZA_SIG_REGS_OFFSET,
|
||||
ZA_SIG_REGS_SIZE(vq));
|
||||
if (err)
|
||||
return -EFAULT;
|
||||
|
||||
set_thread_flag(TIF_SME);
|
||||
current->thread.svcr |= SYS_SVCR_EL0_ZA_MASK;
|
||||
|
||||
return 0;
|
||||
}
|
||||
#else /* ! CONFIG_ARM64_SME */
|
||||
|
||||
/* Turn any non-optimised out attempts to use these into a link error: */
|
||||
extern int preserve_za_context(void __user *ctx);
|
||||
extern int restore_za_context(struct user_ctxs *user);
|
||||
|
||||
#endif /* ! CONFIG_ARM64_SME */
|
||||
|
||||
static int parse_user_sigframe(struct user_ctxs *user,
|
||||
struct rt_sigframe __user *sf)
|
||||
@ -340,6 +457,7 @@ static int parse_user_sigframe(struct user_ctxs *user,
|
||||
|
||||
user->fpsimd = NULL;
|
||||
user->sve = NULL;
|
||||
user->za = NULL;
|
||||
|
||||
if (!IS_ALIGNED((unsigned long)base, 16))
|
||||
goto invalid;
|
||||
@ -393,7 +511,7 @@ static int parse_user_sigframe(struct user_ctxs *user,
|
||||
break;
|
||||
|
||||
case SVE_MAGIC:
|
||||
if (!system_supports_sve())
|
||||
if (!system_supports_sve() && !system_supports_sme())
|
||||
goto invalid;
|
||||
|
||||
if (user->sve)
|
||||
@ -405,6 +523,19 @@ static int parse_user_sigframe(struct user_ctxs *user,
|
||||
user->sve = (struct sve_context __user *)head;
|
||||
break;
|
||||
|
||||
case ZA_MAGIC:
|
||||
if (!system_supports_sme())
|
||||
goto invalid;
|
||||
|
||||
if (user->za)
|
||||
goto invalid;
|
||||
|
||||
if (size < sizeof(*user->za))
|
||||
goto invalid;
|
||||
|
||||
user->za = (struct za_context __user *)head;
|
||||
break;
|
||||
|
||||
case EXTRA_MAGIC:
|
||||
if (have_extra_context)
|
||||
goto invalid;
|
||||
@ -528,6 +659,9 @@ static int restore_sigframe(struct pt_regs *regs,
|
||||
}
|
||||
}
|
||||
|
||||
if (err == 0 && system_supports_sme() && user.za)
|
||||
err = restore_za_context(&user);
|
||||
|
||||
return err;
|
||||
}
|
||||
|
||||
@ -594,11 +728,12 @@ static int setup_sigframe_layout(struct rt_sigframe_user_layout *user,
|
||||
if (system_supports_sve()) {
|
||||
unsigned int vq = 0;
|
||||
|
||||
if (add_all || test_thread_flag(TIF_SVE)) {
|
||||
int vl = sve_max_vl();
|
||||
if (add_all || test_thread_flag(TIF_SVE) ||
|
||||
thread_sm_enabled(¤t->thread)) {
|
||||
int vl = max(sve_max_vl(), sme_max_vl());
|
||||
|
||||
if (!add_all)
|
||||
vl = task_get_sve_vl(current);
|
||||
vl = thread_get_cur_vl(¤t->thread);
|
||||
|
||||
vq = sve_vq_from_vl(vl);
|
||||
}
|
||||
@ -609,6 +744,24 @@ static int setup_sigframe_layout(struct rt_sigframe_user_layout *user,
|
||||
return err;
|
||||
}
|
||||
|
||||
if (system_supports_sme()) {
|
||||
unsigned int vl;
|
||||
unsigned int vq = 0;
|
||||
|
||||
if (add_all)
|
||||
vl = sme_max_vl();
|
||||
else
|
||||
vl = task_get_sme_vl(current);
|
||||
|
||||
if (thread_za_enabled(¤t->thread))
|
||||
vq = sve_vq_from_vl(vl);
|
||||
|
||||
err = sigframe_alloc(user, &user->za_offset,
|
||||
ZA_SIG_CONTEXT_SIZE(vq));
|
||||
if (err)
|
||||
return err;
|
||||
}
|
||||
|
||||
return sigframe_alloc_end(user);
|
||||
}
|
||||
|
||||
@ -649,13 +802,21 @@ static int setup_sigframe(struct rt_sigframe_user_layout *user,
|
||||
__put_user_error(current->thread.fault_code, &esr_ctx->esr, err);
|
||||
}
|
||||
|
||||
/* Scalable Vector Extension state, if present */
|
||||
if (system_supports_sve() && err == 0 && user->sve_offset) {
|
||||
/* Scalable Vector Extension state (including streaming), if present */
|
||||
if ((system_supports_sve() || system_supports_sme()) &&
|
||||
err == 0 && user->sve_offset) {
|
||||
struct sve_context __user *sve_ctx =
|
||||
apply_user_offset(user, user->sve_offset);
|
||||
err |= preserve_sve_context(sve_ctx);
|
||||
}
|
||||
|
||||
/* ZA state if present */
|
||||
if (system_supports_sme() && err == 0 && user->za_offset) {
|
||||
struct za_context __user *za_ctx =
|
||||
apply_user_offset(user, user->za_offset);
|
||||
err |= preserve_za_context(za_ctx);
|
||||
}
|
||||
|
||||
if (err == 0 && user->extra_offset) {
|
||||
char __user *sfp = (char __user *)user->sigframe;
|
||||
char __user *userp =
|
||||
@ -759,6 +920,13 @@ static void setup_return(struct pt_regs *regs, struct k_sigaction *ka,
|
||||
/* TCO (Tag Check Override) always cleared for signal handlers */
|
||||
regs->pstate &= ~PSR_TCO_BIT;
|
||||
|
||||
/* Signal handlers are invoked with ZA and streaming mode disabled */
|
||||
if (system_supports_sme()) {
|
||||
current->thread.svcr &= ~(SYS_SVCR_EL0_ZA_MASK |
|
||||
SYS_SVCR_EL0_SM_MASK);
|
||||
sme_smstop();
|
||||
}
|
||||
|
||||
if (ka->sa.sa_flags & SA_RESTORER)
|
||||
sigtramp = ka->sa.sa_restorer;
|
||||
else
|
||||
|
@ -19,43 +19,60 @@
|
||||
#include <asm/stacktrace.h>
|
||||
|
||||
/*
|
||||
* AArch64 PCS assigns the frame pointer to x29.
|
||||
* A snapshot of a frame record or fp/lr register values, along with some
|
||||
* accounting information necessary for robust unwinding.
|
||||
*
|
||||
* A simple function prologue looks like this:
|
||||
* sub sp, sp, #0x10
|
||||
* stp x29, x30, [sp]
|
||||
* mov x29, sp
|
||||
* @fp: The fp value in the frame record (or the real fp)
|
||||
* @pc: The lr value in the frame record (or the real lr)
|
||||
*
|
||||
* A simple function epilogue looks like this:
|
||||
* mov sp, x29
|
||||
* ldp x29, x30, [sp]
|
||||
* add sp, sp, #0x10
|
||||
* @stacks_done: Stacks which have been entirely unwound, for which it is no
|
||||
* longer valid to unwind to.
|
||||
*
|
||||
* @prev_fp: The fp that pointed to this frame record, or a synthetic value
|
||||
* of 0. This is used to ensure that within a stack, each
|
||||
* subsequent frame record is at an increasing address.
|
||||
* @prev_type: The type of stack this frame record was on, or a synthetic
|
||||
* value of STACK_TYPE_UNKNOWN. This is used to detect a
|
||||
* transition from one stack to another.
|
||||
*
|
||||
* @kr_cur: When KRETPROBES is selected, holds the kretprobe instance
|
||||
* associated with the most recently encountered replacement lr
|
||||
* value.
|
||||
*/
|
||||
|
||||
|
||||
static notrace void start_backtrace(struct stackframe *frame, unsigned long fp,
|
||||
unsigned long pc)
|
||||
{
|
||||
frame->fp = fp;
|
||||
frame->pc = pc;
|
||||
struct unwind_state {
|
||||
unsigned long fp;
|
||||
unsigned long pc;
|
||||
DECLARE_BITMAP(stacks_done, __NR_STACK_TYPES);
|
||||
unsigned long prev_fp;
|
||||
enum stack_type prev_type;
|
||||
#ifdef CONFIG_KRETPROBES
|
||||
frame->kr_cur = NULL;
|
||||
struct llist_node *kr_cur;
|
||||
#endif
|
||||
};
|
||||
|
||||
static notrace void unwind_init(struct unwind_state *state, unsigned long fp,
|
||||
unsigned long pc)
|
||||
{
|
||||
state->fp = fp;
|
||||
state->pc = pc;
|
||||
#ifdef CONFIG_KRETPROBES
|
||||
state->kr_cur = NULL;
|
||||
#endif
|
||||
|
||||
/*
|
||||
* Prime the first unwind.
|
||||
*
|
||||
* In unwind_frame() we'll check that the FP points to a valid stack,
|
||||
* In unwind_next() we'll check that the FP points to a valid stack,
|
||||
* which can't be STACK_TYPE_UNKNOWN, and the first unwind will be
|
||||
* treated as a transition to whichever stack that happens to be. The
|
||||
* prev_fp value won't be used, but we set it to 0 such that it is
|
||||
* definitely not an accessible stack address.
|
||||
*/
|
||||
bitmap_zero(frame->stacks_done, __NR_STACK_TYPES);
|
||||
frame->prev_fp = 0;
|
||||
frame->prev_type = STACK_TYPE_UNKNOWN;
|
||||
bitmap_zero(state->stacks_done, __NR_STACK_TYPES);
|
||||
state->prev_fp = 0;
|
||||
state->prev_type = STACK_TYPE_UNKNOWN;
|
||||
}
|
||||
NOKPROBE_SYMBOL(start_backtrace);
|
||||
NOKPROBE_SYMBOL(unwind_init);
|
||||
|
||||
/*
|
||||
* Unwind from one frame record (A) to the next frame record (B).
|
||||
@ -64,15 +81,12 @@ NOKPROBE_SYMBOL(start_backtrace);
|
||||
* records (e.g. a cycle), determined based on the location and fp value of A
|
||||
* and the location (but not the fp value) of B.
|
||||
*/
|
||||
static int notrace unwind_frame(struct task_struct *tsk,
|
||||
struct stackframe *frame)
|
||||
static int notrace unwind_next(struct task_struct *tsk,
|
||||
struct unwind_state *state)
|
||||
{
|
||||
unsigned long fp = frame->fp;
|
||||
unsigned long fp = state->fp;
|
||||
struct stack_info info;
|
||||
|
||||
if (!tsk)
|
||||
tsk = current;
|
||||
|
||||
/* Final frame; nothing to unwind */
|
||||
if (fp == (unsigned long)task_pt_regs(tsk)->stackframe)
|
||||
return -ENOENT;
|
||||
@ -83,7 +97,7 @@ static int notrace unwind_frame(struct task_struct *tsk,
|
||||
if (!on_accessible_stack(tsk, fp, 16, &info))
|
||||
return -EINVAL;
|
||||
|
||||
if (test_bit(info.type, frame->stacks_done))
|
||||
if (test_bit(info.type, state->stacks_done))
|
||||
return -EINVAL;
|
||||
|
||||
/*
|
||||
@ -99,27 +113,27 @@ static int notrace unwind_frame(struct task_struct *tsk,
|
||||
* stack to another, it's never valid to unwind back to that first
|
||||
* stack.
|
||||
*/
|
||||
if (info.type == frame->prev_type) {
|
||||
if (fp <= frame->prev_fp)
|
||||
if (info.type == state->prev_type) {
|
||||
if (fp <= state->prev_fp)
|
||||
return -EINVAL;
|
||||
} else {
|
||||
set_bit(frame->prev_type, frame->stacks_done);
|
||||
set_bit(state->prev_type, state->stacks_done);
|
||||
}
|
||||
|
||||
/*
|
||||
* Record this frame record's values and location. The prev_fp and
|
||||
* prev_type are only meaningful to the next unwind_frame() invocation.
|
||||
* prev_type are only meaningful to the next unwind_next() invocation.
|
||||
*/
|
||||
frame->fp = READ_ONCE_NOCHECK(*(unsigned long *)(fp));
|
||||
frame->pc = READ_ONCE_NOCHECK(*(unsigned long *)(fp + 8));
|
||||
frame->prev_fp = fp;
|
||||
frame->prev_type = info.type;
|
||||
state->fp = READ_ONCE_NOCHECK(*(unsigned long *)(fp));
|
||||
state->pc = READ_ONCE_NOCHECK(*(unsigned long *)(fp + 8));
|
||||
state->prev_fp = fp;
|
||||
state->prev_type = info.type;
|
||||
|
||||
frame->pc = ptrauth_strip_insn_pac(frame->pc);
|
||||
state->pc = ptrauth_strip_insn_pac(state->pc);
|
||||
|
||||
#ifdef CONFIG_FUNCTION_GRAPH_TRACER
|
||||
if (tsk->ret_stack &&
|
||||
(frame->pc == (unsigned long)return_to_handler)) {
|
||||
(state->pc == (unsigned long)return_to_handler)) {
|
||||
unsigned long orig_pc;
|
||||
/*
|
||||
* This is a case where function graph tracer has
|
||||
@ -127,37 +141,37 @@ static int notrace unwind_frame(struct task_struct *tsk,
|
||||
* to hook a function return.
|
||||
* So replace it to an original value.
|
||||
*/
|
||||
orig_pc = ftrace_graph_ret_addr(tsk, NULL, frame->pc,
|
||||
(void *)frame->fp);
|
||||
if (WARN_ON_ONCE(frame->pc == orig_pc))
|
||||
orig_pc = ftrace_graph_ret_addr(tsk, NULL, state->pc,
|
||||
(void *)state->fp);
|
||||
if (WARN_ON_ONCE(state->pc == orig_pc))
|
||||
return -EINVAL;
|
||||
frame->pc = orig_pc;
|
||||
state->pc = orig_pc;
|
||||
}
|
||||
#endif /* CONFIG_FUNCTION_GRAPH_TRACER */
|
||||
#ifdef CONFIG_KRETPROBES
|
||||
if (is_kretprobe_trampoline(frame->pc))
|
||||
frame->pc = kretprobe_find_ret_addr(tsk, (void *)frame->fp, &frame->kr_cur);
|
||||
if (is_kretprobe_trampoline(state->pc))
|
||||
state->pc = kretprobe_find_ret_addr(tsk, (void *)state->fp, &state->kr_cur);
|
||||
#endif
|
||||
|
||||
return 0;
|
||||
}
|
||||
NOKPROBE_SYMBOL(unwind_frame);
|
||||
NOKPROBE_SYMBOL(unwind_next);
|
||||
|
||||
static void notrace walk_stackframe(struct task_struct *tsk,
|
||||
struct stackframe *frame,
|
||||
bool (*fn)(void *, unsigned long), void *data)
|
||||
static void notrace unwind(struct task_struct *tsk,
|
||||
struct unwind_state *state,
|
||||
stack_trace_consume_fn consume_entry, void *cookie)
|
||||
{
|
||||
while (1) {
|
||||
int ret;
|
||||
|
||||
if (!fn(data, frame->pc))
|
||||
if (!consume_entry(cookie, state->pc))
|
||||
break;
|
||||
ret = unwind_frame(tsk, frame);
|
||||
ret = unwind_next(tsk, state);
|
||||
if (ret < 0)
|
||||
break;
|
||||
}
|
||||
}
|
||||
NOKPROBE_SYMBOL(walk_stackframe);
|
||||
NOKPROBE_SYMBOL(unwind);
|
||||
|
||||
static bool dump_backtrace_entry(void *arg, unsigned long where)
|
||||
{
|
||||
@ -196,17 +210,17 @@ noinline notrace void arch_stack_walk(stack_trace_consume_fn consume_entry,
|
||||
void *cookie, struct task_struct *task,
|
||||
struct pt_regs *regs)
|
||||
{
|
||||
struct stackframe frame;
|
||||
struct unwind_state state;
|
||||
|
||||
if (regs)
|
||||
start_backtrace(&frame, regs->regs[29], regs->pc);
|
||||
unwind_init(&state, regs->regs[29], regs->pc);
|
||||
else if (task == current)
|
||||
start_backtrace(&frame,
|
||||
unwind_init(&state,
|
||||
(unsigned long)__builtin_frame_address(1),
|
||||
(unsigned long)__builtin_return_address(0));
|
||||
else
|
||||
start_backtrace(&frame, thread_saved_fp(task),
|
||||
unwind_init(&state, thread_saved_fp(task),
|
||||
thread_saved_pc(task));
|
||||
|
||||
walk_stackframe(task, &frame, consume_entry, cookie);
|
||||
unwind(task, &state, consume_entry, cookie);
|
||||
}
|
||||
|
@ -158,11 +158,36 @@ trace_exit:
|
||||
syscall_trace_exit(regs);
|
||||
}
|
||||
|
||||
static inline void sve_user_discard(void)
|
||||
/*
|
||||
* As per the ABI exit SME streaming mode and clear the SVE state not
|
||||
* shared with FPSIMD on syscall entry.
|
||||
*/
|
||||
static inline void fp_user_discard(void)
|
||||
{
|
||||
/*
|
||||
* If SME is active then exit streaming mode. If ZA is active
|
||||
* then flush the SVE registers but leave userspace access to
|
||||
* both SVE and SME enabled, otherwise disable SME for the
|
||||
* task and fall through to disabling SVE too. This means
|
||||
* that after a syscall we never have any streaming mode
|
||||
* register state to track, if this changes the KVM code will
|
||||
* need updating.
|
||||
*/
|
||||
if (system_supports_sme() && test_thread_flag(TIF_SME)) {
|
||||
u64 svcr = read_sysreg_s(SYS_SVCR_EL0);
|
||||
|
||||
if (svcr & SYS_SVCR_EL0_SM_MASK)
|
||||
sme_smstop_sm();
|
||||
}
|
||||
|
||||
if (!system_supports_sve())
|
||||
return;
|
||||
|
||||
/*
|
||||
* If SME is not active then disable SVE, the registers will
|
||||
* be cleared when userspace next attempts to access them and
|
||||
* we do not need to track the SVE register state until then.
|
||||
*/
|
||||
clear_thread_flag(TIF_SVE);
|
||||
|
||||
/*
|
||||
@ -177,7 +202,7 @@ static inline void sve_user_discard(void)
|
||||
|
||||
void do_el0_svc(struct pt_regs *regs)
|
||||
{
|
||||
sve_user_discard();
|
||||
fp_user_discard();
|
||||
el0_svc_common(regs, regs->regs[8], __NR_syscalls, sys_call_table);
|
||||
}
|
||||
|
||||
|
@ -821,6 +821,7 @@ static const char *esr_class_str[] = {
|
||||
[ESR_ELx_EC_SVE] = "SVE",
|
||||
[ESR_ELx_EC_ERET] = "ERET/ERETAA/ERETAB",
|
||||
[ESR_ELx_EC_FPAC] = "FPAC",
|
||||
[ESR_ELx_EC_SME] = "SME",
|
||||
[ESR_ELx_EC_IMP_DEF] = "EL3 IMP DEF",
|
||||
[ESR_ELx_EC_IABT_LOW] = "IABT (lower EL)",
|
||||
[ESR_ELx_EC_IABT_CUR] = "IABT (current EL)",
|
||||
|
@ -93,7 +93,6 @@ jiffies = jiffies_64;
|
||||
|
||||
#ifdef CONFIG_HIBERNATION
|
||||
#define HIBERNATE_TEXT \
|
||||
. = ALIGN(SZ_4K); \
|
||||
__hibernate_exit_text_start = .; \
|
||||
*(.hibernate_exit.text) \
|
||||
__hibernate_exit_text_end = .;
|
||||
@ -103,7 +102,6 @@ jiffies = jiffies_64;
|
||||
|
||||
#ifdef CONFIG_KEXEC_CORE
|
||||
#define KEXEC_TEXT \
|
||||
. = ALIGN(SZ_4K); \
|
||||
__relocate_new_kernel_start = .; \
|
||||
*(.kexec_relocate.text) \
|
||||
__relocate_new_kernel_end = .;
|
||||
@ -170,9 +168,6 @@ SECTIONS
|
||||
KPROBES_TEXT
|
||||
HYPERVISOR_TEXT
|
||||
IDMAP_TEXT
|
||||
HIBERNATE_TEXT
|
||||
KEXEC_TEXT
|
||||
TRAMP_TEXT
|
||||
*(.gnu.warning)
|
||||
. = ALIGN(16);
|
||||
*(.got) /* Global offset table */
|
||||
@ -194,6 +189,14 @@ SECTIONS
|
||||
|
||||
HYPERVISOR_DATA_SECTIONS
|
||||
|
||||
/* code sections that are never executed via the kernel mapping */
|
||||
.rodata.text : {
|
||||
TRAMP_TEXT
|
||||
HIBERNATE_TEXT
|
||||
KEXEC_TEXT
|
||||
. = ALIGN(PAGE_SIZE);
|
||||
}
|
||||
|
||||
idmap_pg_dir = .;
|
||||
. += IDMAP_DIR_SIZE;
|
||||
idmap_pg_end = .;
|
||||
@ -337,8 +340,8 @@ ASSERT(__hyp_idmap_text_end - __hyp_idmap_text_start <= PAGE_SIZE,
|
||||
ASSERT(__idmap_text_end - (__idmap_text_start & ~(SZ_4K - 1)) <= SZ_4K,
|
||||
"ID map text too big or misaligned")
|
||||
#ifdef CONFIG_HIBERNATION
|
||||
ASSERT(__hibernate_exit_text_end - (__hibernate_exit_text_start & ~(SZ_4K - 1))
|
||||
<= SZ_4K, "Hibernate exit text too big or misaligned")
|
||||
ASSERT(__hibernate_exit_text_end - __hibernate_exit_text_start <= SZ_4K,
|
||||
"Hibernate exit text is bigger than 4 KiB")
|
||||
#endif
|
||||
#ifdef CONFIG_UNMAP_KERNEL_AT_EL0
|
||||
ASSERT((__entry_tramp_text_end - __entry_tramp_text_start) <= 3*PAGE_SIZE,
|
||||
@ -362,7 +365,7 @@ ASSERT(swapper_pg_dir - tramp_pg_dir == TRAMP_SWAPPER_OFFSET,
|
||||
|
||||
#ifdef CONFIG_KEXEC_CORE
|
||||
/* kexec relocation code should fit into one KEXEC_CONTROL_PAGE_SIZE */
|
||||
ASSERT(__relocate_new_kernel_end - (__relocate_new_kernel_start & ~(SZ_4K - 1))
|
||||
<= SZ_4K, "kexec relocation code is too big or misaligned")
|
||||
ASSERT(__relocate_new_kernel_end - __relocate_new_kernel_start <= SZ_4K,
|
||||
"kexec relocation code is bigger than 4 KiB")
|
||||
ASSERT(KEXEC_CONTROL_PAGE_SIZE >= SZ_4K, "KEXEC_CONTROL_PAGE_SIZE is broken")
|
||||
#endif
|
||||
|
@ -82,6 +82,26 @@ void kvm_arch_vcpu_load_fp(struct kvm_vcpu *vcpu)
|
||||
|
||||
if (read_sysreg(cpacr_el1) & CPACR_EL1_ZEN_EL0EN)
|
||||
vcpu->arch.flags |= KVM_ARM64_HOST_SVE_ENABLED;
|
||||
|
||||
/*
|
||||
* We don't currently support SME guests but if we leave
|
||||
* things in streaming mode then when the guest starts running
|
||||
* FPSIMD or SVE code it may generate SME traps so as a
|
||||
* special case if we are in streaming mode we force the host
|
||||
* state to be saved now and exit streaming mode so that we
|
||||
* don't have to handle any SME traps for valid guest
|
||||
* operations. Do this for ZA as well for now for simplicity.
|
||||
*/
|
||||
if (system_supports_sme()) {
|
||||
if (read_sysreg(cpacr_el1) & CPACR_EL1_SMEN_EL0EN)
|
||||
vcpu->arch.flags |= KVM_ARM64_HOST_SME_ENABLED;
|
||||
|
||||
if (read_sysreg_s(SYS_SVCR_EL0) &
|
||||
(SYS_SVCR_EL0_SM_MASK | SYS_SVCR_EL0_ZA_MASK)) {
|
||||
vcpu->arch.flags &= ~KVM_ARM64_FP_HOST;
|
||||
fpsimd_save_and_flush_cpu_state();
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
/*
|
||||
@ -109,9 +129,14 @@ void kvm_arch_vcpu_ctxsync_fp(struct kvm_vcpu *vcpu)
|
||||
WARN_ON_ONCE(!irqs_disabled());
|
||||
|
||||
if (vcpu->arch.flags & KVM_ARM64_FP_ENABLED) {
|
||||
/*
|
||||
* Currently we do not support SME guests so SVCR is
|
||||
* always 0 and we just need a variable to point to.
|
||||
*/
|
||||
fpsimd_bind_state_to_cpu(&vcpu->arch.ctxt.fp_regs,
|
||||
vcpu->arch.sve_state,
|
||||
vcpu->arch.sve_max_vl);
|
||||
vcpu->arch.sve_max_vl,
|
||||
NULL, 0, &vcpu->arch.svcr);
|
||||
|
||||
clear_thread_flag(TIF_FOREIGN_FPSTATE);
|
||||
update_thread_flag(TIF_SVE, vcpu_has_sve(vcpu));
|
||||
@ -130,6 +155,22 @@ void kvm_arch_vcpu_put_fp(struct kvm_vcpu *vcpu)
|
||||
|
||||
local_irq_save(flags);
|
||||
|
||||
/*
|
||||
* If we have VHE then the Hyp code will reset CPACR_EL1 to
|
||||
* CPACR_EL1_DEFAULT and we need to reenable SME.
|
||||
*/
|
||||
if (has_vhe() && system_supports_sme()) {
|
||||
/* Also restore EL0 state seen on entry */
|
||||
if (vcpu->arch.flags & KVM_ARM64_HOST_SME_ENABLED)
|
||||
sysreg_clear_set(CPACR_EL1, 0,
|
||||
CPACR_EL1_SMEN_EL0EN |
|
||||
CPACR_EL1_SMEN_EL1EN);
|
||||
else
|
||||
sysreg_clear_set(CPACR_EL1,
|
||||
CPACR_EL1_SMEN_EL0EN,
|
||||
CPACR_EL1_SMEN_EL1EN);
|
||||
}
|
||||
|
||||
if (vcpu->arch.flags & KVM_ARM64_FP_ENABLED) {
|
||||
if (vcpu_has_sve(vcpu)) {
|
||||
__vcpu_sys_reg(vcpu, ZCR_EL1) = read_sysreg_el1(SYS_ZCR);
|
||||
|
@ -47,10 +47,24 @@ static void __activate_traps(struct kvm_vcpu *vcpu)
|
||||
val |= CPTR_EL2_TFP | CPTR_EL2_TZ;
|
||||
__activate_traps_fpsimd32(vcpu);
|
||||
}
|
||||
if (cpus_have_final_cap(ARM64_SME))
|
||||
val |= CPTR_EL2_TSM;
|
||||
|
||||
write_sysreg(val, cptr_el2);
|
||||
write_sysreg(__this_cpu_read(kvm_hyp_vector), vbar_el2);
|
||||
|
||||
if (cpus_have_final_cap(ARM64_SME)) {
|
||||
val = read_sysreg_s(SYS_HFGRTR_EL2);
|
||||
val &= ~(HFGxTR_EL2_nTPIDR2_EL0_MASK |
|
||||
HFGxTR_EL2_nSMPRI_EL1_MASK);
|
||||
write_sysreg_s(val, SYS_HFGRTR_EL2);
|
||||
|
||||
val = read_sysreg_s(SYS_HFGWTR_EL2);
|
||||
val &= ~(HFGxTR_EL2_nTPIDR2_EL0_MASK |
|
||||
HFGxTR_EL2_nSMPRI_EL1_MASK);
|
||||
write_sysreg_s(val, SYS_HFGWTR_EL2);
|
||||
}
|
||||
|
||||
if (cpus_have_final_cap(ARM64_WORKAROUND_SPECULATIVE_AT)) {
|
||||
struct kvm_cpu_context *ctxt = &vcpu->arch.ctxt;
|
||||
|
||||
@ -94,9 +108,25 @@ static void __deactivate_traps(struct kvm_vcpu *vcpu)
|
||||
|
||||
write_sysreg(this_cpu_ptr(&kvm_init_params)->hcr_el2, hcr_el2);
|
||||
|
||||
if (cpus_have_final_cap(ARM64_SME)) {
|
||||
u64 val;
|
||||
|
||||
val = read_sysreg_s(SYS_HFGRTR_EL2);
|
||||
val |= HFGxTR_EL2_nTPIDR2_EL0_MASK |
|
||||
HFGxTR_EL2_nSMPRI_EL1_MASK;
|
||||
write_sysreg_s(val, SYS_HFGRTR_EL2);
|
||||
|
||||
val = read_sysreg_s(SYS_HFGWTR_EL2);
|
||||
val |= HFGxTR_EL2_nTPIDR2_EL0_MASK |
|
||||
HFGxTR_EL2_nSMPRI_EL1_MASK;
|
||||
write_sysreg_s(val, SYS_HFGWTR_EL2);
|
||||
}
|
||||
|
||||
cptr = CPTR_EL2_DEFAULT;
|
||||
if (vcpu_has_sve(vcpu) && (vcpu->arch.flags & KVM_ARM64_FP_ENABLED))
|
||||
cptr |= CPTR_EL2_TZ;
|
||||
if (cpus_have_final_cap(ARM64_SME))
|
||||
cptr &= ~CPTR_EL2_TSM;
|
||||
|
||||
write_sysreg(cptr, cptr_el2);
|
||||
write_sysreg(__kvm_hyp_host_vector, vbar_el2);
|
||||
|
@ -41,7 +41,8 @@ static void __activate_traps(struct kvm_vcpu *vcpu)
|
||||
|
||||
val = read_sysreg(cpacr_el1);
|
||||
val |= CPACR_EL1_TTA;
|
||||
val &= ~(CPACR_EL1_ZEN_EL0EN | CPACR_EL1_ZEN_EL1EN);
|
||||
val &= ~(CPACR_EL1_ZEN_EL0EN | CPACR_EL1_ZEN_EL1EN |
|
||||
CPACR_EL1_SMEN_EL0EN | CPACR_EL1_SMEN_EL1EN);
|
||||
|
||||
/*
|
||||
* With VHE (HCR.E2H == 1), accesses to CPACR_EL1 are routed to
|
||||
@ -62,6 +63,10 @@ static void __activate_traps(struct kvm_vcpu *vcpu)
|
||||
__activate_traps_fpsimd32(vcpu);
|
||||
}
|
||||
|
||||
if (cpus_have_final_cap(ARM64_SME))
|
||||
write_sysreg(read_sysreg(sctlr_el2) & ~SCTLR_ELx_ENTP2,
|
||||
sctlr_el2);
|
||||
|
||||
write_sysreg(val, cpacr_el1);
|
||||
|
||||
write_sysreg(__this_cpu_read(kvm_hyp_vector), vbar_el1);
|
||||
@ -83,6 +88,10 @@ static void __deactivate_traps(struct kvm_vcpu *vcpu)
|
||||
*/
|
||||
asm(ALTERNATIVE("nop", "isb", ARM64_WORKAROUND_SPECULATIVE_AT));
|
||||
|
||||
if (cpus_have_final_cap(ARM64_SME))
|
||||
write_sysreg(read_sysreg(sctlr_el2) | SCTLR_ELx_ENTP2,
|
||||
sctlr_el2);
|
||||
|
||||
write_sysreg(CPACR_EL1_DEFAULT, cpacr_el1);
|
||||
|
||||
if (!arm64_kernel_unmapped_at_el0())
|
||||
|
@ -1132,6 +1132,8 @@ static u64 read_id_reg(const struct kvm_vcpu *vcpu,
|
||||
case SYS_ID_AA64PFR1_EL1:
|
||||
if (!kvm_has_mte(vcpu->kvm))
|
||||
val &= ~ARM64_FEATURE_MASK(ID_AA64PFR1_MTE);
|
||||
|
||||
val &= ~ARM64_FEATURE_MASK(ID_AA64PFR1_SME);
|
||||
break;
|
||||
case SYS_ID_AA64ISAR1_EL1:
|
||||
if (!vcpu_has_ptrauth(vcpu))
|
||||
@ -1553,7 +1555,7 @@ static const struct sys_reg_desc sys_reg_descs[] = {
|
||||
ID_UNALLOCATED(4,2),
|
||||
ID_UNALLOCATED(4,3),
|
||||
ID_SANITISED(ID_AA64ZFR0_EL1),
|
||||
ID_UNALLOCATED(4,5),
|
||||
ID_HIDDEN(ID_AA64SMFR0_EL1),
|
||||
ID_UNALLOCATED(4,6),
|
||||
ID_UNALLOCATED(4,7),
|
||||
|
||||
@ -1596,6 +1598,8 @@ static const struct sys_reg_desc sys_reg_descs[] = {
|
||||
|
||||
{ SYS_DESC(SYS_ZCR_EL1), NULL, reset_val, ZCR_EL1, 0, .visibility = sve_visibility },
|
||||
{ SYS_DESC(SYS_TRFCR_EL1), undef_access },
|
||||
{ SYS_DESC(SYS_SMPRI_EL1), undef_access },
|
||||
{ SYS_DESC(SYS_SMCR_EL1), undef_access },
|
||||
{ SYS_DESC(SYS_TTBR0_EL1), access_vm_reg, reset_unknown, TTBR0_EL1 },
|
||||
{ SYS_DESC(SYS_TTBR1_EL1), access_vm_reg, reset_unknown, TTBR1_EL1 },
|
||||
{ SYS_DESC(SYS_TCR_EL1), access_vm_reg, reset_val, TCR_EL1, 0 },
|
||||
@ -1678,8 +1682,10 @@ static const struct sys_reg_desc sys_reg_descs[] = {
|
||||
|
||||
{ SYS_DESC(SYS_CCSIDR_EL1), access_ccsidr },
|
||||
{ SYS_DESC(SYS_CLIDR_EL1), access_clidr },
|
||||
{ SYS_DESC(SYS_SMIDR_EL1), undef_access },
|
||||
{ SYS_DESC(SYS_CSSELR_EL1), access_csselr, reset_unknown, CSSELR_EL1 },
|
||||
{ SYS_DESC(SYS_CTR_EL0), access_ctr },
|
||||
{ SYS_DESC(SYS_SVCR_EL0), undef_access },
|
||||
|
||||
{ PMU_SYS_REG(SYS_PMCR_EL0), .access = access_pmcr,
|
||||
.reset = reset_pmcr, .reg = PMCR_EL0 },
|
||||
@ -1719,6 +1725,7 @@ static const struct sys_reg_desc sys_reg_descs[] = {
|
||||
|
||||
{ SYS_DESC(SYS_TPIDR_EL0), NULL, reset_unknown, TPIDR_EL0 },
|
||||
{ SYS_DESC(SYS_TPIDRRO_EL0), NULL, reset_unknown, TPIDRRO_EL0 },
|
||||
{ SYS_DESC(SYS_TPIDR2_EL0), undef_access },
|
||||
|
||||
{ SYS_DESC(SYS_SCXTNUM_EL0), undef_access },
|
||||
|
||||
|
@ -93,7 +93,7 @@ SYM_FUNC_START(mte_copy_tags_from_user)
|
||||
mov x3, x1
|
||||
cbz x2, 2f
|
||||
1:
|
||||
user_ldst 2f, ldtrb, w4, x1, 0
|
||||
USER(2f, ldtrb w4, [x1])
|
||||
lsl x4, x4, #MTE_TAG_SHIFT
|
||||
stg x4, [x0], #MTE_GRANULE_SIZE
|
||||
add x1, x1, #1
|
||||
@ -120,7 +120,7 @@ SYM_FUNC_START(mte_copy_tags_to_user)
|
||||
1:
|
||||
ldg x4, [x1]
|
||||
ubfx x4, x4, #MTE_TAG_SHIFT, #MTE_TAG_SIZE
|
||||
user_ldst 2f, sttrb, w4, x0, 0
|
||||
USER(2f, sttrb w4, [x0])
|
||||
add x0, x0, #1
|
||||
add x1, x1, #MTE_GRANULE_SIZE
|
||||
subs x2, x2, #1
|
||||
|
@ -16,8 +16,8 @@
|
||||
|
||||
void copy_highpage(struct page *to, struct page *from)
|
||||
{
|
||||
struct page *kto = page_address(to);
|
||||
struct page *kfrom = page_address(from);
|
||||
void *kto = page_address(to);
|
||||
void *kfrom = page_address(from);
|
||||
|
||||
copy_page(kto, kfrom);
|
||||
|
||||
|
@ -158,6 +158,28 @@ static inline int num_contig_ptes(unsigned long size, size_t *pgsize)
|
||||
return contig_ptes;
|
||||
}
|
||||
|
||||
pte_t huge_ptep_get(pte_t *ptep)
|
||||
{
|
||||
int ncontig, i;
|
||||
size_t pgsize;
|
||||
pte_t orig_pte = ptep_get(ptep);
|
||||
|
||||
if (!pte_present(orig_pte) || !pte_cont(orig_pte))
|
||||
return orig_pte;
|
||||
|
||||
ncontig = num_contig_ptes(page_size(pte_page(orig_pte)), &pgsize);
|
||||
for (i = 0; i < ncontig; i++, ptep++) {
|
||||
pte_t pte = ptep_get(ptep);
|
||||
|
||||
if (pte_dirty(pte))
|
||||
orig_pte = pte_mkdirty(orig_pte);
|
||||
|
||||
if (pte_young(pte))
|
||||
orig_pte = pte_mkyoung(orig_pte);
|
||||
}
|
||||
return orig_pte;
|
||||
}
|
||||
|
||||
/*
|
||||
* Changing some bits of contiguous entries requires us to follow a
|
||||
* Break-Before-Make approach, breaking the whole contiguous set
|
||||
@ -166,15 +188,14 @@ static inline int num_contig_ptes(unsigned long size, size_t *pgsize)
|
||||
*
|
||||
* This helper performs the break step.
|
||||
*/
|
||||
static pte_t get_clear_flush(struct mm_struct *mm,
|
||||
static pte_t get_clear_contig(struct mm_struct *mm,
|
||||
unsigned long addr,
|
||||
pte_t *ptep,
|
||||
unsigned long pgsize,
|
||||
unsigned long ncontig)
|
||||
{
|
||||
pte_t orig_pte = huge_ptep_get(ptep);
|
||||
bool valid = pte_valid(orig_pte);
|
||||
unsigned long i, saddr = addr;
|
||||
pte_t orig_pte = ptep_get(ptep);
|
||||
unsigned long i;
|
||||
|
||||
for (i = 0; i < ncontig; i++, addr += pgsize, ptep++) {
|
||||
pte_t pte = ptep_get_and_clear(mm, addr, ptep);
|
||||
@ -190,11 +211,6 @@ static pte_t get_clear_flush(struct mm_struct *mm,
|
||||
if (pte_young(pte))
|
||||
orig_pte = pte_mkyoung(orig_pte);
|
||||
}
|
||||
|
||||
if (valid) {
|
||||
struct vm_area_struct vma = TLB_FLUSH_VMA(mm, 0);
|
||||
flush_tlb_range(&vma, saddr, addr);
|
||||
}
|
||||
return orig_pte;
|
||||
}
|
||||
|
||||
@ -385,14 +401,14 @@ pte_t huge_ptep_get_and_clear(struct mm_struct *mm,
|
||||
{
|
||||
int ncontig;
|
||||
size_t pgsize;
|
||||
pte_t orig_pte = huge_ptep_get(ptep);
|
||||
pte_t orig_pte = ptep_get(ptep);
|
||||
|
||||
if (!pte_cont(orig_pte))
|
||||
return ptep_get_and_clear(mm, addr, ptep);
|
||||
|
||||
ncontig = find_num_contig(mm, addr, ptep, &pgsize);
|
||||
|
||||
return get_clear_flush(mm, addr, ptep, pgsize, ncontig);
|
||||
return get_clear_contig(mm, addr, ptep, pgsize, ncontig);
|
||||
}
|
||||
|
||||
/*
|
||||
@ -408,11 +424,11 @@ static int __cont_access_flags_changed(pte_t *ptep, pte_t pte, int ncontig)
|
||||
{
|
||||
int i;
|
||||
|
||||
if (pte_write(pte) != pte_write(huge_ptep_get(ptep)))
|
||||
if (pte_write(pte) != pte_write(ptep_get(ptep)))
|
||||
return 1;
|
||||
|
||||
for (i = 0; i < ncontig; i++) {
|
||||
pte_t orig_pte = huge_ptep_get(ptep + i);
|
||||
pte_t orig_pte = ptep_get(ptep + i);
|
||||
|
||||
if (pte_dirty(pte) != pte_dirty(orig_pte))
|
||||
return 1;
|
||||
@ -443,7 +459,7 @@ int huge_ptep_set_access_flags(struct vm_area_struct *vma,
|
||||
if (!__cont_access_flags_changed(ptep, pte, ncontig))
|
||||
return 0;
|
||||
|
||||
orig_pte = get_clear_flush(vma->vm_mm, addr, ptep, pgsize, ncontig);
|
||||
orig_pte = get_clear_contig(vma->vm_mm, addr, ptep, pgsize, ncontig);
|
||||
|
||||
/* Make sure we don't lose the dirty or young state */
|
||||
if (pte_dirty(orig_pte))
|
||||
@ -476,7 +492,7 @@ void huge_ptep_set_wrprotect(struct mm_struct *mm,
|
||||
ncontig = find_num_contig(mm, addr, ptep, &pgsize);
|
||||
dpfn = pgsize >> PAGE_SHIFT;
|
||||
|
||||
pte = get_clear_flush(mm, addr, ptep, pgsize, ncontig);
|
||||
pte = get_clear_contig(mm, addr, ptep, pgsize, ncontig);
|
||||
pte = pte_wrprotect(pte);
|
||||
|
||||
hugeprot = pte_pgprot(pte);
|
||||
|
@ -90,6 +90,32 @@ phys_addr_t __ro_after_init arm64_dma_phys_limit;
|
||||
phys_addr_t __ro_after_init arm64_dma_phys_limit = PHYS_MASK + 1;
|
||||
#endif
|
||||
|
||||
/* Current arm64 boot protocol requires 2MB alignment */
|
||||
#define CRASH_ALIGN SZ_2M
|
||||
|
||||
#define CRASH_ADDR_LOW_MAX arm64_dma_phys_limit
|
||||
#define CRASH_ADDR_HIGH_MAX (PHYS_MASK + 1)
|
||||
|
||||
static int __init reserve_crashkernel_low(unsigned long long low_size)
|
||||
{
|
||||
unsigned long long low_base;
|
||||
|
||||
low_base = memblock_phys_alloc_range(low_size, CRASH_ALIGN, 0, CRASH_ADDR_LOW_MAX);
|
||||
if (!low_base) {
|
||||
pr_err("cannot allocate crashkernel low memory (size:0x%llx).\n", low_size);
|
||||
return -ENOMEM;
|
||||
}
|
||||
|
||||
pr_info("crashkernel low memory reserved: 0x%08llx - 0x%08llx (%lld MB)\n",
|
||||
low_base, low_base + low_size, low_size >> 20);
|
||||
|
||||
crashk_low_res.start = low_base;
|
||||
crashk_low_res.end = low_base + low_size - 1;
|
||||
insert_resource(&iomem_resource, &crashk_low_res);
|
||||
|
||||
return 0;
|
||||
}
|
||||
|
||||
/*
|
||||
* reserve_crashkernel() - reserves memory for crash kernel
|
||||
*
|
||||
@ -100,17 +126,35 @@ phys_addr_t __ro_after_init arm64_dma_phys_limit = PHYS_MASK + 1;
|
||||
static void __init reserve_crashkernel(void)
|
||||
{
|
||||
unsigned long long crash_base, crash_size;
|
||||
unsigned long long crash_max = arm64_dma_phys_limit;
|
||||
unsigned long long crash_low_size = 0;
|
||||
unsigned long long crash_max = CRASH_ADDR_LOW_MAX;
|
||||
char *cmdline = boot_command_line;
|
||||
int ret;
|
||||
|
||||
if (!IS_ENABLED(CONFIG_KEXEC_CORE))
|
||||
return;
|
||||
|
||||
ret = parse_crashkernel(boot_command_line, memblock_phys_mem_size(),
|
||||
/* crashkernel=X[@offset] */
|
||||
ret = parse_crashkernel(cmdline, memblock_phys_mem_size(),
|
||||
&crash_size, &crash_base);
|
||||
/* no crashkernel= or invalid value specified */
|
||||
if (ret || !crash_size)
|
||||
if (ret == -ENOENT) {
|
||||
ret = parse_crashkernel_high(cmdline, 0, &crash_size, &crash_base);
|
||||
if (ret || !crash_size)
|
||||
return;
|
||||
|
||||
/*
|
||||
* crashkernel=Y,low can be specified or not, but invalid value
|
||||
* is not allowed.
|
||||
*/
|
||||
ret = parse_crashkernel_low(cmdline, 0, &crash_low_size, &crash_base);
|
||||
if (ret && (ret != -ENOENT))
|
||||
return;
|
||||
|
||||
crash_max = CRASH_ADDR_HIGH_MAX;
|
||||
} else if (ret || !crash_size) {
|
||||
/* The specified value is invalid */
|
||||
return;
|
||||
}
|
||||
|
||||
crash_size = PAGE_ALIGN(crash_size);
|
||||
|
||||
@ -118,8 +162,7 @@ static void __init reserve_crashkernel(void)
|
||||
if (crash_base)
|
||||
crash_max = crash_base + crash_size;
|
||||
|
||||
/* Current arm64 boot protocol requires 2MB alignment */
|
||||
crash_base = memblock_phys_alloc_range(crash_size, SZ_2M,
|
||||
crash_base = memblock_phys_alloc_range(crash_size, CRASH_ALIGN,
|
||||
crash_base, crash_max);
|
||||
if (!crash_base) {
|
||||
pr_warn("cannot allocate crashkernel (size:0x%llx)\n",
|
||||
@ -127,6 +170,12 @@ static void __init reserve_crashkernel(void)
|
||||
return;
|
||||
}
|
||||
|
||||
if ((crash_base >= CRASH_ADDR_LOW_MAX) &&
|
||||
crash_low_size && reserve_crashkernel_low(crash_low_size)) {
|
||||
memblock_phys_free(crash_base, crash_size);
|
||||
return;
|
||||
}
|
||||
|
||||
pr_info("crashkernel reserved: 0x%016llx - 0x%016llx (%lld MB)\n",
|
||||
crash_base, crash_base + crash_size, crash_size >> 20);
|
||||
|
||||
@ -135,8 +184,12 @@ static void __init reserve_crashkernel(void)
|
||||
* map. Inform kmemleak so that it won't try to access it.
|
||||
*/
|
||||
kmemleak_ignore_phys(crash_base);
|
||||
if (crashk_low_res.end)
|
||||
kmemleak_ignore_phys(crashk_low_res.start);
|
||||
|
||||
crashk_res.start = crash_base;
|
||||
crashk_res.end = crash_base + crash_size - 1;
|
||||
insert_resource(&iomem_resource, &crashk_res);
|
||||
}
|
||||
|
||||
/*
|
||||
@ -157,7 +210,7 @@ static phys_addr_t __init max_zone_phys(unsigned int zone_bits)
|
||||
return min(zone_mask, memblock_end_of_DRAM() - 1) + 1;
|
||||
}
|
||||
|
||||
static void __init zone_sizes_init(unsigned long min, unsigned long max)
|
||||
static void __init zone_sizes_init(void)
|
||||
{
|
||||
unsigned long max_zone_pfns[MAX_NR_ZONES] = {0};
|
||||
unsigned int __maybe_unused acpi_zone_dma_bits;
|
||||
@ -176,7 +229,7 @@ static void __init zone_sizes_init(unsigned long min, unsigned long max)
|
||||
if (!arm64_dma_phys_limit)
|
||||
arm64_dma_phys_limit = dma32_phys_limit;
|
||||
#endif
|
||||
max_zone_pfns[ZONE_NORMAL] = max;
|
||||
max_zone_pfns[ZONE_NORMAL] = max_pfn;
|
||||
|
||||
free_area_init(max_zone_pfns);
|
||||
}
|
||||
@ -374,7 +427,7 @@ void __init bootmem_init(void)
|
||||
* done after the fixed reservations
|
||||
*/
|
||||
sparse_init();
|
||||
zone_sizes_init(min, max);
|
||||
zone_sizes_init();
|
||||
|
||||
/*
|
||||
* Reserve the CMA area after arm64_dma_phys_limit was initialised.
|
||||
|
@ -238,7 +238,7 @@ int trans_pgd_idmap_page(struct trans_pgd_info *info, phys_addr_t *trans_ttbr0,
|
||||
int this_level, index, level_lsb, level_msb;
|
||||
|
||||
dst_addr &= PAGE_MASK;
|
||||
prev_level_entry = pte_val(pfn_pte(pfn, PAGE_KERNEL_EXEC));
|
||||
prev_level_entry = pte_val(pfn_pte(pfn, PAGE_KERNEL_ROX));
|
||||
|
||||
for (this_level = 3; this_level >= 0; this_level--) {
|
||||
levels[this_level] = trans_alloc(info);
|
||||
|
@ -43,6 +43,8 @@ KVM_PROTECTED_MODE
|
||||
MISMATCHED_CACHE_TYPE
|
||||
MTE
|
||||
MTE_ASYMM
|
||||
SME
|
||||
SME_FA64
|
||||
SPECTRE_V2
|
||||
SPECTRE_V3A
|
||||
SPECTRE_V4
|
||||
|
@ -579,9 +579,7 @@ void arch_ftrace_trampoline_free(struct ftrace_ops *ops)
|
||||
|
||||
#ifdef CONFIG_FUNCTION_GRAPH_TRACER
|
||||
|
||||
#ifdef CONFIG_DYNAMIC_FTRACE
|
||||
|
||||
#ifndef CONFIG_HAVE_DYNAMIC_FTRACE_WITH_ARGS
|
||||
#if defined(CONFIG_DYNAMIC_FTRACE) && !defined(CONFIG_HAVE_DYNAMIC_FTRACE_WITH_ARGS)
|
||||
extern void ftrace_graph_call(void);
|
||||
static const char *ftrace_jmp_replace(unsigned long ip, unsigned long addr)
|
||||
{
|
||||
@ -610,18 +608,7 @@ int ftrace_disable_ftrace_graph_caller(void)
|
||||
|
||||
return ftrace_mod_jmp(ip, &ftrace_stub);
|
||||
}
|
||||
#else /* !CONFIG_HAVE_DYNAMIC_FTRACE_WITH_ARGS */
|
||||
int ftrace_enable_ftrace_graph_caller(void)
|
||||
{
|
||||
return 0;
|
||||
}
|
||||
|
||||
int ftrace_disable_ftrace_graph_caller(void)
|
||||
{
|
||||
return 0;
|
||||
}
|
||||
#endif /* CONFIG_HAVE_DYNAMIC_FTRACE_WITH_ARGS */
|
||||
#endif /* !CONFIG_DYNAMIC_FTRACE */
|
||||
#endif /* CONFIG_DYNAMIC_FTRACE && !CONFIG_HAVE_DYNAMIC_FTRACE_WITH_ARGS */
|
||||
|
||||
/*
|
||||
* Hook the return address and push it in the stack of return addrs
|
||||
|
@ -973,16 +973,24 @@ static void __init early_init_dt_check_for_elfcorehdr(unsigned long node)
|
||||
|
||||
static unsigned long chosen_node_offset = -FDT_ERR_NOTFOUND;
|
||||
|
||||
/*
|
||||
* The main usage of linux,usable-memory-range is for crash dump kernel.
|
||||
* Originally, the number of usable-memory regions is one. Now there may
|
||||
* be two regions, low region and high region.
|
||||
* To make compatibility with existing user-space and older kdump, the low
|
||||
* region is always the last range of linux,usable-memory-range if exist.
|
||||
*/
|
||||
#define MAX_USABLE_RANGES 2
|
||||
|
||||
/**
|
||||
* early_init_dt_check_for_usable_mem_range - Decode usable memory range
|
||||
* location from flat tree
|
||||
*/
|
||||
void __init early_init_dt_check_for_usable_mem_range(void)
|
||||
{
|
||||
const __be32 *prop;
|
||||
int len;
|
||||
phys_addr_t cap_mem_addr;
|
||||
phys_addr_t cap_mem_size;
|
||||
struct memblock_region rgn[MAX_USABLE_RANGES] = {0};
|
||||
const __be32 *prop, *endp;
|
||||
int len, i;
|
||||
unsigned long node = chosen_node_offset;
|
||||
|
||||
if ((long)node < 0)
|
||||
@ -991,16 +999,21 @@ void __init early_init_dt_check_for_usable_mem_range(void)
|
||||
pr_debug("Looking for usable-memory-range property... ");
|
||||
|
||||
prop = of_get_flat_dt_prop(node, "linux,usable-memory-range", &len);
|
||||
if (!prop || (len < (dt_root_addr_cells + dt_root_size_cells)))
|
||||
if (!prop || (len % (dt_root_addr_cells + dt_root_size_cells)))
|
||||
return;
|
||||
|
||||
cap_mem_addr = dt_mem_next_cell(dt_root_addr_cells, &prop);
|
||||
cap_mem_size = dt_mem_next_cell(dt_root_size_cells, &prop);
|
||||
endp = prop + (len / sizeof(__be32));
|
||||
for (i = 0; i < MAX_USABLE_RANGES && prop < endp; i++) {
|
||||
rgn[i].base = dt_mem_next_cell(dt_root_addr_cells, &prop);
|
||||
rgn[i].size = dt_mem_next_cell(dt_root_size_cells, &prop);
|
||||
|
||||
pr_debug("cap_mem_start=%pa cap_mem_size=%pa\n", &cap_mem_addr,
|
||||
&cap_mem_size);
|
||||
pr_debug("cap_mem_regions[%d]: base=%pa, size=%pa\n",
|
||||
i, &rgn[i].base, &rgn[i].size);
|
||||
}
|
||||
|
||||
memblock_cap_memory_range(cap_mem_addr, cap_mem_size);
|
||||
memblock_cap_memory_range(rgn[0].base, rgn[0].size);
|
||||
for (i = 1; i < MAX_USABLE_RANGES && rgn[i].size; i++)
|
||||
memblock_add(rgn[i].base, rgn[i].size);
|
||||
}
|
||||
|
||||
#ifdef CONFIG_SERIAL_EARLYCON
|
||||
|
@ -386,6 +386,15 @@ void *of_kexec_alloc_and_setup_fdt(const struct kimage *image,
|
||||
crashk_res.end - crashk_res.start + 1);
|
||||
if (ret)
|
||||
goto out;
|
||||
|
||||
if (crashk_low_res.end) {
|
||||
ret = fdt_appendprop_addrrange(fdt, 0, chosen_node,
|
||||
"linux,usable-memory-range",
|
||||
crashk_low_res.start,
|
||||
crashk_low_res.end - crashk_low_res.start + 1);
|
||||
if (ret)
|
||||
goto out;
|
||||
}
|
||||
}
|
||||
|
||||
/* add bootargs */
|
||||
|
@ -2565,7 +2565,12 @@ static noinline int search_ioctl(struct inode *inode,
|
||||
|
||||
while (1) {
|
||||
ret = -EFAULT;
|
||||
if (fault_in_writeable(ubuf + sk_offset, *buf_size - sk_offset))
|
||||
/*
|
||||
* Ensure that the whole user buffer is faulted in at sub-page
|
||||
* granularity, otherwise the loop may live-lock.
|
||||
*/
|
||||
if (fault_in_subpage_writeable(ubuf + sk_offset,
|
||||
*buf_size - sk_offset))
|
||||
break;
|
||||
|
||||
ret = btrfs_search_forward(root, &key, path, sk->min_transid);
|
||||
|
@ -1046,6 +1046,7 @@ void folio_add_wait_queue(struct folio *folio, wait_queue_entry_t *waiter);
|
||||
* Fault in userspace address range.
|
||||
*/
|
||||
size_t fault_in_writeable(char __user *uaddr, size_t size);
|
||||
size_t fault_in_subpage_writeable(char __user *uaddr, size_t size);
|
||||
size_t fault_in_safe_writeable(const char __user *uaddr, size_t size);
|
||||
size_t fault_in_readable(const char __user *uaddr, size_t size);
|
||||
|
||||
|
@ -231,6 +231,28 @@ static inline bool pagefault_disabled(void)
|
||||
*/
|
||||
#define faulthandler_disabled() (pagefault_disabled() || in_atomic())
|
||||
|
||||
#ifndef CONFIG_ARCH_HAS_SUBPAGE_FAULTS
|
||||
|
||||
/**
|
||||
* probe_subpage_writeable: probe the user range for write faults at sub-page
|
||||
* granularity (e.g. arm64 MTE)
|
||||
* @uaddr: start of address range
|
||||
* @size: size of address range
|
||||
*
|
||||
* Returns 0 on success, the number of bytes not probed on fault.
|
||||
*
|
||||
* It is expected that the caller checked for the write permission of each
|
||||
* page in the range either by put_user() or GUP. The architecture port can
|
||||
* implement a more efficient get_user() probing if the same sub-page faults
|
||||
* are triggered by either a read or a write.
|
||||
*/
|
||||
static inline size_t probe_subpage_writeable(char __user *uaddr, size_t size)
|
||||
{
|
||||
return 0;
|
||||
}
|
||||
|
||||
#endif /* CONFIG_ARCH_HAS_SUBPAGE_FAULTS */
|
||||
|
||||
#ifndef ARCH_HAS_NOCACHE_UACCESS
|
||||
|
||||
static inline __must_check unsigned long
|
||||
|
@ -431,6 +431,8 @@ typedef struct elf64_shdr {
|
||||
#define NT_ARM_PACG_KEYS 0x408 /* ARM pointer authentication generic key */
|
||||
#define NT_ARM_TAGGED_ADDR_CTRL 0x409 /* arm64 tagged address control (prctl()) */
|
||||
#define NT_ARM_PAC_ENABLED_KEYS 0x40a /* arm64 ptr auth enabled keys (prctl()) */
|
||||
#define NT_ARM_SSVE 0x40b /* ARM Streaming SVE registers */
|
||||
#define NT_ARM_ZA 0x40c /* ARM SME ZA registers */
|
||||
#define NT_ARC_V2 0x600 /* ARCv2 accumulator/extra registers */
|
||||
#define NT_VMCOREDD 0x700 /* Vmcore Device Dump Note */
|
||||
#define NT_MIPS_DSP 0x800 /* MIPS DSP ASE registers */
|
||||
|
@ -272,6 +272,15 @@ struct prctl_mm_map {
|
||||
# define PR_SCHED_CORE_SCOPE_THREAD_GROUP 1
|
||||
# define PR_SCHED_CORE_SCOPE_PROCESS_GROUP 2
|
||||
|
||||
/* arm64 Scalable Matrix Extension controls */
|
||||
/* Flag values must be in sync with SVE versions */
|
||||
#define PR_SME_SET_VL 63 /* set task vector length */
|
||||
# define PR_SME_SET_VL_ONEXEC (1 << 18) /* defer effect until exec */
|
||||
#define PR_SME_GET_VL 64 /* get task vector length */
|
||||
/* Bits common to PR_SME_SET_VL and PR_SME_GET_VL */
|
||||
# define PR_SME_VL_LEN_MASK 0xffff
|
||||
# define PR_SME_VL_INHERIT (1 << 17) /* inherit across exec */
|
||||
|
||||
#define PR_SET_VMA 0x53564d41
|
||||
# define PR_SET_VMA_ANON_NAME 0
|
||||
|
||||
|
@ -243,9 +243,8 @@ static int __init __parse_crashkernel(char *cmdline,
|
||||
*crash_base = 0;
|
||||
|
||||
ck_cmdline = get_last_crashkernel(cmdline, name, suffix);
|
||||
|
||||
if (!ck_cmdline)
|
||||
return -EINVAL;
|
||||
return -ENOENT;
|
||||
|
||||
ck_cmdline += strlen(name);
|
||||
|
||||
|
12
kernel/sys.c
12
kernel/sys.c
@ -117,6 +117,12 @@
|
||||
#ifndef SVE_GET_VL
|
||||
# define SVE_GET_VL() (-EINVAL)
|
||||
#endif
|
||||
#ifndef SME_SET_VL
|
||||
# define SME_SET_VL(a) (-EINVAL)
|
||||
#endif
|
||||
#ifndef SME_GET_VL
|
||||
# define SME_GET_VL() (-EINVAL)
|
||||
#endif
|
||||
#ifndef PAC_RESET_KEYS
|
||||
# define PAC_RESET_KEYS(a, b) (-EINVAL)
|
||||
#endif
|
||||
@ -2541,6 +2547,12 @@ SYSCALL_DEFINE5(prctl, int, option, unsigned long, arg2, unsigned long, arg3,
|
||||
case PR_SVE_GET_VL:
|
||||
error = SVE_GET_VL();
|
||||
break;
|
||||
case PR_SME_SET_VL:
|
||||
error = SME_SET_VL(arg2);
|
||||
break;
|
||||
case PR_SME_GET_VL:
|
||||
error = SME_GET_VL();
|
||||
break;
|
||||
case PR_GET_SPECULATION_CTRL:
|
||||
if (arg3 || arg4 || arg5)
|
||||
return -EINVAL;
|
||||
|
@ -30,6 +30,24 @@ int ftrace_graph_active;
|
||||
/* Both enabled by default (can be cleared by function_graph tracer flags */
|
||||
static bool fgraph_sleep_time = true;
|
||||
|
||||
/*
|
||||
* archs can override this function if they must do something
|
||||
* to enable hook for graph tracer.
|
||||
*/
|
||||
int __weak ftrace_enable_ftrace_graph_caller(void)
|
||||
{
|
||||
return 0;
|
||||
}
|
||||
|
||||
/*
|
||||
* archs can override this function if they must do something
|
||||
* to disable hook for graph tracer.
|
||||
*/
|
||||
int __weak ftrace_disable_ftrace_graph_caller(void)
|
||||
{
|
||||
return 0;
|
||||
}
|
||||
|
||||
/**
|
||||
* ftrace_graph_stop - set to permanently disable function graph tracing
|
||||
*
|
||||
|
29
mm/gup.c
29
mm/gup.c
@ -1648,6 +1648,35 @@ out:
|
||||
}
|
||||
EXPORT_SYMBOL(fault_in_writeable);
|
||||
|
||||
/**
|
||||
* fault_in_subpage_writeable - fault in an address range for writing
|
||||
* @uaddr: start of address range
|
||||
* @size: size of address range
|
||||
*
|
||||
* Fault in a user address range for writing while checking for permissions at
|
||||
* sub-page granularity (e.g. arm64 MTE). This function should be used when
|
||||
* the caller cannot guarantee forward progress of a copy_to_user() loop.
|
||||
*
|
||||
* Returns the number of bytes not faulted in (like copy_to_user() and
|
||||
* copy_from_user()).
|
||||
*/
|
||||
size_t fault_in_subpage_writeable(char __user *uaddr, size_t size)
|
||||
{
|
||||
size_t faulted_in;
|
||||
|
||||
/*
|
||||
* Attempt faulting in at page granularity first for page table
|
||||
* permission checking. The arch-specific probe_subpage_writeable()
|
||||
* functions may not check for this.
|
||||
*/
|
||||
faulted_in = size - fault_in_writeable(uaddr, size);
|
||||
if (faulted_in)
|
||||
faulted_in -= probe_subpage_writeable(uaddr, faulted_in);
|
||||
|
||||
return size - faulted_in;
|
||||
}
|
||||
EXPORT_SYMBOL(fault_in_subpage_writeable);
|
||||
|
||||
/*
|
||||
* fault_in_safe_writeable - fault in an address range for writing
|
||||
* @uaddr: start of address range
|
||||
|
Loading…
Reference in New Issue
Block a user