Commit Graph

1043364 Commits

Author SHA1 Message Date
Will Deacon
2bc655ce29 Merge branch 'for-next/misc' into for-next/core
* for-next/misc:
  arm64: Select POSIX_CPU_TIMERS_TASK_WORK
  arm64: Document boot requirements for FEAT_SME_FA64
  arm64: ftrace: use function_nocfi for _mcount as well
  arm64: asm: setup.h: export common variables
  arm64/traps: Avoid unnecessary kernel/user pointer conversion
2021-10-29 12:24:59 +01:00
Will Deacon
082f6b4b62 Merge branch 'for-next/kselftest' into for-next/core
* for-next/kselftest:
  selftests: arm64: Factor out utility functions for assembly FP tests
  selftests: arm64: Add coverage of ptrace flags for SVE VL inheritance
  selftests: arm64: Verify that all possible vector lengths are handled
  selftests: arm64: Fix and enable test for setting current VL in vec-syscfg
  selftests: arm64: Remove bogus error check on writing to files
  selftests: arm64: Fix printf() format mismatch in vec-syscfg
  selftests: arm64: Move FPSIMD in SVE ptrace test into a function
  selftests: arm64: More comprehensively test the SVE ptrace interface
  selftests: arm64: Verify interoperation of SVE and FPSIMD register sets
  selftests: arm64: Clarify output when verifying SVE register set
  selftests: arm64: Document what the SVE ptrace test is doing
  selftests: arm64: Remove extraneous register setting code
  selftests: arm64: Don't log child creation as a test in SVE ptrace test
  selftests: arm64: Use a define for the number of SVE ptrace tests to be run
2021-10-29 12:24:53 +01:00
Will Deacon
d8a2c0fba5 Merge branch 'for-next/kexec' into for-next/core
* for-next/kexec:
  arm64: trans_pgd: remove trans_pgd_map_page()
  arm64: kexec: remove cpu-reset.h
  arm64: kexec: remove the pre-kexec PoC maintenance
  arm64: kexec: keep MMU enabled during kexec relocation
  arm64: kexec: install a copy of the linear-map
  arm64: kexec: use ld script for relocation function
  arm64: kexec: relocate in EL1 mode
  arm64: kexec: configure EL2 vectors for kexec
  arm64: kexec: pass kimage as the only argument to relocation function
  arm64: kexec: Use dcache ops macros instead of open-coding
  arm64: kexec: skip relocation code for inplace kexec
  arm64: kexec: flush image and lists during kexec load time
  arm64: hibernate: abstract ttrb0 setup function
  arm64: trans_pgd: hibernate: Add trans_pgd_copy_el2_vectors
  arm64: kernel: add helper for booted at EL2 and not VHE
2021-10-29 12:24:47 +01:00
Will Deacon
99fe09c857 Merge branch 'for-next/extable' into for-next/core
* for-next/extable:
  arm64: vmlinux.lds.S: remove `.fixup` section
  arm64: extable: add load_unaligned_zeropad() handler
  arm64: extable: add a dedicated uaccess handler
  arm64: extable: add `type` and `data` fields
  arm64: extable: use `ex` for `exception_table_entry`
  arm64: extable: make fixup_exception() return bool
  arm64: extable: consolidate definitions
  arm64: gpr-num: support W registers
  arm64: factor out GPR numbering helpers
  arm64: kvm: use kvm_exception_table_entry
  arm64: lib: __arch_copy_to_user(): fold fixups into body
  arm64: lib: __arch_copy_from_user(): fold fixups into body
  arm64: lib: __arch_clear_user(): fold fixups into body
2021-10-29 12:24:37 +01:00
Will Deacon
a69483eeef Merge branch 'for-next/8.6-timers' into for-next/core
* for-next/8.6-timers:
  arm64: Add HWCAP for self-synchronising virtual counter
  arm64: Add handling of CNTVCTSS traps
  arm64: Add CNT{P,V}CTSS_EL0 alternatives to cnt{p,v}ct_el0
  arm64: Add a capability for FEAT_ECV
  clocksource/drivers/arch_arm_timer: Move workaround synchronisation around
  clocksource/drivers/arm_arch_timer: Fix masking for high freq counters
  clocksource/drivers/arm_arch_timer: Drop unnecessary ISB on CVAL programming
  clocksource/drivers/arm_arch_timer: Remove any trace of the TVAL programming interface
  clocksource/drivers/arm_arch_timer: Work around broken CVAL implementations
  clocksource/drivers/arm_arch_timer: Advertise 56bit timer to the core code
  clocksource/drivers/arm_arch_timer: Move MMIO timer programming over to CVAL
  clocksource/drivers/arm_arch_timer: Fix MMIO base address vs callback ordering issue
  clocksource/drivers/arm_arch_timer: Move drop _tval from erratum function names
  clocksource/drivers/arm_arch_timer: Move system register timer programming over to CVAL
  clocksource/drivers/arm_arch_timer: Extend write side of timer register accessors to u64
  clocksource/drivers/arm_arch_timer: Drop CNT*_TVAL read accessors
  clocksource/arm_arch_timer: Add build-time guards for unhandled register accesses
2021-10-29 12:20:21 +01:00
Nicolas Saenz Julienne
a68773bd32 arm64: Select POSIX_CPU_TIMERS_TASK_WORK
With 6caa5812e2 ("KVM: arm64: Use generic KVM xfer to guest work
function") all arm64 exit paths are properly equipped to handle the
POSIX timers' task work.

Deferring timer callbacks to thread context, not only limits the amount
of time spent in hard interrupt context, but is a safer
implementation[1], and will allow PREEMPT_RT setups to use KVM[2].

So let's enable POSIX_CPU_TIMERS_TASK_WORK on arm64.

[1] https://lore.kernel.org/all/20200716201923.228696399@linutronix.de/
[2] https://lore.kernel.org/linux-rt-users/87v92bdnlx.ffs@tglx/

Signed-off-by: Nicolas Saenz Julienne <nsaenzju@redhat.com>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Acked-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20211018144713.873464-1-nsaenzju@redhat.com
Signed-off-by: Will Deacon <will@kernel.org>
2021-10-28 09:56:23 +01:00
Mark Brown
d198c77b7f arm64: Document boot requirements for FEAT_SME_FA64
The EAC1 release of the SME specification adds the FA64 feature which
requires enablement at higher ELs before lower ELs can use it. Document
what we require from higher ELs in our boot requirements.

Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20211026111802.12853-1-broonie@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>
2021-10-28 09:30:15 +01:00
Mark Brown
260ea4ba94 selftests: arm64: Factor out utility functions for assembly FP tests
The various floating point test programs written in assembly have a bunch
of helper functions and macros which are cut'n'pasted between them. Factor
them out into a separate source file which is linked into all of them.

We don't include memcmp() since it isn't as generic as it should be and
directly branches to report an error in the programs.

Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20211019181851.3341232-1-broonie@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>
2021-10-21 11:11:27 +01:00
Mark Rutland
bf6e667f47 arm64: vmlinux.lds.S: remove .fixup section
We no longer place anything into a `.fixup` section, so we no longer
need to place those sections into the `.text` section in the main kernel
Image.

Remove the use of `.fixup`.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: James Morse <james.morse@arm.com>
Cc: Robin Murphy <robin.murphy@arm.com>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20211019160219.5202-14-mark.rutland@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
2021-10-21 10:45:22 +01:00
Mark Rutland
753b323687 arm64: extable: add load_unaligned_zeropad() handler
For inline assembly, we place exception fixups out-of-line in the
`.fixup` section such that these are out of the way of the fast path.
This has a few drawbacks:

* Since the fixup code is anonymous, backtraces will symbolize fixups as
  offsets from the nearest prior symbol, currently
  `__entry_tramp_text_end`. This is confusing, and painful to debug
  without access to the relevant vmlinux.

* Since the exception handler adjusts the PC to execute the fixup, and
  the fixup uses a direct branch back into the function it fixes,
  backtraces of fixups miss the original function. This is confusing,
  and violates requirements for RELIABLE_STACKTRACE (and therefore
  LIVEPATCH).

* Inline assembly and associated fixups are generated from templates,
  and we have many copies of logically identical fixups which only
  differ in which specific registers are written to and which address is
  branched to at the end of the fixup. This is potentially wasteful of
  I-cache resources, and makes it hard to add additional logic to fixups
  without significant bloat.

* In the case of load_unaligned_zeropad(), the logic in the fixup
  requires a temporary register that we must allocate even in the
  fast-path where it will not be used.

This patch address all four concerns for load_unaligned_zeropad() fixups
by adding a dedicated exception handler which performs the fixup logic
in exception context and subsequent returns back after the faulting
instruction. For the moment, the fixup logic is identical to the old
assembly fixup logic, but in future we could enhance this by taking the
ESR and FAR into account to constrain the faults we try to fix up, or to
specialize fixups for MTE tag check faults.

Other than backtracing, there should be no functional change as a result
of this patch.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: James Morse <james.morse@arm.com>
Cc: Robin Murphy <robin.murphy@arm.com>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20211019160219.5202-13-mark.rutland@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
2021-10-21 10:45:22 +01:00
Mark Rutland
2e77a62cb3 arm64: extable: add a dedicated uaccess handler
For inline assembly, we place exception fixups out-of-line in the
`.fixup` section such that these are out of the way of the fast path.
This has a few drawbacks:

* Since the fixup code is anonymous, backtraces will symbolize fixups as
  offsets from the nearest prior symbol, currently
  `__entry_tramp_text_end`. This is confusing, and painful to debug
  without access to the relevant vmlinux.

* Since the exception handler adjusts the PC to execute the fixup, and
  the fixup uses a direct branch back into the function it fixes,
  backtraces of fixups miss the original function. This is confusing,
  and violates requirements for RELIABLE_STACKTRACE (and therefore
  LIVEPATCH).

* Inline assembly and associated fixups are generated from templates,
  and we have many copies of logically identical fixups which only
  differ in which specific registers are written to and which address is
  branched to at the end of the fixup. This is potentially wasteful of
  I-cache resources, and makes it hard to add additional logic to fixups
  without significant bloat.

This patch address all three concerns for inline uaccess fixups by
adding a dedicated exception handler which updates registers in
exception context and subsequent returns back into the function which
faulted, removing the need for fixups specialized to each faulting
instruction.

Other than backtracing, there should be no functional change as a result
of this patch.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: James Morse <james.morse@arm.com>
Cc: Robin Murphy <robin.murphy@arm.com>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20211019160219.5202-12-mark.rutland@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
2021-10-21 10:45:22 +01:00
Mark Rutland
d6e2cc5647 arm64: extable: add type and data fields
Subsequent patches will add specialized handlers for fixups, in addition
to the simple PC fixup and BPF handlers we have today. In preparation,
this patch adds a new `type` field to struct exception_table_entry, and
uses this to distinguish the fixup and BPF cases. A `data` field is also
added so that subsequent patches can associate data specific to each
exception site (e.g. register numbers).

Handlers are named ex_handler_*() for consistency, following the exmaple
of x86. At the same time, get_ex_fixup() is split out into a helper so
that it can be used by other ex_handler_*() functions ins subsequent
patches.

This patch will increase the size of the exception tables, which will be
remedied by subsequent patches removing redundant fixup code. There
should be no functional change as a result of this patch.

Since each entry is now 12 bytes in size, we must reduce the alignment
of each entry from `.align 3` (i.e. 8 bytes) to `.align 2` (i.e. 4
bytes), which is the natrual alignment of the `insn` and `fixup` fields.
The current 8-byte alignment is a holdover from when the `insn` and
`fixup` fields was 8 bytes, and while not harmful has not been necessary
since commit:

  6c94f27ac8 ("arm64: switch to relative exception tables")

Similarly, RO_EXCEPTION_TABLE_ALIGN is dropped to 4 bytes.

Concurrently with this patch, x86's exception table entry format is
being updated (similarly to a 12-byte format, with 32-bytes of absolute
data). Once both have been merged it should be possible to unify the
sorttable logic for the two.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Andrii Nakryiko <andrii@kernel.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: James Morse <james.morse@arm.com>
Cc: Jean-Philippe Brucker <jean-philippe@linaro.org>
Cc: Robin Murphy <robin.murphy@arm.com>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20211019160219.5202-11-mark.rutland@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
2021-10-21 10:45:22 +01:00
Mark Rutland
5d0e790514 arm64: extable: use ex for exception_table_entry
Subsequent patches will extend `struct exception_table_entry` with more
fields, and the distinction between the entry and its `fixup` field will
become more important.

For clarity, let's consistently use `ex` to refer to refer to an entire
entry. In subsequent patches we'll use `fixup` to refer to the fixup
field specifically. This matches the naming convention used today in
arch/arm64/net/bpf_jit_comp.c.

There should be no functional change as a result of this patch.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Acked-by: Robin Murphy <robin.murphy@arm.com>
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: James Morse <james.morse@arm.com>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20211019160219.5202-10-mark.rutland@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
2021-10-21 10:45:22 +01:00
Mark Rutland
e8c328d7de arm64: extable: make fixup_exception() return bool
The return values of fixup_exception() and arm64_bpf_fixup_exception()
represent a boolean condition rather than an error code, so for clarity
it would be better to return `bool` rather than `int`.

This patch adjusts the code accordingly. While we're modifying the
prototype, we also remove the unnecessary `extern` keyword, so that this
won't look out of place when we make subsequent additions to the header.

There should be no functional change as a result of this patch.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Andrii Nakryiko <andrii@kernel.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: James Morse <james.morse@arm.com>
Cc: Jean-Philippe Brucker <jean-philippe@linaro.org>
Cc: Robin Murphy <robin.murphy@arm.com>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20211019160219.5202-9-mark.rutland@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
2021-10-21 10:45:22 +01:00
Mark Rutland
819771cc28 arm64: extable: consolidate definitions
In subsequent patches we'll alter the structure and usage of struct
exception_table_entry. For inline assembly, we create these using the
`_ASM_EXTABLE()` CPP macro defined in <asm/uaccess.h>, and for plain
assembly code we use the `_asm_extable()` GAS macro defined in
<asm/assembler.h>, which are largely identical save for different
escaping and stringification requirements.

This patch moves the common definitions to a new <asm/asm-extable.h>
header, so that it's easier to keep the two in-sync, and to remove the
implication that these are only used for uaccess helpers (as e.g.
load_unaligned_zeropad() is only used on kernel memory, and depends upon
`_ASM_EXTABLE()`.

At the same time, a few minor modifications are made for clarity and in
preparation for subsequent patches:

* The structure creation is factored out into an `__ASM_EXTABLE_RAW()`
  macro. This will make it easier to support different fixup variants in
  subsequent patches without needing to update all users of
  `_ASM_EXTABLE()`, and makes it easier to see tha the CPP and GAS
  variants of the macros are structurally identical.

  For the CPP macro, the stringification of fields is left to the
  wrapper macro, `_ASM_EXTABLE()`, as in subsequent patches it will be
  necessary to stringify fields in wrapper macros to safely concatenate
  strings which cannot be token-pasted together in CPP.

* The fields of the structure are created separately on their own lines.
  This will make it easier to add/remove/modify individual fields
  clearly.

* Additional parentheses are added around the use of macro arguments in
  field definitions to avoid any potential problems with evaluation due
  to operator precedence, and to make errors upon misuse clearer.

* USER() is moved into <asm/asm-uaccess.h>, as it is not required by all
  assembly code, and is already refered to by comments in that file.

There should be no functional change as a result of this patch.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: James Morse <james.morse@arm.com>
Cc: Robin Murphy <robin.murphy@arm.com>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20211019160219.5202-8-mark.rutland@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
2021-10-21 10:45:22 +01:00
Mark Rutland
286fba6c2a arm64: gpr-num: support W registers
In subsequent patches we'll want to map W registers to their register
numbers. Update gpr-num.h so that we can do this.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: James Morse <james.morse@arm.com>
Cc: Robin Murphy <robin.murphy@arm.com>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20211019160219.5202-7-mark.rutland@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
2021-10-21 10:45:22 +01:00
Mark Rutland
8ed1b498ad arm64: factor out GPR numbering helpers
In <asm/sysreg.h> we have macros to convert the names of general purpose
registers (GPRs) into integer constants, which we use to manually build
the encoding for `MRS` and `MSR` instructions where we can't rely on the
assembler to do so for us.

In subsequent patches we'll need to map the same GPR names to integer
constants so that we can use this to build metadata for exception
fixups.

So that the we can use the mappings elsewhere, factor out the
definitions into a new <asm/gpr-num.h> header, renaming the definitions
to align with this "GPR num" naming for clarity.

There should be no functional change as a result of this patch.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: James Morse <james.morse@arm.com>
Cc: Robin Murphy <robin.murphy@arm.com>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20211019160219.5202-6-mark.rutland@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
2021-10-21 10:45:22 +01:00
Mark Rutland
ae2b2f3384 arm64: kvm: use kvm_exception_table_entry
In subsequent patches we'll alter `struct exception_table_entry`, adding
fields that are not needed for KVM exception fixups.

In preparation for this, migrate KVM to its own `struct
kvm_exception_table_entry`, which is identical to the current format of
`struct exception_table_entry`. Comments are updated accordingly.

There should be no functional change as a result of this patch.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Cc: Alexandru Elisei <alexandru.elisei@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: James Morse <james.morse@arm.com>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Robin Murphy <robin.murphy@arm.com>
Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
Cc: Will Deacon <will@kernel.org>
Acked-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20211019160219.5202-5-mark.rutland@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
2021-10-21 10:45:21 +01:00
Mark Rutland
139f9ab73d arm64: lib: __arch_copy_to_user(): fold fixups into body
Like other functions, __arch_copy_to_user() places its exception fixups
in the `.fixup` section without any clear association with
__arch_copy_to_user() itself. If we backtrace the fixup code, it will be
symbolized as an offset from the nearest prior symbol, which happens to
be `__entry_tramp_text_end`. Further, since the PC adjustment for the
fixup is akin to a direct branch rather than a function call,
__arch_copy_to_user() itself will be missing from the backtrace.

This is confusing and hinders debugging. In general this pattern will
also be problematic for CONFIG_LIVEPATCH, since fixups often return to
their associated function, but this isn't accurately captured in the
stacktrace.

To solve these issues for assembly functions, we must move fixups into
the body of the functions themselves, after the usual fast-path returns.
This patch does so for __arch_copy_to_user().

Inline assembly will be dealt with in subsequent patches.

Other than the improved backtracing, there should be no functional
change as a result of this patch.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Acked-by: Robin Murphy <robin.murphy@arm.com>
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: James Morse <james.morse@arm.com>
Cc: Mark Brown <broonie@kernel.org>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20211019160219.5202-4-mark.rutland@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
2021-10-21 10:45:21 +01:00
Mark Rutland
4012e0e227 arm64: lib: __arch_copy_from_user(): fold fixups into body
Like other functions, __arch_copy_from_user() places its exception
fixups in the `.fixup` section without any clear association with
__arch_copy_from_user() itself. If we backtrace the fixup code, it will
be symbolized as an offset from the nearest prior symbol, which happens
to be `__entry_tramp_text_end`. Further, since the PC adjustment for the
fixup is akin to a direct branch rather than a function call,
__arch_copy_from_user() itself will be missing from the backtrace.

This is confusing and hinders debugging. In general this pattern will
also be problematic for CONFIG_LIVEPATCH, since fixups often return to
their associated function, but this isn't accurately captured in the
stacktrace.

To solve these issues for assembly functions, we must move fixups into
the body of the functions themselves, after the usual fast-path returns.
This patch does so for __arch_copy_from_user().

Inline assembly will be dealt with in subsequent patches.

Other than the improved backtracing, there should be no functional
change as a result of this patch.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Acked-by: Robin Murphy <robin.murphy@arm.com>
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: James Morse <james.morse@arm.com>
Cc: Mark Brown <broonie@kernel.org>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20211019160219.5202-3-mark.rutland@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
2021-10-21 10:45:21 +01:00
Mark Rutland
35d67794b8 arm64: lib: __arch_clear_user(): fold fixups into body
Like other functions, __arch_clear_user() places its exception fixups in
the `.fixup` section without any clear association with
__arch_clear_user() itself. If we backtrace the fixup code, it will be
symbolized as an offset from the nearest prior symbol, which happens to
be `__entry_tramp_text_end`. Further, since the PC adjustment for the
fixup is akin to a direct branch rather than a function call,
__arch_clear_user() itself will be missing from the backtrace.

This is confusing and hinders debugging. In general this pattern will
also be problematic for CONFIG_LIVEPATCH, since fixups often return to
their associated function, but this isn't accurately captured in the
stacktrace.

To solve these issues for assembly functions, we must move fixups into
the body of the functions themselves, after the usual fast-path returns.
This patch does so for __arch_clear_user().

Inline assembly will be dealt with in subsequent patches.

Other than the improved backtracing, there should be no functional
change as a result of this patch.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Acked-by: Robin Murphy <robin.murphy@arm.com>
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: James Morse <james.morse@arm.com>
Cc: Mark Brown <broonie@kernel.org>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20211019160219.5202-2-mark.rutland@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
2021-10-21 10:45:21 +01:00
Marc Zyngier
fee29f008a arm64: Add HWCAP for self-synchronising virtual counter
Since userspace can make use of the CNTVSS_EL0 instruction, expose
it via a HWCAP.

Suggested-by: Will Deacon <will@kernel.org>
Acked-by: Will Deacon <will@kernel.org>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20211017124225.3018098-18-maz@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>
2021-10-19 10:56:20 +01:00
Marc Zyngier
ae976f063b arm64: Add handling of CNTVCTSS traps
Since CNTVCTSS obey the same control bits as CNTVCT, add the necessary
decoding to the hook table. Note that there is no known user of
this at the moment.

Acked-by: Will Deacon <will@kernel.org>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20211017124225.3018098-17-maz@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>
2021-10-19 10:56:20 +01:00
Marc Zyngier
9ee840a960 arm64: Add CNT{P,V}CTSS_EL0 alternatives to cnt{p,v}ct_el0
CNTPCTSS_EL0 and CNTVCTSS_EL0 are alternatives to the usual
CNTPCT_EL0 and CNTVCT_EL0 that do not require a previous ISB
to be synchronised (SS stands for Self-Synchronising).

Use the ARM64_HAS_ECV capability to control alternative sequences
that switch to these low(er)-cost primitives. Note that the
counter access in the VDSO is for now left alone until we decide
whether we want to allow this.

Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20211017124225.3018098-16-maz@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>
2021-10-19 10:56:20 +01:00
Marc Zyngier
fdf865988b arm64: Add a capability for FEAT_ECV
Add a new capability to detect the Enhanced Counter Virtualization
feature (FEAT_ECV).

Reviewed-by: Oliver Upton <oupton@google.com>
Acked-by: Will Deacon <will@kernel.org>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20211017124225.3018098-15-maz@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>
2021-10-19 10:56:20 +01:00
Will Deacon
0e277fb807 Merge branch 'timers/drivers/armv8.6_arch_timer' of https://git.linaro.org/people/daniel.lezcano/linux into for-next/8.6-timers
Pull Arm architected timer driver rework from Marc (via Daniel) so that
we can add the Armv8.6 support on top.

Link: https://lore.kernel.org/r/d0c55386-2f7f-a940-45bb-d80ae5e0f378@linaro.org

* 'timers/drivers/armv8.6_arch_timer' of https://git.linaro.org/people/daniel.lezcano/linux:
  clocksource/drivers/arch_arm_timer: Move workaround synchronisation around
  clocksource/drivers/arm_arch_timer: Fix masking for high freq counters
  clocksource/drivers/arm_arch_timer: Drop unnecessary ISB on CVAL programming
  clocksource/drivers/arm_arch_timer: Remove any trace of the TVAL programming interface
  clocksource/drivers/arm_arch_timer: Work around broken CVAL implementations
  clocksource/drivers/arm_arch_timer: Advertise 56bit timer to the core code
  clocksource/drivers/arm_arch_timer: Move MMIO timer programming over to CVAL
  clocksource/drivers/arm_arch_timer: Fix MMIO base address vs callback ordering issue
  clocksource/drivers/arm_arch_timer: Move drop _tval from erratum function names
  clocksource/drivers/arm_arch_timer: Move system register timer programming over to CVAL
  clocksource/drivers/arm_arch_timer: Extend write side of timer register accessors to u64
  clocksource/drivers/arm_arch_timer: Drop CNT*_TVAL read accessors
  clocksource/arm_arch_timer: Add build-time guards for unhandled register accesses
2021-10-19 10:52:11 +01:00
Marc Zyngier
db26f8f2da clocksource/drivers/arch_arm_timer: Move workaround synchronisation around
We currently handle synchronisation when workarounds are enabled
by having an ISB in the __arch_counter_get_cnt?ct_stable() helpers.

While this works, this prevents us from relaxing this synchronisation.

Instead, move it closer to the point where the synchronisation is
actually needed. Further patches will subsequently relax this.

Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20211017124225.3018098-14-maz@kernel.org
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2021-10-18 09:20:57 +02:00
Oliver Upton
c1153d52c4 clocksource/drivers/arm_arch_timer: Fix masking for high freq counters
Unfortunately, the architecture provides no means to determine the bit
width of the system counter. However, we do know the following from the
specification:

 - the system counter is at least 56 bits wide
 - Roll-over time of not less than 40 years

To date, the arch timer driver has depended on the first property,
assuming any system counter to be 56 bits wide and masking off the rest.
However, combining a narrow clocksource mask with a high frequency
counter could result in prematurely wrapping the system counter by a
significant margin. For example, a 56 bit wide, 1GHz system counter
would wrap in a mere 2.28 years!

This is a problem for two reasons: v8.6+ implementations are required to
provide a 64 bit, 1GHz system counter. Furthermore, before v8.6,
implementers may select a counter frequency of their choosing.

Fix the issue by deriving a valid clock mask based on the second
property from above. Set the floor at 56 bits, since we know no system
counter is narrower than that.

[maz: fixed width computation not to lose the last bit, added
      max delta generation for the timer]

Suggested-by: Marc Zyngier <maz@kernel.org>
Signed-off-by: Oliver Upton <oupton@google.com>
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20210807191428.3488948-1-oupton@google.com
Link: https://lore.kernel.org/r/20211017124225.3018098-13-maz@kernel.org
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2021-10-18 09:20:02 +02:00
Marc Zyngier
ec8f7f3342 clocksource/drivers/arm_arch_timer: Drop unnecessary ISB on CVAL programming
Switching from TVAL to CVAL has a small drawback: we need an ISB
before reading the counter. We cannot get rid of it, but we can
instead remove the one that comes just after writing to CVAL.

This reduces the number of ISBs from 3 to 2 when programming
the timer.

Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20211017124225.3018098-12-maz@kernel.org
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2021-10-17 21:47:44 +02:00
Marc Zyngier
41f8d02a6a clocksource/drivers/arm_arch_timer: Remove any trace of the TVAL programming interface
TVAL usage is now long gone, get rid of the leftovers.

Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20211017124225.3018098-11-maz@kernel.org
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2021-10-17 21:47:39 +02:00
Marc Zyngier
012f188504 clocksource/drivers/arm_arch_timer: Work around broken CVAL implementations
The Applied Micro XGene-1 SoC has a busted implementation of the
CVAL register: it looks like it is based on TVAL instead of the
other way around. The net effect of this implementation blunder
is that the maximum deadline you can program in the timer is
32bit wide.

Use a MIDR check to notice the broken CPU, and reduce the width
of the timer to 32bit.

Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20211017124225.3018098-10-maz@kernel.org
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2021-10-17 21:47:31 +02:00
Marc Zyngier
30aa08da35 clocksource/drivers/arm_arch_timer: Advertise 56bit timer to the core code
Proudly tell the code code that we have a timer able to handle
56 bits deltas.

Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20211017124225.3018098-9-maz@kernel.org
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2021-10-17 21:47:27 +02:00
Marc Zyngier
8b82c4f883 clocksource/drivers/arm_arch_timer: Move MMIO timer programming over to CVAL
Similarily to the sysreg-based timer, move the MMIO over to using
the CVAL registers instead of TVAL. Note that there is no warranty
that the 64bit MMIO access will be atomic, but the timer is always
disabled at the point where we program CVAL.

Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20211017124225.3018098-8-maz@kernel.org
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2021-10-17 21:47:21 +02:00
Marc Zyngier
72f47a3f0e clocksource/drivers/arm_arch_timer: Fix MMIO base address vs callback ordering issue
The MMIO timer base address gets published after we have registered
the callbacks and the interrupt handler, which is... a bit dangerous.

Fix this by moving the base address publication to the point where
we register the timer, and expose a pointer to the timer structure
itself rather than a naked value.

Reviewed-by: Oliver Upton <oupton@google.com>
Reviewed-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20211017124225.3018098-7-maz@kernel.org
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2021-10-17 21:47:15 +02:00
Marc Zyngier
ac9ef4f24c clocksource/drivers/arm_arch_timer: Move drop _tval from erratum function names
The '_tval' name in the erratum handling function names doesn't
make much sense anymore (and they were using CVAL the first place).

Drop the _tval tag.

Reviewed-by: Oliver Upton <oupton@google.com>
Reviewed-by: Mark Rutland <mark.rutland@arm.com>
Tested-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20211017124225.3018098-6-maz@kernel.org
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2021-10-17 21:47:10 +02:00
Marc Zyngier
a38b71b083 clocksource/drivers/arm_arch_timer: Move system register timer programming over to CVAL
In order to cope better with high frequency counters, move the
programming of the timers from the countdown timer (TVAL) over
to the comparator (CVAL).

The programming model is slightly different, as we now need to
read the current counter value to have an absolute deadline
instead of a relative one.

There is a small overhead to this change, which we will address
in the following patches.

Reviewed-by: Oliver Upton <oupton@google.com>
Reviewed-by: Mark Rutland <mark.rutland@arm.com>
Tested-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20211017124225.3018098-5-maz@kernel.org
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2021-10-17 21:47:05 +02:00
Marc Zyngier
1e8d929231 clocksource/drivers/arm_arch_timer: Extend write side of timer register accessors to u64
The various accessors for the timer sysreg and MMIO registers are
currently hardwired to 32bit. However, we are about to introduce
the use of the CVAL registers, which require a 64bit access.

Upgrade the write side of the accessors to take a 64bit value
(the read side is left untouched as we don't plan to ever read
back any of these registers).

No functional change expected.

Reviewed-by: Oliver Upton <oupton@google.com>
Reviewed-by: Mark Rutland <mark.rutland@arm.com>
Tested-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20211017124225.3018098-4-maz@kernel.org
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2021-10-17 21:46:59 +02:00
Marc Zyngier
d72689988d clocksource/drivers/arm_arch_timer: Drop CNT*_TVAL read accessors
The arch timer driver never reads the various TVAL registers, only
writes to them. It is thus pointless to provide accessors
for them and to implement errata workarounds.

Drop these read-side accessors, and add a couple of BUG() statements
for the time being. These statements will be removed further down
the line.

Reviewed-by: Oliver Upton <oupton@google.com>
Reviewed-by: Mark Rutland <mark.rutland@arm.com>
Tested-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20211017124225.3018098-3-maz@kernel.org
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2021-10-17 21:46:50 +02:00
Marc Zyngier
4775bc63f8 clocksource/arm_arch_timer: Add build-time guards for unhandled register accesses
As we are about to change the registers that are used by the driver,
start by adding build-time checks to ensure that we always handle
all registers and access modes.

Suggested-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20211017124225.3018098-2-maz@kernel.org
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2021-10-17 21:45:48 +02:00
Sumit Garg
de56379f21 arm64: ftrace: use function_nocfi for _mcount as well
Commit 800618f955 ("arm64: ftrace: use function_nocfi for ftrace_call")
only fixed address of ftrace_call but address of _mcount needs to be
fixed as well. Use function_nocfi() to get the actual address of _mcount
function as with CONFIG_CFI_CLANG, the compiler replaces function pointers
with jump table addresses which breaks dynamic ftrace as the address of
_mcount is replaced with the address of _mcount.cfi_jt.

With mainline, this won't be a problem since by default
CONFIG_DYNAMIC_FTRACE_WITH_REGS=y with Clang >= 10 as it supports
-fpatchable-function-entry and CFI requires Clang 12 but for consistency
we should add function_nocfi() for _mcount as well.

Signed-off-by: Sumit Garg <sumit.garg@linaro.org>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Sami Tolvanen <samitolvanen@google.com>
Link: https://lore.kernel.org/r/20211011125059.3378646-1-sumit.garg@linaro.org
Signed-off-by: Will Deacon <will@kernel.org>
2021-10-12 09:24:49 +01:00
Anders Roxell
1dfde0892b arm64: asm: setup.h: export common variables
When building the kernel with sparse enabled 'C=1' the following
warnings can be seen:

arch/arm64/kernel/setup.c:58:13: warning: symbol '__fdt_pointer' was not declared. Should it be static?
arch/arm64/kernel/setup.c:84:25: warning: symbol 'boot_args' was not declared. Should it be static?

Rework so the variables are exported, since these two variable are
created and used in setup.c, also used in head.S.

Signed-off-by: Anders Roxell <anders.roxell@linaro.org>
Link: https://lore.kernel.org/r/20211007195601.677474-1-anders.roxell@linaro.org
Signed-off-by: Will Deacon <will@kernel.org>
2021-10-12 09:22:33 +01:00
Mark Brown
0ba1ce1e86 selftests: arm64: Add coverage of ptrace flags for SVE VL inheritance
Add a test that covers enabling and disabling of SVE vector length
inheritance via the ptrace interface.

Signed-off-by: Mark Brown <broonie@kernel.org>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Link: https://lore.kernel.org/r/20211005123537.976795-1-broonie@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>
2021-10-07 09:20:52 +01:00
Pasha Tatashin
6091dd9eaf arm64: trans_pgd: remove trans_pgd_map_page()
The intend of trans_pgd_map_page() was to map contiguous range of VA
memory to the memory that is getting relocated during kexec. However,
since we are now using linear map instead of contiguous range this
function is not needed

Suggested-by: Pingfan Liu <kernelfans@gmail.com>
Signed-off-by: Pasha Tatashin <pasha.tatashin@soleen.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Link: https://lore.kernel.org/r/20210930143113.1502553-16-pasha.tatashin@soleen.com
Signed-off-by: Will Deacon <will@kernel.org>
2021-10-01 13:31:01 +01:00
Pasha Tatashin
7a2512fa64 arm64: kexec: remove cpu-reset.h
This header contains only cpu_soft_restart() which is never used directly
anymore. So, remove this header, and rename the helper to be
cpu_soft_restart().

Suggested-by: James Morse <james.morse@arm.com>
Signed-off-by: Pasha Tatashin <pasha.tatashin@soleen.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Link: https://lore.kernel.org/r/20210930143113.1502553-15-pasha.tatashin@soleen.com
Signed-off-by: Will Deacon <will@kernel.org>
2021-10-01 13:31:00 +01:00
Pasha Tatashin
939f1b9564 arm64: kexec: remove the pre-kexec PoC maintenance
Now that kexec does its relocations with the MMU enabled, we no longer
need to clean the relocation data to the PoC.

Suggested-by: James Morse <james.morse@arm.com>
Signed-off-by: Pasha Tatashin <pasha.tatashin@soleen.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Link: https://lore.kernel.org/r/20210930143113.1502553-14-pasha.tatashin@soleen.com
Signed-off-by: Will Deacon <will@kernel.org>
2021-10-01 13:31:00 +01:00
Pasha Tatashin
efc2d0f20a arm64: kexec: keep MMU enabled during kexec relocation
Now, that we have linear map page tables configured, keep MMU enabled
to allow faster relocation of segments to final destination.

Cavium ThunderX2:
Kernel Image size: 38M Iniramfs size: 46M Total relocation size: 84M
MMU-disabled:
relocation	7.489539915s
MMU-enabled:
relocation	0.03946095s

Broadcom Stingray:
The performance data: for a moderate size kernel + initramfs: 25M the
relocation was taking 0.382s, with enabled MMU it now takes
0.019s only or x20 improvement.

The time is proportional to the size of relocation, therefore if initramfs
is larger, 100M it could take over a second.

Signed-off-by: Pasha Tatashin <pasha.tatashin@soleen.com>
Tested-by: Pingfan Liu <piliu@redhat.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Link: https://lore.kernel.org/r/20210930143113.1502553-13-pasha.tatashin@soleen.com
Signed-off-by: Will Deacon <will@kernel.org>
2021-10-01 13:31:00 +01:00
Pasha Tatashin
3744b5280e arm64: kexec: install a copy of the linear-map
To perform the kexec relocation with the MMU enabled, we need a copy
of the linear map.

Create one, and install it from the relocation code. This has to be done
from the assembly code as it will be idmapped with TTBR0. The kernel
runs in TTRB1, so can't use the break-before-make sequence on the mapping
it is executing from.

The makes no difference yet as the relocation code runs with the MMU
disabled.

Suggested-by: James Morse <james.morse@arm.com>
Signed-off-by: Pasha Tatashin <pasha.tatashin@soleen.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Link: https://lore.kernel.org/r/20210930143113.1502553-12-pasha.tatashin@soleen.com
Signed-off-by: Will Deacon <will@kernel.org>
2021-10-01 13:31:00 +01:00
Pasha Tatashin
19a046f07c arm64: kexec: use ld script for relocation function
Currently, relocation code declares start and end variables
which are used to compute its size.

The better way to do this is to use ld script, and put relocation
function in its own section.

Signed-off-by: Pasha Tatashin <pasha.tatashin@soleen.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Link: https://lore.kernel.org/r/20210930143113.1502553-11-pasha.tatashin@soleen.com
Signed-off-by: Will Deacon <will@kernel.org>
2021-10-01 13:31:00 +01:00
Pasha Tatashin
ba959fe96a arm64: kexec: relocate in EL1 mode
Since we are going to keep MMU enabled during relocation, we need to
keep EL1 mode throughout the relocation.

Keep EL1 enabled, and switch EL2 only before entering the new world.

Suggested-by: James Morse <james.morse@arm.com>
Signed-off-by: Pasha Tatashin <pasha.tatashin@soleen.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Link: https://lore.kernel.org/r/20210930143113.1502553-10-pasha.tatashin@soleen.com
Signed-off-by: Will Deacon <will@kernel.org>
2021-10-01 13:31:00 +01:00
Pasha Tatashin
08eae0ef61 arm64: kexec: configure EL2 vectors for kexec
If we have a EL2 mode without VHE, the EL2 vectors are needed in order
to switch to EL2 and jump to new world with hypervisor privileges.

In preparation to MMU enabled relocation, configure our EL2 table now.

Kexec uses #HVC_SOFT_RESTART to branch to the new world, so extend
el1_sync vector that is provided by trans_pgd_copy_el2_vectors() to
support this case.

Signed-off-by: Pasha Tatashin <pasha.tatashin@soleen.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Link: https://lore.kernel.org/r/20210930143113.1502553-9-pasha.tatashin@soleen.com
Signed-off-by: Will Deacon <will@kernel.org>
2021-10-01 13:31:00 +01:00