Commit Graph

2266 Commits

Author SHA1 Message Date
Will Deacon
26a25c841d arm64: perf: Treat EXCLUDE_EL* bit definitions as unsigned
Although the upper 32 bits of the PMEVTYPER<n>_EL0 registers are RES0,
we should treat the EXCLUDE_EL* bit definitions as unsigned so that we
avoid accidentally sign-extending the privilege filtering bit (bit 31)
into the upper half of the register.

Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-12-13 15:34:44 +00:00
Will Deacon
b47f515bdc Merge branch 'for-next/perf' into aarch64/for-next/core
Merge in arm64 perf and PMU driver updates, including support for the
system/uncore PMU in the ThunderX2 platform.
2018-12-12 19:00:25 +00:00
Ard Biesheuvel
0a1213fa74 arm64: enable per-task stack canaries
This enables the use of per-task stack canary values if GCC has
support for emitting the stack canary reference relative to the
value of sp_el0, which holds the task struct pointer in the arm64
kernel.

The $(eval) extends KBUILD_CFLAGS at the moment the make rule is
applied, which means asm-offsets.o (which we rely on for the offset
value) is built without the arguments, and everything built afterwards
has the options set.

Reviewed-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-12-12 18:45:31 +00:00
Will Deacon
6e4ede698d arm64: percpu: Fix LSE implementation of value-returning pcpu atomics
Commit 959bf2fd03 ("arm64: percpu: Rewrite per-cpu ops to allow use of
LSE atomics") introduced alternative code sequences for the arm64 percpu
atomics, so that the LSE instructions can be patched in at runtime if
they are supported by the CPU.

Unfortunately, when patching in the LSE sequence for a value-returning
pcpu atomic, the argument registers are the wrong way round. The
implementation of this_cpu_add_return() therefore ends up adding
uninitialised stack to the percpu variable and returning garbage.

As it turns out, there aren't very many users of the value-returning
percpu atomics in mainline and we only spotted this due to a failure in
the kprobes selftests. In this case, when attempting to single-step over
the out-of-line instruction slot, the debug monitors would not be
enabled because calling this_cpu_inc_return() on the kernel debug
monitor refcount would fail to detect the transition from 0. We would
consequently execute past the slot and take an undefined instruction
exception from the kernel, resulting in a BUG:

 | kernel BUG at arch/arm64/kernel/traps.c:421!
 | PREEMPT SMP
 | pc : do_undefinstr+0x268/0x278
 | lr : do_undefinstr+0x124/0x278
 | Process swapper/0 (pid: 1, stack limit = 0x(____ptrval____))
 | Call trace:
 |  do_undefinstr+0x268/0x278
 |  el1_undef+0x10/0x78
 |  0xffff00000803c004
 |  init_kprobes+0x150/0x180
 |  do_one_initcall+0x74/0x178
 |  kernel_init_freeable+0x188/0x224
 |  kernel_init+0x10/0x100
 |  ret_from_fork+0x10/0x1c

Fix the argument order to get the value-returning pcpu atomics working
correctly when implemented using the LSE instructions.

Reported-by: Catalin Marinas <catalin.marinas@arm.com>
Tested-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-12-12 14:43:35 +00:00
Mark Rutland
c3296a1391 arm64: add <asm/asm-prototypes.h>
While we can export symbols from assembly files, CONFIG_MODVERIONS requires C
declarations of anyhting that's exported.

Let's account for this as other architectures do by placing these declarations
in <asm/asm-prototypes.h>, which kbuild will automatically use to generate
modversion information for assembly files.

Since we already define most prototypes in existing headers, we simply need to
include those headers in <asm/asm-prototypes.h>, and don't need to duplicate
these.

Reviewed-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-12-12 14:10:18 +00:00
Will Deacon
9b31cf493f arm64: mm: Introduce MAX_USER_VA_BITS definition
With the introduction of 52-bit virtual addressing for userspace, we are
now in a position where the virtual addressing capability of userspace
may exceed that of the kernel. Consequently, the VA_BITS definition
cannot be used blindly, since it reflects only the size of kernel
virtual addresses.

This patch introduces MAX_USER_VA_BITS which is either VA_BITS or 52
depending on whether 52-bit virtual addressing has been configured at
build time, removing a few places where the 52 is open-coded based on
explicit CONFIG_ guards.

Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-12-12 11:51:40 +00:00
Will Deacon
7faa313f05 arm64: preempt: Fix big-endian when checking preempt count in assembly
Commit 3962446922 ("arm64: preempt: Provide our own implementation of
asm/preempt.h") extended the preempt count field in struct thread_info
to 64 bits, so that it consists of a 32-bit count plus a 32-bit flag
indicating whether or not the current task needs rescheduling.

Whilst the asm-offsets definition of TSK_TI_PREEMPT was updated to point
to this new field, the assembly usage was left untouched meaning that a
32-bit load from TSK_TI_PREEMPT on a big-endian machine actually returns
the reschedule flag instead of the count.

Whilst we could fix this by pointing TSK_TI_PREEMPT at the count field,
we're actually better off reworking the two assembly users so that they
operate on the whole 64-bit value in favour of inspecting the thread
flags separately in order to determine whether a reschedule is needed.

Acked-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Reported-by: "kernelci.org bot" <bot@kernelci.org>
Tested-by: Kevin Hilman <khilman@baylibre.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-12-11 20:07:03 +00:00
Will Deacon
d34664f63b Merge branch 'for-next/kexec' into aarch64/for-next/core
Merge in kexec_file_load() support from Akashi Takahiro.
2018-12-10 18:57:17 +00:00
Will Deacon
bc84a2d106 Merge branch 'kvm/cortex-a76-erratum-1165522' into aarch64/for-next/core
Pull in KVM workaround for A76 erratum #116522.

Conflicts:
	arch/arm64/include/asm/cpucaps.h
2018-12-10 18:53:52 +00:00
Will Deacon
66f16a2451 arm64: smp: Rework early feature mismatched detection
Rather than add additional variables to detect specific early feature
mismatches with secondary CPUs, we can instead dedicate the upper bits
of the CPU boot status word to flag specific mismatches.

This allows us to communicate both granule and VA-size mismatches back
to the primary CPU without the need for additional book-keeping.

Tested-by: Steve Capper <steve.capper@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-12-10 18:42:18 +00:00
Will Deacon
68d23da437 arm64: Kconfig: Re-jig CONFIG options for 52-bit VA
Enabling 52-bit VAs for userspace is pretty confusing, since it requires
you to select "48-bit" virtual addressing in the Kconfig.

Rework the logic so that 52-bit user virtual addressing is advertised in
the "Virtual address space size" choice, along with some help text to
describe its interaction with Pointer Authentication. The EXPERT-only
option to force all user mappings to the 52-bit range is then made
available immediately below the VA size selection.

Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-12-10 18:42:18 +00:00
Steve Capper
b9567720a1 arm64: mm: Allow forcing all userspace addresses to 52-bit
On arm64 52-bit VAs are provided to userspace when a hint is supplied to
mmap. This helps maintain compatibility with software that expects at
most 48-bit VAs to be returned.

In order to help identify software that has 48-bit VA assumptions, this
patch allows one to compile a kernel where 52-bit VAs are returned by
default on HW that supports it.

This feature is intended to be for development systems only.

Signed-off-by: Steve Capper <steve.capper@arm.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-12-10 18:42:18 +00:00
Steve Capper
67e7fdfcc6 arm64: mm: introduce 52-bit userspace support
On arm64 there is optional support for a 52-bit virtual address space.
To exploit this one has to be running with a 64KB page size and be
running on hardware that supports this.

For an arm64 kernel supporting a 48 bit VA with a 64KB page size,
some changes are needed to support a 52-bit userspace:
 * TCR_EL1.T0SZ needs to be 12 instead of 16,
 * TASK_SIZE needs to reflect the new size.

This patch implements the above when the support for 52-bit VAs is
detected at early boot time.

On arm64 userspace addresses translation is controlled by TTBR0_EL1. As
well as userspace, TTBR0_EL1 controls:
 * The identity mapping,
 * EFI runtime code.

It is possible to run a kernel with an identity mapping that has a
larger VA size than userspace (and for this case __cpu_set_tcr_t0sz()
would set TCR_EL1.T0SZ as appropriate). However, when the conditions for
52-bit userspace are met; it is possible to keep TCR_EL1.T0SZ fixed at
12. Thus in this patch, the TCR_EL1.T0SZ size changing logic is
disabled.

Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Steve Capper <steve.capper@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-12-10 18:42:17 +00:00
Steve Capper
e842dfb5a2 arm64: mm: Offset TTBR1 to allow 52-bit PTRS_PER_PGD
Enabling 52-bit VAs on arm64 requires that the PGD table expands from 64
entries (for the 48-bit case) to 1024 entries. This quantity,
PTRS_PER_PGD is used as follows to compute which PGD entry corresponds
to a given virtual address, addr:

pgd_index(addr) -> (addr >> PGDIR_SHIFT) & (PTRS_PER_PGD - 1)

Userspace addresses are prefixed by 0's, so for a 48-bit userspace
address, uva, the following is true:
(uva >> PGDIR_SHIFT) & (1024 - 1) == (uva >> PGDIR_SHIFT) & (64 - 1)

In other words, a 48-bit userspace address will have the same pgd_index
when using PTRS_PER_PGD = 64 and 1024.

Kernel addresses are prefixed by 1's so, given a 48-bit kernel address,
kva, we have the following inequality:
(kva >> PGDIR_SHIFT) & (1024 - 1) != (kva >> PGDIR_SHIFT) & (64 - 1)

In other words a 48-bit kernel virtual address will have a different
pgd_index when using PTRS_PER_PGD = 64 and 1024.

If, however, we note that:
kva = 0xFFFF << 48 + lower (where lower[63:48] == 0b)
and, PGDIR_SHIFT = 42 (as we are dealing with 64KB PAGE_SIZE)

We can consider:
(kva >> PGDIR_SHIFT) & (1024 - 1) - (kva >> PGDIR_SHIFT) & (64 - 1)
 = (0xFFFF << 6) & 0x3FF - (0xFFFF << 6) & 0x3F	// "lower" cancels out
 = 0x3C0

In other words, one can switch PTRS_PER_PGD to the 52-bit value globally
provided that they increment ttbr1_el1 by 0x3C0 * 8 = 0x1E00 bytes when
running with 48-bit kernel VAs (TCR_EL1.T1SZ = 16).

For kernel configuration where 52-bit userspace VAs are possible, this
patch offsets ttbr1_el1 and sets PTRS_PER_PGD corresponding to the
52-bit value.

Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Suggested-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Steve Capper <steve.capper@arm.com>
[will: added comment to TTBR1_BADDR_4852_OFFSET calculation]
Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-12-10 18:42:17 +00:00
Steve Capper
e5d9915745 arm64: mm: Define arch_get_mmap_end, arch_get_mmap_base
Now that we have DEFAULT_MAP_WINDOW defined, we can arch_get_mmap_end
and arch_get_mmap_base helpers to allow for high addresses in mmap.

Signed-off-by: Steve Capper <steve.capper@arm.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-12-10 18:42:17 +00:00
Steve Capper
363524d2b1 arm64: mm: Introduce DEFAULT_MAP_WINDOW
We wish to introduce a 52-bit virtual address space for userspace but
maintain compatibility with software that assumes the maximum VA space
size is 48 bit.

In order to achieve this, on 52-bit VA systems, we make mmap behave as
if it were running on a 48-bit VA system (unless userspace explicitly
requests a VA where addr[51:48] != 0).

On a system running a 52-bit userspace we need TASK_SIZE to represent
the 52-bit limit as it is used in various places to distinguish between
kernelspace and userspace addresses.

Thus we need a new limit for mmap, stack, ELF loader and EFI (which uses
TTBR0) to represent the non-extended VA space.

This patch introduces DEFAULT_MAP_WINDOW and DEFAULT_MAP_WINDOW_64 and
switches the appropriate logic to use that instead of TASK_SIZE.

Signed-off-by: Steve Capper <steve.capper@arm.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-12-10 18:42:17 +00:00
Qian Cai
6e8830674e arm64: kasan: Increase stack size for KASAN_EXTRA
If the kernel is configured with KASAN_EXTRA, the stack size is
increased significantly due to setting the GCC -fstack-reuse option to
"none" [1]. As a result, it can trigger a stack overrun quite often with
32k stack size compiled using GCC 8. For example, this reproducer

  https://github.com/linux-test-project/ltp/blob/master/testcases/kernel/syscalls/madvise/madvise06.c

can trigger a "corrupted stack end detected inside scheduler" very
reliably with CONFIG_SCHED_STACK_END_CHECK enabled. There are other
reports at:

  https://lore.kernel.org/lkml/1542144497.12945.29.camel@gmx.us/
  https://lore.kernel.org/lkml/721E7B42-2D55-4866-9C1A-3E8D64F33F9C@gmx.us/

There are just too many functions that could have a large stack with
KASAN_EXTRA due to large local variables that have been called over and
over again without being able to reuse the stacks. Some noticiable ones
are,

size
7536 shrink_inactive_list
7440 shrink_page_list
6560 fscache_stats_show
3920 jbd2_journal_commit_transaction
3216 try_to_unmap_one
3072 migrate_page_move_mapping
3584 migrate_misplaced_transhuge_page
3920 ip_vs_lblcr_schedule
4304 lpfc_nvme_info_show
3888 lpfc_debugfs_nvmestat_data.constprop

There are other 49 functions over 2k in size while compiling kernel with
"-Wframe-larger-than=" on this machine. Hence, it is too much work to
change Makefiles for each object to compile without
-fsanitize-address-use-after-scope individually.

[1] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81715#c23

Signed-off-by: Qian Cai <cai@lca.pw>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-12-10 17:53:12 +00:00
Will Deacon
33309ecda0 arm64: Fix minor issues with the dcache_by_line_op macro
The dcache_by_line_op macro suffers from a couple of small problems:

First, the GAS directives that are currently being used rely on
assembler behavior that is not documented, and probably not guaranteed
to produce the correct behavior going forward. As a result, we end up
with some undefined symbols in cache.o:

$ nm arch/arm64/mm/cache.o
         ...
         U civac
         ...
         U cvac
         U cvap
         U cvau

This is due to the fact that the comparisons used to select the
operation type in the dcache_by_line_op macro are comparing symbols
not strings, and even though it seems that GAS is doing the right
thing here (undefined symbols by the same name are equal to each
other), it seems unwise to rely on this.

Second, when patching in a DC CVAP instruction on CPUs that support it,
the fallback path consists of a DC CVAU instruction which may be
affected by CPU errata that require ARM64_WORKAROUND_CLEAN_CACHE.

Solve these issues by unrolling the various maintenance routines and
using the conditional directives that are documented as operating on
strings. To avoid the complexity of nested alternatives, we move the
DC CVAP patching to __clean_dcache_area_pop, falling back to a branch
to __clean_dcache_area_poc if DCPOP is not supported by the CPU.

Reported-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Suggested-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-12-10 15:03:51 +00:00
Marc Zyngier
1e4448c5dd arm64: KVM: Add synchronization on translation regime change for erratum 1165522
In order to ensure that slipping HCR_EL2.TGE is done at the right
time when switching translation regime, let insert the required ISBs
that will be patched in when erratum 1165522 is detected.

Take this opportunity to add the missing include of asm/alternative.h
which was getting there by pure luck.

Acked-by: Christoffer Dall <christoffer.dall@arm.com>
Reviewed-by: James Morse <james.morse@arm.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-12-10 11:59:07 +00:00
Marc Zyngier
8b2cca9ade arm64: KVM: Force VHE for systems affected by erratum 1165522
In order to easily mitigate ARM erratum 1165522, we need to force
affected CPUs to run in VHE mode if using KVM.

Reviewed-by: James Morse <james.morse@arm.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-12-10 11:59:07 +00:00
Marc Zyngier
793d5d9213 arm64: Add TCR_EPD{0,1} definitions
We are soon going to play with TCR_EL1.EPD{0,1}, so let's add the
relevant definitions.

Reviewed-by: James Morse <james.morse@arm.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-12-10 11:59:06 +00:00
Marc Zyngier
33e5f4e509 KVM: arm64: Rework detection of SVE, !VHE systems
An SVE system is so far the only case where we mandate VHE. As we're
starting to grow this requirements, let's slightly rework the way we
deal with that situation, allowing for easy extension of this check.

Acked-by: Christoffer Dall <christoffer.dall@arm.com>
Reviewed-by: James Morse <james.morse@arm.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-12-10 11:57:52 +00:00
Mark Rutland
386b3c7bda arm64: add EXPORT_SYMBOL_NOKASAN()
So that we can export symbols directly from assembly files, let's make
use of the generic <asm/export.h>. We have a few symbols that we'll want
to conditionally export for !KASAN kernel builds, so we add a helper for
that in <asm/assembler.h>.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-12-10 11:50:11 +00:00
Will Deacon
4230509978 arm64: cmpxchg: Use "K" instead of "L" for ll/sc immediate constraint
The "L" AArch64 machine constraint, which we use for the "old" value in
an LL/SC cmpxchg(), generates an immediate that is suitable for a 64-bit
logical instruction. However, for cmpxchg() operations on types smaller
than 64 bits, this constraint can result in an invalid instruction which
is correctly rejected by GAS, such as EOR W1, W1, #0xffffffff.

Whilst we could special-case the constraint based on the cmpxchg size,
it's far easier to change the constraint to "K" and put up with using
a register for large 64-bit immediates. For out-of-line LL/SC atomics,
this is all moot anyway.

Reported-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-12-07 17:28:13 +00:00
Will Deacon
959bf2fd03 arm64: percpu: Rewrite per-cpu ops to allow use of LSE atomics
Our percpu code is a bit of an inconsistent mess:

  * It rolls its own xchg(), but reuses cmpxchg_local()
  * It uses various different flavours of preempt_{enable,disable}()
  * It returns values even for the non-returning RmW operations
  * It makes no use of LSE atomics outside of the cmpxchg() ops
  * There are individual macros for different sizes of access, but these
    are all funneled through a switch statement rather than dispatched
    directly to the relevant case

This patch rewrites the per-cpu operations to address these shortcomings.
Whilst the new code is a lot cleaner, the big advantage is that we can
use the non-returning ST- atomic instructions when we have LSE.

Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-12-07 17:28:06 +00:00
Will Deacon
b4f9209bfc arm64: Avoid masking "old" for LSE cmpxchg() implementation
The CAS instructions implicitly access only the relevant bits of the "old"
argument, so there is no need for explicit masking via type-casting as
there is in the LL/SC implementation.

Move the casting into the LL/SC code and remove it altogether for the LSE
implementation.

Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-12-07 17:28:01 +00:00
Will Deacon
5ef3fe4cec arm64: Avoid redundant type conversions in xchg() and cmpxchg()
Our atomic instructions (either LSE atomics of LDXR/STXR sequences)
natively support byte, half-word, word and double-word memory accesses
so there is no need to mask the data register prior to being stored.

Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-12-07 17:27:55 +00:00
Will Deacon
3962446922 arm64: preempt: Provide our own implementation of asm/preempt.h
The asm-generic/preempt.h implementation doesn't make use of the
PREEMPT_NEED_RESCHED flag, since this can interact badly with load/store
architectures which rely on the preempt_count word being unchanged across
an interrupt.

However, since we're a 64-bit architecture and the preempt count is
only 32 bits wide, we can simply pack it next to the resched flag and
load the whole thing in one go, so that a dec-and-test operation doesn't
need to load twice.

Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-12-07 12:35:53 +00:00
Jackie Liu
cc9f8349cb arm64: crypto: add NEON accelerated XOR implementation
This is a NEON acceleration method that can improve
performance by approximately 20%. I got the following
data from the centos 7.5 on Huawei's HISI1616 chip:

[ 93.837726] xor: measuring software checksum speed
[ 93.874039]   8regs  : 7123.200 MB/sec
[ 93.914038]   32regs : 7180.300 MB/sec
[ 93.954043]   arm64_neon: 9856.000 MB/sec
[ 93.954047] xor: using function: arm64_neon (9856.000 MB/sec)

I believe this code can bring some optimization for
all arm64 platform. thanks for Ard Biesheuvel's suggestions.

Signed-off-by: Jackie Liu <liuyun01@kylinos.cn>
Reviewed-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-12-06 16:47:06 +00:00
Jackie Liu
21e28547f6 arm64/neon: add workaround for ambiguous C99 stdint.h types
In a way similar to ARM commit 09096f6a0e ("ARM: 7822/1: add workaround
for ambiguous C99 stdint.h types"), this patch redefines the macros that
are used in stdint.h so its definitions of uint64_t and int64_t are
compatible with those of the kernel.

This patch comes from: https://patchwork.kernel.org/patch/3540001/
Wrote by: Ard Biesheuvel <ard.biesheuvel@linaro.org>

We mark this file as a private file and don't have to override asm/types.h

Signed-off-by: Jackie Liu <liuyun01@kylinos.cn>
Reviewed-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-12-06 16:47:05 +00:00
Will Deacon
bd4fb6d270 arm64: Add support for SB barrier and patch in over DSB; ISB sequences
We currently use a DSB; ISB sequence to inhibit speculation in set_fs().
Whilst this works for current CPUs, future CPUs may implement a new SB
barrier instruction which acts as an architected speculation barrier.

On CPUs that support it, patch in an SB; NOP sequence over the DSB; ISB
sequence and advertise the presence of the new instruction to userspace.

Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-12-06 16:47:04 +00:00
Suzuki K Poulose
0b587c84e4 arm64: capabilities: Batch cpu_enable callbacks
We use a stop_machine call for each available capability to
enable it on all the CPUs available at boot time. Instead
we could batch the cpu_enable callbacks to a single stop_machine()
call to save us some time.

Reviewed-by: Vladimir Murzin <vladimir.murzin@arm.com>
Tested-by: Vladimir Murzin <vladimir.murzin@arm.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-12-06 15:12:26 +00:00
AKASHI Takahiro
f3b70e5094 arm64: kexec_file: allow for loading Image-format kernel
This patch provides kexec_file_ops for "Image"-format kernel. In this
implementation, a binary is always loaded with a fixed offset identified
in text_offset field of its header.

Regarding signature verification for trusted boot, this patch doesn't
contains CONFIG_KEXEC_VERIFY_SIG support, which is to be added later
in this series, but file-attribute-based verification is still a viable
option by enabling IMA security subsystem.

You can sign(label) a to-be-kexec'ed kernel image on target file system
with:
    $ evmctl ima_sign --key /path/to/private_key.pem Image

On live system, you must have IMA enforced with, at least, the following
security policy:
    "appraise func=KEXEC_KERNEL_CHECK appraise_type=imasig"

See more details about IMA here:
    https://sourceforge.net/p/linux-ima/wiki/Home/

Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Reviewed-by: James Morse <james.morse@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-12-06 14:38:52 +00:00
AKASHI Takahiro
52b2a8af74 arm64: kexec_file: load initrd and device-tree
load_other_segments() is expected to allocate and place all the necessary
memory segments other than kernel, including initrd and device-tree
blob (and elf core header for crash).
While most of the code was borrowed from kexec-tools' counterpart,
users may not be allowed to specify dtb explicitly, instead, the dtb
presented by the original boot loader is reused.

arch_kimage_kernel_post_load_cleanup() is responsible for freeing arm64-
specific data allocated in load_other_segments().

Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Reviewed-by: James Morse <james.morse@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-12-06 14:38:52 +00:00
AKASHI Takahiro
bdd2c9d1c3 arm64: cpufeature: add MMFR0 helper functions
Those helper functions for MMFR0 register will be used later by kexec_file
loader.

Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Reviewed-by: James Morse <james.morse@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-12-06 14:38:51 +00:00
AKASHI Takahiro
f56063c51f arm64: add image head flag definitions
Those image head's flags will be used later by kexec_file loader.

Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Acked-by: James Morse <james.morse@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-12-06 14:38:51 +00:00
Suzuki K Poulose
f58cdf7e3c arm64: capabilities: Merge duplicate Cavium erratum entries
Merge duplicate entries for a single capability using the midr
range list for Cavium errata 30115 and 27456.

Cc: Andrew Pinski <apinski@cavium.com>
Cc: David Daney <david.daney@cavium.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Reviewed-by: Vladimir Murzin <vladimir.murzin@arm.com>
Tested-by: Vladimir Murzin <vladimir.murzin@arm.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-12-06 11:47:44 +00:00
Suzuki K Poulose
c9460dcb06 arm64: capabilities: Merge entries for ARM64_WORKAROUND_CLEAN_CACHE
We have two entries for ARM64_WORKAROUND_CLEAN_CACHE capability :

1) ARM Errata 826319, 827319, 824069, 819472 on A53 r0p[012]
2) ARM Errata 819472 on A53 r0p[01]

Both have the same work around. Merge these entries to avoid
duplicate entries for a single capability. Add a new Kconfig
entry to control the "capability" entry to make it easier
to handle combinations of the CONFIGs.

Cc: Will Deacon <will.deacon@arm.com>
Cc: Andre Przywara <andre.przywara@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-12-06 11:47:44 +00:00
Mark Rutland
5c176aff5b arm64: ftrace: enable graph FP test
The core frace code has an optional sanity check on the frame pointer
passed by ftrace_graph_caller and return_to_handler. This is cheap,
useful, and enabled unconditionally on x86, sparc, and riscv.

Let's do the same on arm64, so that we can catch any problems early.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: AKASHI Takahiro <takahiro.akashi@linaro.org>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Torsten Duwe <duwe@suse.de>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-11-30 13:29:04 +00:00
Will Deacon
1b57ec8c75 arm64: io: Ensure value passed to __iormb() is held in a 64-bit register
As of commit 6460d32014 ("arm64: io: Ensure calls to delay routines
are ordered against prior readX()"), MMIO reads smaller than 64 bits
fail to compile under clang because we end up mixing 32-bit and 64-bit
register operands for the same data processing instruction:

./include/asm-generic/io.h:695:9: warning: value size does not match register size specified by the constraint and modifier [-Wasm-operand-widths]
        return readb(addr);
               ^
./arch/arm64/include/asm/io.h:147:58: note: expanded from macro 'readb'
                                                                       ^
./include/asm-generic/io.h:695:9: note: use constraint modifier "w"
./arch/arm64/include/asm/io.h:147:50: note: expanded from macro 'readb'
                                                               ^
./arch/arm64/include/asm/io.h:118:24: note: expanded from macro '__iormb'
        asm volatile("eor       %0, %1, %1\n"                           \
                                    ^

Fix the build by casting the macro argument to 'unsigned long' when used
as an input to the inline asm.

Reported-by: Nick Desaulniers <nick.desaulniers@gmail.com>
Reported-by: Nathan Chancellor <natechancellor@gmail.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-11-29 16:36:18 +00:00
Will Deacon
3d65b6bbc0 arm64: tlbi: Set MAX_TLBI_OPS to PTRS_PER_PTE
In order to reduce the possibility of soft lock-ups, we bound the
maximum number of TLBI operations performed by a single call to
flush_tlb_range() to an arbitrary constant of 1024.

Whilst this does the job of avoiding lock-ups, we can actually be a bit
smarter by defining this as PTRS_PER_PTE. Due to the structure of our
page tables, using PTRS_PER_PTE means that an outer loop calling
flush_tlb_range() for entire table entries will end up performing just a
single TLBI operation for each entry. As an example, mremap()ing a 1GB
range mapped using 4k pages now requires only 512 TLBI operations when
moving the page tables as opposed to 262144 operations (512*512) when
using the current threshold of 1024.

Cc: Joel Fernandes <joel@joelfernandes.org>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-11-27 19:01:21 +00:00
Ard Biesheuvel
bdb85cd1d2 arm64/module: switch to ADRP/ADD sequences for PLT entries
Now that we have switched to the small code model entirely, and
reduced the extended KASLR range to 4 GB, we can be sure that the
targets of relative branches that are out of range are in range
for a ADRP/ADD pair, which is one instruction shorter than our
current MOVN/MOVK/MOVK sequence, and is more idiomatic and so it
is more likely to be implemented efficiently by micro-architectures.

So switch over the ordinary PLT code and the special handling of
the Cortex-A53 ADRP errata, as well as the ftrace trampline
handling.

Reviewed-by: Torsten Duwe <duwe@lst.de>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
[will: Added a couple of comments in the plt equality check]
Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-11-27 19:00:45 +00:00
Ard Biesheuvel
7aaf7b2fd2 arm64/insn: add support for emitting ADR/ADRP instructions
Add support for emitting ADR and ADRP instructions so we can switch
over our PLT generation code in a subsequent patch.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-11-27 18:47:33 +00:00
Jeremy Linton
9eb1c92b47 arm64: acpi: Prepare for longer MADTs
The BAD_MADT_GICC_ENTRY check is a little too strict because
it rejects MADT entries that don't match the currently known
lengths. We should remove this restriction to avoid problems
if the table length changes. Future code which might depend on
additional fields should be written to validate those fields
before using them, rather than trying to globally check
known MADT version lengths.

Link: https://lkml.kernel.org/r/20181012192937.3819951-1-jeremy.linton@arm.com
Signed-off-by: Jeremy Linton <jeremy.linton@arm.com>
[lorenzo.pieralisi@arm.com: added MADT macro comments]
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Acked-by: Sudeep Holla <sudeep.holla@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Al Stone <ahs3@redhat.com>
Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-11-27 18:00:14 +00:00
Will Deacon
6460d32014 arm64: io: Ensure calls to delay routines are ordered against prior readX()
A relatively standard idiom for ensuring that a pair of MMIO writes to a
device arrive at that device with a specified minimum delay between them
is as follows:

	writel_relaxed(42, dev_base + CTL1);
	readl(dev_base + CTL1);
	udelay(10);
	writel_relaxed(42, dev_base + CTL2);

the intention being that the read-back from the device will push the
prior write to CTL1, and the udelay will hold up the write to CTL1 until
at least 10us have elapsed.

Unfortunately, on arm64 where the underlying delay loop is implemented
as a read of the architected counter, the CPU does not guarantee
ordering from the readl() to the delay loop and therefore the delay loop
could in theory be speculated and not provide the desired interval
between the two writes.

Fix this in a similar manner to PowerPC by introducing a dummy control
dependency on the output of readX() which, combined with the ISB in the
read of the architected counter, guarantees that a subsequent delay loop
can not be executed until the readX() has returned its result.

Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-11-27 12:18:07 +00:00
Alex Van Brunt
3403e56b41 arm64: mm: Don't wait for completion of TLB invalidation when page aging
When transitioning a PTE from young to old as part of page aging, we
can avoid waiting for the TLB invalidation to complete and therefore
drop the subsequent DSB instruction. Whilst this opens up a race with
page reclaim, where a PTE in active use via a stale, young TLB entry
does not update the underlying descriptor, the worst thing that happens
is that the page is reclaimed and then immediately faulted back in.

Given that we have a DSB in our context-switch path, the window for a
spurious reclaim is fairly limited and eliding the barrier claims to
boost NVMe/SSD accesses by over 10% on some platforms.

A similar optimisation was made for x86 in commit b13b1d2d86 ("x86/mm:
In the PTE swapout page reclaim case clear the accessed bit instead of
flushing the TLB").

Signed-off-by: Alex Van Brunt <avanbrunt@nvidia.com>
Signed-off-by: Ashish Mhetre <amhetre@nvidia.com>
[will: rewrote patch]
Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-11-26 16:59:46 +00:00
Will Deacon
4b47e573a4 arm64: perf: Move event definitions into perf_event.h
The PMU event numbers are split between perf_event.h and perf_event.c,
which makes it difficult to spot any gaps in the numbers which may be
allocated in the future.

This patch sorts the events numerically, adds some missing events and
moves the definitions into perf_event.h.

Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-11-21 13:16:34 +00:00
Jessica Yu
c8ebf64eab arm64/module: use plt section indices for relocations
Instead of saving a pointer to the .plt and .init.plt sections to apply
plt-based relocations, save and use their section indices instead.

The mod->arch.{core,init}.plt pointers were problematic for livepatch
because they pointed within temporary section headers (provided by the
module loader via info->sechdrs) that would be freed after module load.
Since livepatch modules may need to apply relocations post-module-load
(for example, to patch a module that is loaded later), using section
indices to offset into the section headers (instead of accessing them
through a saved pointer) allows livepatch modules on arm64 to pass in
their own copy of the section headers to apply_relocate_add() to apply
delayed relocations.

Reviewed-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Reviewed-by: Miroslav Benes <mbenes@suse.cz>
Signed-off-by: Jessica Yu <jeyu@kernel.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-11-20 11:38:26 +00:00
Ard Biesheuvel
c55191e96c arm64: mm: apply r/o permissions of VM areas to its linear alias as well
On arm64, we use block mappings and contiguous hints to map the linear
region, to minimize the TLB footprint. However, this means that the
entire region is mapped using read/write permissions, which we cannot
modify at page granularity without having to take intrusive measures to
prevent TLB conflicts.

This means the linear aliases of pages belonging to read-only mappings
(executable or otherwise) in the vmalloc region are also mapped read/write,
and could potentially be abused to modify things like module code, bpf JIT
code or other read-only data.

So let's fix this, by extending the set_memory_ro/rw routines to take
the linear alias into account. The consequence of enabling this is
that we can no longer use block mappings or contiguous hints, so in
cases where the TLB footprint of the linear region is a bottleneck,
performance may be affected.

Therefore, allow this feature to be runtime en/disabled, by setting
rodata=full (or 'on' to disable just this enhancement, or 'off' to
disable read-only mappings for code and r/o data entirely) on the
kernel command line. Also, allow the default value to be set via a
Kconfig option.

Tested-by: Laura Abbott <labbott@redhat.com>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-11-20 11:38:26 +00:00
Ard Biesheuvel
26a4676faa arm64: mm: define NET_IP_ALIGN to 0
On arm64, there is no need to add 2 bytes of padding to the start of
each network buffer just to make the IP header appear 32-bit aligned.

Since this might actually adversely affect DMA performance some
platforms, let's override NET_IP_ALIGN to 0 to get rid of this
padding.

Acked-by: Ilias Apalodimas <ilias.apalodimas@linaro.org>
Tested-by: Ilias Apalodimas <ilias.apalodimas@linaro.org>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2018-11-08 17:50:26 +00:00