Commit Graph

3226 Commits

Author SHA1 Message Date
Mark Brown
38d1666760 arm64: Clarify when cpu_enable() is called
Strengthen the wording in the documentation for cpu_enable() to make it
more obvious to readers not already familiar with the code when the core
will call this callback and that this is intentional.

Signed-off-by: Mark Brown <broonie@kernel.org>
Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com>
[will: minor tweak to emphasis in the comment]
Signed-off-by: Will Deacon <will@kernel.org>
2019-08-14 17:12:53 +01:00
Mark Rutland
77ad4ce693 arm64: memory: rename VA_START to PAGE_END
Prior to commit:

  14c127c957 ("arm64: mm: Flip kernel VA space")

... VA_START described the start of the TTBR1 address space for a given
VA size described by VA_BITS, where all kernel mappings began.

Since that commit, VA_START described a portion midway through the
address space, where the linear map ends and other kernel mappings
begin.

To avoid confusion, let's rename VA_START to PAGE_END, making it clear
that it's not the start of the TTBR1 address space and implying that
it's related to PAGE_OFFSET. Comments and other mnemonics are updated
accordingly, along with a typo fix in the decription of VMEMMAP_SIZE.

There should be no functional change as a result of this patch.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Tested-by: Steve Capper <steve.capper@arm.com>
Reviewed-by: Steve Capper <steve.capper@arm.com>
Signed-off-by: Will Deacon <will@kernel.org>
2019-08-14 17:06:58 +01:00
Will Deacon
d0b3c32ed9 arm64: memory: Cosmetic cleanups
Cleanup memory.h so that the indentation is consistent, remove pointless
line-wrapping and use consistent parameter names for different versions
of the same macro.

Reviewed-by: Steve Capper <steve.capper@arm.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Reviewed-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Will Deacon <will@kernel.org>
2019-08-14 13:09:45 +01:00
Will Deacon
68933aa973 arm64: memory: Add comments to end of non-trivial #ifdef blocks
Commenting the #endif of a multi-statement #ifdef block with the
condition which guards it is useful and can save having to scroll back
through the file to figure out which set of Kconfig options apply to
a particular piece of code.

Reviewed-by: Steve Capper <steve.capper@arm.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Reviewed-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Will Deacon <will@kernel.org>
2019-08-14 13:09:24 +01:00
Will Deacon
6bbd497f02 arm64: memory: Implement __tag_set() as common function
There's no need for __tag_set() to be a complicated macro when
CONFIG_KASAN_SW_TAGS=y and a simple static inline otherwise. Rewrite
the thing as a common static inline function.

Tested-by: Steve Capper <steve.capper@arm.com>
Reviewed-by: Steve Capper <steve.capper@arm.com>
Tested-by: Geert Uytterhoeven <geert+renesas@glider.be>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Reviewed-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Will Deacon <will@kernel.org>
2019-08-14 13:08:44 +01:00
Will Deacon
a5ac40f53b arm64: memory: Simplify _VA_START and _PAGE_OFFSET definitions
Rather than subtracting from -1 and then adding 1, we can simply
subtract from 0.

Tested-by: Steve Capper <steve.capper@arm.com>
Reviewed-by: Steve Capper <steve.capper@arm.com>
Tested-by: Geert Uytterhoeven <geert+renesas@glider.be>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Will Deacon <will@kernel.org>
2019-08-14 13:08:40 +01:00
Will Deacon
9ba33dcc6b arm64: memory: Simplify virt_to_page() implementation
Build virt_to_page() on top of virt_to_pfn() so we can avoid the need
for explicit shifting.

Tested-by: Steve Capper <steve.capper@arm.com>
Reviewed-by: Steve Capper <steve.capper@arm.com>
Tested-by: Geert Uytterhoeven <geert+renesas@glider.be>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Reviewed-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Will Deacon <will@kernel.org>
2019-08-14 13:07:49 +01:00
Will Deacon
96628f0fb1 arm64: memory: Rewrite default page_to_virt()/virt_to_page()
The default implementations of page_to_virt() and virt_to_page() are
fairly confusing to read and the former evaluates its 'page' parameter
twice in the macro

Rewrite them so that the computation is expressed as 'base + index' in
both cases and the parameter is always evaluated exactly once.

Tested-by: Steve Capper <steve.capper@arm.com>
Reviewed-by: Steve Capper <steve.capper@arm.com>
Tested-by: Geert Uytterhoeven <geert+renesas@glider.be>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Will Deacon <will@kernel.org>
2019-08-14 13:05:35 +01:00
Will Deacon
577c2b3528 arm64: memory: Ensure address tag is masked in conversion macros
When converting a linear virtual address to a physical address, pfn or
struct page *, we must make sure that the tag bits are masked before the
calculation otherwise we end up with corrupt pointers when running with
CONFIG_KASAN_SW_TAGS=y:

  | Unable to handle kernel paging request at virtual address 0037fe0007580d08
  | [0037fe0007580d08] address between user and kernel address ranges

Mask out the tag in __virt_to_phys_nodebug() and virt_to_page().

Reported-by: Qian Cai <cai@lca.pw>
Reported-by: Geert Uytterhoeven <geert@linux-m68k.org>
Tested-by: Steve Capper <steve.capper@arm.com>
Reviewed-by: Steve Capper <steve.capper@arm.com>
Tested-by: Geert Uytterhoeven <geert+renesas@glider.be>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Reviewed-by: Mark Rutland <mark.rutland@arm.com>
Fixes: 9cb1c5ddd2 ("arm64: mm: Remove bit-masking optimisations for PAGE_OFFSET and VMEMMAP_START")
Signed-off-by: Will Deacon <will@kernel.org>
2019-08-14 13:04:46 +01:00
Will Deacon
68dd8ef321 arm64: memory: Fix virt_addr_valid() using __is_lm_address()
virt_addr_valid() is intended to test whether or not the passed address
is a valid linear map address. Unfortunately, it relies on
_virt_addr_is_linear() which is broken because it assumes the linear
map is at the top of the address space, which it no longer is.

Reimplement virt_addr_valid() using __is_lm_address() and remove
_virt_addr_is_linear() entirely. At the same time, ensure we evaluate
the macro parameter only once and move it within the __ASSEMBLY__ block.

Reported-by: Qian Cai <cai@lca.pw>
Reported-by: Geert Uytterhoeven <geert@linux-m68k.org>
Tested-by: Steve Capper <steve.capper@arm.com>
Reviewed-by: Steve Capper <steve.capper@arm.com>
Tested-by: Geert Uytterhoeven <geert+renesas@glider.be>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Reviewed-by: Mark Rutland <mark.rutland@arm.com>
Fixes: 14c127c957 ("arm64: mm: Flip kernel VA space")
Signed-off-by: Will Deacon <will@kernel.org>
2019-08-14 13:00:57 +01:00
Will Deacon
d06fa5a118 Merge tag 'common/for-v5.4-rc1/cpu-topology' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux into for-next/cpu-topology
Pull in generic CPU topology changes from Paul Walmsley (RISC-V).

* tag 'common/for-v5.4-rc1/cpu-topology' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux:
  MAINTAINERS: Add an entry for generic architecture topology
  base: arch_topology: update Kconfig help description
  RISC-V: Parse cpu topology during boot.
  arm: Use common cpu_topology structure and functions.
  cpu-topology: Move cpu topology code to common code.
  dt-binding: cpu-topology: Move cpu-map to a common binding.
  Documentation: DT: arm: add support for sockets defining package boundaries
2019-08-14 10:07:00 +01:00
Nick Desaulniers
80d8381226 arm64: prefer __section from compiler_attributes.h
GCC unescapes escaped string section names while Clang does not. Because
__section uses the `#` stringification operator for the section name, it
doesn't need to be escaped.

This antipattern was found with:
$ grep -e __section\(\" -e __section__\(\" -r

Reported-by: Sedat Dilek <sedat.dilek@gmail.com>
Suggested-by: Josh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: Nick Desaulniers <ndesaulniers@google.com>
Signed-off-by: Will Deacon <will@kernel.org>
2019-08-13 18:32:15 +01:00
Linus Torvalds
7f20fd2337 Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull kvm fixes from Paolo Bonzini:
 "Bugfixes (arm and x86) and cleanups"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
  selftests: kvm: Adding config fragments
  KVM: selftests: Update gitignore file for latest changes
  kvm: remove unnecessary PageReserved check
  KVM: arm/arm64: vgic: Reevaluate level sensitive interrupts on enable
  KVM: arm: Don't write junk to CP15 registers on reset
  KVM: arm64: Don't write junk to sysregs on reset
  KVM: arm/arm64: Sync ICH_VMCR_EL2 back when about to block
  x86: kvm: remove useless calls to kvm_para_available
  KVM: no need to check return value of debugfs_create functions
  KVM: remove kvm_arch_has_vcpu_debugfs()
  KVM: Fix leak vCPU's VMCS value into other pCPU
  KVM: Check preempted_in_kernel for involuntary preemption
  KVM: LAPIC: Don't need to wakeup vCPU twice afer timer fire
  arm64: KVM: hyp: debug-sr: Mark expected switch fall-through
  KVM: arm64: Update kvm_arm_exception_class and esr_class_str for new EC
  KVM: arm: vgic-v3: Mark expected switch fall-through
  arm64: KVM: regmap: Fix unexpected switch fall-through
  KVM: arm/arm64: Introduce kvm_pmu_vcpu_init() to setup PMU counter index
2019-08-09 15:46:29 -07:00
Will Deacon
9c1cac424c arm64: mm: Really fix sparse warning in untagged_addr()
untagged_addr() can be called with a '__user' pointer parameter and must
therefore use '__force' casts both when passing this parameter through
to sign_extend64() as a 'u64', but also when casting the 's64' return
value back to the '__user' pointer type.

Signed-off-by: Will Deacon <will@kernel.org>
2019-08-09 15:39:37 +01:00
Will Deacon
d2d73d2fef arm64: mm: Simplify definition of virt_addr_valid()
_virt_addr_valid() is defined as the same value in two places and rolls
its own version of virt_to_pfn() in both cases.

Consolidate these definitions by inlining a simplified version directly
into virt_addr_valid().

Signed-off-by: Will Deacon <will@kernel.org>
2019-08-09 15:39:11 +01:00
Steve Capper
2c624fe687 arm64: mm: Remove vabits_user
Previous patches have enabled 52-bit kernel + user VAs and there is no
longer any scenario where user VA != kernel VA size.

This patch removes the, now redundant, vabits_user variable and replaces
usage with vabits_actual where appropriate.

Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Steve Capper <steve.capper@arm.com>
Signed-off-by: Will Deacon <will@kernel.org>
2019-08-09 11:17:27 +01:00
Steve Capper
b6d00d47e8 arm64: mm: Introduce 52-bit Kernel VAs
Most of the machinery is now in place to enable 52-bit kernel VAs that
are detectable at boot time.

This patch adds a Kconfig option for 52-bit user and kernel addresses
and plumbs in the requisite CONFIG_ macros as well as sets TCR.T1SZ,
physvirt_offset and vmemmap at early boot.

To simplify things this patch also removes the 52-bit user/48-bit kernel
kconfig option.

Signed-off-by: Steve Capper <steve.capper@arm.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Will Deacon <will@kernel.org>
2019-08-09 11:17:26 +01:00
Steve Capper
ce3aaed873 arm64: mm: Modify calculation of VMEMMAP_SIZE
In a later patch we will need to have a slightly larger VMEMMAP region
to accommodate boot time selection between 48/52-bit kernel VAs.

This patch modifies the formula for computing VMEMMAP_SIZE to depend
explicitly on the PAGE_OFFSET and start of kernel addressable memory.
(This allows for a slightly larger direct linear map in future).

Signed-off-by: Steve Capper <steve.capper@arm.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Will Deacon <will@kernel.org>
2019-08-09 11:17:26 +01:00
Steve Capper
c8b6d2ccf9 arm64: mm: Separate out vmemmap
vmemmap is a preprocessor definition that depends on a variable,
memstart_addr. In a later patch we will need to expand the size of
the VMEMMAP region and optionally modify vmemmap depending upon
whether or not hardware support is available for 52-bit virtual
addresses.

This patch changes vmemmap to be a variable. As the old definition
depended on a variable load, this should not affect performance
noticeably.

Signed-off-by: Steve Capper <steve.capper@arm.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Will Deacon <will@kernel.org>
2019-08-09 11:17:25 +01:00
Steve Capper
c812026c54 arm64: mm: Logic to make offset_ttbr1 conditional
When running with a 52-bit userspace VA and a 48-bit kernel VA we offset
ttbr1_el1 to allow the kernel pagetables with a 52-bit PTRS_PER_PGD to
be used for both userspace and kernel.

Moving on to a 52-bit kernel VA we no longer require this offset to
ttbr1_el1 should we be running on a system with HW support for 52-bit
VAs.

This patch introduces conditional logic to offset_ttbr1 to query
SYS_ID_AA64MMFR2_EL1 whenever 52-bit VAs are selected. If there is HW
support for 52-bit VAs then the ttbr1 offset is skipped.

We choose to read a system register rather than vabits_actual because
offset_ttbr1 can be called in places where the kernel data is not
actually mapped.

Calls to offset_ttbr1 appear to be made from rarely called code paths so
this extra logic is not expected to adversely affect performance.

Signed-off-by: Steve Capper <steve.capper@arm.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Will Deacon <will@kernel.org>
2019-08-09 11:17:24 +01:00
Steve Capper
5383cc6efe arm64: mm: Introduce vabits_actual
In order to support 52-bit kernel addresses detectable at boot time, one
needs to know the actual VA_BITS detected. A new variable vabits_actual
is introduced in this commit and employed for the KVM hypervisor layout,
KASAN, fault handling and phys-to/from-virt translation where there
would normally be compile time constants.

In order to maintain performance in phys_to_virt, another variable
physvirt_offset is introduced.

Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Steve Capper <steve.capper@arm.com>
Signed-off-by: Will Deacon <will@kernel.org>
2019-08-09 11:17:21 +01:00
Steve Capper
90ec95cda9 arm64: mm: Introduce VA_BITS_MIN
In order to support 52-bit kernel addresses detectable at boot time, the
kernel needs to know the most conservative VA_BITS possible should it
need to fall back to this quantity due to lack of hardware support.

A new compile time constant VA_BITS_MIN is introduced in this patch and
it is employed in the KASAN end address, KASLR, and EFI stub.

For Arm, if 52-bit VA support is unavailable the fallback is to 48-bits.

In other words: VA_BITS_MIN = min (48, VA_BITS)

Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Steve Capper <steve.capper@arm.com>
Signed-off-by: Will Deacon <will@kernel.org>
2019-08-09 11:17:16 +01:00
Steve Capper
6bd1d0be0e arm64: kasan: Switch to using KASAN_SHADOW_OFFSET
KASAN_SHADOW_OFFSET is a constant that is supplied to gcc as a command
line argument and affects the codegen of the inline address sanetiser.

Essentially, for an example memory access:
    *ptr1 = val;
The compiler will insert logic similar to the below:
    shadowValue = *(ptr1 >> KASAN_SHADOW_SCALE_SHIFT + KASAN_SHADOW_OFFSET)
    if (somethingWrong(shadowValue))
        flagAnError();

This code sequence is inserted into many places, thus
KASAN_SHADOW_OFFSET is essentially baked into many places in the kernel
text.

If we want to run a single kernel binary with multiple address spaces,
then we need to do this with KASAN_SHADOW_OFFSET fixed.

Thankfully, due to the way the KASAN_SHADOW_OFFSET is used to provide
shadow addresses we know that the end of the shadow region is constant
w.r.t. VA space size:
    KASAN_SHADOW_END = ~0 >> KASAN_SHADOW_SCALE_SHIFT + KASAN_SHADOW_OFFSET

This means that if we increase the size of the VA space, the start of
the KASAN region expands into lower addresses whilst the end of the
KASAN region is fixed.

Currently the arm64 code computes KASAN_SHADOW_OFFSET at build time via
build scripts with the VA size used as a parameter. (There are build
time checks in the C code too to ensure that expected values are being
derived). It is sufficient, and indeed is a simplification, to remove
the build scripts (and build time checks) entirely and instead provide
KASAN_SHADOW_OFFSET values.

This patch removes the logic to compute the KASAN_SHADOW_OFFSET in the
arm64 Makefile, and instead we adopt the approach used by x86 to supply
offset values in kConfig. To help debug/develop future VA space changes,
the Makefile logic has been preserved in a script file in the arm64
Documentation folder.

Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Steve Capper <steve.capper@arm.com>
Signed-off-by: Will Deacon <will@kernel.org>
2019-08-09 11:17:11 +01:00
Steve Capper
14c127c957 arm64: mm: Flip kernel VA space
In order to allow for a KASAN shadow that changes size at boot time, one
must fix the KASAN_SHADOW_END for both 48 & 52-bit VAs and "grow" the
start address. Also, it is highly desirable to maintain the same
function addresses in the kernel .text between VA sizes. Both of these
requirements necessitate us to flip the kernel address space halves s.t.
the direct linear map occupies the lower addresses.

This patch puts the direct linear map in the lower addresses of the
kernel VA range and everything else in the higher ranges.

We need to adjust:
 *) KASAN shadow region placement logic,
 *) KASAN_SHADOW_OFFSET computation logic,
 *) virt_to_phys, phys_to_virt checks,
 *) page table dumper.

These are all small changes, that need to take place atomically, so they
are bundled into this commit.

As part of the re-arrangement, a guard region of 2MB (to preserve
alignment for fixed map) is added after the vmemmap. Otherwise the
vmemmap could intersect with IS_ERR pointers.

Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Steve Capper <steve.capper@arm.com>
Signed-off-by: Will Deacon <will@kernel.org>
2019-08-09 11:16:51 +01:00
Steve Capper
9cb1c5ddd2 arm64: mm: Remove bit-masking optimisations for PAGE_OFFSET and VMEMMAP_START
Currently there are assumptions about the alignment of VMEMMAP_START
and PAGE_OFFSET that won't be valid after this series is applied.

These assumptions are in the form of bitwise operators being used
instead of addition and subtraction when calculating addresses.

This patch replaces these bitwise operators with addition/subtraction.

Signed-off-by: Steve Capper <steve.capper@arm.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Will Deacon <will@kernel.org>
2019-08-09 11:15:46 +01:00
Jia He
30e235389f arm64: mm: add missing PTE_SPECIAL in pte_mkdevmap on arm64
Without this patch, the MAP_SYNC test case will cause a print_bad_pte
warning on arm64 as follows:

[   25.542693] BUG: Bad page map in process mapdax333 pte:2e8000448800f53 pmd:41ff5f003
[   25.546360] page:ffff7e0010220000 refcount:1 mapcount:-1 mapping:ffff8003e29c7440 index:0x0
[   25.550281] ext4_dax_aops
[   25.550282] name:"__aaabbbcccddd__"
[   25.551553] flags: 0x3ffff0000001002(referenced|reserved)
[   25.555802] raw: 03ffff0000001002 ffff8003dfffa908 0000000000000000 ffff8003e29c7440
[   25.559446] raw: 0000000000000000 0000000000000000 00000001fffffffe 0000000000000000
[   25.563075] page dumped because: bad pte
[   25.564938] addr:0000ffffbe05b000 vm_flags:208000fb anon_vma:0000000000000000 mapping:ffff8003e29c7440 index:0
[   25.574272] file:__aaabbbcccddd__ fault:ext4_dax_fault mmmmap:ext4_file_mmap readpage:0x0
[   25.578799] CPU: 1 PID: 1180 Comm: mapdax333 Not tainted 5.2.0+ #21
[   25.581702] Hardware name: QEMU KVM Virtual Machine, BIOS 0.0.0 02/06/2015
[   25.585624] Call trace:
[   25.587008]  dump_backtrace+0x0/0x178
[   25.588799]  show_stack+0x24/0x30
[   25.590328]  dump_stack+0xa8/0xcc
[   25.591901]  print_bad_pte+0x18c/0x218
[   25.593628]  unmap_page_range+0x778/0xc00
[   25.595506]  unmap_single_vma+0x94/0xe8
[   25.597304]  unmap_vmas+0x90/0x108
[   25.598901]  unmap_region+0xc0/0x128
[   25.600566]  __do_munmap+0x284/0x3f0
[   25.602245]  __vm_munmap+0x78/0xe0
[   25.603820]  __arm64_sys_munmap+0x34/0x48
[   25.605709]  el0_svc_common.constprop.0+0x78/0x168
[   25.607956]  el0_svc_handler+0x34/0x90
[   25.609698]  el0_svc+0x8/0xc
[...]

The root cause is in _vm_normal_page, without the PTE_SPECIAL bit,
the return value will be incorrectly set to pfn_to_page(pfn) instead
of NULL. Besides, this patch also rewrite the pmd_mkdevmap to avoid
setting PTE_SPECIAL for pmd

The MAP_SYNC test case is as follows(Provided by Yibo Cai)
$#include <stdio.h>
$#include <string.h>
$#include <unistd.h>
$#include <sys/file.h>
$#include <sys/mman.h>

$#ifndef MAP_SYNC
$#define MAP_SYNC 0x80000
$#endif

/* mount -o dax /dev/pmem0 /mnt */
$#define F "/mnt/__aaabbbcccddd__"

int main(void)
{
    int fd;
    char buf[4096];
    void *addr;

    if ((fd = open(F, O_CREAT|O_TRUNC|O_RDWR, 0644)) < 0) {
        perror("open1");
        return 1;
    }

    if (write(fd, buf, 4096) != 4096) {
        perror("lseek");
        return 1;
    }

    addr = mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_SHARED|MAP_SYNC, fd, 0);
    if (addr == MAP_FAILED) {
        perror("mmap");
        printf("did you mount with '-o dax'?\n");
        return 1;
    }

    memset(addr, 0x55, 4096);

    if (munmap(addr, 4096) == -1) {
        perror("munmap");
        return 1;
    }

    close(fd);

    return 0;
}

Fixes: 73b20c84d4 ("arm64: mm: implement pte_devmap support")
Reported-by: Yibo Cai <Yibo.Cai@arm.com>
Acked-by: Will Deacon <will@kernel.org>
Acked-by: Robin Murphy <Robin.Murphy@arm.com>
Signed-off-by: Jia He <justin.he@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2019-08-08 18:38:20 +01:00
Qian Cai
b99286b088 arm64/prefetch: fix a -Wtype-limits warning
The commit d5370f7548 ("arm64: prefetch: add alternative pattern for
CPUs without a prefetcher") introduced MIDR_IS_CPU_MODEL_RANGE() to be
used in has_no_hw_prefetch() with rv_min=0 which generates a compilation
warning from GCC,

In file included from ./arch/arm64/include/asm/cache.h:8,
               from ./include/linux/cache.h:6,
               from ./include/linux/printk.h:9,
               from ./include/linux/kernel.h:15,
               from ./include/linux/cpumask.h:10,
               from arch/arm64/kernel/cpufeature.c:11:
arch/arm64/kernel/cpufeature.c: In function 'has_no_hw_prefetch':
./arch/arm64/include/asm/cputype.h:59:26: warning: comparison of
unsigned expression >= 0 is always true [-Wtype-limits]
_model == (model) && rv >= (rv_min) && rv <= (rv_max);  \
                        ^~
arch/arm64/kernel/cpufeature.c:889:9: note: in expansion of macro
'MIDR_IS_CPU_MODEL_RANGE'
return MIDR_IS_CPU_MODEL_RANGE(midr, MIDR_THUNDERX,
       ^~~~~~~~~~~~~~~~~~~~~~~

Fix it by converting MIDR_IS_CPU_MODEL_RANGE to a static inline
function.

Signed-off-by: Qian Cai <cai@lca.pw>
Signed-off-by: Will Deacon <will@kernel.org>
2019-08-07 16:20:57 +01:00
Leo Yan
42d038c4fb arm64: Add support for function error injection
Inspired by the commit 7cd01b08d3 ("powerpc: Add support for function
error injection"), this patch supports function error injection for
Arm64.

This patch mainly support two functions: one is regs_set_return_value()
which is used to overwrite the return value; the another function is
override_function_with_return() which is to override the probed
function returning and jump to its caller.

Reviewed-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Leo Yan <leo.yan@linaro.org>
Signed-off-by: Will Deacon <will@kernel.org>
2019-08-07 13:53:09 +01:00
Catalin Marinas
63f0c60379 arm64: Introduce prctl() options to control the tagged user addresses ABI
It is not desirable to relax the ABI to allow tagged user addresses into
the kernel indiscriminately. This patch introduces a prctl() interface
for enabling or disabling the tagged ABI with a global sysctl control
for preventing applications from enabling the relaxed ABI (meant for
testing user-space prctl() return error checking without reconfiguring
the kernel). The ABI properties are inherited by threads of the same
application and fork()'ed children but cleared on execve(). A Kconfig
option allows the overall disabling of the relaxed ABI.

The PR_SET_TAGGED_ADDR_CTRL will be expanded in the future to handle
MTE-specific settings like imprecise vs precise exceptions.

Reviewed-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Andrey Konovalov <andreyknvl@google.com>
Signed-off-by: Will Deacon <will@kernel.org>
2019-08-06 18:08:45 +01:00
Andrey Konovalov
2b835e24b5 arm64: untag user pointers in access_ok and __uaccess_mask_ptr
This patch is a part of a series that extends kernel ABI to allow to pass
tagged user pointers (with the top byte set to something else other than
0x00) as syscall arguments.

copy_from_user (and a few other similar functions) are used to copy data
from user memory into the kernel memory or vice versa. Since a user can
provided a tagged pointer to one of the syscalls that use copy_from_user,
we need to correctly handle such pointers.

Do this by untagging user pointers in access_ok and in __uaccess_mask_ptr,
before performing access validity checks.

Note, that this patch only temporarily untags the pointers to perform the
checks, but then passes them as is into the kernel internals.

Reviewed-by: Vincenzo Frascino <vincenzo.frascino@arm.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Andrey Konovalov <andreyknvl@google.com>
[will: Add __force to casting in untagged_addr() to kill sparse warning]
Signed-off-by: Will Deacon <will@kernel.org>
2019-08-06 18:08:25 +01:00
Geert Uytterhoeven
66cbdf5d0c arm64: Move TIF_* documentation to individual definitions
Some TIF_* flags are documented in the comment block at the top, some
next to their definitions, some in both places.

Move all documentation to the individual definitions for consistency,
and for easy lookup.

Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Will Deacon <will@kernel.org>
2019-08-05 12:35:34 +01:00
Will Deacon
22ec71615d arm64: io: Relax implicit barriers in default I/O accessors
The arm64 implementation of the default I/O accessors requires barrier
instructions to satisfy the memory ordering requirements documented in
memory-barriers.txt [1], which are largely derived from the behaviour of
I/O accesses on x86.

Of particular interest are the requirements that a write to a device
must be ordered against prior writes to memory, and a read from a device
must be ordered against subsequent reads from memory. We satisfy these
requirements using various flavours of DSB: the most expensive barrier
we have, since it implies completion of prior accesses. This was deemed
necessary when we first implemented the accessors, since accesses to
different endpoints could propagate independently and therefore the only
way to enforce order is to rely on completion guarantees [2].

Since then, the Armv8 memory model has been retrospectively strengthened
to require "other-multi-copy atomicity", a property that requires memory
accesses from an observer to become visible to all other observers
simultaneously [3]. In other words, propagation of accesses is limited
to transitioning from locally observed to globally observed. It recently
became apparent that this change also has a subtle impact on our I/O
accessors for shared peripherals, allowing us to use the cheaper DMB
instruction instead.

As a concrete example, consider the following:

	memcpy(dma_buffer, data, bufsz);
	writel(DMA_START, dev->ctrl_reg);

A DMB ST instruction between the final write to the DMA buffer and the
write to the control register will ensure that the writes to the DMA
buffer are observed before the write to the control register by all
observers. Put another way, if an observer can see the write to the
control register, it can also see the writes to memory. This has always
been the case and is not sufficient to provide the ordering required by
Linux, since there is no guarantee that the master interface of the
DMA-capable device has observed either of the accesses. However, in an
other-multi-copy atomic world, we can infer two things:

  1. A write arriving at an endpoint shared between multiple CPUs is
     visible to all CPUs

  2. A write that is visible to all CPUs is also visible to all other
     observers in the shareability domain

Pieced together, this allows us to use DMB OSHST for our default I/O
write accessors and DMB OSHLD for our default I/O read accessors (the
outer-shareability is for handling non-cacheable mappings) for shared
devices. Memory-mapped, DMA-capable peripherals that are private to a
CPU (i.e. inaccessible to other CPUs) still require the DSB, however
these are few and far between and typically require special treatment
anyway which is outside of the scope of the portable driver API (e.g.
GIC, page-table walker, SPE profiler).

Note that our mandatory barriers remain as DSBs, since there are cases
where they are used to flush the store buffer of the CPU, e.g. when
publishing page table updates to the SMMU.

[1] https://git.kernel.org/linus/4614bbdee357
[2] https://www.youtube.com/watch?v=i6DayghhA8Q
[3] https://www.cl.cam.ac.uk/~pes20/armv8-mca/

Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2019-08-05 12:35:23 +01:00
Mark Brown
2f8f180b3c arm64: Remove unused cpucap_multi_entry_cap_cpu_enable()
The function cpucap_multi_entry_cap_cpu_enable() is unused, remove it to
avoid any confusion reading the code and potential for bit rot.

Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Will Deacon <will@kernel.org>
2019-08-05 11:06:34 +01:00
Will Deacon
73961dc118 arm64: sysreg: Remove unused and rotting SCTLR_ELx field definitions
Our SCTLR_ELx field definitions are somewhat over-engineered in that
they carefully define masks describing the RES0/RES1 bits and then use
these to construct further masks representing bits to be set/cleared for
the _EL1 and _EL2 registers.

However, most of the resulting definitions aren't actually used by
anybody and have subsequently started to bit-rot when new fields have
been added by the architecture, resulting in fields being part of the
RES0 mask despite being defined and used elsewhere.

Rather than fix up these masks, simply remove the unused parts entirely
so that we can drop the maintenance burden. We can always add things
back if we need them in the future.

Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Will Deacon <will@kernel.org>
2019-08-05 11:06:34 +01:00
Will Deacon
332e5281a4 arm64: esr: Add ESR exception class encoding for trapped ERET
The ESR.EC encoding of 0b011010 (0x1a) describes an exception generated
by an ERET, ERETAA or ERETAB instruction as a result of a nested
virtualisation trap to EL2.

Add an encoding for this EC and a string description so that we identify
it correctly if we take one unexpectedly.

Acked-by: Mark Rutland <mark.rutland@arm.com>
Acked-by: Marc Zyngier <maz@kernel.org>
Signed-off-by: Will Deacon <will@kernel.org>
2019-08-05 11:06:34 +01:00
Mark Rutland
b907b80d7a arm64: remove pointless __KERNEL__ guards
For a number of years, UAPI headers have been split from kernel-internal
headers. The latter are never exposed to userspace, and always built
with __KERNEL__ defined.

Most headers under arch/arm64 don't have __KERNEL__ guards, but there
are a few stragglers lying around. To make things more consistent, and
to set a good example going forward, let's remove these redundant
__KERNEL__ guards.

In a couple of cases, a trailing #endif lacked a comment describing its
corresponding #if or #ifdef, so these are fixes up at the same time.

Guards in auto-generated crypto code are left as-is, as these guards are
generated by scripting imported from the upstream openssl project
scripts. Guards in UAPI headers are left as-is, as these can be included
by userspace or the kernel.

There should be no functional change as a result of this patch.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Will Deacon <will@kernel.org>
2019-08-05 11:06:33 +01:00
Julien Thierry
c87857945b arm64: Remove unused assembly macro
As of commit 4141c857fd ("arm64: convert
raw syscall invocation to C"), moving syscall handling from assembly to
C, the macro mask_nospec64 is no longer referenced.

Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Julien Thierry <julien.thierry@arm.com>
Signed-off-by: Will Deacon <will@kernel.org>
2019-08-05 11:06:33 +01:00
Linus Torvalds
0432a0a066 Merge branch 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull vdso timer fixes from Thomas Gleixner:
 "A series of commits to deal with the regression caused by the generic
  VDSO implementation.

  The usage of clock_gettime64() for 32bit compat fallback syscalls
  caused seccomp filters to kill innocent processes because they only
  allow clock_gettime().

  Handle the compat syscalls with clock_gettime() as before, which is
  not a functional problem for the VDSO as the legacy compat application
  interface is not y2038 safe anyway. It's just extra fallback code
  which needs to be implemented on every architecture.

  It's opt in for now so that it does not break the compile of already
  converted architectures in linux-next. Once these are fixed, the
  #ifdeffery goes away.

  So much for trying to be smart and reuse code..."

* 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  arm64: compat: vdso: Use legacy syscalls as fallback
  x86/vdso/32: Use 32bit syscall fallback
  lib/vdso/32: Provide legacy syscall fallbacks
  lib/vdso: Move fallback invocation to the callers
  lib/vdso/32: Remove inconsistent NULL pointer checks
2019-08-03 10:51:29 -07:00
Masami Hiramatsu
b3980e4852 arm64: kprobes: Recover pstate.D in single-step exception handler
kprobes manipulates the interrupted PSTATE for single step, and
doesn't restore it. Thus, if we put a kprobe where the pstate.D
(debug) masked, the mask will be cleared after the kprobe hits.

Moreover, in the most complicated case, this can lead a kernel
crash with below message when a nested kprobe hits.

[  152.118921] Unexpected kernel single-step exception at EL1

When the 1st kprobe hits, do_debug_exception() will be called.
At this point, debug exception (= pstate.D) must be masked (=1).
But if another kprobes hits before single-step of the first kprobe
(e.g. inside user pre_handler), it unmask the debug exception
(pstate.D = 0) and return.
Then, when the 1st kprobe setting up single-step, it saves current
DAIF, mask DAIF, enable single-step, and restore DAIF.
However, since "D" flag in DAIF is cleared by the 2nd kprobe, the
single-step exception happens soon after restoring DAIF.

This has been introduced by commit 7419333fa1 ("arm64: kprobe:
Always clear pstate.D in breakpoint exception handler")

To solve this issue, this stores all DAIF bits and restore it
after single stepping.

Reported-by: Naresh Kamboju <naresh.kamboju@linaro.org>
Fixes: 7419333fa1 ("arm64: kprobe: Always clear pstate.D in breakpoint exception handler")
Reviewed-by: James Morse <james.morse@arm.com>
Tested-by: James Morse <james.morse@arm.com>
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Will Deacon <will@kernel.org>
2019-08-02 11:55:50 +01:00
Qian Cai
7732d20a16 arm64/mm: fix variable 'tag' set but not used
When CONFIG_KASAN_SW_TAGS=n, set_tag() is compiled away. GCC throws a
warning,

mm/kasan/common.c: In function '__kasan_kmalloc':
mm/kasan/common.c:464:5: warning: variable 'tag' set but not used
[-Wunused-but-set-variable]
  u8 tag = 0xff;
     ^~~

Fix it by making __tag_set() a static inline function the same as
arch_kasan_set_tag() in mm/kasan/kasan.h for consistency because there
is a macro in arch/arm64/include/asm/kasan.h,

 #define arch_kasan_set_tag(addr, tag) __tag_set(addr, tag)

However, when CONFIG_DEBUG_VIRTUAL=n and CONFIG_SPARSEMEM_VMEMMAP=y,
page_to_virt() will call __tag_set() with incorrect type of a
parameter, so fix that as well. Also, still let page_to_virt() return
"void *" instead of "const void *", so will not need to add a similar
cast in lowmem_page_address().

Signed-off-by: Qian Cai <cai@lca.pw>
Signed-off-by: Will Deacon <will@kernel.org>
2019-08-01 15:53:10 +01:00
Qian Cai
7d4e2dcf31 arm64/mm: fix variable 'pud' set but not used
GCC throws a warning,

arch/arm64/mm/mmu.c: In function 'pud_free_pmd_page':
arch/arm64/mm/mmu.c:1033:8: warning: variable 'pud' set but not used
[-Wunused-but-set-variable]
  pud_t pud;
        ^~~

because pud_table() is a macro and compiled away. Fix it by making it a
static inline function and for pud_sect() as well.

Signed-off-by: Qian Cai <cai@lca.pw>
Signed-off-by: Will Deacon <will@kernel.org>
2019-08-01 15:00:27 +01:00
Julien Thierry
677379bc91 arm64: Lower priority mask for GIC_PRIO_IRQON
On a system with two security states, if SCR_EL3.FIQ is cleared,
non-secure IRQ priorities get shifted to fit the secure view but
priority masks aren't.

On such system, it turns out that GIC_PRIO_IRQON masks the priority of
normal interrupts, which obviously ends up in a hang.

Increase GIC_PRIO_IRQON value (i.e. lower priority) to make sure
interrupts are not blocked by it.

Cc: Oleg Nesterov <oleg@redhat.com>
Fixes: bd82d4bd21 ("arm64: Fix incorrect irqflag restore for priority masking")
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Julien Thierry <julien.thierry.kdev@gmail.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
[will: fixed Fixes: tag]
Signed-off-by: Will Deacon <will@kernel.org>
2019-08-01 14:59:48 +01:00
Qian Cai
f1d4836201 arm64/efi: fix variable 'si' set but not used
GCC throws out this warning on arm64.

drivers/firmware/efi/libstub/arm-stub.c: In function 'efi_entry':
drivers/firmware/efi/libstub/arm-stub.c:132:22: warning: variable 'si'
set but not used [-Wunused-but-set-variable]

Fix it by making free_screen_info() a static inline function.

Acked-by: Will Deacon <will@kernel.org>
Signed-off-by: Qian Cai <cai@lca.pw>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2019-07-31 18:13:46 +01:00
Will Deacon
147b9635e6 arm64: cpufeature: Fix feature comparison for CTR_EL0.{CWG,ERG}
If CTR_EL0.{CWG,ERG} are 0b0000 then they must be interpreted to have
their architecturally maximum values, which defeats the use of
FTR_HIGHER_SAFE when sanitising CPU ID registers on heterogeneous
machines.

Introduce FTR_HIGHER_OR_ZERO_SAFE so that these fields effectively
saturate at zero.

Fixes: 3c739b5710 ("arm64: Keep track of CPU feature registers")
Cc: <stable@vger.kernel.org> # 4.4.x-
Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2019-07-31 18:10:55 +01:00
Thomas Gleixner
33a58980ff arm64: compat: vdso: Use legacy syscalls as fallback
The generic VDSO implementation uses the Y2038 safe clock_gettime64() and
clock_getres_time64() syscalls as fallback for 32bit VDSO. This breaks
seccomp setups because these syscalls might be not (yet) allowed.

Implement the 32bit variants which use the legacy syscalls and select the
variant in the core library.

The 64bit time variants are not removed because they are required for the
time64 based vdso accessors.

Fixes: 00b26474c2 ("lib/vdso: Provide generic VDSO implementation")
Reported-by: Sean Christopherson <sean.j.christopherson@intel.com>
Reported-by: Paul Bolle <pebolle@tiscali.nl>
Suggested-by: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Vincenzo Frascino <vincenzo.frascino@arm.com>
Reviewed-by: Vincenzo Frascino <vincenzo.frascino@arm.com>
Link: https://lkml.kernel.org/r/20190728131648.971361611@linutronix.de
2019-07-31 00:09:10 +02:00
Zenghui Yu
6701c619fa KVM: arm64: Update kvm_arm_exception_class and esr_class_str for new EC
We've added two ESR exception classes for new ARM hardware extensions:
ESR_ELx_EC_PAC and ESR_ELx_EC_SVE, but failed to update the strings
used in tracing and other debug.

Let's update "kvm_arm_exception_class" for these two EC, which the
new EC will be visible to user-space via kvm_exit trace events
Also update to "esr_class_str" for ESR_ELx_EC_PAC, by which we can
get more readable debug info.

Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Dave Martin <Dave.Martin@arm.com>
Reviewed-by: James Morse <james.morse@arm.com>
Signed-off-by: Zenghui Yu <yuzenghui@huawei.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
2019-07-26 15:40:38 +01:00
Atish Patra
60c1b220d8 cpu-topology: Move cpu topology code to common code.
Both RISC-V & ARM64 are using cpu-map device tree to describe
their cpu topology. It's better to move the relevant code to
a common place instead of duplicate code.

To: Will Deacon <will.deacon@arm.com>
To: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Atish Patra <atish.patra@wdc.com>
[Tested on QDF2400]
Tested-by: Jeffrey Hugo <jhugo@codeaurora.org>
[Tested on Juno and other embedded platforms.]
Tested-by: Sudeep Holla <sudeep.holla@arm.com>
Reviewed-by: Sudeep Holla <sudeep.holla@arm.com>
Acked-by: Will Deacon <will.deacon@arm.com>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Paul Walmsley <paul.walmsley@sifive.com>
2019-07-22 09:36:06 -07:00
Marc Zyngier
cbdf8a189a arm64: Force SSBS on context switch
On a CPU that doesn't support SSBS, PSTATE[12] is RES0.  In a system
where only some of the CPUs implement SSBS, we end-up losing track of
the SSBS bit across task migration.

To address this issue, let's force the SSBS bit on context switch.

Fixes: 8f04e8e6e2 ("arm64: ssbd: Add support for PSTATE.SSBS rather than trapping to EL3")
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
[will: inverted logic and added comments]
Signed-off-by: Will Deacon <will@kernel.org>
2019-07-22 15:24:16 +01:00
Anshuman Khandual
5a9060e943 arm64: mm: Drop pte_huge()
This helper is required from generic huge_pte_alloc() which is available
when arch subscribes ARCH_WANT_GENERAL_HUGETLB. arm64 implements it's own
huge_pte_alloc() and does not depend on the generic definition. Drop this
helper which is redundant on arm64.

Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Steve Capper <Steve.Capper@arm.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
Signed-off-by: Will Deacon <will@kernel.org>
2019-07-22 12:06:38 +01:00
Mark Rutland
592700f094 arm64: stacktrace: Better handle corrupted stacks
The arm64 stacktrace code is careful to only dereference frame records
in valid stack ranges, ensuring that a corrupted frame record won't
result in a faulting access.

However, it's still possible for corrupt frame records to result in
infinite loops in the stacktrace code, which is also undesirable.

This patch ensures that we complete a stacktrace in finite time, by
keeping track of which stacks we have already completed unwinding, and
verifying that if the next frame record is on the same stack, it is at a
higher address.

As this has turned out to be particularly subtle, comments are added to
explain the procedure.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: James Morse <james.morse@arm.com>
Tested-by: James Morse <james.morse@arm.com>
Acked-by: Dave Martin <Dave.Martin@arm.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Tengfei Fan <tengfeif@codeaurora.org>
Signed-off-by: Will Deacon <will@kernel.org>
2019-07-22 11:44:15 +01:00