Commit Graph

3303 Commits

Author SHA1 Message Date
Joerg Roedel
56fbacc9bf Merge branches 'arm/renesas', 'arm/qcom', 'arm/mediatek', 'arm/omap', 'arm/exynos', 'arm/smmu', 'ppc/pamu', 'x86/vt-d', 'x86/amd' and 'core' into next 2020-07-29 14:42:00 +02:00
Marc Zyngier
a394cf6e85 Merge branch 'kvm-arm64/misc-5.9' into kvmarm-master/next-WIP
Signed-off-by: Marc Zyngier <maz@kernel.org>
2020-07-28 16:26:16 +01:00
Marc Zyngier
c9dc95005a Merge branch 'kvm-arm64/target-table-no-more' into kvmarm-master/next-WIP
Signed-off-by: Marc Zyngier <maz@kernel.org>
2020-07-28 16:10:32 +01:00
Marc Zyngier
fc279329a8 Merge branch 'kvm-arm64/ptrauth-nvhe' into kvmarm-master/next-WIP
Signed-off-by: Marc Zyngier <maz@kernel.org>
2020-07-28 16:02:21 +01:00
Marc Zyngier
300dca6853 Merge branch 'kvm-arm64/pre-nv-5.9' into kvmarm-master/next-WIP
Signed-off-by: Marc Zyngier <maz@kernel.org>
2020-07-28 15:48:27 +01:00
David Brazdil
a59a2edbbb KVM: arm64: Substitute RANDOMIZE_BASE for HARDEN_EL2_VECTORS
The HARDEN_EL2_VECTORS config maps vectors at a fixed location on cores which
are susceptible to Spector variant 3a (A57, A72) to prevent defeating hyp
layout randomization by leaking the value of VBAR_EL2.

Since this feature is only applicable when EL2 layout randomization is enabled,
unify both behind the same RANDOMIZE_BASE Kconfig. Majority of code remains
conditional on a capability selected for the affected cores.

Signed-off-by: David Brazdil <dbrazdil@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20200721094445.82184-3-dbrazdil@google.com
2020-07-28 10:41:11 +01:00
Marc Zyngier
bf4086b1a1 KVM: arm64: Prevent vcpu_has_ptrauth from generating OOL functions
So far, vcpu_has_ptrauth() is implemented in terms of system_supports_*_auth()
calls, which are declared "inline". In some specific conditions (clang
and SCS), the "inline" very much turns into an "out of line", which
leads to a fireworks when this predicate is evaluated on a non-VHE
system (right at the beginning of __hyp_handle_ptrauth).

Instead, make sure vcpu_has_ptrauth gets expanded inline by directly
using the cpus_have_final_cap() helpers, which are __always_inline,
generate much better code, and are the only thing that make sense when
running at EL2 on a nVHE system.

Fixes: 29eb5a3c57 ("KVM: arm64: Handle PtrAuth traps early")
Reported-by: Nathan Chancellor <natechancellor@gmail.com>
Reported-by: Nick Desaulniers <ndesaulniers@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Tested-by: Nathan Chancellor <natechancellor@gmail.com>
Reviewed-by: Nathan Chancellor <natechancellor@gmail.com>
Link: https://lore.kernel.org/r/20200722162231.3689767-1-maz@kernel.org
2020-07-28 09:03:57 +01:00
David S. Miller
a57066b1a0 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
The UDP reuseport conflict was a little bit tricky.

The net-next code, via bpf-next, extracted the reuseport handling
into a helper so that the BPF sk lookup code could invoke it.

At the same time, the logic for reuseport handling of unconnected
sockets changed via commit efc6b6f6c3
which changed the logic to carry on the reuseport result into the
rest of the lookup loop if we do not return immediately.

This requires moving the reuseport_has_conns() logic into the callers.

While we are here, get rid of inline directives as they do not belong
in foo.c files.

The other changes were cases of more straightforward overlapping
modifications.

Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-25 17:49:04 -07:00
Andrei Vagin
3503d56cc7 arm64/vdso: Add time namespace page
Allocate the time namespace page among VVAR pages.  Provide
__arch_get_timens_vdso_data() helper for VDSO code to get the
code-relative position of VVARs on that special page.

If a task belongs to a time namespace then the VVAR page which contains
the system wide VDSO data is replaced with a namespace specific page
which has the same layout as the VVAR page. That page has vdso_data->seq
set to 1 to enforce the slow path and vdso_data->clock_mode set to
VCLOCK_TIMENS to enforce the time namespace handling path.

The extra check in the case that vdso_data->seq is odd, e.g. a concurrent
update of the VDSO data is in progress, is not really affecting regular
tasks which are not part of a time namespace as the task is spin waiting
for the update to finish and vdso_data->seq to become even again.

If a time namespace task hits that code path, it invokes the corresponding
time getter function which retrieves the real VVAR page, reads host time
and then adds the offset for the requested clock which is stored in the
special VVAR page.

The time-namespace page isn't allocated on !CONFIG_TIME_NAMESPACE, but
vma is the same size, which simplifies criu/vdso migration between
different kernel configs.

Signed-off-by: Andrei Vagin <avagin@gmail.com>
Reviewed-by: Vincenzo Frascino <vincenzo.frascino@arm.com>
Reviewed-by: Dmitry Safonov <dima@arista.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Link: https://lore.kernel.org/r/20200624083321.144975-4-avagin@gmail.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2020-07-24 13:15:20 +01:00
Catalin Marinas
a46cec12f4 arm64: Reserve HWCAP2_MTE as (1 << 18)
While MTE is not supported in the upstream kernel yet, add a comment
that HWCAP2_MTE as (1 << 18) is reserved. Glibc makes use of it for the
resolving (ifunc) of the MTE-safe string routines.

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2020-07-24 11:55:29 +01:00
Vladimir Murzin
493cf9b723 arm64: s/AMEVTYPE/AMEVTYPER
Activity Monitor Event Type Registers are named as AMEVTYPER{0,1}<n>

Signed-off-by: Vladimir Murzin <vladimir.murzin@arm.com>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Link: https://lore.kernel.org/r/20200721091259.102756-1-vladimir.murzin@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2020-07-22 13:59:38 +01:00
Valentin Schneider
25980c7a79 arch_topology, sched/core: Cleanup thermal pressure definition
The following commit:

  14533a16c4 ("thermal/cpu-cooling, sched/core: Move the arch_set_thermal_pressure() API to generic scheduler code")

moved the definition of arch_set_thermal_pressure() to sched/core.c, but
kept its declaration in linux/arch_topology.h. When building e.g. an x86
kernel with CONFIG_SCHED_THERMAL_PRESSURE=y, cpufreq_cooling.c ends up
getting the declaration of arch_set_thermal_pressure() from
include/linux/arch_topology.h, which is somewhat awkward.

On top of this, sched/core.c unconditionally defines
o The thermal_pressure percpu variable
o arch_set_thermal_pressure()

while arch_scale_thermal_pressure() does nothing unless redefined by the
architecture.

arch_*() functions are meant to be defined by architectures, so revert the
aforementioned commit and re-implement it in a way that keeps
arch_set_thermal_pressure() architecture-definable, and doesn't define the
thermal pressure percpu variable for kernels that don't need
it (CONFIG_SCHED_THERMAL_PRESSURE=n).

Signed-off-by: Valentin Schneider <valentin.schneider@arm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20200712165917.9168-2-valentin.schneider@arm.com
2020-07-22 10:22:05 +02:00
Shaokun Zhang
55fdc1f44c arm64: perf: Expose some new events via sysfs
Some new PMU events can been detected by PMCEID1_EL0, but it can't
be listed, Let's expose these through sysfs.

Signed-off-by: Shaokun Zhang <zhangshaokun@hisilicon.com>
Cc: Will Deacon <will@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Link: https://lore.kernel.org/r/1595328573-12751-2-git-send-email-zhangshaokun@hisilicon.com
Signed-off-by: Will Deacon <will@kernel.org>
2020-07-21 12:59:42 +01:00
Will Deacon
5f1f7f6c20 arm64: Reduce the number of header files pulled into vmlinux.lds.S
Although vmlinux.lds.S smells like an assembly file and is compiled
with __ASSEMBLY__ defined, it's actually just fed to the preprocessor to
create our linker script. This means that any assembly macros defined
by headers that it includes will result in a helpful link error:

| aarch64-linux-gnu-ld:./arch/arm64/kernel/vmlinux.lds:1: syntax error

In preparation for an arm64-private asm/rwonce.h implementation, which
will end up pulling assembly macros into linux/compiler.h, reduce the
number of headers we include directly and transitively in vmlinux.lds.S

Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Will Deacon <will@kernel.org>
2020-07-21 10:50:37 +01:00
Will Deacon
002dff36ac asm/rwonce: Don't pull <asm/barrier.h> into 'asm-generic/rwonce.h'
Now that 'smp_read_barrier_depends()' has gone the way of the Norwegian
Blue, drop the inclusion of <asm/barrier.h> in 'asm-generic/rwonce.h'.

This requires fixups to some architecture vdso headers which were
previously relying on 'asm/barrier.h' coming in via 'linux/compiler.h'.

Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Will Deacon <will@kernel.org>
2020-07-21 10:50:36 +01:00
Christoph Hellwig
55db9c0e85 net: remove compat_sys_{get,set}sockopt
Now that the ->compat_{get,set}sockopt proto_ops methods are gone
there is no good reason left to keep the compat syscalls separate.

This fixes the odd use of unsigned int for the compat_setsockopt
optlen and the missing sock_use_custom_sol_socket.

It would also easily allow running the eBPF hooks for the compat
syscalls, but such a large change in behavior does not belong into
a consolidation patch like this one.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-19 18:16:40 -07:00
Linus Torvalds
a570f41989 Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux into master
Pull arm64 fixes from Will Deacon:
 "A batch of arm64 fixes.

  Although the diffstat is a bit larger than we'd usually have at this
  stage, a decent amount of it is the addition of comments describing
  our syscall tracing behaviour, and also a sweep across all the modular
  arm64 PMU drivers to make them rebust against unloading and unbinding.

  There are a couple of minor things kicking around at the moment (CPU
  errata and module PLTs for very large modules), but I'm not expecting
  any significant changes now for us in 5.8.

   - Fix kernel text addresses for relocatable images booting using EFI
     and with KASLR disabled so that they match the vmlinux ELF binary.

   - Fix unloading and unbinding of PMU driver modules.

   - Fix generic mmiowb() when writeX() is called from preemptible
     context (reported by the riscv folks).

   - Fix ptrace hardware single-step interactions with signal handlers,
     system calls and reverse debugging.

   - Fix reporting of 64-bit x0 register for 32-bit tasks via
     'perf_regs'.

   - Add comments describing syscall entry/exit tracing ABI"

* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
  drivers/perf: Prevent forced unbinding of PMU drivers
  asm-generic/mmiowb: Allow mmiowb_set_pending() when preemptible()
  arm64: Use test_tsk_thread_flag() for checking TIF_SINGLESTEP
  arm64: ptrace: Use NO_SYSCALL instead of -1 in syscall_trace_enter()
  arm64: syscall: Expand the comment about ptrace and syscall(-1)
  arm64: ptrace: Add a comment describing our syscall entry/exit trap ABI
  arm64: compat: Ensure upper 32 bits of x0 are zero on syscall return
  arm64: ptrace: Override SPSR.SS when single-stepping is enabled
  arm64: ptrace: Consistently use pseudo-singlestep exceptions
  drivers/perf: Fix kernel panic when rmmod PMU modules during perf sampling
  efi/libstub/arm64: Retain 2MB kernel Image alignment if !KASLR
2020-07-17 15:27:52 -07:00
Will Deacon
15956689a0 arm64: compat: Ensure upper 32 bits of x0 are zero on syscall return
Although we zero the upper bits of x0 on entry to the kernel from an
AArch32 task, we do not clear them on the exception return path and can
therefore expose 64-bit sign extended syscall return values to userspace
via interfaces such as the 'perf_regs' ABI, which deal exclusively with
64-bit registers.

Explicitly clear the upper 32 bits of x0 on return from a compat system
call.

Cc: <stable@vger.kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Keno Fischer <keno@juliacomputing.com>
Cc: Luis Machado <luis.machado@linaro.org>
Signed-off-by: Will Deacon <will@kernel.org>
2020-07-16 11:41:31 +01:00
Will Deacon
3a5a4366ce arm64: ptrace: Override SPSR.SS when single-stepping is enabled
Luis reports that, when reverse debugging with GDB, single-step does not
function as expected on arm64:

  | I've noticed, under very specific conditions, that a PTRACE_SINGLESTEP
  | request by GDB won't execute the underlying instruction. As a consequence,
  | the PC doesn't move, but we return a SIGTRAP just like we would for a
  | regular successful PTRACE_SINGLESTEP request.

The underlying problem is that when the CPU register state is restored
as part of a reverse step, the SPSR.SS bit is cleared and so the hardware
single-step state can transition to the "active-pending" state, causing
an unexpected step exception to be taken immediately if a step operation
is attempted.

In hindsight, we probably shouldn't have exposed SPSR.SS in the pstate
accessible by the GPR regset, but it's a bit late for that now. Instead,
simply prevent userspace from configuring the bit to a value which is
inconsistent with the TIF_SINGLESTEP state for the task being traced.

Cc: <stable@vger.kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Keno Fischer <keno@juliacomputing.com>
Link: https://lore.kernel.org/r/1eed6d69-d53d-9657-1fc9-c089be07f98c@linaro.org
Reported-by: Luis Machado <luis.machado@linaro.org>
Tested-by: Luis Machado <luis.machado@linaro.org>
Signed-off-by: Will Deacon <will@kernel.org>
2020-07-16 11:41:21 +01:00
Will Deacon
ac2081cdc4 arm64: ptrace: Consistently use pseudo-singlestep exceptions
Although the arm64 single-step state machine can be fast-forwarded in
cases where we wish to generate a SIGTRAP without actually executing an
instruction, this has two major limitations outside of simply skipping
an instruction due to emulation.

1. Stepping out of a ptrace signal stop into a signal handler where
   SIGTRAP is blocked. Fast-forwarding the stepping state machine in
   this case will result in a forced SIGTRAP, with the handler reset to
   SIG_DFL.

2. The hardware implicitly fast-forwards the state machine when executing
   an SVC instruction for issuing a system call. This can interact badly
   with subsequent ptrace stops signalled during the execution of the
   system call (e.g. SYSCALL_EXIT or seccomp traps), as they may corrupt
   the stepping state by updating the PSTATE for the tracee.

Resolve both of these issues by injecting a pseudo-singlestep exception
on entry to a signal handler and also on return to userspace following a
system call.

Cc: <stable@vger.kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Tested-by: Luis Machado <luis.machado@linaro.org>
Reported-by: Keno Fischer <keno@juliacomputing.com>
Signed-off-by: Will Deacon <will@kernel.org>
2020-07-16 11:41:07 +01:00
Zhenyu Ye
d1d3aa98b1 arm64: tlb: Use the TLBI RANGE feature in arm64
Add __TLBI_VADDR_RANGE macro and rewrite __flush_tlb_range().

When cpu supports TLBI feature, the minimum range granularity is
decided by 'scale', so we can not flush all pages by one instruction
in some cases.

For example, when the pages = 0xe81a, let's start 'scale' from
maximum, and find right 'num' for each 'scale':

1. scale = 3, we can flush no pages because the minimum range is
   2^(5*3 + 1) = 0x10000.
2. scale = 2, the minimum range is 2^(5*2 + 1) = 0x800, we can
   flush 0xe800 pages this time, the num = 0xe800/0x800 - 1 = 0x1c.
   Remaining pages is 0x1a;
3. scale = 1, the minimum range is 2^(5*1 + 1) = 0x40, no page
   can be flushed.
4. scale = 0, we flush the remaining 0x1a pages, the num =
   0x1a/0x2 - 1 = 0xd.

However, in most scenarios, the pages = 1 when flush_tlb_range() is
called. Start from scale = 3 or other proper value (such as scale =
ilog2(pages)), will incur extra overhead.
So increase 'scale' from 0 to maximum, the flush order is exactly
opposite to the example.

Signed-off-by: Zhenyu Ye <yezhenyu2@huawei.com>
Link: https://lore.kernel.org/r/20200715071945.897-4-yezhenyu2@huawei.com
[catalin.marinas@arm.com: removed unnecessary masks in __TLBI_VADDR_RANGE]
[catalin.marinas@arm.com: __TLB_RANGE_NUM subtracts 1]
[catalin.marinas@arm.com: minor adjustments to the comments]
[catalin.marinas@arm.com: introduce system_supports_tlb_range()]
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2020-07-15 17:07:19 +01:00
Zhenyu Ye
b620ba5454 arm64: tlb: Detect the ARMv8.4 TLBI RANGE feature
ARMv8.4-TLBI provides TLBI invalidation instruction that apply to a
range of input addresses. This patch detect this feature.

Signed-off-by: Zhenyu Ye <yezhenyu2@huawei.com>
Link: https://lore.kernel.org/r/20200715071945.897-2-yezhenyu2@huawei.com
[catalin.marinas@arm.com: some renaming for consistency]
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2020-07-15 15:57:30 +01:00
Anshuman Khandual
abb7962adc arm64/hugetlb: Reserve CMA areas for gigantic pages on 16K and 64K configs
Currently 'hugetlb_cma=' command line argument does not create CMA area on
ARM64_16K_PAGES and ARM64_64K_PAGES based platforms. Instead, it just ends
up with the following warning message. Reason being, hugetlb_cma_reserve()
never gets called for these huge page sizes.

[   64.255669] hugetlb_cma: the option isn't supported by current arch

This enables CMA areas reservation on ARM64_16K_PAGES and ARM64_64K_PAGES
configs by defining an unified arm64_hugetlb_cma_reseve() that is wrapped
in CONFIG_CMA. Call site for arm64_hugetlb_cma_reserve() is also protected
as <asm/hugetlb.h> is conditionally included and hence cannot contain stub
for the inverse config i.e !(CONFIG_HUGETLB_PAGE && CONFIG_CMA).

Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Barry Song <song.bao.hua@hisilicon.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-kernel@vger.kernel.org
Link: https://lore.kernel.org/r/1593578521-24672-1-git-send-email-anshuman.khandual@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2020-07-15 13:38:03 +01:00
Ard Biesheuvel
1583052d11 arm64/acpi: disallow AML memory opregions to access kernel memory
AML uses SystemMemory opregions to allow AML handlers to access MMIO
registers of, e.g., GPIO controllers, or access reserved regions of
memory that are owned by the firmware.

Currently, we also allow AML access to memory that is owned by the
kernel and mapped via the linear region, which does not seem to be
supported by a valid use case, and exposes the kernel's internal
state to AML methods that may be buggy and exploitable.

On arm64, ACPI support requires booting in EFI mode, and so we can cross
reference the requested region against the EFI memory map, rather than
just do a minimal check on the first page. So let's only permit regions
to be remapped by the ACPI core if
- they don't appear in the EFI memory map at all (which is the case for
  most MMIO), or
- they are covered by a single region in the EFI memory map, which is not
  of a type that describes memory that is given to the kernel at boot.

Reported-by: Jason A. Donenfeld <Jason@zx2c4.com>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Acked-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Link: https://lore.kernel.org/r/20200626155832.2323789-2-ardb@kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2020-07-14 18:02:03 +01:00
Linus Torvalds
f4c8824cbc Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull arm64 fixes from Will Deacon:
 "An unfortunately large collection of arm64 fixes for -rc5.

  Some of this is absolutely trivial, but the alternatives, vDSO and CPU
  errata workaround fixes are significant. At least people are finding
  and fixing these things, I suppose.

   - Fix workaround for CPU erratum #1418040 to disable the compat vDSO

   - Fix Oops when single-stepping with KGDB

   - Fix memory attributes for hypervisor device mappings at EL2

   - Fix memory leak in PSCI and remove useless variable assignment

   - Fix up some comments and asm labels in our entry code

   - Fix broken register table formatting in our generated html docs

   - Fix missing NULL sentinel in CPU errata workaround list

   - Fix patching of branches in alternative instruction sections"

* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
  arm64/alternatives: don't patch up internal branches
  arm64: Add missing sentinel to erratum_1463225
  arm64: Documentation: Fix broken table in generated HTML
  arm64: kgdb: Fix single-step exception handling oops
  arm64: entry: Tidy up block comments and label numbers
  arm64: Rework ARM_ERRATUM_1414080 handling
  arm64: arch_timer: Disable the compat vdso for cores affected by ARM64_WORKAROUND_1418040
  arm64: arch_timer: Allow an workaround descriptor to disable compat vdso
  arm64: Introduce a way to disable the 32bit vdso
  arm64: entry: Fix the typo in the comment of el1_dbg()
  drivers/firmware/psci: Assign @err directly in hotplug_tests()
  drivers/firmware/psci: Fix memory leakage in alloc_init_cpu_groups()
  KVM: arm64: Fix definition of PAGE_HYP_DEVICE
2020-07-10 08:42:17 -07:00
Zhenyu Ye
61c11656b6 arm64: tlb: don't set the ttl value in flush_tlb_page_nosync
flush_tlb_page_nosync() may be called from pmd level, so we
can not set the ttl = 3 here.

The callstack is as follows:

	pmdp_set_access_flags
		ptep_set_access_flags
			flush_tlb_fix_spurious_fault
				flush_tlb_page
					flush_tlb_page_nosync

Fixes: e735b98a5f ("arm64: Add tlbi_user_level TLB invalidation helper")
Reported-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Zhenyu Ye <yezhenyu2@huawei.com>
Link: https://lore.kernel.org/r/20200710094158.468-1-yezhenyu2@huawei.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2020-07-10 16:27:49 +01:00
Tianjia Zhang
74cc7e0c35 KVM: arm64: clean up redundant 'kvm_run' parameters
In the current kvm version, 'kvm_run' has been included in the 'kvm_vcpu'
structure. For historical reasons, many kvm-related function parameters
retain the 'kvm_run' and 'kvm_vcpu' parameters at the same time. This
patch does a unified cleanup of these remaining redundant parameters.

Signed-off-by: Tianjia Zhang <tianjia.zhang@linux.alibaba.com>
Reviewed-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Message-Id: <20200623131418.31473-3-tianjia.zhang@linux.alibaba.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-07-10 04:26:40 -04:00
Sean Christopherson
c1a33aebe9 KVM: arm64: Use common KVM implementation of MMU memory caches
Move to the common MMU memory cache implementation now that the common
code and arm64's existing code are semantically compatible.

No functional change intended.

Cc: Marc Zyngier <maz@kernel.org>
Suggested-by: Christoffer Dall <christoffer.dall@arm.com>
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Message-Id: <20200703023545.8771-19-sean.j.christopherson@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-07-09 13:29:43 -04:00
Sean Christopherson
e539451b7e KVM: arm64: Use common code's approach for __GFP_ZERO with memory caches
Add a "gfp_zero" member to arm64's 'struct kvm_mmu_memory_cache' to make
the struct and its usage compatible with the common 'struct
kvm_mmu_memory_cache' in linux/kvm_host.h.  This will minimize code
churn when arm64 moves to the common implementation in a future patch, at
the cost of temporarily having somewhat silly code.

Note, GFP_PGTABLE_USER is equivalent to GFP_KERNEL_ACCOUNT | GFP_ZERO:

  #define GFP_PGTABLE_USER  (GFP_PGTABLE_KERNEL | __GFP_ACCOUNT)
  |
  -> #define GFP_PGTABLE_KERNEL        (GFP_KERNEL | __GFP_ZERO)

  == GFP_KERNEL | __GFP_ACCOUNT | __GFP_ZERO

versus

  #define GFP_KERNEL_ACCOUNT (GFP_KERNEL | __GFP_ACCOUNT)

    with __GFP_ZERO explicitly OR'd in

  == GFP_KERNEL | __GFP_ACCOUNT | __GFP_ZERO

No functional change intended.

Tested-by: Marc Zyngier <maz@kernel.org>
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Message-Id: <20200703023545.8771-18-sean.j.christopherson@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-07-09 13:29:43 -04:00
Sean Christopherson
2aa9c199cf KVM: Move x86's version of struct kvm_mmu_memory_cache to common code
Move x86's 'struct kvm_mmu_memory_cache' to common code in anticipation
of moving the entire x86 implementation code to common KVM and reusing
it for arm64 and MIPS.  Add a new architecture specific asm/kvm_types.h
to control the existence and parameters of the struct.  The new header
is needed to avoid a chicken-and-egg problem with asm/kvm_host.h as all
architectures define instances of the struct in their vCPU structs.

Add an asm-generic version of kvm_types.h to avoid having empty files on
PPC and s390 in the long term, and for arm64 and mips in the short term.

Suggested-by: Christoffer Dall <christoffer.dall@arm.com>
Reviewed-by: Ben Gardon <bgardon@google.com>
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Message-Id: <20200703023545.8771-15-sean.j.christopherson@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-07-09 13:29:42 -04:00
Marc Zyngier
c1fbec4ac0 arm64: arch_timer: Allow an workaround descriptor to disable compat vdso
As we are about to disable the vdso for compat tasks in some circumstances,
let's allow a workaround descriptor to express exactly that.

Signed-off-by: Marc Zyngier <maz@kernel.org>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20200706163802.1836732-3-maz@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>
2020-07-08 21:57:51 +01:00
Marc Zyngier
97884ca8c2 arm64: Introduce a way to disable the 32bit vdso
We have a class of errata (grouped under the ARM64_WORKAROUND_1418040
banner) that force the trapping of counter access from 32bit EL0.

We would normally disable the whole vdso for such defect, except that
it would disable it for 64bit userspace as well, which is a shame.

Instead, add a new vdso_clock_mode, which signals that the vdso
isn't usable for compat tasks.  This gets checked in the new
vdso_clocksource_ok() helper, now provided for the 32bit vdso.

Signed-off-by: Marc Zyngier <maz@kernel.org>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20200706163802.1836732-2-maz@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>
2020-07-08 21:57:51 +01:00
Will Deacon
68cf617309 KVM: arm64: Fix definition of PAGE_HYP_DEVICE
PAGE_HYP_DEVICE is intended to encode attribute bits for an EL2 stage-1
pte mapping a device. Unfortunately, it includes PROT_DEVICE_nGnRE which
encodes attributes for EL1 stage-1 mappings such as UXN and nG, which are
RES0 for EL2, and DBM which is meaningless as TCR_EL2.HD is not set.

Fix the definition of PAGE_HYP_DEVICE so that it doesn't set RES0 bits
at EL2.

Acked-by: Marc Zyngier <maz@kernel.org>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: James Morse <james.morse@arm.com>
Cc: <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/20200708162546.26176-1-will@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>
2020-07-08 21:35:48 +01:00
Marc Zyngier
6de7dd31de KVM: arm64: Don't use has_vhe() for CHOOSE_HYP_SYM()
The recently introduced CHOOSE_HYP_SYM() macro picks one symbol
or another, depending on whether the kernel run as a VHE
hypervisor or not. For that, it uses the has_vhe() helper, which
is itself implemented as a final capability.

Unfortunately, __copy_hyp_vect_bpi now indirectly uses CHOOSE_HYP_SYM
to get the __bp_harden_hyp_vecs symbol, using has_vhe() in the process.
At this stage, the capability isn't final and things explode:

[    0.000000] ACPI: SRAT not present
[    0.000000] percpu: Embedded 34 pages/cpu s101264 r8192 d29808 u139264
[    0.000000] Detected PIPT I-cache on CPU0
[    0.000000] ------------[ cut here ]------------
[    0.000000] kernel BUG at arch/arm64/include/asm/cpufeature.h:459!
[    0.000000] Internal error: Oops - BUG: 0 [#1] PREEMPT SMP
[    0.000000] Modules linked in:
[    0.000000] CPU: 0 PID: 0 Comm: swapper Not tainted 5.8.0-rc4-00080-gd630681366e5 #1388
[    0.000000] pstate: 80000085 (Nzcv daIf -PAN -UAO BTYPE=--)
[    0.000000] pc : check_branch_predictor+0x3a4/0x408
[    0.000000] lr : check_branch_predictor+0x2a4/0x408
[    0.000000] sp : ffff800011693e90
[    0.000000] x29: ffff800011693e90 x28: ffff8000116a1530
[    0.000000] x27: ffff8000112c1008 x26: ffff800010ca6ff8
[    0.000000] x25: ffff8000112c1000 x24: ffff8000116a1320
[    0.000000] x23: 0000000000000000 x22: ffff8000112c1000
[    0.000000] x21: ffff800010177120 x20: ffff8000116ae108
[    0.000000] x19: 0000000000000000 x18: ffff800011965c90
[    0.000000] x17: 0000000000022000 x16: 0000000000000003
[    0.000000] x15: 00000000ffffffff x14: ffff8000118c3a38
[    0.000000] x13: 0000000000000021 x12: 0000000000000022
[    0.000000] x11: d37a6f4de9bd37a7 x10: 000000000000001d
[    0.000000] x9 : 0000000000000000 x8 : ffff800011f8dad8
[    0.000000] x7 : ffff800011965ad0 x6 : 0000000000000003
[    0.000000] x5 : 0000000000000000 x4 : 0000000000000000
[    0.000000] x3 : 0000000000000100 x2 : 0000000000000004
[    0.000000] x1 : ffff8000116ae148 x0 : 0000000000000000
[    0.000000] Call trace:
[    0.000000]  check_branch_predictor+0x3a4/0x408
[    0.000000]  update_cpu_capabilities+0x84/0x138
[    0.000000]  init_cpu_features+0x2c0/0x2d8
[    0.000000]  cpuinfo_store_boot_cpu+0x54/0x64
[    0.000000]  smp_prepare_boot_cpu+0x2c/0x60
[    0.000000]  start_kernel+0x16c/0x574
[    0.000000] Code: 17ffffc7 91010281 14000198 17ffffca (d4210000)

This is addressed using a two-fold process:
- Replace has_vhe() with is_kernel_in_hyp_mode(), which tests
  whether we are running at EL2.
- Make CHOOSE_HYP_SYM() return an *undefined* symbol when
  compiled in the nVHE hypervisor, as we really should never
  use this helper in the nVHE-specific code.

With this in place, we're back to a bootable kernel again.

Fixes: b877e9849d ("KVM: arm64: Build hyp-entry.S separately for VHE/nVHE")
Signed-off-by: Marc Zyngier <maz@kernel.org>
2020-07-07 18:01:22 +01:00
Catalin Marinas
34e36d81a0 arm64: Shift the __tlbi_level() indentation left
This is for consistency with the other __tlbi macros in this file. No
functional change.

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2020-07-07 11:26:14 +01:00
Zhenyu Ye
a7ac1cfa4c arm64: tlb: Set the TTL field in flush_*_tlb_range
This patch implement flush_{pmd|pud}_tlb_range() in arm64 by
calling __flush_tlb_range() with the corresponding stride and
tlb_level values.

Signed-off-by: Zhenyu Ye <yezhenyu2@huawei.com>
Link: https://lore.kernel.org/r/20200625080314.230-7-yezhenyu2@huawei.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2020-07-07 11:23:47 +01:00
Zhenyu Ye
c4ab2cbc1d arm64: tlb: Set the TTL field in flush_tlb_range
This patch uses the cleared_* in struct mmu_gather to set the
TTL field in flush_tlb_range().

Signed-off-by: Zhenyu Ye <yezhenyu2@huawei.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Link: https://lore.kernel.org/r/20200625080314.230-6-yezhenyu2@huawei.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2020-07-07 11:23:47 +01:00
Zhenyu Ye
e735b98a5f arm64: Add tlbi_user_level TLB invalidation helper
Add a level-hinted parameter to __tlbi_user, which only gets used
if ARMv8.4-TTL gets detected.

ARMv8.4-TTL provides the TTL field in tlbi instruction to indicate
the level of translation table walk holding the leaf entry for the
address that is being invalidated.

This patch set the default level value of flush_tlb_range() to 0,
which will be updated in future patches.  And set the ttl value of
flush_tlb_page_nosync() to 3 because it is only called to flush a
single pte page.

Signed-off-by: Zhenyu Ye <yezhenyu2@huawei.com>
Link: https://lore.kernel.org/r/20200625080314.230-4-yezhenyu2@huawei.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2020-07-07 11:23:46 +01:00
Marc Zyngier
41ce82f63c KVM: arm64: timers: Move timer registers to the sys_regs file
Move the timer gsisters to the sysreg file. This will further help when
they are directly changed by a nesting hypervisor in the VNCR page.

This requires moving the initialisation of the timer struct so that some
of the helpers (such as arch_timer_ctx_index) can work correctly at an
early stage.

Signed-off-by: Marc Zyngier <maz@kernel.org>
2020-07-07 09:28:38 +01:00
Marc Zyngier
710f198218 KVM: arm64: Move SPSR_EL1 to the system register array
SPSR_EL1 being a VNCR-capable register with ARMv8.4-NV, move it to
the sysregs array and update the accessors.

Reviewed-by: James Morse <james.morse@arm.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
2020-07-07 09:28:38 +01:00
Marc Zyngier
fd85b66789 KVM: arm64: Disintegrate SPSR array
As we're about to move SPSR_EL1 into the VNCR page, we need to
disassociate it from the rest of the 32bit cruft. Let's break
the array into individual fields.

Reviewed-by: James Morse <james.morse@arm.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
2020-07-07 09:28:38 +01:00
Marc Zyngier
1bded23ea7 KVM: arm64: Move SP_EL1 to the system register array
SP_EL1 being a VNCR-capable register with ARMv8.4-NV, move it to the
system register array and update the accessors.

Reviewed-by: James Morse <james.morse@arm.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
2020-07-07 09:28:38 +01:00
Marc Zyngier
98909e6d1c KVM: arm64: Move ELR_EL1 to the system register array
As ELR-EL1 is a VNCR-capable register with ARMv8.4-NV, let's move it to
the sys_regs array and repaint the accessors. While we're at it, let's
kill the now useless accessors used only on the fault injection path.

Reviewed-by: James Morse <james.morse@arm.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
2020-07-07 09:28:38 +01:00
Marc Zyngier
e47c2055c6 KVM: arm64: Make struct kvm_regs userspace-only
struct kvm_regs is used by userspace to indicate which register gets
accessed by the {GET,SET}_ONE_REG API. But as we're about to refactor
the layout of the in-kernel register structures, we need the kernel to
move away from it.

Let's make kvm_regs userspace only, and let the kernel map it to its own
internal representation.

Reviewed-by: James Morse <james.morse@arm.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
2020-07-07 09:28:38 +01:00
Marc Zyngier
71071acfd3 KVM: arm64: hyp: Use ctxt_sys_reg/__vcpu_sys_reg instead of raw sys_regs access
Switch the hypervisor code to using ctxt_sys_reg/__vcpu_sys_reg instead
of raw sys_regs accesses. No intended functionnal change.

Signed-off-by: Marc Zyngier <maz@kernel.org>
2020-07-07 09:28:37 +01:00
Marc Zyngier
1b422dd7fc KVM: arm64: Introduce accessor for ctxt->sys_reg
In order to allow the disintegration of the per-vcpu sysreg array,
let's introduce a new helper (ctxt_sys_reg()) that returns the
in-memory copy of a system register, picked from a given context.

__vcpu_sys_reg() is rewritten to use this helper.

Signed-off-by: Marc Zyngier <maz@kernel.org>
2020-07-07 09:28:37 +01:00
Marc Zyngier
efaa5b93af KVM: arm64: Use TTL hint in when invalidating stage-2 translations
Since we often have a precise idea of the level we're dealing with
when invalidating TLBs, we can provide it to as a hint to our
invalidation helper.

Reviewed-by: James Morse <james.morse@arm.com>
Reviewed-by: Alexandru Elisei <alexandru.elisei@arm.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
2020-07-07 09:28:37 +01:00
Christoffer Dall
a0e50aa3f4 KVM: arm64: Factor out stage 2 page table data from struct kvm
As we are about to reuse our stage 2 page table manipulation code for
shadow stage 2 page tables in the context of nested virtualization, we
are going to manage multiple stage 2 page tables for a single VM.

This requires some pretty invasive changes to our data structures,
which moves the vmid and pgd pointers into a separate structure and
change pretty much all of our mmu code to operate on this structure
instead.

The new structure is called struct kvm_s2_mmu.

There is no intended functional change by this patch alone.

Reviewed-by: James Morse <james.morse@arm.com>
Reviewed-by: Alexandru Elisei <alexandru.elisei@arm.com>
[Designed data structure layout in collaboration]
Signed-off-by: Christoffer Dall <christoffer.dall@arm.com>
Co-developed-by: Marc Zyngier <maz@kernel.org>
[maz: Moved the last_vcpu_ran down to the S2 MMU structure as well]
Signed-off-by: Marc Zyngier <maz@kernel.org>
2020-07-07 09:28:37 +01:00
Marc Zyngier
ae4bffb555 Merge branch 'kvm-arm64/ttl-for-arm64' into HEAD
Signed-off-by: Marc Zyngier <maz@kernel.org>
2020-07-07 09:28:24 +01:00
Marc Zyngier
c10bc62ae4 arm64: Add level-hinted TLB invalidation helper
Add a level-hinted TLB invalidation helper that only gets used if
ARMv8.4-TTL gets detected.

Reviewed-by: Alexandru Elisei <alexandru.elisei@arm.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
2020-07-07 09:27:15 +01:00