linux/arch/arm64/Kconfig

1969 lines
64 KiB
Plaintext
Raw Normal View History

# SPDX-License-Identifier: GPL-2.0-only
config ARM64
def_bool y
select ACPI_CCA_REQUIRED if ACPI
ACPI: move arm64 GSI IRQ model to generic GSI IRQ layer The code deployed to implement GSI linux IRQ numbers mapping on arm64 turns out to be generic enough so that it can be moved to ACPI core code along with its respective config option ACPI_GENERIC_GSI selectable on architectures that can reuse the same code. Current ACPI IRQ mapping code is not integrated in the kernel IRQ domain infrastructure, in particular there is no way to look-up the IRQ domain associated with a particular interrupt controller, so this first version of GSI generic code carries out the GSI<->IRQ mapping relying on the IRQ default domain which is supposed to be always set on a specific architecture in case the domain structure passed to irq_create/find_mapping() functions is missing. This patch moves the arm64 acpi functions that implement the gsi mappings: acpi_gsi_to_irq() acpi_register_gsi() acpi_unregister_gsi() to ACPI core code. Since the generic GSI<->domain mapping is based on IRQ domains, it can be extended as soon as a way to map an interrupt controller to an IRQ domain is implemented for ACPI in the IRQ domain layer. x86 and ia64 code for GSI mappings cannot rely on the generic GSI layer at present for legacy reasons, so they do not select the ACPI_GENERIC_GSI config options and keep relying on their arch specific GSI mapping layer. Cc: Jiang Liu <jiang.liu@linux.intel.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Rafael J. Wysocki <rjw@rjwysocki.net> Acked-by: Hanjun Guo <hanjun.guo@linaro.org> Acked-by: Will Deacon <will.deacon@arm.com> Acked-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2015-03-24 17:58:51 +00:00
select ACPI_GENERIC_GSI if ACPI
select ACPI_GTDT if ACPI
select ACPI_IORT if ACPI
select ACPI_REDUCED_HARDWARE_ONLY if ACPI
select ACPI_MCFG if (ACPI && PCI)
select ACPI_SPCR_TABLE if ACPI
select ACPI_PPTT if ACPI
select ARCH_HAS_DEBUG_WX
select ARCH_BINFMT_ELF_STATE
select ARCH_HAS_DEBUG_VIRTUAL
mm/debug: add tests validating architecture page table helpers This adds tests which will validate architecture page table helpers and other accessors in their compliance with expected generic MM semantics. This will help various architectures in validating changes to existing page table helpers or addition of new ones. This test covers basic page table entry transformations including but not limited to old, young, dirty, clean, write, write protect etc at various level along with populating intermediate entries with next page table page and validating them. Test page table pages are allocated from system memory with required size and alignments. The mapped pfns at page table levels are derived from a real pfn representing a valid kernel text symbol. This test gets called via late_initcall(). This test gets built and run when CONFIG_DEBUG_VM_PGTABLE is selected. Any architecture, which is willing to subscribe this test will need to select ARCH_HAS_DEBUG_VM_PGTABLE. For now this is limited to arc, arm64, x86, s390 and powerpc platforms where the test is known to build and run successfully Going forward, other architectures too can subscribe the test after fixing any build or runtime problems with their page table helpers. Folks interested in making sure that a given platform's page table helpers conform to expected generic MM semantics should enable the above config which will just trigger this test during boot. Any non conformity here will be reported as an warning which would need to be fixed. This test will help catch any changes to the agreed upon semantics expected from generic MM and enable platforms to accommodate it thereafter. [anshuman.khandual@arm.com: v17] Link: http://lkml.kernel.org/r/1587436495-22033-3-git-send-email-anshuman.khandual@arm.com [anshuman.khandual@arm.com: v18] Link: http://lkml.kernel.org/r/1588564865-31160-3-git-send-email-anshuman.khandual@arm.com Suggested-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com> Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Qian Cai <cai@lca.pw> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Tested-by: Gerald Schaefer <gerald.schaefer@de.ibm.com> [s390] Tested-by: Christophe Leroy <christophe.leroy@c-s.fr> [ppc32] Reviewed-by: Ingo Molnar <mingo@kernel.org> Cc: Mike Rapoport <rppt@linux.ibm.com> Cc: Vineet Gupta <vgupta@synopsys.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will@kernel.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: Christian Borntraeger <borntraeger@de.ibm.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Kirill A. Shutemov <kirill@shutemov.name> Cc: Paul Walmsley <paul.walmsley@sifive.com> Cc: Palmer Dabbelt <palmer@dabbelt.com> Link: http://lkml.kernel.org/r/1583919272-24178-1-git-send-email-anshuman.khandual@arm.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-06-04 23:47:15 +00:00
select ARCH_HAS_DEBUG_VM_PGTABLE
select ARCH_HAS_DMA_PREP_COHERENT
select ARCH_HAS_ACPI_TABLE_UPGRADE if ACPI
select ARCH_HAS_FAST_MULTIPLIER
include/linux/string.h: add the option of fortified string.h functions This adds support for compiling with a rough equivalent to the glibc _FORTIFY_SOURCE=1 feature, providing compile-time and runtime buffer overflow checks for string.h functions when the compiler determines the size of the source or destination buffer at compile-time. Unlike glibc, it covers buffer reads in addition to writes. GNU C __builtin_*_chk intrinsics are avoided because they would force a much more complex implementation. They aren't designed to detect read overflows and offer no real benefit when using an implementation based on inline checks. Inline checks don't add up to much code size and allow full use of the regular string intrinsics while avoiding the need for a bunch of _chk functions and per-arch assembly to avoid wrapper overhead. This detects various overflows at compile-time in various drivers and some non-x86 core kernel code. There will likely be issues caught in regular use at runtime too. Future improvements left out of initial implementation for simplicity, as it's all quite optional and can be done incrementally: * Some of the fortified string functions (strncpy, strcat), don't yet place a limit on reads from the source based on __builtin_object_size of the source buffer. * Extending coverage to more string functions like strlcat. * It should be possible to optionally use __builtin_object_size(x, 1) for some functions (C strings) to detect intra-object overflows (like glibc's _FORTIFY_SOURCE=2), but for now this takes the conservative approach to avoid likely compatibility issues. * The compile-time checks should be made available via a separate config option which can be enabled by default (or always enabled) once enough time has passed to get the issues it catches fixed. Kees said: "This is great to have. While it was out-of-tree code, it would have blocked at least CVE-2016-3858 from being exploitable (improper size argument to strlcpy()). I've sent a number of fixes for out-of-bounds-reads that this detected upstream already" [arnd@arndb.de: x86: fix fortified memcpy] Link: http://lkml.kernel.org/r/20170627150047.660360-1-arnd@arndb.de [keescook@chromium.org: avoid panic() in favor of BUG()] Link: http://lkml.kernel.org/r/20170626235122.GA25261@beast [keescook@chromium.org: move from -mm, add ARCH_HAS_FORTIFY_SOURCE, tweak Kconfig help] Link: http://lkml.kernel.org/r/20170526095404.20439-1-danielmicay@gmail.com Link: http://lkml.kernel.org/r/1497903987-21002-8-git-send-email-keescook@chromium.org Signed-off-by: Daniel Micay <danielmicay@gmail.com> Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Kees Cook <keescook@chromium.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Daniel Axtens <dja@axtens.net> Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk> Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Cc: Chris Metcalf <cmetcalf@ezchip.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-07-12 21:36:10 +00:00
select ARCH_HAS_FORTIFY_SOURCE
select ARCH_HAS_GCOV_PROFILE_ALL
select ARCH_HAS_GIGANTIC_PAGE
select ARCH_HAS_KCOV
select ARCH_HAS_KEEPINITRD
select ARCH_HAS_MEMBARRIER_SYNC_CORE
bpf: Restrict bpf_probe_read{, str}() only to archs where they work Given the legacy bpf_probe_read{,str}() BPF helpers are broken on archs with overlapping address ranges, we should really take the next step to disable them from BPF use there. To generally fix the situation, we've recently added new helper variants bpf_probe_read_{user,kernel}() and bpf_probe_read_{user,kernel}_str(). For details on them, see 6ae08ae3dea2 ("bpf: Add probe_read_{user, kernel} and probe_read_{user,kernel}_str helpers"). Given bpf_probe_read{,str}() have been around for ~5 years by now, there are plenty of users at least on x86 still relying on them today, so we cannot remove them entirely w/o breaking the BPF tracing ecosystem. However, their use should be restricted to archs with non-overlapping address ranges where they are working in their current form. Therefore, move this behind a CONFIG_ARCH_HAS_NON_OVERLAPPING_ADDRESS_SPACE and have x86, arm64, arm select it (other archs supporting it can follow-up on it as well). For the remaining archs, they can workaround easily by relying on the feature probe from bpftool which spills out defines that can be used out of BPF C code to implement the drop-in replacement for old/new kernels via: bpftool feature probe macro Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Reviewed-by: Masami Hiramatsu <mhiramat@kernel.org> Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Cc: Brendan Gregg <brendan.d.gregg@gmail.com> Cc: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/bpf/20200515101118.6508-2-daniel@iogearbox.net
2020-05-15 10:11:16 +00:00
select ARCH_HAS_NON_OVERLAPPING_ADDRESS_SPACE
select ARCH_HAS_PTE_DEVMAP
mm: introduce ARCH_HAS_PTE_SPECIAL Currently the PTE special supports is turned on in per architecture header files. Most of the time, it is defined in arch/*/include/asm/pgtable.h depending or not on some other per architecture static definition. This patch introduce a new configuration variable to manage this directly in the Kconfig files. It would later replace __HAVE_ARCH_PTE_SPECIAL. Here notes for some architecture where the definition of __HAVE_ARCH_PTE_SPECIAL is not obvious: arm __HAVE_ARCH_PTE_SPECIAL which is currently defined in arch/arm/include/asm/pgtable-3level.h which is included by arch/arm/include/asm/pgtable.h when CONFIG_ARM_LPAE is set. So select ARCH_HAS_PTE_SPECIAL if ARM_LPAE. powerpc __HAVE_ARCH_PTE_SPECIAL is defined in 2 files: - arch/powerpc/include/asm/book3s/64/pgtable.h - arch/powerpc/include/asm/pte-common.h The first one is included if (PPC_BOOK3S & PPC64) while the second is included in all the other cases. So select ARCH_HAS_PTE_SPECIAL all the time. sparc: __HAVE_ARCH_PTE_SPECIAL is defined if defined(__sparc__) && defined(__arch64__) which are defined through the compiler in sparc/Makefile if !SPARC32 which I assume to be if SPARC64. So select ARCH_HAS_PTE_SPECIAL if SPARC64 There is no functional change introduced by this patch. Link: http://lkml.kernel.org/r/1523433816-14460-2-git-send-email-ldufour@linux.vnet.ibm.com Signed-off-by: Laurent Dufour <ldufour@linux.vnet.ibm.com> Suggested-by: Jerome Glisse <jglisse@redhat.com> Reviewed-by: Jerome Glisse <jglisse@redhat.com> Acked-by: David Rientjes <rientjes@google.com> Cc: Michal Hocko <mhocko@kernel.org> Cc: "Aneesh Kumar K . V" <aneesh.kumar@linux.vnet.ibm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Cc: Rich Felker <dalias@libc.org> Cc: David S. Miller <davem@davemloft.net> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: Vineet Gupta <vgupta@synopsys.com> Cc: Palmer Dabbelt <palmer@sifive.com> Cc: Albert Ou <albert@sifive.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: David Rientjes <rientjes@google.com> Cc: Robin Murphy <robin.murphy@arm.com> Cc: Christophe LEROY <christophe.leroy@c-s.fr> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-06-08 00:06:08 +00:00
select ARCH_HAS_PTE_SPECIAL
select ARCH_HAS_SETUP_DMA_OPS
select ARCH_HAS_SET_DIRECT_MAP
select ARCH_HAS_SET_MEMORY
select ARCH_STACKWALK
select ARCH_HAS_STRICT_KERNEL_RWX
select ARCH_HAS_STRICT_MODULE_RWX
select ARCH_HAS_SYNC_DMA_FOR_DEVICE
select ARCH_HAS_SYNC_DMA_FOR_CPU
arm64: implement syscall wrappers To minimize the risk of userspace-controlled values being used under speculation, this patch adds pt_regs based syscall wrappers for arm64, which pass the minimum set of required userspace values to syscall implementations. For each syscall, a wrapper which takes a pt_regs argument is automatically generated, and this extracts the arguments before calling the "real" syscall implementation. Each syscall has three functions generated: * __do_<compat_>sys_<name> is the "real" syscall implementation, with the expected prototype. * __se_<compat_>sys_<name> is the sign-extension/narrowing wrapper, inherited from common code. This takes a series of long parameters, casting each to the requisite types required by the "real" syscall implementation in __do_<compat_>sys_<name>. This wrapper *may* not be necessary on arm64 given the AAPCS rules on unused register bits, but it seemed safer to keep the wrapper for now. * __arm64_<compat_>_sys_<name> takes a struct pt_regs pointer, and extracts *only* the relevant register values, passing these on to the __se_<compat_>sys_<name> wrapper. The syscall invocation code is updated to handle the calling convention required by __arm64_<compat_>_sys_<name>, and passes a single struct pt_regs pointer. The compiler can fold the syscall implementation and its wrappers, such that the overhead of this approach is minimized. Note that we play games with sys_ni_syscall(). It can't be defined with SYSCALL_DEFINE0() because we must avoid the possibility of error injection. Additionally, there are a couple of locations where we need to call it from C code, and we don't (currently) have a ksys_ni_syscall(). While it has no wrapper, passing in a redundant pt_regs pointer is benign per the AAPCS. When ARCH_HAS_SYSCALL_WRAPPER is selected, no prototype is defines for sys_ni_syscall(). Since we need to treat it differently for in-kernel calls and the syscall tables, the prototype is defined as-required. The wrappers are largely the same as their x86 counterparts, but simplified as we don't have a variety of compat calling conventions that require separate stubs. Unlike x86, we have some zero-argument compat syscalls, and must define COMPAT_SYSCALL_DEFINE0() to ensure that these are also given an __arm64_compat_sys_ prefix. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Reviewed-by: Dominik Brodowski <linux@dominikbrodowski.net> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will.deacon@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-07-11 13:56:56 +00:00
select ARCH_HAS_SYSCALL_WRAPPER
select ARCH_HAS_TEARDOWN_DMA_OPS if IOMMU_SUPPORT
select ARCH_HAS_TICK_BROADCAST if GENERIC_CLOCKEVENTS_BROADCAST
select ARCH_HAVE_ELF_PROT
select ARCH_HAVE_NMI_SAFE_CMPXCHG
select ARCH_INLINE_READ_LOCK if !PREEMPTION
select ARCH_INLINE_READ_LOCK_BH if !PREEMPTION
select ARCH_INLINE_READ_LOCK_IRQ if !PREEMPTION
select ARCH_INLINE_READ_LOCK_IRQSAVE if !PREEMPTION
select ARCH_INLINE_READ_UNLOCK if !PREEMPTION
select ARCH_INLINE_READ_UNLOCK_BH if !PREEMPTION
select ARCH_INLINE_READ_UNLOCK_IRQ if !PREEMPTION
select ARCH_INLINE_READ_UNLOCK_IRQRESTORE if !PREEMPTION
select ARCH_INLINE_WRITE_LOCK if !PREEMPTION
select ARCH_INLINE_WRITE_LOCK_BH if !PREEMPTION
select ARCH_INLINE_WRITE_LOCK_IRQ if !PREEMPTION
select ARCH_INLINE_WRITE_LOCK_IRQSAVE if !PREEMPTION
select ARCH_INLINE_WRITE_UNLOCK if !PREEMPTION
select ARCH_INLINE_WRITE_UNLOCK_BH if !PREEMPTION
select ARCH_INLINE_WRITE_UNLOCK_IRQ if !PREEMPTION
select ARCH_INLINE_WRITE_UNLOCK_IRQRESTORE if !PREEMPTION
select ARCH_INLINE_SPIN_TRYLOCK if !PREEMPTION
select ARCH_INLINE_SPIN_TRYLOCK_BH if !PREEMPTION
select ARCH_INLINE_SPIN_LOCK if !PREEMPTION
select ARCH_INLINE_SPIN_LOCK_BH if !PREEMPTION
select ARCH_INLINE_SPIN_LOCK_IRQ if !PREEMPTION
select ARCH_INLINE_SPIN_LOCK_IRQSAVE if !PREEMPTION
select ARCH_INLINE_SPIN_UNLOCK if !PREEMPTION
select ARCH_INLINE_SPIN_UNLOCK_BH if !PREEMPTION
select ARCH_INLINE_SPIN_UNLOCK_IRQ if !PREEMPTION
select ARCH_INLINE_SPIN_UNLOCK_IRQRESTORE if !PREEMPTION
mm: memblock: make keeping memblock memory opt-in rather than opt-out Most architectures do not need the memblock memory after the page allocator is initialized, but only few enable ARCH_DISCARD_MEMBLOCK in the arch Kconfig. Replacing ARCH_DISCARD_MEMBLOCK with ARCH_KEEP_MEMBLOCK and inverting the logic makes it clear which architectures actually use memblock after system initialization and skips the necessity to add ARCH_DISCARD_MEMBLOCK to the architectures that are still missing that option. Link: http://lkml.kernel.org/r/1556102150-32517-1-git-send-email-rppt@linux.ibm.com Signed-off-by: Mike Rapoport <rppt@linux.ibm.com> Acked-by: Michael Ellerman <mpe@ellerman.id.au> (powerpc) Cc: Russell King <linux@armlinux.org.uk> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Richard Kuo <rkuo@codeaurora.org> Cc: Tony Luck <tony.luck@intel.com> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Paul Burton <paul.burton@mips.com> Cc: James Hogan <jhogan@kernel.org> Cc: Ley Foon Tan <lftan@altera.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Cc: Rich Felker <dalias@libc.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Eric Biederman <ebiederm@xmission.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-05-14 00:22:59 +00:00
select ARCH_KEEP_MEMBLOCK
select ARCH_USE_CMPXCHG_LOCKREF
select ARCH_USE_GNU_PROPERTY
select ARCH_USE_MEMTEST
select ARCH_USE_QUEUED_RWLOCKS
select ARCH_USE_QUEUED_SPINLOCKS
select ARCH_USE_SYM_ANNOTATIONS
arch, mm: restore dependency of __kernel_map_pages() on DEBUG_PAGEALLOC The design of DEBUG_PAGEALLOC presumes that __kernel_map_pages() must never fail. With this assumption is wouldn't be safe to allow general usage of this function. Moreover, some architectures that implement __kernel_map_pages() have this function guarded by #ifdef DEBUG_PAGEALLOC and some refuse to map/unmap pages when page allocation debugging is disabled at runtime. As all the users of __kernel_map_pages() were converted to use debug_pagealloc_map_pages() it is safe to make it available only when DEBUG_PAGEALLOC is set. Link: https://lkml.kernel.org/r/20201109192128.960-4-rppt@kernel.org Signed-off-by: Mike Rapoport <rppt@linux.ibm.com> Acked-by: David Hildenbrand <david@redhat.com> Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Albert Ou <aou@eecs.berkeley.edu> Cc: Andy Lutomirski <luto@kernel.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Christian Borntraeger <borntraeger@de.ibm.com> Cc: Christoph Lameter <cl@linux.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: David Rientjes <rientjes@google.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: "Edgecombe, Rick P" <rick.p.edgecombe@intel.com> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Len Brown <len.brown@intel.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Paul Walmsley <paul.walmsley@sifive.com> Cc: Pavel Machek <pavel@ucw.cz> Cc: Pekka Enberg <penberg@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Will Deacon <will@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-12-15 03:10:30 +00:00
select ARCH_SUPPORTS_DEBUG_PAGEALLOC
select ARCH_SUPPORTS_MEMORY_FAILURE
select ARCH_SUPPORTS_SHADOW_CALL_STACK if CC_HAVE_SHADOW_CALL_STACK
select ARCH_SUPPORTS_LTO_CLANG if CPU_LITTLE_ENDIAN
select ARCH_SUPPORTS_LTO_CLANG_THIN
select ARCH_SUPPORTS_CFI_CLANG
select ARCH_SUPPORTS_ATOMIC_RMW
select ARCH_SUPPORTS_INT128 if CC_HAS_INT128 && (GCC_VERSION >= 50000 || CC_IS_CLANG)
select ARCH_SUPPORTS_NUMA_BALANCING
select ARCH_WANT_COMPAT_IPC_PARSE_VERSION if COMPAT
bpf, x86, arm64: Enable jit by default when not built as always-on After Spectre 2 fix via 290af86629b2 ("bpf: introduce BPF_JIT_ALWAYS_ON config") most major distros use BPF_JIT_ALWAYS_ON configuration these days which compiles out the BPF interpreter entirely and always enables the JIT. Also given recent fix in e1608f3fa857 ("bpf: Avoid setting bpf insns pages read-only when prog is jited"), we additionally avoid fragmenting the direct map for the BPF insns pages sitting in the general data heap since they are not used during execution. Latter is only needed when run through the interpreter. Since both x86 and arm64 JITs have seen a lot of exposure over the years, are generally most up to date and maintained, there is more downside in !BPF_JIT_ALWAYS_ON configurations to have the interpreter enabled by default rather than the JIT. Add a ARCH_WANT_DEFAULT_BPF_JIT config which archs can use to set the bpf_jit_{enable,kallsyms} to 1. Back in the days the bpf_jit_kallsyms knob was set to 0 by default since major distros still had /proc/kallsyms addresses exposed to unprivileged user space which is not the case anymore. Hence both knobs are set via BPF_JIT_DEFAULT_ON which is set to 'y' in case of BPF_JIT_ALWAYS_ON or ARCH_WANT_DEFAULT_BPF_JIT. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Will Deacon <will@kernel.org> Acked-by: Martin KaFai Lau <kafai@fb.com> Link: https://lore.kernel.org/bpf/f78ad24795c2966efcc2ee19025fa3459f622185.1575903816.git.daniel@iogearbox.net
2019-12-09 15:08:03 +00:00
select ARCH_WANT_DEFAULT_BPF_JIT
select ARCH_WANT_DEFAULT_TOPDOWN_MMAP_LAYOUT
select ARCH_WANT_FRAME_POINTERS
select ARCH_WANT_HUGE_PMD_SHARE if ARM64_4K_PAGES || (ARM64_16K_PAGES && !ARM64_VA_BITS_36)
select ARCH_WANT_LD_ORPHAN_WARN
select ARCH_HAS_UBSAN_SANITIZE_ALL
select ARM_AMBA
select ARM_ARCH_TIMER
select ARM_GIC
select AUDIT_ARCH_COMPAT_GENERIC
PCI/MSI: irqchip: Fix PCI_MSI dependencies The PCI_MSI symbol is used inconsistently throughout the tree, with some drivers using 'select' and others using 'depends on', or using conditional selects. This keeps causing problems; the latest one is a result of ARCH_ALPINE using a 'select' statement to enable its platform-specific MSI driver without enabling MSI: warning: (ARCH_ALPINE) selects ALPINE_MSI which has unmet direct dependencies (PCI && PCI_MSI) drivers/irqchip/irq-alpine-msi.c:104:15: error: variable 'alpine_msix_domain_info' has initializer but incomplete type static struct msi_domain_info alpine_msix_domain_info = { ^~~~~~~~~~~~~~~ drivers/irqchip/irq-alpine-msi.c:105:2: error: unknown field 'flags' specified in initializer .flags = MSI_FLAG_USE_DEF_DOM_OPS | MSI_FLAG_USE_DEF_CHIP_OPS | ^ drivers/irqchip/irq-alpine-msi.c:105:11: error: 'MSI_FLAG_USE_DEF_DOM_OPS' undeclared here (not in a function) .flags = MSI_FLAG_USE_DEF_DOM_OPS | MSI_FLAG_USE_DEF_CHIP_OPS | ^~~~~~~~~~~~~~~~~~~~~~~~ There is little reason to enable PCI support for a platform that uses MSI but then leave MSI disabled at compile time. Select PCI_MSI from irqchips that implement MSI, and make PCI host bridges that use MSI on ARM depend on PCI_MSI_IRQ_DOMAIN. For all three architectures that support PCI_MSI_IRQ_DOMAIN (ARM, ARM64, X86), enable it by default whenever MSI is enabled. [bhelgaas: changelog, omit crypto config change] Suggested-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Acked-by: Marc Zyngier <marc.zyngier@arm.com>
2016-06-15 20:47:33 +00:00
select ARM_GIC_V2M if PCI
select ARM_GIC_V3
PCI/MSI: irqchip: Fix PCI_MSI dependencies The PCI_MSI symbol is used inconsistently throughout the tree, with some drivers using 'select' and others using 'depends on', or using conditional selects. This keeps causing problems; the latest one is a result of ARCH_ALPINE using a 'select' statement to enable its platform-specific MSI driver without enabling MSI: warning: (ARCH_ALPINE) selects ALPINE_MSI which has unmet direct dependencies (PCI && PCI_MSI) drivers/irqchip/irq-alpine-msi.c:104:15: error: variable 'alpine_msix_domain_info' has initializer but incomplete type static struct msi_domain_info alpine_msix_domain_info = { ^~~~~~~~~~~~~~~ drivers/irqchip/irq-alpine-msi.c:105:2: error: unknown field 'flags' specified in initializer .flags = MSI_FLAG_USE_DEF_DOM_OPS | MSI_FLAG_USE_DEF_CHIP_OPS | ^ drivers/irqchip/irq-alpine-msi.c:105:11: error: 'MSI_FLAG_USE_DEF_DOM_OPS' undeclared here (not in a function) .flags = MSI_FLAG_USE_DEF_DOM_OPS | MSI_FLAG_USE_DEF_CHIP_OPS | ^~~~~~~~~~~~~~~~~~~~~~~~ There is little reason to enable PCI support for a platform that uses MSI but then leave MSI disabled at compile time. Select PCI_MSI from irqchips that implement MSI, and make PCI host bridges that use MSI on ARM depend on PCI_MSI_IRQ_DOMAIN. For all three architectures that support PCI_MSI_IRQ_DOMAIN (ARM, ARM64, X86), enable it by default whenever MSI is enabled. [bhelgaas: changelog, omit crypto config change] Suggested-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Acked-by: Marc Zyngier <marc.zyngier@arm.com>
2016-06-15 20:47:33 +00:00
select ARM_GIC_V3_ITS if PCI
select ARM_PSCI_FW
select BUILDTIME_TABLE_SORT
select CLONE_BACKWARDS
select COMMON_CLK
select CPU_PM if (SUSPEND || CPU_IDLE)
select CRC32
select DCACHE_WORD_ACCESS
select DMA_DIRECT_REMAP
select EDAC_SUPPORT
select FRAME_POINTER
select GENERIC_ALLOCATOR
select GENERIC_ARCH_TOPOLOGY
select GENERIC_CLOCKEVENTS_BROADCAST
select GENERIC_CPU_AUTOPROBE
select GENERIC_CPU_VULNERABILITIES
select GENERIC_EARLY_IOREMAP
ARM64: enable GENERIC_FIND_FIRST_BIT ARM64 doesn't implement find_first_{zero}_bit in arch code and doesn't enable it in a config. It leads to using find_next_bit() which is less efficient: 0000000000000000 <find_first_bit>: 0: aa0003e4 mov x4, x0 4: aa0103e0 mov x0, x1 8: b4000181 cbz x1, 38 <find_first_bit+0x38> c: f9400083 ldr x3, [x4] 10: d2800802 mov x2, #0x40 // #64 14: 91002084 add x4, x4, #0x8 18: b40000c3 cbz x3, 30 <find_first_bit+0x30> 1c: 14000008 b 3c <find_first_bit+0x3c> 20: f8408483 ldr x3, [x4], #8 24: 91010045 add x5, x2, #0x40 28: b50000c3 cbnz x3, 40 <find_first_bit+0x40> 2c: aa0503e2 mov x2, x5 30: eb02001f cmp x0, x2 34: 54ffff68 b.hi 20 <find_first_bit+0x20> // b.pmore 38: d65f03c0 ret 3c: d2800002 mov x2, #0x0 // #0 40: dac00063 rbit x3, x3 44: dac01063 clz x3, x3 48: 8b020062 add x2, x3, x2 4c: eb02001f cmp x0, x2 50: 9a829000 csel x0, x0, x2, ls // ls = plast 54: d65f03c0 ret ... 0000000000000118 <_find_next_bit.constprop.1>: 118: eb02007f cmp x3, x2 11c: 540002e2 b.cs 178 <_find_next_bit.constprop.1+0x60> // b.hs, b.nlast 120: d346fc66 lsr x6, x3, #6 124: f8667805 ldr x5, [x0, x6, lsl #3] 128: b4000061 cbz x1, 134 <_find_next_bit.constprop.1+0x1c> 12c: f8667826 ldr x6, [x1, x6, lsl #3] 130: 8a0600a5 and x5, x5, x6 134: ca0400a6 eor x6, x5, x4 138: 92800005 mov x5, #0xffffffffffffffff // #-1 13c: 9ac320a5 lsl x5, x5, x3 140: 927ae463 and x3, x3, #0xffffffffffffffc0 144: ea0600a5 ands x5, x5, x6 148: 54000120 b.eq 16c <_find_next_bit.constprop.1+0x54> // b.none 14c: 1400000e b 184 <_find_next_bit.constprop.1+0x6c> 150: d346fc66 lsr x6, x3, #6 154: f8667805 ldr x5, [x0, x6, lsl #3] 158: b4000061 cbz x1, 164 <_find_next_bit.constprop.1+0x4c> 15c: f8667826 ldr x6, [x1, x6, lsl #3] 160: 8a0600a5 and x5, x5, x6 164: eb05009f cmp x4, x5 168: 540000c1 b.ne 180 <_find_next_bit.constprop.1+0x68> // b.any 16c: 91010063 add x3, x3, #0x40 170: eb03005f cmp x2, x3 174: 54fffee8 b.hi 150 <_find_next_bit.constprop.1+0x38> // b.pmore 178: aa0203e0 mov x0, x2 17c: d65f03c0 ret 180: ca050085 eor x5, x4, x5 184: dac000a5 rbit x5, x5 188: dac010a5 clz x5, x5 18c: 8b0300a3 add x3, x5, x3 190: eb03005f cmp x2, x3 194: 9a839042 csel x2, x2, x3, ls // ls = plast 198: aa0203e0 mov x0, x2 19c: d65f03c0 ret ... 0000000000000238 <find_next_bit>: 238: a9bf7bfd stp x29, x30, [sp, #-16]! 23c: aa0203e3 mov x3, x2 240: d2800004 mov x4, #0x0 // #0 244: aa0103e2 mov x2, x1 248: 910003fd mov x29, sp 24c: d2800001 mov x1, #0x0 // #0 250: 97ffffb2 bl 118 <_find_next_bit.constprop.1> 254: a8c17bfd ldp x29, x30, [sp], #16 258: d65f03c0 ret Enabling find_{first,next}_bit() would also benefit for_each_{set,clear}_bit(). On A-53 find_first_bit() is almost twice faster than find_next_bit(), according to lib/find_bit_benchmark (thanks to Alexey for testing): GENERIC_FIND_FIRST_BIT=n: [7126084.948181] find_first_bit: 47389224 ns, 16357 iterations [7126085.032315] find_first_bit: 19048193 ns, 655 iterations GENERIC_FIND_FIRST_BIT=y: [ 84.158068] find_first_bit: 27193319 ns, 16406 iterations [ 84.233005] find_first_bit: 11082437 ns, 656 iterations GENERIC_FIND_FIRST_BIT=n bloats the kernel despite that it disables generation of find_{first,next}_bit(): yury:linux$ scripts/bloat-o-meter vmlinux vmlinux.ffb add/remove: 4/1 grow/shrink: 19/251 up/down: 564/-1692 (-1128) ... Overall, GENERIC_FIND_FIRST_BIT=n is harmful both in terms of performance and code size, and it's better to have GENERIC_FIND_FIRST_BIT enabled. Tested-by: Alexey Klimov <aklimov@redhat.com> Signed-off-by: Yury Norov <yury.norov@gmail.com> Acked-by: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20210225135700.1381396-2-yury.norov@gmail.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2021-02-25 13:56:59 +00:00
select GENERIC_FIND_FIRST_BIT
select GENERIC_IDLE_POLL_SETUP
select GENERIC_IRQ_IPI
select GENERIC_IRQ_PROBE
select GENERIC_IRQ_SHOW
select GENERIC_IRQ_SHOW_LEVEL
select GENERIC_LIB_DEVMEM_IS_ALLOWED
select GENERIC_PCI_IOMAP
arm64: mm: convert mm/dump.c to use walk_page_range() Now walk_page_range() can walk kernel page tables, we can switch the arm64 ptdump code over to using it, simplifying the code. Link: http://lkml.kernel.org/r/20191218162402.45610-22-steven.price@arm.com Signed-off-by: Steven Price <steven.price@arm.com> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Cc: Albert Ou <aou@eecs.berkeley.edu> Cc: Alexandre Ghiti <alex@ghiti.fr> Cc: Andy Lutomirski <luto@kernel.org> Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Christian Borntraeger <borntraeger@de.ibm.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: David S. Miller <davem@davemloft.net> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Hogan <jhogan@kernel.org> Cc: James Morse <james.morse@arm.com> Cc: Jerome Glisse <jglisse@redhat.com> Cc: "Liang, Kan" <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Paul Burton <paul.burton@mips.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Paul Walmsley <paul.walmsley@sifive.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Russell King <linux@armlinux.org.uk> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: Vineet Gupta <vgupta@synopsys.com> Cc: Will Deacon <will@kernel.org> Cc: Zong Li <zong.li@sifive.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-02-04 01:36:29 +00:00
select GENERIC_PTDUMP
select GENERIC_SCHED_CLOCK
select GENERIC_SMP_IDLE_THREAD
select GENERIC_STRNCPY_FROM_USER
select GENERIC_STRNLEN_USER
select GENERIC_TIME_VSYSCALL
select GENERIC_GETTIMEOFDAY
select GENERIC_VDSO_TIME_NS
select HANDLE_DOMAIN_IRQ
select HARDIRQS_SW_RESEND
arm64: mremap speedup - Enable HAVE_MOVE_PMD HAVE_MOVE_PMD enables remapping pages at the PMD level if both the source and destination addresses are PMD-aligned. HAVE_MOVE_PMD is already enabled on x86. The original patch [1] that introduced this config did not enable it on arm64 at the time because of performance issues with flushing the TLB on every PMD move. These issues have since been addressed in more recent releases with improvements to the arm64 TLB invalidation and core mmu_gather code as Will Deacon mentioned in [2]. >From the data below, it can be inferred that there is approximately 8x improvement in performance when HAVE_MOVE_PMD is enabled on arm64. --------- Test Results ---------- The following results were obtained on an arm64 device running a 5.4 kernel, by remapping a PMD-aligned, 1GB sized region to a PMD-aligned destination. The results from 10 iterations of the test are given below. All times are in nanoseconds. Control HAVE_MOVE_PMD 9220833 1247761 9002552 1219896 9254115 1094792 8725885 1227760 9308646 1043698 9001667 1101771 8793385 1159896 8774636 1143594 9553125 1025833 9374010 1078125 9100885.4 1134312.6 <-- Mean Time in nanoseconds Total mremap time for a 1GB sized PMD-aligned region drops from ~9.1 milliseconds to ~1.1 milliseconds. (~8x speedup). [1] https://lore.kernel.org/r/20181108181201.88826-3-joelaf@google.com [2] https://www.mail-archive.com/linuxppc-dev@lists.ozlabs.org/msg140837.html Signed-off-by: Kalesh Singh <kaleshsingh@google.com> Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will@kernel.org> Cc: Andrew Morton <akpm@linux-foundation.org> Link: https://lore.kernel.org/r/20201014005320.2233162-3-kaleshsingh@google.com Link: https://lore.kernel.org/kvmarm/20181029102840.GC13965@arm.com/ Signed-off-by: Will Deacon <will@kernel.org>
2020-10-14 00:53:07 +00:00
select HAVE_MOVE_PMD
arm64: mremap speedup - enable HAVE_MOVE_PUD HAVE_MOVE_PUD enables remapping pages at the PUD level if both the source and destination addresses are PUD-aligned. With HAVE_MOVE_PUD enabled it can be inferred that there is approximately a 19x improvement in performance on arm64. (See data below). ------- Test Results --------- The following results were obtained using a 5.4 kernel, by remapping a PUD-aligned, 1GB sized region to a PUD-aligned destination. The results from 10 iterations of the test are given below: Total mremap times for 1GB data on arm64. All times are in nanoseconds. Control HAVE_MOVE_PUD 1247761 74271 1219896 46771 1094792 59687 1227760 48385 1043698 76666 1101771 50365 1159896 52500 1143594 75261 1025833 61354 1078125 48697 1134312.6 59395.7 <-- Mean time in nanoseconds A 1GB mremap completion time drops from ~1.1 milliseconds to ~59 microseconds on arm64. (~19x speed up). Link: https://lkml.kernel.org/r/20201014005320.2233162-5-kaleshsingh@google.com Signed-off-by: Kalesh Singh <kaleshsingh@google.com> Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will@kernel.org> Cc: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Cc: Anshuman Khandual <anshuman.khandual@arm.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Geffon <bgeffon@google.com> Cc: Christian Brauner <christian.brauner@ubuntu.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Frederic Weisbecker <frederic@kernel.org> Cc: Gavin Shan <gshan@redhat.com> Cc: Hassan Naveed <hnaveed@wavecomp.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jia He <justin.he@arm.com> Cc: John Hubbard <jhubbard@nvidia.com> Cc: Kees Cook <keescook@chromium.org> Cc: Krzysztof Kozlowski <krzk@kernel.org> Cc: Lokesh Gidra <lokeshgidra@google.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Masahiro Yamada <masahiroy@kernel.org> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mike Rapoport <rppt@kernel.org> Cc: Mina Almasry <almasrymina@google.com> Cc: Minchan Kim <minchan@google.com> Cc: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Ralph Campbell <rcampbell@nvidia.com> Cc: Ram Pai <linuxram@us.ibm.com> Cc: Sami Tolvanen <samitolvanen@google.com> Cc: Sandipan Das <sandipan@linux.ibm.com> Cc: SeongJae Park <sjpark@amazon.de> Cc: Shuah Khan <shuah@kernel.org> Cc: Steven Price <steven.price@arm.com> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Zi Yan <ziy@nvidia.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-12-15 03:07:35 +00:00
select HAVE_MOVE_PUD
select HAVE_PCI
select HAVE_ACPI_APEI if (ACPI && EFI)
select HAVE_ALIGNED_STRUCT_PAGE if SLUB
select HAVE_ARCH_AUDITSYSCALL
select HAVE_ARCH_BITREVERSE
select HAVE_ARCH_COMPILER_H
select HAVE_ARCH_HUGE_VMAP
select HAVE_ARCH_JUMP_LABEL
arm64/kernel: jump_label: Switch to relative references On a randomly chosen distro kernel build for arm64, vmlinux.o shows the following sections, containing jump label entries, and the associated RELA relocation records, respectively: ... [38088] __jump_table PROGBITS 0000000000000000 00e19f30 000000000002ea10 0000000000000000 WA 0 0 8 [38089] .rela__jump_table RELA 0000000000000000 01fd8bb0 000000000008be30 0000000000000018 I 38178 38088 8 ... In other words, we have 190 KB worth of 'struct jump_entry' instances, and 573 KB worth of RELA entries to relocate each entry's code, target and key members. This means the RELA section occupies 10% of the .init segment, and the two sections combined represent 5% of vmlinux's entire memory footprint. So let's switch from 64-bit absolute references to 32-bit relative references for the code and target field, and a 64-bit relative reference for the 'key' field (which may reside in another module or the core kernel, which may be more than 4 GB way on arm64 when running with KASLR enable): this reduces the size of the __jump_table by 33%, and gets rid of the RELA section entirely. Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Will Deacon <will.deacon@arm.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: linux-arm-kernel@lists.infradead.org Cc: linux-s390@vger.kernel.org Cc: Arnd Bergmann <arnd@arndb.de> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Kees Cook <keescook@chromium.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Jessica Yu <jeyu@kernel.org> Link: https://lkml.kernel.org/r/20180919065144.25010-4-ard.biesheuvel@linaro.org
2018-09-19 06:51:38 +00:00
select HAVE_ARCH_JUMP_LABEL_RELATIVE
arm64/mm/kasan: don't use vmemmap_populate() to initialize shadow The kasan shadow is currently mapped using vmemmap_populate() since that provides a semi-convenient way to map pages into init_top_pgt. However, since that no longer zeroes the mapped pages, it is not suitable for kasan, which requires zeroed shadow memory. Add kasan_populate_shadow() interface and use it instead of vmemmap_populate(). Besides, this allows us to take advantage of gigantic pages and use them to populate the shadow, which should save us some memory wasted on page tables and reduce TLB pressure. Link: http://lkml.kernel.org/r/20171103185147.2688-3-pasha.tatashin@oracle.com Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Pavel Tatashin <pasha.tatashin@oracle.com> Cc: Andrey Ryabinin <aryabinin@virtuozzo.com> Cc: Steven Sistare <steven.sistare@oracle.com> Cc: Daniel Jordan <daniel.m.jordan@oracle.com> Cc: Bob Picco <bob.picco@oracle.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Alexander Potapenko <glider@google.com> Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Christian Borntraeger <borntraeger@de.ibm.com> Cc: David S. Miller <davem@davemloft.net> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: Michal Hocko <mhocko@kernel.org> Cc: Sam Ravnborg <sam@ravnborg.org> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-11-16 01:36:40 +00:00
select HAVE_ARCH_KASAN if !(ARM64_16K_PAGES && ARM64_VA_BITS_48)
select HAVE_ARCH_KASAN_VMALLOC if HAVE_ARCH_KASAN
select HAVE_ARCH_KASAN_SW_TAGS if HAVE_ARCH_KASAN
select HAVE_ARCH_KASAN_HW_TAGS if (HAVE_ARCH_KASAN && ARM64_MTE)
arm64, kfence: enable KFENCE for ARM64 Add architecture specific implementation details for KFENCE and enable KFENCE for the arm64 architecture. In particular, this implements the required interface in <asm/kfence.h>. KFENCE requires that attributes for pages from its memory pool can individually be set. Therefore, force the entire linear map to be mapped at page granularity. Doing so may result in extra memory allocated for page tables in case rodata=full is not set; however, currently CONFIG_RODATA_FULL_DEFAULT_ENABLED=y is the default, and the common case is therefore not affected by this change. [elver@google.com: add missing copyright and description header] Link: https://lkml.kernel.org/r/20210118092159.145934-3-elver@google.com Link: https://lkml.kernel.org/r/20201103175841.3495947-4-elver@google.com Signed-off-by: Alexander Potapenko <glider@google.com> Signed-off-by: Marco Elver <elver@google.com> Reviewed-by: Dmitry Vyukov <dvyukov@google.com> Co-developed-by: Alexander Potapenko <glider@google.com> Reviewed-by: Jann Horn <jannh@google.com> Reviewed-by: Mark Rutland <mark.rutland@arm.com> Cc: Andrey Konovalov <andreyknvl@google.com> Cc: Andrey Ryabinin <aryabinin@virtuozzo.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Christopher Lameter <cl@linux.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: David Rientjes <rientjes@google.com> Cc: Eric Dumazet <edumazet@google.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Hillf Danton <hdanton@sina.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Joern Engel <joern@purestorage.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Kees Cook <keescook@chromium.org> Cc: Paul E. McKenney <paulmck@kernel.org> Cc: Pekka Enberg <penberg@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: SeongJae Park <sjpark@amazon.de> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Will Deacon <will@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-02-26 01:19:03 +00:00
select HAVE_ARCH_KFENCE
select HAVE_ARCH_KGDB
arm64: mm: support ARCH_MMAP_RND_BITS arm64: arch_mmap_rnd() uses STACK_RND_MASK to generate the random offset for the mmap base address. This value represents a compromise between increased ASLR effectiveness and avoiding address-space fragmentation. Replace it with a Kconfig option, which is sensibly bounded, so that platform developers may choose where to place this compromise. Keep default values as new minimums. Signed-off-by: Daniel Cashman <dcashman@google.com> Cc: Russell King <linux@arm.linux.org.uk> Acked-by: Kees Cook <keescook@chromium.org> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Don Zickus <dzickus@redhat.com> Cc: Eric W. Biederman <ebiederm@xmission.com> Cc: Heinrich Schuchardt <xypron.glpk@gmx.de> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Mel Gorman <mgorman@suse.de> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: David Rientjes <rientjes@google.com> Cc: Mark Salyzyn <salyzyn@android.com> Cc: Jeff Vander Stoep <jeffv@google.com> Cc: Nick Kralevich <nnk@google.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will.deacon@arm.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Hector Marco-Gisbert <hecmargi@upv.es> Cc: Borislav Petkov <bp@suse.de> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-01-14 23:20:01 +00:00
select HAVE_ARCH_MMAP_RND_BITS
select HAVE_ARCH_MMAP_RND_COMPAT_BITS if COMPAT
select HAVE_ARCH_PFN_VALID
arch: enable relative relocations for arm64, power and x86 Patch series "add support for relative references in special sections", v10. This adds support for emitting special sections such as initcall arrays, PCI fixups and tracepoints as relative references rather than absolute references. This reduces the size by 50% on 64-bit architectures, but more importantly, it removes the need for carrying relocation metadata for these sections in relocatable kernels (e.g., for KASLR) that needs to be fixed up at boot time. On arm64, this reduces the vmlinux footprint of such a reference by 8x (8 byte absolute reference + 24 byte RELA entry vs 4 byte relative reference) Patch #3 was sent out before as a single patch. This series supersedes the previous submission. This version makes relative ksymtab entries dependent on the new Kconfig symbol HAVE_ARCH_PREL32_RELOCATIONS rather than trying to infer from kbuild test robot replies for which architectures it should be blacklisted. Patch #1 introduces the new Kconfig symbol HAVE_ARCH_PREL32_RELOCATIONS, and sets it for the main architectures that are expected to benefit the most from this feature, i.e., 64-bit architectures or ones that use runtime relocations. Patch #2 add support for #define'ing __DISABLE_EXPORTS to get rid of ksymtab/kcrctab sections in decompressor and EFI stub objects when rebuilding existing C files to run in a different context. Patches #4 - #6 implement relative references for initcalls, PCI fixups and tracepoints, respectively, all of which produce sections with order ~1000 entries on an arm64 defconfig kernel with tracing enabled. This means we save about 28 KB of vmlinux space for each of these patches. [From the v7 series blurb, which included the jump_label patches as well]: For the arm64 kernel, all patches combined reduce the memory footprint of vmlinux by about 1.3 MB (using a config copied from Ubuntu that has KASLR enabled), of which ~1 MB is the size reduction of the RELA section in .init, and the remaining 300 KB is reduction of .text/.data. This patch (of 6): Before updating certain subsystems to use place relative 32-bit relocations in special sections, to save space and reduce the number of absolute relocations that need to be processed at runtime by relocatable kernels, introduce the Kconfig symbol and define it for some architectures that should be able to support and benefit from it. Link: http://lkml.kernel.org/r/20180704083651.24360-2-ard.biesheuvel@linaro.org Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Acked-by: Michael Ellerman <mpe@ellerman.id.au> Reviewed-by: Will Deacon <will.deacon@arm.com> Acked-by: Ingo Molnar <mingo@kernel.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Kees Cook <keescook@chromium.org> Cc: Thomas Garnier <thgarnie@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: "Serge E. Hallyn" <serge@hallyn.com> Cc: Bjorn Helgaas <bhelgaas@google.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Russell King <linux@armlinux.org.uk> Cc: Paul Mackerras <paulus@samba.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Petr Mladek <pmladek@suse.com> Cc: James Morris <jmorris@namei.org> Cc: Nicolas Pitre <nico@linaro.org> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>, Cc: James Morris <james.morris@microsoft.com> Cc: Jessica Yu <jeyu@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-08-22 04:56:00 +00:00
select HAVE_ARCH_PREL32_RELOCATIONS
select HAVE_ARCH_RANDOMIZE_KSTACK_OFFSET
select HAVE_ARCH_SECCOMP_FILTER
select HAVE_ARCH_STACKLEAK
select HAVE_ARCH_THREAD_STRUCT_WHITELIST
select HAVE_ARCH_TRACEHOOK
select HAVE_ARCH_TRANSPARENT_HUGEPAGE
select HAVE_ARCH_VMAP_STACK
select HAVE_ARM_SMCCC
select HAVE_ASM_MODVERSIONS
select HAVE_EBPF_JIT
select HAVE_C_RECORDMCOUNT
select HAVE_CMPXCHG_DOUBLE
select HAVE_CMPXCHG_LOCAL
select HAVE_CONTEXT_TRACKING
select HAVE_DEBUG_BUGVERBOSE
select HAVE_DEBUG_KMEMLEAK
select HAVE_DMA_CONTIGUOUS
select HAVE_DYNAMIC_FTRACE
arm64: implement ftrace with regs This patch implements FTRACE_WITH_REGS for arm64, which allows a traced function's arguments (and some other registers) to be captured into a struct pt_regs, allowing these to be inspected and/or modified. This is a building block for live-patching, where a function's arguments may be forwarded to another function. This is also necessary to enable ftrace and in-kernel pointer authentication at the same time, as it allows the LR value to be captured and adjusted prior to signing. Using GCC's -fpatchable-function-entry=N option, we can have the compiler insert a configurable number of NOPs between the function entry point and the usual prologue. This also ensures functions are AAPCS compliant (e.g. disabling inter-procedural register allocation). For example, with -fpatchable-function-entry=2, GCC 8.1.0 compiles the following: | unsigned long bar(void); | | unsigned long foo(void) | { | return bar() + 1; | } ... to: | <foo>: | nop | nop | stp x29, x30, [sp, #-16]! | mov x29, sp | bl 0 <bar> | add x0, x0, #0x1 | ldp x29, x30, [sp], #16 | ret This patch builds the kernel with -fpatchable-function-entry=2, prefixing each function with two NOPs. To trace a function, we replace these NOPs with a sequence that saves the LR into a GPR, then calls an ftrace entry assembly function which saves this and other relevant registers: | mov x9, x30 | bl <ftrace-entry> Since patchable functions are AAPCS compliant (and the kernel does not use x18 as a platform register), x9-x18 can be safely clobbered in the patched sequence and the ftrace entry code. There are now two ftrace entry functions, ftrace_regs_entry (which saves all GPRs), and ftrace_entry (which saves the bare minimum). A PLT is allocated for each within modules. Signed-off-by: Torsten Duwe <duwe@suse.de> [Mark: rework asm, comments, PLTs, initialization, commit message] Signed-off-by: Mark Rutland <mark.rutland@arm.com> Reviewed-by: Amit Daniel Kachhap <amit.kachhap@arm.com> Reviewed-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Reviewed-by: Torsten Duwe <duwe@suse.de> Tested-by: Amit Daniel Kachhap <amit.kachhap@arm.com> Tested-by: Torsten Duwe <duwe@suse.de> Cc: AKASHI Takahiro <takahiro.akashi@linaro.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Julien Thierry <jthierry@redhat.com> Cc: Will Deacon <will@kernel.org>
2019-02-08 15:10:19 +00:00
select HAVE_DYNAMIC_FTRACE_WITH_REGS \
if $(cc-option,-fpatchable-function-entry=2)
select FTRACE_MCOUNT_USE_PATCHABLE_FUNCTION_ENTRY \
if DYNAMIC_FTRACE_WITH_REGS
select HAVE_EFFICIENT_UNALIGNED_ACCESS
select HAVE_FAST_GUP
select HAVE_FTRACE_MCOUNT_RECORD
select HAVE_FUNCTION_TRACER
select HAVE_FUNCTION_ERROR_INJECTION
select HAVE_FUNCTION_GRAPH_TRACER
GCC plugin infrastructure This patch allows to build the whole kernel with GCC plugins. It was ported from grsecurity/PaX. The infrastructure supports building out-of-tree modules and building in a separate directory. Cross-compilation is supported too. Currently the x86, arm, arm64 and uml architectures enable plugins. The directory of the gcc plugins is scripts/gcc-plugins. You can use a file or a directory there. The plugins compile with these options: * -fno-rtti: gcc is compiled with this option so the plugins must use it too * -fno-exceptions: this is inherited from gcc too * -fasynchronous-unwind-tables: this is inherited from gcc too * -ggdb: it is useful for debugging a plugin (better backtrace on internal errors) * -Wno-narrowing: to suppress warnings from gcc headers (ipa-utils.h) * -Wno-unused-variable: to suppress warnings from gcc headers (gcc_version variable, plugin-version.h) The infrastructure introduces a new Makefile target called gcc-plugins. It supports all gcc versions from 4.5 to 6.0. The scripts/gcc-plugin.sh script chooses the proper host compiler (gcc-4.7 can be built by either gcc or g++). This script also checks the availability of the included headers in scripts/gcc-plugins/gcc-common.h. The gcc-common.h header contains frequently included headers for GCC plugins and it has a compatibility layer for the supported gcc versions. The gcc-generate-*-pass.h headers automatically generate the registration structures for GIMPLE, SIMPLE_IPA, IPA and RTL passes. Note that 'make clean' keeps the *.so files (only the distclean or mrproper targets clean all) because they are needed for out-of-tree modules. Based on work created by the PaX Team. Signed-off-by: Emese Revfy <re.emese@gmail.com> Acked-by: Kees Cook <keescook@chromium.org> Signed-off-by: Michal Marek <mmarek@suse.com>
2016-05-23 22:09:38 +00:00
select HAVE_GCC_PLUGINS
select HAVE_HW_BREAKPOINT if PERF_EVENTS
select HAVE_IRQ_TIME_ACCOUNTING
select HAVE_NMI
select HAVE_PATA_PLATFORM
select HAVE_PERF_EVENTS
select HAVE_PERF_REGS
select HAVE_PERF_USER_STACK_DUMP
select HAVE_REGS_AND_STACK_ACCESS_API
select HAVE_FUNCTION_ARG_ACCESS_API
select HAVE_FUTEX_CMPXCHG if FUTEX
select MMU_GATHER_RCU_TABLE_FREE
select HAVE_RSEQ
select HAVE_STACKPROTECTOR
select HAVE_SYSCALL_TRACEPOINTS
arm64: Kprobes with single stepping support Add support for basic kernel probes(kprobes) and jump probes (jprobes) for ARM64. Kprobes utilizes software breakpoint and single step debug exceptions supported on ARM v8. A software breakpoint is placed at the probe address to trap the kernel execution into the kprobe handler. ARM v8 supports enabling single stepping before the break exception return (ERET), with next PC in exception return address (ELR_EL1). The kprobe handler prepares an executable memory slot for out-of-line execution with a copy of the original instruction being probed, and enables single stepping. The PC is set to the out-of-line slot address before the ERET. With this scheme, the instruction is executed with the exact same register context except for the PC (and DAIF) registers. Debug mask (PSTATE.D) is enabled only when single stepping a recursive kprobe, e.g.: during kprobes reenter so that probed instruction can be single stepped within the kprobe handler -exception- context. The recursion depth of kprobe is always 2, i.e. upon probe re-entry, any further re-entry is prevented by not calling handlers and the case counted as a missed kprobe). Single stepping from the x-o-l slot has a drawback for PC-relative accesses like branching and symbolic literals access as the offset from the new PC (slot address) may not be ensured to fit in the immediate value of the opcode. Such instructions need simulation, so reject probing them. Instructions generating exceptions or cpu mode change are rejected for probing. Exclusive load/store instructions are rejected too. Additionally, the code is checked to see if it is inside an exclusive load/store sequence (code from Pratyush). System instructions are mostly enabled for stepping, except MSR/MRS accesses to "DAIF" flags in PSTATE, which are not safe for probing. This also changes arch/arm64/include/asm/ptrace.h to use include/asm-generic/ptrace.h. Thanks to Steve Capper and Pratyush Anand for several suggested Changes. Signed-off-by: Sandeepa Prabhu <sandeepa.s.prabhu@gmail.com> Signed-off-by: David A. Long <dave.long@linaro.org> Signed-off-by: Pratyush Anand <panand@redhat.com> Acked-by: Masami Hiramatsu <mhiramat@kernel.org> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2016-07-08 16:35:48 +00:00
select HAVE_KPROBES
select HAVE_KRETPROBES
select HAVE_GENERIC_VDSO
select IOMMU_DMA if IOMMU_SUPPORT
select IRQ_DOMAIN
select IRQ_FORCED_THREADING
select KASAN_VMALLOC if KASAN_GENERIC
select MODULES_USE_ELF_RELA
select NEED_DMA_MAP_STATE
select NEED_SG_DMA_LENGTH
select OF
select OF_EARLY_FLATTREE
select PCI_DOMAINS_GENERIC if PCI
select PCI_ECAM if (ACPI && PCI)
select PCI_SYSCALL if PCI
select POWER_RESET
select POWER_SUPPLY
select SPARSE_IRQ
select SWIOTLB
select SYSCTL_EXCEPTION_TRACE
arm64: split thread_info from task stack This patch moves arm64's struct thread_info from the task stack into task_struct. This protects thread_info from corruption in the case of stack overflows, and makes its address harder to determine if stack addresses are leaked, making a number of attacks more difficult. Precise detection and handling of overflow is left for subsequent patches. Largely, this involves changing code to store the task_struct in sp_el0, and acquire the thread_info from the task struct. Core code now implements current_thread_info(), and as noted in <linux/sched.h> this relies on offsetof(task_struct, thread_info) == 0, enforced by core code. This change means that the 'tsk' register used in entry.S now points to a task_struct, rather than a thread_info as it used to. To make this clear, the TI_* field offsets are renamed to TSK_TI_*, with asm-offsets appropriately updated to account for the structural change. Userspace clobbers sp_el0, and we can no longer restore this from the stack. Instead, the current task is cached in a per-cpu variable that we can safely access from early assembly as interrupts are disabled (and we are thus not preemptible). Both secondary entry and idle are updated to stash the sp and task pointer separately. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Tested-by: Laura Abbott <labbott@redhat.com> Cc: AKASHI Takahiro <takahiro.akashi@linaro.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org> Cc: James Morse <james.morse@arm.com> Cc: Kees Cook <keescook@chromium.org> Cc: Suzuki K Poulose <suzuki.poulose@arm.com> Cc: Will Deacon <will.deacon@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2016-11-03 20:23:13 +00:00
select THREAD_INFO_IN_TASK
userfaultfd: add minor fault registration mode Patch series "userfaultfd: add minor fault handling", v9. Overview ======== This series adds a new userfaultfd feature, UFFD_FEATURE_MINOR_HUGETLBFS. When enabled (via the UFFDIO_API ioctl), this feature means that any hugetlbfs VMAs registered with UFFDIO_REGISTER_MODE_MISSING will *also* get events for "minor" faults. By "minor" fault, I mean the following situation: Let there exist two mappings (i.e., VMAs) to the same page(s) (shared memory). One of the mappings is registered with userfaultfd (in minor mode), and the other is not. Via the non-UFFD mapping, the underlying pages have already been allocated & filled with some contents. The UFFD mapping has not yet been faulted in; when it is touched for the first time, this results in what I'm calling a "minor" fault. As a concrete example, when working with hugetlbfs, we have huge_pte_none(), but find_lock_page() finds an existing page. We also add a new ioctl to resolve such faults: UFFDIO_CONTINUE. The idea is, userspace resolves the fault by either a) doing nothing if the contents are already correct, or b) updating the underlying contents using the second, non-UFFD mapping (via memcpy/memset or similar, or something fancier like RDMA, or etc...). In either case, userspace issues UFFDIO_CONTINUE to tell the kernel "I have ensured the page contents are correct, carry on setting up the mapping". Use Case ======== Consider the use case of VM live migration (e.g. under QEMU/KVM): 1. While a VM is still running, we copy the contents of its memory to a target machine. The pages are populated on the target by writing to the non-UFFD mapping, using the setup described above. The VM is still running (and therefore its memory is likely changing), so this may be repeated several times, until we decide the target is "up to date enough". 2. We pause the VM on the source, and start executing on the target machine. During this gap, the VM's user(s) will *see* a pause, so it is desirable to minimize this window. 3. Between the last time any page was copied from the source to the target, and when the VM was paused, the contents of that page may have changed - and therefore the copy we have on the target machine is out of date. Although we can keep track of which pages are out of date, for VMs with large amounts of memory, it is "slow" to transfer this information to the target machine. We want to resume execution before such a transfer would complete. 4. So, the guest begins executing on the target machine. The first time it touches its memory (via the UFFD-registered mapping), userspace wants to intercept this fault. Userspace checks whether or not the page is up to date, and if not, copies the updated page from the source machine, via the non-UFFD mapping. Finally, whether a copy was performed or not, userspace issues a UFFDIO_CONTINUE ioctl to tell the kernel "I have ensured the page contents are correct, carry on setting up the mapping". We don't have to do all of the final updates on-demand. The userfaultfd manager can, in the background, also copy over updated pages once it receives the map of which pages are up-to-date or not. Interaction with Existing APIs ============================== Because this is a feature, a registered VMA could potentially receive both missing and minor faults. I spent some time thinking through how the existing API interacts with the new feature: UFFDIO_CONTINUE cannot be used to resolve non-minor faults, as it does not allocate a new page. If UFFDIO_CONTINUE is used on a non-minor fault: - For non-shared memory or shmem, -EINVAL is returned. - For hugetlb, -EFAULT is returned. UFFDIO_COPY and UFFDIO_ZEROPAGE cannot be used to resolve minor faults. Without modifications, the existing codepath assumes a new page needs to be allocated. This is okay, since userspace must have a second non-UFFD-registered mapping anyway, thus there isn't much reason to want to use these in any case (just memcpy or memset or similar). - If UFFDIO_COPY is used on a minor fault, -EEXIST is returned. - If UFFDIO_ZEROPAGE is used on a minor fault, -EEXIST is returned (or -EINVAL in the case of hugetlb, as UFFDIO_ZEROPAGE is unsupported in any case). - UFFDIO_WRITEPROTECT simply doesn't work with shared memory, and returns -ENOENT in that case (regardless of the kind of fault). Future Work =========== This series only supports hugetlbfs. I have a second series in flight to support shmem as well, extending the functionality. This series is more mature than the shmem support at this point, and the functionality works fully on hugetlbfs, so this series can be merged first and then shmem support will follow. This patch (of 6): This feature allows userspace to intercept "minor" faults. By "minor" faults, I mean the following situation: Let there exist two mappings (i.e., VMAs) to the same page(s). One of the mappings is registered with userfaultfd (in minor mode), and the other is not. Via the non-UFFD mapping, the underlying pages have already been allocated & filled with some contents. The UFFD mapping has not yet been faulted in; when it is touched for the first time, this results in what I'm calling a "minor" fault. As a concrete example, when working with hugetlbfs, we have huge_pte_none(), but find_lock_page() finds an existing page. This commit adds the new registration mode, and sets the relevant flag on the VMAs being registered. In the hugetlb fault path, if we find that we have huge_pte_none(), but find_lock_page() does indeed find an existing page, then we have a "minor" fault, and if the VMA has the userfaultfd registration flag, we call into userfaultfd to handle it. This is implemented as a new registration mode, instead of an API feature. This is because the alternative implementation has significant drawbacks [1]. However, doing it this was requires we allocate a VM_* flag for the new registration mode. On 32-bit systems, there are no unused bits, so this feature is only supported on architectures with CONFIG_ARCH_USES_HIGH_VMA_FLAGS. When attempting to register a VMA in MINOR mode on 32-bit architectures, we return -EINVAL. [1] https://lore.kernel.org/patchwork/patch/1380226/ [peterx@redhat.com: fix minor fault page leak] Link: https://lkml.kernel.org/r/20210322175132.36659-1-peterx@redhat.com Link: https://lkml.kernel.org/r/20210301222728.176417-1-axelrasmussen@google.com Link: https://lkml.kernel.org/r/20210301222728.176417-2-axelrasmussen@google.com Signed-off-by: Axel Rasmussen <axelrasmussen@google.com> Reviewed-by: Peter Xu <peterx@redhat.com> Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Alexey Dobriyan <adobriyan@gmail.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Anshuman Khandual <anshuman.khandual@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Chinwen Chang <chinwen.chang@mediatek.com> Cc: Huang Ying <ying.huang@intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jann Horn <jannh@google.com> Cc: Jerome Glisse <jglisse@redhat.com> Cc: Lokesh Gidra <lokeshgidra@google.com> Cc: "Matthew Wilcox (Oracle)" <willy@infradead.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: "Michal Koutn" <mkoutny@suse.com> Cc: Michel Lespinasse <walken@google.com> Cc: Mike Rapoport <rppt@linux.vnet.ibm.com> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: Peter Xu <peterx@redhat.com> Cc: Shaohua Li <shli@fb.com> Cc: Shawn Anastasio <shawn@anastas.io> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Steven Price <steven.price@arm.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Adam Ruprecht <ruprecht@google.com> Cc: Axel Rasmussen <axelrasmussen@google.com> Cc: Cannon Matthews <cannonmatthews@google.com> Cc: "Dr . David Alan Gilbert" <dgilbert@redhat.com> Cc: David Rientjes <rientjes@google.com> Cc: Mina Almasry <almasrymina@google.com> Cc: Oliver Upton <oupton@google.com> Cc: Kirill A. Shutemov <kirill@shutemov.name> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-05-05 01:35:36 +00:00
select HAVE_ARCH_USERFAULTFD_MINOR if USERFAULTFD
help
ARM 64-bit (AArch64) Linux support.
config 64BIT
def_bool y
config MMU
def_bool y
config ARM64_PAGE_SHIFT
int
default 16 if ARM64_64K_PAGES
default 14 if ARM64_16K_PAGES
default 12
config ARM64_CONT_PTE_SHIFT
int
default 5 if ARM64_64K_PAGES
default 7 if ARM64_16K_PAGES
default 4
config ARM64_CONT_PMD_SHIFT
int
default 5 if ARM64_64K_PAGES
default 5 if ARM64_16K_PAGES
default 4
arm64: mm: support ARCH_MMAP_RND_BITS arm64: arch_mmap_rnd() uses STACK_RND_MASK to generate the random offset for the mmap base address. This value represents a compromise between increased ASLR effectiveness and avoiding address-space fragmentation. Replace it with a Kconfig option, which is sensibly bounded, so that platform developers may choose where to place this compromise. Keep default values as new minimums. Signed-off-by: Daniel Cashman <dcashman@google.com> Cc: Russell King <linux@arm.linux.org.uk> Acked-by: Kees Cook <keescook@chromium.org> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Don Zickus <dzickus@redhat.com> Cc: Eric W. Biederman <ebiederm@xmission.com> Cc: Heinrich Schuchardt <xypron.glpk@gmx.de> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Mel Gorman <mgorman@suse.de> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: David Rientjes <rientjes@google.com> Cc: Mark Salyzyn <salyzyn@android.com> Cc: Jeff Vander Stoep <jeffv@google.com> Cc: Nick Kralevich <nnk@google.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will.deacon@arm.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Hector Marco-Gisbert <hecmargi@upv.es> Cc: Borislav Petkov <bp@suse.de> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-01-14 23:20:01 +00:00
config ARCH_MMAP_RND_BITS_MIN
default 14 if ARM64_64K_PAGES
default 16 if ARM64_16K_PAGES
default 18
# max bits determined by the following formula:
# VA_BITS - PAGE_SHIFT - 3
config ARCH_MMAP_RND_BITS_MAX
default 19 if ARM64_VA_BITS=36
default 24 if ARM64_VA_BITS=39
default 27 if ARM64_VA_BITS=42
default 30 if ARM64_VA_BITS=47
default 29 if ARM64_VA_BITS=48 && ARM64_64K_PAGES
default 31 if ARM64_VA_BITS=48 && ARM64_16K_PAGES
default 33 if ARM64_VA_BITS=48
default 14 if ARM64_64K_PAGES
default 16 if ARM64_16K_PAGES
default 18
config ARCH_MMAP_RND_COMPAT_BITS_MIN
default 7 if ARM64_64K_PAGES
default 9 if ARM64_16K_PAGES
default 11
config ARCH_MMAP_RND_COMPAT_BITS_MAX
default 16
config NO_IOPORT_MAP
def_bool y if !PCI
config STACKTRACE_SUPPORT
def_bool y
config ILLEGAL_POINTER_VALUE
hex
default 0xdead000000000000
config LOCKDEP_SUPPORT
def_bool y
config TRACE_IRQFLAGS_SUPPORT
def_bool y
config GENERIC_BUG
def_bool y
depends on BUG
config GENERIC_BUG_RELATIVE_POINTERS
def_bool y
depends on GENERIC_BUG
config GENERIC_HWEIGHT
def_bool y
config GENERIC_CSUM
def_bool y
config GENERIC_CALIBRATE_DELAY
def_bool y
config ZONE_DMA
bool "Support DMA zone" if EXPERT
default y
config ZONE_DMA32
bool "Support DMA32 zone" if EXPERT
default y
config ARCH_ENABLE_MEMORY_HOTPLUG
def_bool y
arm64/mm: Enable memory hot remove The arch code for hot-remove must tear down portions of the linear map and vmemmap corresponding to memory being removed. In both cases the page tables mapping these regions must be freed, and when sparse vmemmap is in use the memory backing the vmemmap must also be freed. This patch adds unmap_hotplug_range() and free_empty_tables() helpers which can be used to tear down either region and calls it from vmemmap_free() and ___remove_pgd_mapping(). The free_mapped argument determines whether the backing memory will be freed. It makes two distinct passes over the kernel page table. In the first pass with unmap_hotplug_range() it unmaps, invalidates applicable TLB cache and frees backing memory if required (vmemmap) for each mapped leaf entry. In the second pass with free_empty_tables() it looks for empty page table sections whose page table page can be unmapped, TLB invalidated and freed. While freeing intermediate level page table pages bail out if any of its entries are still valid. This can happen for partially filled kernel page table either from a previously attempted failed memory hot add or while removing an address range which does not span the entire page table page range. The vmemmap region may share levels of table with the vmalloc region. There can be conflicts between hot remove freeing page table pages with a concurrent vmalloc() walking the kernel page table. This conflict can not just be solved by taking the init_mm ptl because of existing locking scheme in vmalloc(). So free_empty_tables() implements a floor and ceiling method which is borrowed from user page table tear with free_pgd_range() which skips freeing page table pages if intermediate address range is not aligned or maximum floor-ceiling might not own the entire page table page. Boot memory on arm64 cannot be removed. Hence this registers a new memory hotplug notifier which prevents boot memory offlining and it's removal. While here update arch_add_memory() to handle __add_pages() failures by just unmapping recently added kernel linear mapping. Now enable memory hot remove on arm64 platforms by default with ARCH_ENABLE_MEMORY_HOTREMOVE. This implementation is overall inspired from kernel page table tear down procedure on X86 architecture and user page table tear down method. [Mike and Catalin added P4D page table level support] Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Mike Rapoport <rppt@linux.ibm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2020-03-04 04:28:43 +00:00
config ARCH_ENABLE_MEMORY_HOTREMOVE
def_bool y
config SMP
def_bool y
config KERNEL_MODE_NEON
def_bool y
config FIX_EARLYCON_MEM
def_bool y
config PGTABLE_LEVELS
int
default 2 if ARM64_16K_PAGES && ARM64_VA_BITS_36
default 2 if ARM64_64K_PAGES && ARM64_VA_BITS_42
default 3 if ARM64_64K_PAGES && (ARM64_VA_BITS_48 || ARM64_VA_BITS_52)
default 3 if ARM64_4K_PAGES && ARM64_VA_BITS_39
default 3 if ARM64_16K_PAGES && ARM64_VA_BITS_47
default 4 if !ARM64_64K_PAGES && ARM64_VA_BITS_48
config ARCH_SUPPORTS_UPROBES
def_bool y
config ARCH_PROC_KCORE_TEXT
def_bool y
config BROKEN_GAS_INST
def_bool !$(as-instr,1:\n.inst 0\n.rept . - 1b\n\nnop\n.endr\n)
arm64: kasan: Switch to using KASAN_SHADOW_OFFSET KASAN_SHADOW_OFFSET is a constant that is supplied to gcc as a command line argument and affects the codegen of the inline address sanetiser. Essentially, for an example memory access: *ptr1 = val; The compiler will insert logic similar to the below: shadowValue = *(ptr1 >> KASAN_SHADOW_SCALE_SHIFT + KASAN_SHADOW_OFFSET) if (somethingWrong(shadowValue)) flagAnError(); This code sequence is inserted into many places, thus KASAN_SHADOW_OFFSET is essentially baked into many places in the kernel text. If we want to run a single kernel binary with multiple address spaces, then we need to do this with KASAN_SHADOW_OFFSET fixed. Thankfully, due to the way the KASAN_SHADOW_OFFSET is used to provide shadow addresses we know that the end of the shadow region is constant w.r.t. VA space size: KASAN_SHADOW_END = ~0 >> KASAN_SHADOW_SCALE_SHIFT + KASAN_SHADOW_OFFSET This means that if we increase the size of the VA space, the start of the KASAN region expands into lower addresses whilst the end of the KASAN region is fixed. Currently the arm64 code computes KASAN_SHADOW_OFFSET at build time via build scripts with the VA size used as a parameter. (There are build time checks in the C code too to ensure that expected values are being derived). It is sufficient, and indeed is a simplification, to remove the build scripts (and build time checks) entirely and instead provide KASAN_SHADOW_OFFSET values. This patch removes the logic to compute the KASAN_SHADOW_OFFSET in the arm64 Makefile, and instead we adopt the approach used by x86 to supply offset values in kConfig. To help debug/develop future VA space changes, the Makefile logic has been preserved in a script file in the arm64 Documentation folder. Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Steve Capper <steve.capper@arm.com> Signed-off-by: Will Deacon <will@kernel.org>
2019-08-07 15:55:15 +00:00
config KASAN_SHADOW_OFFSET
hex
depends on KASAN_GENERIC || KASAN_SW_TAGS
arm64: mm: extend linear region for 52-bit VA configurations For historical reasons, the arm64 kernel VA space is configured as two equally sized halves, i.e., on a 48-bit VA build, the VA space is split into a 47-bit vmalloc region and a 47-bit linear region. When support for 52-bit virtual addressing was added, this equal split was kept, resulting in a substantial waste of virtual address space in the linear region: 48-bit VA 52-bit VA 0xffff_ffff_ffff_ffff +-------------+ +-------------+ | vmalloc | | vmalloc | 0xffff_8000_0000_0000 +-------------+ _PAGE_END(48) +-------------+ | linear | : : 0xffff_0000_0000_0000 +-------------+ : : : : : : : : : : : : : : : : : currently : : unusable : : : : : : unused : : by : : : : : : : : hardware : : : : : : : 0xfff8_0000_0000_0000 : : _PAGE_END(52) +-------------+ : : | | : : | | : : | | : : | | : : | | : unusable : | | : : | linear | : by : | | : : | region | : hardware : | | : : | | : : | | : : | | : : | | : : | | : : | | 0xfff0_0000_0000_0000 +-------------+ PAGE_OFFSET +-------------+ As illustrated above, the 52-bit VA kernel uses 47 bits for the vmalloc space (as before), to ensure that a single 64k granule kernel image can support any 64k granule capable system, regardless of whether it supports the 52-bit virtual addressing extension. However, due to the fact that the VA space is still split in equal halves, the linear region is only 2^51 bytes in size, wasting almost half of the 52-bit VA space. Let's fix this, by abandoning the equal split, and simply assigning all VA space outside of the vmalloc region to the linear region. The KASAN shadow region is reconfigured so that it ends at the start of the vmalloc region, and grows downwards. That way, the arrangement of the vmalloc space (which contains kernel mappings, modules, BPF region, the vmemmap array etc) is identical between non-KASAN and KASAN builds, which aids debugging. Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Reviewed-by: Steve Capper <steve.capper@arm.com> Link: https://lore.kernel.org/r/20201008153602.9467-3-ardb@kernel.org Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2020-10-08 15:36:00 +00:00
default 0xdfff800000000000 if (ARM64_VA_BITS_48 || ARM64_VA_BITS_52) && !KASAN_SW_TAGS
default 0xdfffc00000000000 if ARM64_VA_BITS_47 && !KASAN_SW_TAGS
default 0xdffffe0000000000 if ARM64_VA_BITS_42 && !KASAN_SW_TAGS
default 0xdfffffc000000000 if ARM64_VA_BITS_39 && !KASAN_SW_TAGS
default 0xdffffff800000000 if ARM64_VA_BITS_36 && !KASAN_SW_TAGS
default 0xefff800000000000 if (ARM64_VA_BITS_48 || ARM64_VA_BITS_52) && KASAN_SW_TAGS
default 0xefffc00000000000 if ARM64_VA_BITS_47 && KASAN_SW_TAGS
default 0xeffffe0000000000 if ARM64_VA_BITS_42 && KASAN_SW_TAGS
default 0xefffffc000000000 if ARM64_VA_BITS_39 && KASAN_SW_TAGS
default 0xeffffff800000000 if ARM64_VA_BITS_36 && KASAN_SW_TAGS
arm64: kasan: Switch to using KASAN_SHADOW_OFFSET KASAN_SHADOW_OFFSET is a constant that is supplied to gcc as a command line argument and affects the codegen of the inline address sanetiser. Essentially, for an example memory access: *ptr1 = val; The compiler will insert logic similar to the below: shadowValue = *(ptr1 >> KASAN_SHADOW_SCALE_SHIFT + KASAN_SHADOW_OFFSET) if (somethingWrong(shadowValue)) flagAnError(); This code sequence is inserted into many places, thus KASAN_SHADOW_OFFSET is essentially baked into many places in the kernel text. If we want to run a single kernel binary with multiple address spaces, then we need to do this with KASAN_SHADOW_OFFSET fixed. Thankfully, due to the way the KASAN_SHADOW_OFFSET is used to provide shadow addresses we know that the end of the shadow region is constant w.r.t. VA space size: KASAN_SHADOW_END = ~0 >> KASAN_SHADOW_SCALE_SHIFT + KASAN_SHADOW_OFFSET This means that if we increase the size of the VA space, the start of the KASAN region expands into lower addresses whilst the end of the KASAN region is fixed. Currently the arm64 code computes KASAN_SHADOW_OFFSET at build time via build scripts with the VA size used as a parameter. (There are build time checks in the C code too to ensure that expected values are being derived). It is sufficient, and indeed is a simplification, to remove the build scripts (and build time checks) entirely and instead provide KASAN_SHADOW_OFFSET values. This patch removes the logic to compute the KASAN_SHADOW_OFFSET in the arm64 Makefile, and instead we adopt the approach used by x86 to supply offset values in kConfig. To help debug/develop future VA space changes, the Makefile logic has been preserved in a script file in the arm64 Documentation folder. Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Steve Capper <steve.capper@arm.com> Signed-off-by: Will Deacon <will@kernel.org>
2019-08-07 15:55:15 +00:00
default 0xffffffffffffffff
source "arch/arm64/Kconfig.platforms"
menu "Kernel Features"
menu "ARM errata workarounds via the alternatives framework"
config ARM64_WORKAROUND_CLEAN_CACHE
bool
config ARM64_ERRATUM_826319
bool "Cortex-A53: 826319: System might deadlock if a write cannot complete until read data is accepted"
default y
select ARM64_WORKAROUND_CLEAN_CACHE
help
This option adds an alternative code sequence to work around ARM
erratum 826319 on Cortex-A53 parts up to r0p2 with an AMBA 4 ACE or
AXI master interface and an L2 cache.
If a Cortex-A53 uses an AMBA AXI4 ACE interface to other processors
and is unable to accept a certain write via this interface, it will
not progress on read data presented on the read data channel and the
system can deadlock.
The workaround promotes data cache clean instructions to
data cache clean-and-invalidate.
Please note that this does not necessarily enable the workaround,
as it depends on the alternative framework, which will only patch
the kernel if an affected CPU is detected.
If unsure, say Y.
config ARM64_ERRATUM_827319
bool "Cortex-A53: 827319: Data cache clean instructions might cause overlapping transactions to the interconnect"
default y
select ARM64_WORKAROUND_CLEAN_CACHE
help
This option adds an alternative code sequence to work around ARM
erratum 827319 on Cortex-A53 parts up to r0p2 with an AMBA 5 CHI
master interface and an L2 cache.
Under certain conditions this erratum can cause a clean line eviction
to occur at the same time as another transaction to the same address
on the AMBA 5 CHI interface, which can cause data corruption if the
interconnect reorders the two transactions.
The workaround promotes data cache clean instructions to
data cache clean-and-invalidate.
Please note that this does not necessarily enable the workaround,
as it depends on the alternative framework, which will only patch
the kernel if an affected CPU is detected.
If unsure, say Y.
config ARM64_ERRATUM_824069
bool "Cortex-A53: 824069: Cache line might not be marked as clean after a CleanShared snoop"
default y
select ARM64_WORKAROUND_CLEAN_CACHE
help
This option adds an alternative code sequence to work around ARM
erratum 824069 on Cortex-A53 parts up to r0p2 when it is connected
to a coherent interconnect.
If a Cortex-A53 processor is executing a store or prefetch for
write instruction at the same time as a processor in another
cluster is executing a cache maintenance operation to the same
address, then this erratum might cause a clean cache line to be
incorrectly marked as dirty.
The workaround promotes data cache clean instructions to
data cache clean-and-invalidate.
Please note that this option does not necessarily enable the
workaround, as it depends on the alternative framework, which will
only patch the kernel if an affected CPU is detected.
If unsure, say Y.
config ARM64_ERRATUM_819472
bool "Cortex-A53: 819472: Store exclusive instructions might cause data corruption"
default y
select ARM64_WORKAROUND_CLEAN_CACHE
help
This option adds an alternative code sequence to work around ARM
erratum 819472 on Cortex-A53 parts up to r0p1 with an L2 cache
present when it is connected to a coherent interconnect.
If the processor is executing a load and store exclusive sequence at
the same time as a processor in another cluster is executing a cache
maintenance operation to the same address, then this erratum might
cause data corruption.
The workaround promotes data cache clean instructions to
data cache clean-and-invalidate.
Please note that this does not necessarily enable the workaround,
as it depends on the alternative framework, which will only patch
the kernel if an affected CPU is detected.
If unsure, say Y.
config ARM64_ERRATUM_832075
bool "Cortex-A57: 832075: possible deadlock on mixing exclusive memory accesses with device loads"
default y
help
This option adds an alternative code sequence to work around ARM
erratum 832075 on Cortex-A57 parts up to r1p2.
Affected Cortex-A57 parts might deadlock when exclusive load/store
instructions to Write-Back memory are mixed with Device loads.
The workaround is to promote device loads to use Load-Acquire
semantics.
Please note that this does not necessarily enable the workaround,
as it depends on the alternative framework, which will only patch
the kernel if an affected CPU is detected.
If unsure, say Y.
config ARM64_ERRATUM_834220
bool "Cortex-A57: 834220: Stage 2 translation fault might be incorrectly reported in presence of a Stage 1 fault"
depends on KVM
default y
help
This option adds an alternative code sequence to work around ARM
erratum 834220 on Cortex-A57 parts up to r1p2.
Affected Cortex-A57 parts might report a Stage 2 translation
fault as the result of a Stage 1 fault for load crossing a
page boundary when there is a permission or device memory
alignment fault at Stage 1 and a translation fault at Stage 2.
The workaround is to verify that the Stage 1 translation
doesn't generate a fault before handling the Stage 2 fault.
Please note that this does not necessarily enable the workaround,
as it depends on the alternative framework, which will only patch
the kernel if an affected CPU is detected.
If unsure, say Y.
config ARM64_ERRATUM_845719
bool "Cortex-A53: 845719: a load might read incorrect data"
depends on COMPAT
default y
help
This option adds an alternative code sequence to work around ARM
erratum 845719 on Cortex-A53 parts up to r0p4.
When running a compat (AArch32) userspace on an affected Cortex-A53
part, a load at EL0 from a virtual address that matches the bottom 32
bits of the virtual address used by a recent load at (AArch64) EL1
might return incorrect data.
The workaround is to write the contextidr_el1 register on exception
return to a 32-bit task.
Please note that this does not necessarily enable the workaround,
as it depends on the alternative framework, which will only patch
the kernel if an affected CPU is detected.
If unsure, say Y.
config ARM64_ERRATUM_843419
bool "Cortex-A53: 843419: A load or store might access an incorrect address"
default y
arm64/kernel: don't ban ADRP to work around Cortex-A53 erratum #843419 Working around Cortex-A53 erratum #843419 involves special handling of ADRP instructions that end up in the last two instruction slots of a 4k page, or whose output register gets overwritten without having been read. (Note that the latter instruction sequence is never emitted by a properly functioning compiler, which is why it is disregarded by the handling of the same erratum in the bfd.ld linker which we rely on for the core kernel) Normally, this gets taken care of by the linker, which can spot such sequences at final link time, and insert a veneer if the ADRP ends up at a vulnerable offset. However, linux kernel modules are partially linked ELF objects, and so there is no 'final link time' other than the runtime loading of the module, at which time all the static relocations are resolved. For this reason, we have implemented the #843419 workaround for modules by avoiding ADRP instructions altogether, by using the large C model, and by passing -mpc-relative-literal-loads to recent versions of GCC that may emit adrp/ldr pairs to perform literal loads. However, this workaround forces us to keep literal data mixed with the instructions in the executable .text segment, and literal data may inadvertently turn into an exploitable speculative gadget depending on the relative offsets of arbitrary symbols. So let's reimplement this workaround in a way that allows us to switch back to the small C model, and to drop the -mpc-relative-literal-loads GCC switch, by patching affected ADRP instructions at runtime: - ADRP instructions that do not appear at 4k relative offset 0xff8 or 0xffc are ignored - ADRP instructions that are within 1 MB of their target symbol are converted into ADR instructions - remaining ADRP instructions are redirected via a veneer that performs the load using an unaffected movn/movk sequence. Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> [will: tidied up ADRP -> ADR instruction patching.] [will: use ULL suffix for 64-bit immediate] Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-03-06 17:15:33 +00:00
select ARM64_MODULE_PLTS if MODULES
help
This option links the kernel with '--fix-cortex-a53-843419' and
arm64/kernel: don't ban ADRP to work around Cortex-A53 erratum #843419 Working around Cortex-A53 erratum #843419 involves special handling of ADRP instructions that end up in the last two instruction slots of a 4k page, or whose output register gets overwritten without having been read. (Note that the latter instruction sequence is never emitted by a properly functioning compiler, which is why it is disregarded by the handling of the same erratum in the bfd.ld linker which we rely on for the core kernel) Normally, this gets taken care of by the linker, which can spot such sequences at final link time, and insert a veneer if the ADRP ends up at a vulnerable offset. However, linux kernel modules are partially linked ELF objects, and so there is no 'final link time' other than the runtime loading of the module, at which time all the static relocations are resolved. For this reason, we have implemented the #843419 workaround for modules by avoiding ADRP instructions altogether, by using the large C model, and by passing -mpc-relative-literal-loads to recent versions of GCC that may emit adrp/ldr pairs to perform literal loads. However, this workaround forces us to keep literal data mixed with the instructions in the executable .text segment, and literal data may inadvertently turn into an exploitable speculative gadget depending on the relative offsets of arbitrary symbols. So let's reimplement this workaround in a way that allows us to switch back to the small C model, and to drop the -mpc-relative-literal-loads GCC switch, by patching affected ADRP instructions at runtime: - ADRP instructions that do not appear at 4k relative offset 0xff8 or 0xffc are ignored - ADRP instructions that are within 1 MB of their target symbol are converted into ADR instructions - remaining ADRP instructions are redirected via a veneer that performs the load using an unaffected movn/movk sequence. Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> [will: tidied up ADRP -> ADR instruction patching.] [will: use ULL suffix for 64-bit immediate] Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-03-06 17:15:33 +00:00
enables PLT support to replace certain ADRP instructions, which can
cause subsequent memory accesses to use an incorrect address on
Cortex-A53 parts up to r0p4.
If unsure, say Y.
config ARM64_LD_HAS_FIX_ERRATUM_843419
def_bool $(ld-option,--fix-cortex-a53-843419)
config ARM64_ERRATUM_1024718
bool "Cortex-A55: 1024718: Update of DBM/AP bits without break before make might result in incorrect update"
default y
help
This option adds a workaround for ARM Cortex-A55 Erratum 1024718.
Affected Cortex-A55 cores (all revisions) could cause incorrect
update of the hardware dirty bit when the DBM/AP bits are updated
without a break-before-make. The workaround is to disable the usage
of hardware DBM locally on the affected cores. CPUs not affected by
this erratum will continue to use the feature.
If unsure, say Y.
config ARM64_ERRATUM_1418040
bool "Cortex-A76/Neoverse-N1: MRC read following MRRC read of specific Generic Timer in AArch32 might give incorrect result"
default y
depends on COMPAT
help
This option adds a workaround for ARM Cortex-A76/Neoverse-N1
errata 1188873 and 1418040.
Affected Cortex-A76/Neoverse-N1 cores (r0p0 to r3p1) could
cause register corruption when accessing the timer registers
from AArch32 userspace.
If unsure, say Y.
config ARM64_WORKAROUND_SPECULATIVE_AT
bool
config ARM64_ERRATUM_1165522
bool "Cortex-A76: 1165522: Speculative AT instruction using out-of-context translation regime could cause subsequent request to generate an incorrect translation"
default y
select ARM64_WORKAROUND_SPECULATIVE_AT
help
This option adds a workaround for ARM Cortex-A76 erratum 1165522.
Affected Cortex-A76 cores (r0p0, r1p0, r2p0) could end-up with
corrupted TLBs by speculating an AT instruction during a guest
context switch.
If unsure, say Y.
config ARM64_ERRATUM_1319367
bool "Cortex-A57/A72: 1319537: Speculative AT instruction using out-of-context translation regime could cause subsequent request to generate an incorrect translation"
default y
select ARM64_WORKAROUND_SPECULATIVE_AT
help
This option adds work arounds for ARM Cortex-A57 erratum 1319537
and A72 erratum 1319367
Cortex-A57 and A72 cores could end-up with corrupted TLBs by
speculating an AT instruction during a guest context switch.
If unsure, say Y.
config ARM64_ERRATUM_1530923
bool "Cortex-A55: 1530923: Speculative AT instruction using out-of-context translation regime could cause subsequent request to generate an incorrect translation"
default y
select ARM64_WORKAROUND_SPECULATIVE_AT
help
This option adds a workaround for ARM Cortex-A55 erratum 1530923.
Affected Cortex-A55 cores (r0p0, r0p1, r1p0, r2p0) could end-up with
corrupted TLBs by speculating an AT instruction during a guest
context switch.
If unsure, say Y.
config ARM64_WORKAROUND_REPEAT_TLBI
bool
config ARM64_ERRATUM_1286807
bool "Cortex-A76: Modification of the translation table for a virtual address might lead to read-after-read ordering violation"
default y
select ARM64_WORKAROUND_REPEAT_TLBI
help
This option adds a workaround for ARM Cortex-A76 erratum 1286807.
On the affected Cortex-A76 cores (r0p0 to r3p0), if a virtual
address for a cacheable mapping of a location is being
accessed by a core while another core is remapping the virtual
address to a new physical page using the recommended
break-before-make sequence, then under very rare circumstances
TLBI+DSB completes before a read using the translation being
invalidated has been observed by other observers. The
workaround repeats the TLBI+DSB operation.
config ARM64_ERRATUM_1463225
bool "Cortex-A76: Software Step might prevent interrupt recognition"
default y
help
This option adds a workaround for Arm Cortex-A76 erratum 1463225.
On the affected Cortex-A76 cores (r0p0 to r3p1), software stepping
of a system call instruction (SVC) can prevent recognition of
subsequent interrupts when software stepping is disabled in the
exception handler of the system call and either kernel debugging
is enabled or VHE is in use.
Work around the erratum by triggering a dummy step exception
when handling a system call from a task that is being stepped
in a VHE configuration of the kernel.
If unsure, say Y.
config ARM64_ERRATUM_1542419
bool "Neoverse-N1: workaround mis-ordering of instruction fetches"
default y
help
This option adds a workaround for ARM Neoverse-N1 erratum
1542419.
Affected Neoverse-N1 cores could execute a stale instruction when
modified by another CPU. The workaround depends on a firmware
counterpart.
Workaround the issue by hiding the DIC feature from EL0. This
forces user-space to perform cache maintenance.
If unsure, say Y.
config ARM64_ERRATUM_1508412
bool "Cortex-A77: 1508412: workaround deadlock on sequence of NC/Device load and store exclusive or PAR read"
default y
help
This option adds a workaround for Arm Cortex-A77 erratum 1508412.
Affected Cortex-A77 cores (r0p0, r1p0) could deadlock on a sequence
of a store-exclusive or read of PAR_EL1 and a load with device or
non-cacheable memory attributes. The workaround depends on a firmware
counterpart.
KVM guests must also have the workaround implemented or they can
deadlock the system.
Work around the issue by inserting DMB SY barriers around PAR_EL1
register reads and warning KVM users. The DMB barrier is sufficient
to prevent a speculative PAR_EL1 read.
If unsure, say Y.
config CAVIUM_ERRATUM_22375
bool "Cavium erratum 22375, 24313"
default y
help
Enable workaround for errata 22375 and 24313.
This implements two gicv3-its errata workarounds for ThunderX. Both
with a small impact affecting only ITS table allocation.
erratum 22375: only alloc 8MB table size
erratum 24313: ignore memory access type
The fixes are in ITS initialization and basically ignore memory access
type and table size provided by the TYPER and BASER registers.
If unsure, say Y.
config CAVIUM_ERRATUM_23144
bool "Cavium erratum 23144: ITS SYNC hang on dual socket system"
depends on NUMA
default y
help
ITS SYNC command hang for cross node io and collections/cpu mapping.
If unsure, say Y.
config CAVIUM_ERRATUM_23154
bool "Cavium erratum 23154: Access to ICC_IAR1_EL1 is not sync'ed"
default y
help
The gicv3 of ThunderX requires a modified version for
reading the IAR status to ensure data synchronization
(access to icc_iar1_el1 is not sync'ed before and after).
If unsure, say Y.
config CAVIUM_ERRATUM_27456
bool "Cavium erratum 27456: Broadcast TLBI instructions may cause icache corruption"
default y
help
On ThunderX T88 pass 1.x through 2.1 parts, broadcast TLBI
instructions may cause the icache to become corrupted if it
contains data for a non-current ASID. The fix is to
invalidate the icache when changing the mm context.
If unsure, say Y.
config CAVIUM_ERRATUM_30115
bool "Cavium erratum 30115: Guest may disable interrupts in host"
default y
help
On ThunderX T88 pass 1.x through 2.2, T81 pass 1.0 through
1.2, and T83 Pass 1.0, KVM guest execution may disable
interrupts in host. Trapping both GICv3 group-0 and group-1
accesses sidesteps the issue.
If unsure, say Y.
config CAVIUM_TX2_ERRATUM_219
bool "Cavium ThunderX2 erratum 219: PRFM between TTBR change and ISB fails"
default y
help
On Cavium ThunderX2, a load, store or prefetch instruction between a
TTBR update and the corresponding context synchronizing operation can
cause a spurious Data Abort to be delivered to any hardware thread in
the CPU core.
Work around the issue by avoiding the problematic code sequence and
trapping KVM guest TTBRx_EL1 writes to EL2 when SMT is enabled. The
trap handler performs the corresponding register access, skips the
instruction and ensures context synchronization by virtue of the
exception return.
If unsure, say Y.
config FUJITSU_ERRATUM_010001
bool "Fujitsu-A64FX erratum E#010001: Undefined fault may occur wrongly"
default y
help
This option adds a workaround for Fujitsu-A64FX erratum E#010001.
On some variants of the Fujitsu-A64FX cores ver(1.0, 1.1), memory
accesses may cause undefined fault (Data abort, DFSC=0b111111).
This fault occurs under a specific hardware condition when a
load/store instruction performs an address translation using:
case-1 TTBR0_EL1 with TCR_EL1.NFD0 == 1.
case-2 TTBR0_EL2 with TCR_EL2.NFD0 == 1.
case-3 TTBR1_EL1 with TCR_EL1.NFD1 == 1.
case-4 TTBR1_EL2 with TCR_EL2.NFD1 == 1.
The workaround is to ensure these bits are clear in TCR_ELx.
The workaround only affects the Fujitsu-A64FX.
If unsure, say Y.
config HISILICON_ERRATUM_161600802
bool "Hip07 161600802: Erroneous redistributor VLPI base"
default y
help
The HiSilicon Hip07 SoC uses the wrong redistributor base
when issued ITS commands such as VMOVP and VMAPP, and requires
a 128kB offset to be applied to the target address in this commands.
If unsure, say Y.
arm64: Work around Falkor erratum 1003 The Qualcomm Datacenter Technologies Falkor v1 CPU may allocate TLB entries using an incorrect ASID when TTBRx_EL1 is being updated. When the erratum is triggered, page table entries using the new translation table base address (BADDR) will be allocated into the TLB using the old ASID. All circumstances leading to the incorrect ASID being cached in the TLB arise when software writes TTBRx_EL1[ASID] and TTBRx_EL1[BADDR], a memory operation is in the process of performing a translation using the specific TTBRx_EL1 being written, and the memory operation uses a translation table descriptor designated as non-global. EL2 and EL3 code changing the EL1&0 ASID is not subject to this erratum because hardware is prohibited from performing translations from an out-of-context translation regime. Consider the following pseudo code. write new BADDR and ASID values to TTBRx_EL1 Replacing the above sequence with the one below will ensure that no TLB entries with an incorrect ASID are used by software. write reserved value to TTBRx_EL1[ASID] ISB write new value to TTBRx_EL1[BADDR] ISB write new value to TTBRx_EL1[ASID] ISB When the above sequence is used, page table entries using the new BADDR value may still be incorrectly allocated into the TLB using the reserved ASID. Yet this will not reduce functionality, since TLB entries incorrectly tagged with the reserved ASID will never be hit by a later instruction. Based on work by Shanker Donthineni <shankerd@codeaurora.org> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Christopher Covington <cov@codeaurora.org> Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-02-08 20:08:37 +00:00
config QCOM_FALKOR_ERRATUM_1003
bool "Falkor E1003: Incorrect translation due to ASID change"
default y
help
On Falkor v1, an incorrect ASID may be cached in the TLB when ASID
and BADDR are changed together in TTBRx_EL1. Since we keep the ASID
in TTBR1_EL1, this situation only occurs in the entry trampoline and
then only for entries in the walk cache, since the leaf translation
is unchanged. Work around the erratum by invalidating the walk cache
entries for the trampoline before entering the kernel proper.
arm64: Work around Falkor erratum 1003 The Qualcomm Datacenter Technologies Falkor v1 CPU may allocate TLB entries using an incorrect ASID when TTBRx_EL1 is being updated. When the erratum is triggered, page table entries using the new translation table base address (BADDR) will be allocated into the TLB using the old ASID. All circumstances leading to the incorrect ASID being cached in the TLB arise when software writes TTBRx_EL1[ASID] and TTBRx_EL1[BADDR], a memory operation is in the process of performing a translation using the specific TTBRx_EL1 being written, and the memory operation uses a translation table descriptor designated as non-global. EL2 and EL3 code changing the EL1&0 ASID is not subject to this erratum because hardware is prohibited from performing translations from an out-of-context translation regime. Consider the following pseudo code. write new BADDR and ASID values to TTBRx_EL1 Replacing the above sequence with the one below will ensure that no TLB entries with an incorrect ASID are used by software. write reserved value to TTBRx_EL1[ASID] ISB write new value to TTBRx_EL1[BADDR] ISB write new value to TTBRx_EL1[ASID] ISB When the above sequence is used, page table entries using the new BADDR value may still be incorrectly allocated into the TLB using the reserved ASID. Yet this will not reduce functionality, since TLB entries incorrectly tagged with the reserved ASID will never be hit by a later instruction. Based on work by Shanker Donthineni <shankerd@codeaurora.org> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Christopher Covington <cov@codeaurora.org> Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-02-08 20:08:37 +00:00
config QCOM_FALKOR_ERRATUM_1009
bool "Falkor E1009: Prematurely complete a DSB after a TLBI"
default y
select ARM64_WORKAROUND_REPEAT_TLBI
help
On Falkor v1, the CPU may prematurely complete a DSB following a
TLBI xxIS invalidate maintenance operation. Repeat the TLBI operation
one more time to fix the issue.
If unsure, say Y.
config QCOM_QDF2400_ERRATUM_0065
bool "QDF2400 E0065: Incorrect GITS_TYPER.ITT_Entry_size"
default y
help
On Qualcomm Datacenter Technologies QDF2400 SoC, ITS hardware reports
ITE size incorrectly. The GITS_TYPER.ITT_Entry_size field should have
been indicated as 16Bytes (0xf), not 8Bytes (0x7).
If unsure, say Y.
arm64: Add software workaround for Falkor erratum 1041 The ARM architecture defines the memory locations that are permitted to be accessed as the result of a speculative instruction fetch from an exception level for which all stages of translation are disabled. Specifically, the core is permitted to speculatively fetch from the 4KB region containing the current program counter 4K and next 4K. When translation is changed from enabled to disabled for the running exception level (SCTLR_ELn[M] changed from a value of 1 to 0), the Falkor core may errantly speculatively access memory locations outside of the 4KB region permitted by the architecture. The errant memory access may lead to one of the following unexpected behaviors. 1) A System Error Interrupt (SEI) being raised by the Falkor core due to the errant memory access attempting to access a region of memory that is protected by a slave-side memory protection unit. 2) Unpredictable device behavior due to a speculative read from device memory. This behavior may only occur if the instruction cache is disabled prior to or coincident with translation being changed from enabled to disabled. The conditions leading to this erratum will not occur when either of the following occur: 1) A higher exception level disables translation of a lower exception level (e.g. EL2 changing SCTLR_EL1[M] from a value of 1 to 0). 2) An exception level disabling its stage-1 translation if its stage-2 translation is enabled (e.g. EL1 changing SCTLR_EL1[M] from a value of 1 to 0 when HCR_EL2[VM] has a value of 1). To avoid the errant behavior, software must execute an ISB immediately prior to executing the MSR that will change SCTLR_ELn[M] from 1 to 0. Signed-off-by: Shanker Donthineni <shankerd@codeaurora.org> Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-12-11 22:42:32 +00:00
config QCOM_FALKOR_ERRATUM_E1041
bool "Falkor E1041: Speculative instruction fetches might cause errant memory access"
default y
help
Falkor CPU may speculatively fetch instructions from an improper
memory location when MMU translation is changed from SCTLR_ELn[M]=1
to SCTLR_ELn[M]=0. Prefix an ISB instruction to fix the problem.
If unsure, say Y.
config NVIDIA_CARMEL_CNP_ERRATUM
bool "NVIDIA Carmel CNP: CNP on Carmel semantically different than ARM cores"
default y
help
If CNP is enabled on Carmel cores, non-sharable TLBIs on a core will not
invalidate shared TLB entries installed by a different core, as it would
on standard ARM cores.
If unsure, say Y.
config SOCIONEXT_SYNQUACER_PREITS
bool "Socionext Synquacer: Workaround for GICv3 pre-ITS"
default y
help
Socionext Synquacer SoCs implement a separate h/w block to generate
MSI doorbell writes with non-zero values for the device ID.
If unsure, say Y.
endmenu
choice
prompt "Page size"
default ARM64_4K_PAGES
help
Page size (translation granule) configuration.
config ARM64_4K_PAGES
bool "4KB"
help
This feature enables 4KB pages support.
config ARM64_16K_PAGES
bool "16KB"
help
The system will use 16KB pages support. AArch32 emulation
requires applications compiled with 16K (or a multiple of 16K)
aligned segments.
config ARM64_64K_PAGES
bool "64KB"
help
This feature enables 64KB pages support (4KB by default)
allowing only two levels of page tables and faster TLB
look-up. AArch32 emulation requires applications compiled
with 64K aligned segments.
endchoice
choice
prompt "Virtual address space size"
default ARM64_VA_BITS_39 if ARM64_4K_PAGES
default ARM64_VA_BITS_47 if ARM64_16K_PAGES
default ARM64_VA_BITS_42 if ARM64_64K_PAGES
help
Allows choosing one of multiple possible virtual address
space sizes. The level of translation table is determined by
a combination of page size and virtual address space size.
config ARM64_VA_BITS_36
bool "36-bit" if EXPERT
depends on ARM64_16K_PAGES
config ARM64_VA_BITS_39
bool "39-bit"
depends on ARM64_4K_PAGES
config ARM64_VA_BITS_42
bool "42-bit"
depends on ARM64_64K_PAGES
config ARM64_VA_BITS_47
bool "47-bit"
depends on ARM64_16K_PAGES
arm64: mm: Implement 4 levels of translation tables This patch implements 4 levels of translation tables since 3 levels of page tables with 4KB pages cannot support 40-bit physical address space described in [1] due to the following issue. It is a restriction that kernel logical memory map with 4KB + 3 levels (0xffffffc000000000-0xffffffffffffffff) cannot cover RAM region from 544GB to 1024GB in [1]. Specifically, ARM64 kernel fails to create mapping for this region in map_mem function since __phys_to_virt for this region reaches to address overflow. If SoC design follows the document, [1], over 32GB RAM would be placed from 544GB. Even 64GB system is supposed to use the region from 544GB to 576GB for only 32GB RAM. Naturally, it would reach to enable 4 levels of page tables to avoid hacking __virt_to_phys and __phys_to_virt. However, it is recommended 4 levels of page table should be only enabled if memory map is too sparse or there is about 512GB RAM. References ---------- [1]: Principles of ARM Memory Maps, White Paper, Issue C Signed-off-by: Jungseok Lee <jays.lee@samsung.com> Reviewed-by: Sungjinn Chung <sungjinn.chung@samsung.com> Acked-by: Kukjin Kim <kgene.kim@samsung.com> Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org> Reviewed-by: Steve Capper <steve.capper@linaro.org> [catalin.marinas@arm.com: MEMBLOCK_INITIAL_LIMIT removed, same as PUD_SIZE] [catalin.marinas@arm.com: early_ioremap_init() updated for 4 levels] [catalin.marinas@arm.com: 48-bit VA depends on BROKEN until KVM is fixed] Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Tested-by: Jungseok Lee <jungseoklee85@gmail.com>
2014-05-12 09:40:51 +00:00
config ARM64_VA_BITS_48
bool "48-bit"
config ARM64_VA_BITS_52
bool "52-bit"
depends on ARM64_64K_PAGES && (ARM64_PAN || !ARM64_SW_TTBR0_PAN)
help
Enable 52-bit virtual addressing for userspace when explicitly
requested via a hint to mmap(). The kernel will also use 52-bit
virtual addresses for its own mappings (provided HW support for
this feature is available, otherwise it reverts to 48-bit).
NOTE: Enabling 52-bit virtual addressing in conjunction with
ARMv8.3 Pointer Authentication will result in the PAC being
reduced from 7 bits to 3 bits, which may have a significant
impact on its susceptibility to brute-force attacks.
If unsure, select 48-bit virtual addressing instead.
endchoice
config ARM64_FORCE_52BIT
bool "Force 52-bit virtual addresses for userspace"
depends on ARM64_VA_BITS_52 && EXPERT
help
For systems with 52-bit userspace VAs enabled, the kernel will attempt
to maintain compatibility with older software by providing 48-bit VAs
unless a hint is supplied to mmap.
This configuration option disables the 48-bit compatibility logic, and
forces all userspace addresses to be 52-bit on HW that supports it. One
should only enable this configuration option for stress testing userspace
memory management code. If unsure say N here.
config ARM64_VA_BITS
int
default 36 if ARM64_VA_BITS_36
default 39 if ARM64_VA_BITS_39
default 42 if ARM64_VA_BITS_42
default 47 if ARM64_VA_BITS_47
default 48 if ARM64_VA_BITS_48
default 52 if ARM64_VA_BITS_52
choice
prompt "Physical address space size"
default ARM64_PA_BITS_48
help
Choose the maximum physical address range that the kernel will
support.
config ARM64_PA_BITS_48
bool "48-bit"
config ARM64_PA_BITS_52
bool "52-bit (ARMv8.2)"
depends on ARM64_64K_PAGES
depends on ARM64_PAN || !ARM64_SW_TTBR0_PAN
help
Enable support for a 52-bit physical address space, introduced as
part of the ARMv8.2-LPA extension.
With this enabled, the kernel will also continue to work on CPUs that
do not support ARMv8.2-LPA, but with some added memory overhead (and
minor performance overhead).
endchoice
config ARM64_PA_BITS
int
default 48 if ARM64_PA_BITS_48
default 52 if ARM64_PA_BITS_52
choice
prompt "Endianness"
default CPU_LITTLE_ENDIAN
help
Select the endianness of data accesses performed by the CPU. Userspace
applications will need to be compiled and linked for the endianness
that is selected here.
config CPU_BIG_ENDIAN
bool "Build big-endian kernel"
depends on !LD_IS_LLD || LLD_VERSION >= 130000
help
Say Y if you plan on running a kernel with a big-endian userspace.
config CPU_LITTLE_ENDIAN
bool "Build little-endian kernel"
help
Say Y if you plan on running a kernel with a little-endian userspace.
This is usually the case for distributions targeting arm64.
endchoice
config SCHED_MC
bool "Multi-core scheduler support"
help
Multi-core scheduler support improves the CPU scheduler's decision
making when dealing with multi-core CPU chips at a cost of slightly
increased overhead in some places. If unsure say N here.
config SCHED_SMT
bool "SMT scheduler support"
help
Improves the CPU scheduler's decision making when dealing with
MultiThreading at a cost of slightly increased overhead in some
places. If unsure say N here.
config NR_CPUS
int "Maximum number of CPUs (2-4096)"
range 2 4096
default "256"
config HOTPLUG_CPU
bool "Support for hot-pluggable CPUs"
select GENERIC_IRQ_MIGRATION
help
Say Y here to experiment with turning CPUs off and on. CPUs
can be controlled through /sys/devices/system/cpu.
# Common NUMA Features
config NUMA
bool "NUMA Memory Allocation and Scheduler Support"
select GENERIC_ARCH_NUMA
select ACPI_NUMA if ACPI
select OF_NUMA
help
Enable NUMA (Non-Uniform Memory Access) support.
The kernel will try to allocate memory used by a CPU on the
local memory of the CPU and add some more
NUMA awareness to the kernel.
config NODES_SHIFT
int "Maximum NUMA Nodes (as a power of 2)"
range 1 10
default "4"
depends on NEED_MULTIPLE_NODES
help
Specify the maximum number of NUMA Nodes available on the target
system. Increases memory reserved to accommodate various tables.
config USE_PERCPU_NUMA_NODE_ID
def_bool y
depends on NUMA
config HAVE_SETUP_PER_CPU_AREA
def_bool y
depends on NUMA
config NEED_PER_CPU_EMBED_FIRST_CHUNK
def_bool y
depends on NUMA
config HOLES_IN_ZONE
def_bool y
source "kernel/Kconfig.hz"
config ARCH_SPARSEMEM_ENABLE
def_bool y
select SPARSEMEM_VMEMMAP_ENABLE
config ARCH_SPARSEMEM_DEFAULT
def_bool ARCH_SPARSEMEM_ENABLE
config ARCH_SELECT_MEMORY_MODEL
def_bool ARCH_SPARSEMEM_ENABLE
config ARCH_FLATMEM_ENABLE
def_bool !NUMA
config HW_PERF_EVENTS
def_bool y
depends on ARM_PMU
config SYS_SUPPORTS_HUGETLBFS
def_bool y
config ARCH_HAS_CACHE_LINE_SIZE
def_bool y
config ARCH_HAS_FILTER_PGPROT
def_bool y
config ARCH_ENABLE_SPLIT_PMD_PTLOCK
def_bool y if PGTABLE_LEVELS > 2
# Supported by clang >= 7.0
config CC_HAVE_SHADOW_CALL_STACK
def_bool $(cc-option, -fsanitize=shadow-call-stack -ffixed-x18)
config PARAVIRT
bool "Enable paravirtualization code"
help
This changes the kernel so it can modify itself when it is run
under a hypervisor, potentially improving performance significantly
over full virtualization.
config PARAVIRT_TIME_ACCOUNTING
bool "Paravirtual steal time accounting"
select PARAVIRT
help
Select this option to enable fine granularity task steal time
accounting. Time spent executing other tasks in parallel with
the current vCPU is discounted from the vCPU power. To account for
that, there can be a small performance impact.
If in doubt, say N here.
config KEXEC
depends on PM_SLEEP_SMP
select KEXEC_CORE
bool "kexec system call"
help
kexec is a system call that implements the ability to shutdown your
current kernel, and to start another kernel. It is like a reboot
but it is independent of the system firmware. And like a reboot
you can start any kernel with it, not just Linux.
config KEXEC_FILE
bool "kexec file based system call"
select KEXEC_CORE
select HAVE_IMA_KEXEC if IMA
help
This is new version of kexec system call. This system call is
file based and takes file descriptors as system call argument
for kernel and initramfs as opposed to list of segments as
accepted by previous system call.
config KEXEC_SIG
bool "Verify kernel signature during kexec_file_load() syscall"
depends on KEXEC_FILE
help
Select this option to verify a signature with loaded kernel
image. If configured, any attempt of loading a image without
valid signature will fail.
In addition to that option, you need to enable signature
verification for the corresponding kernel image type being
loaded in order for this to work.
config KEXEC_IMAGE_VERIFY_SIG
bool "Enable Image signature verification support"
default y
depends on KEXEC_SIG
depends on EFI && SIGNED_PE_FILE_VERIFICATION
help
Enable Image signature verification support.
comment "Support for PE file signature verification disabled"
depends on KEXEC_SIG
depends on !EFI || !SIGNED_PE_FILE_VERIFICATION
config CRASH_DUMP
bool "Build kdump crash kernel"
help
Generate crash dump after being started by kexec. This should
be normally only set in special crash dump kernels which are
loaded in the main kernel with kexec-tools into a specially
reserved region and then later executed after a crash by
kdump/kexec.
For more details see Documentation/admin-guide/kdump/kdump.rst
config TRANS_TABLE
def_bool y
depends on HIBERNATION
config XEN_DOM0
def_bool y
depends on XEN
config XEN
bool "Xen guest support on ARM64"
depends on ARM64 && OF
xen/arm,arm64: enable SWIOTLB_XEN Xen on arm and arm64 needs SWIOTLB_XEN: when running on Xen we need to program the hardware with mfns rather than pfns for dma addresses. Remove SWIOTLB_XEN dependency on X86 and PCI and make XEN select SWIOTLB_XEN on arm and arm64. At the moment always rely on swiotlb-xen, but when Xen starts supporting hardware IOMMUs we'll be able to avoid it conditionally on the presence of an IOMMU on the platform. Implement xen_create_contiguous_region on arm and arm64: for the moment we assume that dom0 has been mapped 1:1 (physical addresses == machine addresses) therefore we don't need to call XENMEM_exchange. Simply return the physical address as dma address. Initialize the xen-swiotlb from xen_early_init (before the native dma_ops are initialized), set xen_dma_ops to &xen_swiotlb_dma_ops. Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Changes in v8: - assume dom0 is mapped 1:1, no need to call XENMEM_exchange. Changes in v7: - call __set_phys_to_machine_multi from xen_create_contiguous_region and xen_destroy_contiguous_region to update the P2M; - don't call XENMEM_unpin, it has been removed; - call XENMEM_exchange instead of XENMEM_exchange_and_pin; - set nr_exchanged to 0 before calling the hypercall. Changes in v6: - introduce and export xen_dma_ops; - call xen_mm_init from as arch_initcall. Changes in v4: - remove redefinition of DMA_ERROR_CODE; - update the code to use XENMEM_exchange_and_pin and XENMEM_unpin; - add a note about hardware IOMMU in the commit message. Changes in v3: - code style changes; - warn on XENMEM_put_dma_buf failures.
2013-10-10 13:40:44 +00:00
select SWIOTLB_XEN
select PARAVIRT
help
Say Y if you want to run Linux in a Virtual Machine on Xen on ARM64.
config FORCE_MAX_ZONEORDER
int
arm64/mm: Drop THP conditionality from FORCE_MAX_ZONEORDER Currently without THP being enabled, MAX_ORDER via FORCE_MAX_ZONEORDER gets reduced to 11, which falls below HUGETLB_PAGE_ORDER for certain 16K and 64K page size configurations. This is problematic which throws up the following warning during boot as pageblock_order via HUGETLB_PAGE_ORDER order exceeds MAX_ORDER. WARNING: CPU: 7 PID: 127 at mm/vmstat.c:1092 __fragmentation_index+0x58/0x70 Modules linked in: CPU: 7 PID: 127 Comm: kswapd0 Not tainted 5.12.0-rc1-00005-g0221e3101a1 #237 Hardware name: linux,dummy-virt (DT) pstate: 20400005 (nzCv daif +PAN -UAO -TCO BTYPE=--) pc : __fragmentation_index+0x58/0x70 lr : fragmentation_index+0x88/0xa8 sp : ffff800016ccfc00 x29: ffff800016ccfc00 x28: 0000000000000000 x27: ffff800011fd4000 x26: 0000000000000002 x25: ffff800016ccfda0 x24: 0000000000000002 x23: 0000000000000640 x22: ffff0005ffcb5b18 x21: 0000000000000002 x20: 000000000000000d x19: ffff0005ffcb3980 x18: 0000000000000004 x17: 0000000000000001 x16: 0000000000000019 x15: ffff800011ca7fb8 x14: 00000000000002b3 x13: 0000000000000000 x12: 00000000000005e0 x11: 0000000000000003 x10: 0000000000000080 x9 : ffff800011c93948 x8 : 0000000000000000 x7 : 0000000000000000 x6 : 0000000000007000 x5 : 0000000000007944 x4 : 0000000000000032 x3 : 000000000000001c x2 : 000000000000000b x1 : ffff800016ccfc10 x0 : 000000000000000d Call trace: __fragmentation_index+0x58/0x70 compaction_suitable+0x58/0x78 wakeup_kcompactd+0x8c/0xd8 balance_pgdat+0x570/0x5d0 kswapd+0x1e0/0x388 kthread+0x154/0x158 ret_from_fork+0x10/0x30 This solves the problem via keeping FORCE_MAX_ZONEORDER unchanged with or without THP on 16K and 64K page size configurations, making sure that the HUGETLB_PAGE_ORDER (and pageblock_order) would never exceed MAX_ORDER. Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will@kernel.org> Cc: linux-arm-kernel@lists.infradead.org Cc: linux-kernel@vger.kernel.org Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Link: https://lore.kernel.org/r/1614597914-28565-1-git-send-email-anshuman.khandual@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2021-03-01 11:25:14 +00:00
default "14" if ARM64_64K_PAGES
default "12" if ARM64_16K_PAGES
default "11"
help
The kernel memory allocator divides physically contiguous memory
blocks into "zones", where each zone is a power of two number of
pages. This option selects the largest power of two that the kernel
keeps in the memory allocator. If you need to allocate very large
blocks of physically contiguous memory, then you may need to
increase this value.
This config option is actually maximum order plus one. For example,
a value of 11 means that the largest free memory block is 2^10 pages.
We make sure that we can allocate upto a HugePage size for each configuration.
Hence we have :
MAX_ORDER = (PMD_SHIFT - PAGE_SHIFT) + 1 => PAGE_SHIFT - 2
However for 4K, we choose a higher default value, 11 as opposed to 10, giving us
4M allocations matching the default size used by generic code.
config UNMAP_KERNEL_AT_EL0
bool "Unmap kernel when running in userspace (aka \"KAISER\")" if EXPERT
default y
help
Speculation attacks against some high-performance processors can
be used to bypass MMU permission checks and leak kernel data to
userspace. This can be defended against by unmapping the kernel
when running in userspace, mapping it back in on exception entry
via a trampoline page in the vector table.
If unsure, say Y.
config RODATA_FULL_DEFAULT_ENABLED
bool "Apply r/o permissions of VM areas also to their linear aliases"
default y
help
Apply read-only attributes of VM areas to the linear alias of
the backing pages as well. This prevents code or read-only data
from being modified (inadvertently or intentionally) via another
mapping of the same memory page. This additional enhancement can
be turned off at runtime by passing rodata=[off|on] (and turned on
with rodata=full if this option is set to 'n')
This requires the linear region to be mapped down to pages,
which may adversely affect performance in some cases.
config ARM64_SW_TTBR0_PAN
bool "Emulate Privileged Access Never using TTBR0_EL1 switching"
help
Enabling this option prevents the kernel from accessing
user-space memory directly by pointing TTBR0_EL1 to a reserved
zeroed area and reserved ASID. The user access routines
restore the valid TTBR0_EL1 temporarily.
config ARM64_TAGGED_ADDR_ABI
bool "Enable the tagged user addresses syscall ABI"
default y
help
When this option is enabled, user applications can opt in to a
relaxed ABI via prctl() allowing tagged addresses to be passed
to system calls as pointer arguments. For details, see
Documentation/arm64/tagged-address-abi.rst.
menuconfig COMPAT
bool "Kernel support for 32-bit EL0"
depends on ARM64_4K_PAGES || EXPERT
select HAVE_UID16
select OLD_SIGSUSPEND3
select COMPAT_OLD_SIGACTION
help
This option enables support for a 32-bit EL0 running under a 64-bit
kernel at EL1. AArch32-specific components such as system calls,
the user helper functions, VFP support and the ptrace interface are
handled appropriately by the kernel.
If you use a page size other than 4KB (i.e, 16KB or 64KB), please be aware
that you will only be able to execute AArch32 binaries that were compiled
with page size aligned segments.
If you want to execute 32-bit userspace applications, say Y.
if COMPAT
config KUSER_HELPERS
bool "Enable kuser helpers page for 32-bit applications"
default y
help
Warning: disabling this option may break 32-bit user programs.
Provide kuser helpers to compat tasks. The kernel provides
helper code to userspace in read only form at a fixed location
to allow userspace to be independent of the CPU type fitted to
the system. This permits binaries to be run on ARMv4 through
to ARMv8 without modification.
See Documentation/arm/kernel_user_helpers.rst for details.
However, the fixed address nature of these helpers can be used
by ROP (return orientated programming) authors when creating
exploits.
If all of the binaries and libraries which run on your platform
are built specifically for your platform, and make no use of
these helpers, then you can turn this option off to hinder
such exploits. However, in that case, if a binary or library
relying on those helpers is run, it will not function correctly.
Say N here only if you are absolutely certain that you do not
need these helpers; otherwise, the safe option is to say Y.
config COMPAT_VDSO
bool "Enable vDSO for 32-bit applications"
depends on !CPU_BIG_ENDIAN && "$(CROSS_COMPILE_COMPAT)" != ""
select GENERIC_COMPAT_VDSO
default y
help
Place in the process address space of 32-bit applications an
ELF shared object providing fast implementations of gettimeofday
and clock_gettime.
You must have a 32-bit build of glibc 2.22 or later for programs
to seamlessly take advantage of this.
config THUMB2_COMPAT_VDSO
bool "Compile the 32-bit vDSO for Thumb-2 mode" if EXPERT
depends on COMPAT_VDSO
default y
help
Compile the compat vDSO with '-mthumb -fomit-frame-pointer' if y,
otherwise with '-marm'.
menuconfig ARMV8_DEPRECATED
bool "Emulate deprecated/obsolete ARMv8 instructions"
depends on SYSCTL
help
Legacy software support may require certain instructions
that have been deprecated or obsoleted in the architecture.
Enable this config to enable selective emulation of these
features.
If unsure, say Y
if ARMV8_DEPRECATED
config SWP_EMULATION
bool "Emulate SWP/SWPB instructions"
help
ARMv8 obsoletes the use of A32 SWP/SWPB instructions such that
they are always undefined. Say Y here to enable software
emulation of these instructions for userspace using LDXR/STXR.
This feature can be controlled at runtime with the abi.swp
sysctl which is disabled by default.
In some older versions of glibc [<=2.8] SWP is used during futex
trylock() operations with the assumption that the code will not
be preempted. This invalid assumption may be more likely to fail
with SWP emulation enabled, leading to deadlock of the user
application.
NOTE: when accessing uncached shared regions, LDXR/STXR rely
on an external transaction monitoring block called a global
monitor to maintain update atomicity. If your system does not
implement a global monitor, this option can cause programs that
perform SWP operations to uncached memory to deadlock.
If unsure, say Y
config CP15_BARRIER_EMULATION
bool "Emulate CP15 Barrier instructions"
help
The CP15 barrier instructions - CP15ISB, CP15DSB, and
CP15DMB - are deprecated in ARMv8 (and ARMv7). It is
strongly recommended to use the ISB, DSB, and DMB
instructions instead.
Say Y here to enable software emulation of these
instructions for AArch32 userspace code. When this option is
enabled, CP15 barrier usage is traced which can help
identify software that needs updating. This feature can be
controlled at runtime with the abi.cp15_barrier sysctl.
If unsure, say Y
config SETEND_EMULATION
bool "Emulate SETEND instruction"
help
The SETEND instruction alters the data-endianness of the
AArch32 EL0, and is deprecated in ARMv8.
Say Y here to enable software emulation of the instruction
for AArch32 userspace code. This feature can be controlled
at runtime with the abi.setend sysctl.
Note: All the cpus on the system must have mixed endian support at EL0
for this feature to be enabled. If a new CPU - which doesn't support mixed
endian - is hotplugged in after this feature has been enabled, there could
be unexpected results in the applications.
If unsure, say Y
endif
endif
menu "ARMv8.1 architectural features"
config ARM64_HW_AFDBM
bool "Support for hardware updates of the Access and Dirty page flags"
default y
help
The ARMv8.1 architecture extensions introduce support for
hardware updates of the access and dirty information in page
table entries. When enabled in TCR_EL1 (HA and HD bits) on
capable processors, accesses to pages with PTE_AF cleared will
set this bit instead of raising an access flag fault.
Similarly, writes to read-only pages with the DBM bit set will
clear the read-only bit (AP[2]) instead of raising a
permission fault.
Kernels built with this configuration option enabled continue
to work on pre-ARMv8.1 hardware and the performance impact is
minimal. If unsure, say Y.
config ARM64_PAN
bool "Enable support for Privileged Access Never (PAN)"
default y
help
Privileged Access Never (PAN; part of the ARMv8.1 Extensions)
prevents the kernel or hypervisor from accessing user-space (EL0)
memory directly.
Choosing this option will cause any unprotected (not using
copy_to_user et al) memory access to fail with a permission fault.
The feature is detected at runtime, and will remain as a 'nop'
instruction if the cpu does not implement the feature.
config AS_HAS_LDAPR
def_bool $(as-instr,.arch_extension rcpc)
config AS_HAS_LSE_ATOMICS
def_bool $(as-instr,.arch_extension lse)
config ARM64_LSE_ATOMICS
bool
default ARM64_USE_LSE_ATOMICS
depends on AS_HAS_LSE_ATOMICS
config ARM64_USE_LSE_ATOMICS
bool "Atomic instructions"
arm64: lse: Make ARM64_LSE_ATOMICS depend on JUMP_LABEL Support for LSE atomic instructions (CONFIG_ARM64_LSE_ATOMICS) relies on a static key to select between the legacy LL/SC implementation which is available on all arm64 CPUs and the super-duper LSE implementation which is available on CPUs implementing v8.1 and later. Unfortunately, when building a kernel with CONFIG_JUMP_LABEL disabled (e.g. because the toolchain doesn't support 'asm goto'), the static key inside the atomics code tries to use atomics itself. This results in a mess of circular includes and a build failure: In file included from ./arch/arm64/include/asm/lse.h:11, from ./arch/arm64/include/asm/atomic.h:16, from ./include/linux/atomic.h:7, from ./include/asm-generic/bitops/atomic.h:5, from ./arch/arm64/include/asm/bitops.h:26, from ./include/linux/bitops.h:19, from ./include/linux/kernel.h:12, from ./include/asm-generic/bug.h:18, from ./arch/arm64/include/asm/bug.h:26, from ./include/linux/bug.h:5, from ./include/linux/page-flags.h:10, from kernel/bounds.c:10: ./include/linux/jump_label.h: In function ‘static_key_count’: ./include/linux/jump_label.h:254:9: error: implicit declaration of function ‘atomic_read’ [-Werror=implicit-function-declaration] return atomic_read(&key->enabled); ^~~~~~~~~~~ [ ... more of the same ... ] Since LSE atomic instructions are not critical to the operation of the kernel, make them depend on JUMP_LABEL at compile time. Reviewed-by: Andrew Murray <andrew.murray@arm.com> Signed-off-by: Will Deacon <will@kernel.org>
2019-08-29 10:52:47 +00:00
depends on JUMP_LABEL
default y
help
As part of the Large System Extensions, ARMv8.1 introduces new
atomic instructions that are designed specifically to scale in
very large systems.
Say Y here to make use of these instructions for the in-kernel
atomic routines. This incurs a small overhead on CPUs that do
not support these instructions and requires the kernel to be
built with binutils >= 2.25 in order for the new instructions
to be used.
endmenu
menu "ARMv8.2 architectural features"
config ARM64_PMEM
bool "Enable support for persistent memory"
select ARCH_HAS_PMEM_API
select ARCH_HAS_UACCESS_FLUSHCACHE
help
Say Y to enable support for the persistent memory API based on the
ARMv8.2 DCPoP feature.
The feature is detected at runtime, and the kernel will use DC CVAC
operations if DC CVAP is not supported (following the behaviour of
DC CVAP itself if the system does not define a point of persistence).
config ARM64_RAS_EXTN
bool "Enable support for RAS CPU Extensions"
default y
help
CPUs that support the Reliability, Availability and Serviceability
(RAS) Extensions, part of ARMv8.2 are able to track faults and
errors, classify them and report them to software.
On CPUs with these extensions system software can use additional
barriers to determine if faults are pending and read the
classification from a new set of registers.
Selecting this feature will allow the kernel to use these barriers
and access the new registers if the system supports the extension.
Platform RAS features may additionally depend on firmware support.
arm64: mm: Support Common Not Private translations Common Not Private (CNP) is a feature of ARMv8.2 extension which allows translation table entries to be shared between different PEs in the same inner shareable domain, so the hardware can use this fact to optimise the caching of such entries in the TLB. CNP occupies one bit in TTBRx_ELy and VTTBR_EL2, which advertises to the hardware that the translation table entries pointed to by this TTBR are the same as every PE in the same inner shareable domain for which the equivalent TTBR also has CNP bit set. In case CNP bit is set but TTBR does not point at the same translation table entries for a given ASID and VMID, then the system is mis-configured, so the results of translations are UNPREDICTABLE. For kernel we postpone setting CNP till all cpus are up and rely on cpufeature framework to 1) patch the code which is sensitive to CNP and 2) update TTBR1_EL1 with CNP bit set. TTBR1_EL1 can be reprogrammed as result of hibernation or cpuidle (via __enable_mmu). For these two cases we restore CnP bit via __cpu_suspend_exit(). There are a few cases we need to care of changes in TTBR0_EL1: - a switch to idmap - software emulated PAN we rule out latter via Kconfig options and for the former we make sure that CNP is set for non-zero ASIDs only. Reviewed-by: James Morse <james.morse@arm.com> Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Vladimir Murzin <vladimir.murzin@arm.com> [catalin.marinas@arm.com: default y for CONFIG_ARM64_CNP] Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2018-07-31 13:08:56 +00:00
config ARM64_CNP
bool "Enable support for Common Not Private (CNP) translations"
default y
depends on ARM64_PAN || !ARM64_SW_TTBR0_PAN
help
Common Not Private (CNP) allows translation table entries to
be shared between different PEs in the same inner shareable
domain, so the hardware can use this fact to optimise the
caching of such entries in the TLB.
Selecting this option allows the CNP feature to be detected
at runtime, and does not affect PEs that do not implement
this feature.
endmenu
menu "ARMv8.3 architectural features"
config ARM64_PTR_AUTH
bool "Enable support for pointer authentication"
default y
arm64: compile the kernel with ptrauth return address signing Compile all functions with two ptrauth instructions: PACIASP in the prologue to sign the return address, and AUTIASP in the epilogue to authenticate the return address (from the stack). If authentication fails, the return will cause an instruction abort to be taken, followed by an oops and killing the task. This should help protect the kernel against attacks using return-oriented programming. As ptrauth protects the return address, it can also serve as a replacement for CONFIG_STACKPROTECTOR, although note that it does not protect other parts of the stack. The new instructions are in the HINT encoding space, so on a system without ptrauth they execute as NOPs. CONFIG_ARM64_PTR_AUTH now not only enables ptrauth for userspace and KVM guests, but also automatically builds the kernel with ptrauth instructions if the compiler supports it. If there is no compiler support, we do not warn that the kernel was built without ptrauth instructions. GCC 7 and 8 support the -msign-return-address option, while GCC 9 deprecates that option and replaces it with -mbranch-protection. Support both options. Clang uses an external assembler hence this patch makes sure that the correct parameters (-march=armv8.3-a) are passed down to help it recognize the ptrauth instructions. Ftrace function tracer works properly with Ptrauth only when patchable-function-entry feature is present and is ensured by the Kconfig dependency. Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will@kernel.org> Cc: Masahiro Yamada <yamada.masahiro@socionext.com> Reviewed-by: Kees Cook <keescook@chromium.org> Reviewed-by: Vincenzo Frascino <Vincenzo.Frascino@arm.com> # not co-dev parts Co-developed-by: Vincenzo Frascino <vincenzo.frascino@arm.com> Signed-off-by: Vincenzo Frascino <vincenzo.frascino@arm.com> Signed-off-by: Kristina Martsenko <kristina.martsenko@arm.com> [Amit: Cover leaf function, comments, Ftrace Kconfig] Signed-off-by: Amit Daniel Kachhap <amit.kachhap@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2020-03-13 09:05:03 +00:00
depends on (CC_HAS_SIGN_RETURN_ADDRESS || CC_HAS_BRANCH_PROT_PAC_RET) && AS_HAS_PAC
arm64: Depend on newer binutils when building PAC Versions of binutils prior to 2.33.1 don't understand the ELF notes that are added by modern compilers to indicate the PAC and BTI options used to build the code. This causes them to emit large numbers of warnings in the form: aarch64-linux-gnu-nm: warning: .tmp_vmlinux.kallsyms2: unsupported GNU_PROPERTY_TYPE (5) type: 0xc0000000 during the kernel build which is currently causing quite a bit of disruption for automated build testing using clang. In commit 15cd0e675f3f76b (arm64: Kconfig: ptrauth: Add binutils version check to fix mismatch) we added a dependency on binutils to avoid this issue when building with versions of GCC that emit the notes but did not do so for clang as it was believed that the existing check for .cfi_negate_ra_state was already requiring a new enough binutils. This does not appear to be the case for some versions of binutils (eg, the binutils in Debian 10) so instead refactor so we require a new enough GNU binutils in all cases other than when we are using an old GCC version that does not emit notes. Other, more exotic, combinations of tools are possible such as using clang, lld and gas together are possible and may have further problems but rather than adding further version checks it looks like the most robust thing will be to just test that we can build cleanly with the configured tools but that will require more review and discussion so do this for now to address the immediate problem disrupting build testing. Reported-by: KernelCI <bot@kernelci.org> Reported-by: Nick Desaulniers <ndesaulniers@google.com> Signed-off-by: Mark Brown <broonie@kernel.org> Reviewed-by: Nick Desaulniers <ndesaulniers@google.com> Link: https://github.com/ClangBuiltLinux/linux/issues/1054 Link: https://lore.kernel.org/r/20200619123550.48098-1-broonie@kernel.org Signed-off-by: Will Deacon <will@kernel.org>
2020-06-19 12:35:50 +00:00
# Modern compilers insert a .note.gnu.property section note for PAC
# which is only understood by binutils starting with version 2.33.1.
depends on LD_IS_LLD || LD_VERSION >= 23301 || (CC_IS_GCC && GCC_VERSION < 90100)
depends on !CC_IS_CLANG || AS_HAS_CFI_NEGATE_RA_STATE
arm64: compile the kernel with ptrauth return address signing Compile all functions with two ptrauth instructions: PACIASP in the prologue to sign the return address, and AUTIASP in the epilogue to authenticate the return address (from the stack). If authentication fails, the return will cause an instruction abort to be taken, followed by an oops and killing the task. This should help protect the kernel against attacks using return-oriented programming. As ptrauth protects the return address, it can also serve as a replacement for CONFIG_STACKPROTECTOR, although note that it does not protect other parts of the stack. The new instructions are in the HINT encoding space, so on a system without ptrauth they execute as NOPs. CONFIG_ARM64_PTR_AUTH now not only enables ptrauth for userspace and KVM guests, but also automatically builds the kernel with ptrauth instructions if the compiler supports it. If there is no compiler support, we do not warn that the kernel was built without ptrauth instructions. GCC 7 and 8 support the -msign-return-address option, while GCC 9 deprecates that option and replaces it with -mbranch-protection. Support both options. Clang uses an external assembler hence this patch makes sure that the correct parameters (-march=armv8.3-a) are passed down to help it recognize the ptrauth instructions. Ftrace function tracer works properly with Ptrauth only when patchable-function-entry feature is present and is ensured by the Kconfig dependency. Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will@kernel.org> Cc: Masahiro Yamada <yamada.masahiro@socionext.com> Reviewed-by: Kees Cook <keescook@chromium.org> Reviewed-by: Vincenzo Frascino <Vincenzo.Frascino@arm.com> # not co-dev parts Co-developed-by: Vincenzo Frascino <vincenzo.frascino@arm.com> Signed-off-by: Vincenzo Frascino <vincenzo.frascino@arm.com> Signed-off-by: Kristina Martsenko <kristina.martsenko@arm.com> [Amit: Cover leaf function, comments, Ftrace Kconfig] Signed-off-by: Amit Daniel Kachhap <amit.kachhap@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2020-03-13 09:05:03 +00:00
depends on (!FUNCTION_GRAPH_TRACER || DYNAMIC_FTRACE_WITH_REGS)
help
Pointer authentication (part of the ARMv8.3 Extensions) provides
instructions for signing and authenticating pointers against secret
keys, which can be used to mitigate Return Oriented Programming (ROP)
and other attacks.
This option enables these instructions at EL0 (i.e. for userspace).
Choosing this option will cause the kernel to initialise secret keys
for each process at exec() time, with these keys being
context-switched along with the process.
arm64: compile the kernel with ptrauth return address signing Compile all functions with two ptrauth instructions: PACIASP in the prologue to sign the return address, and AUTIASP in the epilogue to authenticate the return address (from the stack). If authentication fails, the return will cause an instruction abort to be taken, followed by an oops and killing the task. This should help protect the kernel against attacks using return-oriented programming. As ptrauth protects the return address, it can also serve as a replacement for CONFIG_STACKPROTECTOR, although note that it does not protect other parts of the stack. The new instructions are in the HINT encoding space, so on a system without ptrauth they execute as NOPs. CONFIG_ARM64_PTR_AUTH now not only enables ptrauth for userspace and KVM guests, but also automatically builds the kernel with ptrauth instructions if the compiler supports it. If there is no compiler support, we do not warn that the kernel was built without ptrauth instructions. GCC 7 and 8 support the -msign-return-address option, while GCC 9 deprecates that option and replaces it with -mbranch-protection. Support both options. Clang uses an external assembler hence this patch makes sure that the correct parameters (-march=armv8.3-a) are passed down to help it recognize the ptrauth instructions. Ftrace function tracer works properly with Ptrauth only when patchable-function-entry feature is present and is ensured by the Kconfig dependency. Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will@kernel.org> Cc: Masahiro Yamada <yamada.masahiro@socionext.com> Reviewed-by: Kees Cook <keescook@chromium.org> Reviewed-by: Vincenzo Frascino <Vincenzo.Frascino@arm.com> # not co-dev parts Co-developed-by: Vincenzo Frascino <vincenzo.frascino@arm.com> Signed-off-by: Vincenzo Frascino <vincenzo.frascino@arm.com> Signed-off-by: Kristina Martsenko <kristina.martsenko@arm.com> [Amit: Cover leaf function, comments, Ftrace Kconfig] Signed-off-by: Amit Daniel Kachhap <amit.kachhap@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2020-03-13 09:05:03 +00:00
If the compiler supports the -mbranch-protection or
-msign-return-address flag (e.g. GCC 7 or later), then this option
will also cause the kernel itself to be compiled with return address
protection. In this case, and if the target hardware is known to
support pointer authentication, then CONFIG_STACKPROTECTOR can be
disabled with minimal loss of protection.
The feature is detected at runtime. If the feature is not present in
KVM: arm/arm64: Context-switch ptrauth registers When pointer authentication is supported, a guest may wish to use it. This patch adds the necessary KVM infrastructure for this to work, with a semi-lazy context switch of the pointer auth state. Pointer authentication feature is only enabled when VHE is built in the kernel and present in the CPU implementation so only VHE code paths are modified. When we schedule a vcpu, we disable guest usage of pointer authentication instructions and accesses to the keys. While these are disabled, we avoid context-switching the keys. When we trap the guest trying to use pointer authentication functionality, we change to eagerly context-switching the keys, and enable the feature. The next time the vcpu is scheduled out/in, we start again. However the host key save is optimized and implemented inside ptrauth instruction/register access trap. Pointer authentication consists of address authentication and generic authentication, and CPUs in a system might have varied support for either. Where support for either feature is not uniform, it is hidden from guests via ID register emulation, as a result of the cpufeature framework in the host. Unfortunately, address authentication and generic authentication cannot be trapped separately, as the architecture provides a single EL2 trap covering both. If we wish to expose one without the other, we cannot prevent a (badly-written) guest from intermittently using a feature which is not uniformly supported (when scheduled on a physical CPU which supports the relevant feature). Hence, this patch expects both type of authentication to be present in a cpu. This switch of key is done from guest enter/exit assembly as preparation for the upcoming in-kernel pointer authentication support. Hence, these key switching routines are not implemented in C code as they may cause pointer authentication key signing error in some situations. Signed-off-by: Mark Rutland <mark.rutland@arm.com> [Only VHE, key switch in full assembly, vcpu_has_ptrauth checks , save host key in ptrauth exception trap] Signed-off-by: Amit Daniel Kachhap <amit.kachhap@arm.com> Reviewed-by: Julien Thierry <julien.thierry@arm.com> Cc: Christoffer Dall <christoffer.dall@arm.com> Cc: kvmarm@lists.cs.columbia.edu [maz: various fixups] Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2019-04-23 04:42:35 +00:00
hardware it will not be advertised to userspace/KVM guest nor will it
be enabled.
If the feature is present on the boot CPU but not on a late CPU, then
the late CPU will be parked. Also, if the boot CPU does not have
address auth and the late CPU has then the late CPU will still boot
but with the feature disabled. On such a system, this option should
not be selected.
arm64: compile the kernel with ptrauth return address signing Compile all functions with two ptrauth instructions: PACIASP in the prologue to sign the return address, and AUTIASP in the epilogue to authenticate the return address (from the stack). If authentication fails, the return will cause an instruction abort to be taken, followed by an oops and killing the task. This should help protect the kernel against attacks using return-oriented programming. As ptrauth protects the return address, it can also serve as a replacement for CONFIG_STACKPROTECTOR, although note that it does not protect other parts of the stack. The new instructions are in the HINT encoding space, so on a system without ptrauth they execute as NOPs. CONFIG_ARM64_PTR_AUTH now not only enables ptrauth for userspace and KVM guests, but also automatically builds the kernel with ptrauth instructions if the compiler supports it. If there is no compiler support, we do not warn that the kernel was built without ptrauth instructions. GCC 7 and 8 support the -msign-return-address option, while GCC 9 deprecates that option and replaces it with -mbranch-protection. Support both options. Clang uses an external assembler hence this patch makes sure that the correct parameters (-march=armv8.3-a) are passed down to help it recognize the ptrauth instructions. Ftrace function tracer works properly with Ptrauth only when patchable-function-entry feature is present and is ensured by the Kconfig dependency. Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will@kernel.org> Cc: Masahiro Yamada <yamada.masahiro@socionext.com> Reviewed-by: Kees Cook <keescook@chromium.org> Reviewed-by: Vincenzo Frascino <Vincenzo.Frascino@arm.com> # not co-dev parts Co-developed-by: Vincenzo Frascino <vincenzo.frascino@arm.com> Signed-off-by: Vincenzo Frascino <vincenzo.frascino@arm.com> Signed-off-by: Kristina Martsenko <kristina.martsenko@arm.com> [Amit: Cover leaf function, comments, Ftrace Kconfig] Signed-off-by: Amit Daniel Kachhap <amit.kachhap@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2020-03-13 09:05:03 +00:00
This feature works with FUNCTION_GRAPH_TRACER option only if
DYNAMIC_FTRACE_WITH_REGS is enabled.
config CC_HAS_BRANCH_PROT_PAC_RET
# GCC 9 or later, clang 8 or later
def_bool $(cc-option,-mbranch-protection=pac-ret+leaf)
config CC_HAS_SIGN_RETURN_ADDRESS
# GCC 7, 8
def_bool $(cc-option,-msign-return-address=all)
config AS_HAS_PAC
2020-06-14 14:43:41 +00:00
def_bool $(cc-option,-Wa$(comma)-march=armv8.3-a)
arm64: compile the kernel with ptrauth return address signing Compile all functions with two ptrauth instructions: PACIASP in the prologue to sign the return address, and AUTIASP in the epilogue to authenticate the return address (from the stack). If authentication fails, the return will cause an instruction abort to be taken, followed by an oops and killing the task. This should help protect the kernel against attacks using return-oriented programming. As ptrauth protects the return address, it can also serve as a replacement for CONFIG_STACKPROTECTOR, although note that it does not protect other parts of the stack. The new instructions are in the HINT encoding space, so on a system without ptrauth they execute as NOPs. CONFIG_ARM64_PTR_AUTH now not only enables ptrauth for userspace and KVM guests, but also automatically builds the kernel with ptrauth instructions if the compiler supports it. If there is no compiler support, we do not warn that the kernel was built without ptrauth instructions. GCC 7 and 8 support the -msign-return-address option, while GCC 9 deprecates that option and replaces it with -mbranch-protection. Support both options. Clang uses an external assembler hence this patch makes sure that the correct parameters (-march=armv8.3-a) are passed down to help it recognize the ptrauth instructions. Ftrace function tracer works properly with Ptrauth only when patchable-function-entry feature is present and is ensured by the Kconfig dependency. Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will@kernel.org> Cc: Masahiro Yamada <yamada.masahiro@socionext.com> Reviewed-by: Kees Cook <keescook@chromium.org> Reviewed-by: Vincenzo Frascino <Vincenzo.Frascino@arm.com> # not co-dev parts Co-developed-by: Vincenzo Frascino <vincenzo.frascino@arm.com> Signed-off-by: Vincenzo Frascino <vincenzo.frascino@arm.com> Signed-off-by: Kristina Martsenko <kristina.martsenko@arm.com> [Amit: Cover leaf function, comments, Ftrace Kconfig] Signed-off-by: Amit Daniel Kachhap <amit.kachhap@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2020-03-13 09:05:03 +00:00
config AS_HAS_CFI_NEGATE_RA_STATE
def_bool $(as-instr,.cfi_startproc\n.cfi_negate_ra_state\n.cfi_endproc\n)
endmenu
menu "ARMv8.4 architectural features"
config ARM64_AMU_EXTN
bool "Enable support for the Activity Monitors Unit CPU extension"
default y
help
The activity monitors extension is an optional extension introduced
by the ARMv8.4 CPU architecture. This enables support for version 1
of the activity monitors architecture, AMUv1.
To enable the use of this extension on CPUs that implement it, say Y.
Note that for architectural reasons, firmware _must_ implement AMU
support when running on CPUs that present the activity monitors
extension. The required support is present in:
* Version 1.5 and later of the ARM Trusted Firmware
For kernels that have this configuration enabled but boot with broken
firmware, you may need to say N here until the firmware is fixed.
Otherwise you may experience firmware panics or lockups when
accessing the counter registers. Even if you are not observing these
symptoms, the values returned by the register reads might not
correctly reflect reality. Most commonly, the value read will be 0,
indicating that the counter is not enabled.
config AS_HAS_ARMV8_4
def_bool $(cc-option,-Wa$(comma)-march=armv8.4-a)
config ARM64_TLB_RANGE
bool "Enable support for tlbi range feature"
default y
depends on AS_HAS_ARMV8_4
help
ARMv8.4-TLBI provides TLBI invalidation instruction that apply to a
range of input addresses.
The feature introduces new assembly instructions, and they were
support when binutils >= 2.30.
endmenu
menu "ARMv8.5 architectural features"
config AS_HAS_ARMV8_5
def_bool $(cc-option,-Wa$(comma)-march=armv8.5-a)
config ARM64_BTI
bool "Branch Target Identification support"
default y
help
Branch Target Identification (part of the ARMv8.5 Extensions)
provides a mechanism to limit the set of locations to which computed
branch instructions such as BR or BLR can jump.
To make use of BTI on CPUs that support it, say Y.
BTI is intended to provide complementary protection to other control
flow integrity protection mechanisms, such as the Pointer
authentication mechanism provided as part of the ARMv8.3 Extensions.
For this reason, it does not make sense to enable this option without
also enabling support for pointer authentication. Thus, when
enabling this option you should also select ARM64_PTR_AUTH=y.
Userspace binaries must also be specifically compiled to make use of
this mechanism. If you say N here or the hardware does not support
BTI, such binaries can still run, but you get no additional
enforcement of branch destinations.
config ARM64_BTI_KERNEL
bool "Use Branch Target Identification for kernel"
default y
depends on ARM64_BTI
depends on ARM64_PTR_AUTH
depends on CC_HAS_BRANCH_PROT_PAC_RET_BTI
# https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94697
depends on !CC_IS_GCC || GCC_VERSION >= 100100
depends on !(CC_IS_CLANG && GCOV_KERNEL)
depends on (!FUNCTION_GRAPH_TRACER || DYNAMIC_FTRACE_WITH_REGS)
help
Build the kernel with Branch Target Identification annotations
and enable enforcement of this for kernel code. When this option
is enabled and the system supports BTI all kernel code including
modular code must have BTI enabled.
config CC_HAS_BRANCH_PROT_PAC_RET_BTI
# GCC 9 or later, clang 8 or later
def_bool $(cc-option,-mbranch-protection=pac-ret+leaf+bti)
config ARM64_E0PD
bool "Enable support for E0PD"
default y
help
E0PD (part of the ARMv8.5 extensions) allows us to ensure
that EL0 accesses made via TTBR1 always fault in constant time,
providing similar benefits to KASLR as those provided by KPTI, but
with lower overhead and without disrupting legitimate access to
kernel memory such as SPE.
This option enables E0PD for TTBR1 where available.
config ARCH_RANDOM
bool "Enable support for random number generation"
default y
help
Random number generation (part of the ARMv8.5 Extensions)
provides a high bandwidth, cryptographically secure
hardware random number generator.
config ARM64_AS_HAS_MTE
# Initial support for MTE went in binutils 2.32.0, checked with
# ".arch armv8.5-a+memtag" below. However, this was incomplete
# as a late addition to the final architecture spec (LDGM/STGM)
# is only supported in the newer 2.32.x and 2.33 binutils
# versions, hence the extra "stgm" instruction check below.
def_bool $(as-instr,.arch armv8.5-a+memtag\nstgm xzr$(comma)[x0])
config ARM64_MTE
bool "Memory Tagging Extension support"
default y
depends on ARM64_AS_HAS_MTE && ARM64_TAGGED_ADDR_ABI
depends on AS_HAS_ARMV8_5
depends on AS_HAS_LSE_ATOMICS
2020-12-22 20:01:35 +00:00
# Required for tag checking in the uaccess routines
depends on ARM64_PAN
select ARCH_USES_HIGH_VMA_FLAGS
help
Memory Tagging (part of the ARMv8.5 Extensions) provides
architectural support for run-time, always-on detection of
various classes of memory error to aid with software debugging
to eliminate vulnerabilities arising from memory-unsafe
languages.
This option enables the support for the Memory Tagging
Extension at EL0 (i.e. for userspace).
Selecting this option allows the feature to be detected at
runtime. Any secondary CPU not implementing this feature will
not be allowed a late bring-up.
Userspace binaries that want to use this feature must
explicitly opt in. The mechanism for the userspace is
described in:
Documentation/arm64/memory-tagging-extension.rst.
endmenu
menu "ARMv8.7 architectural features"
config ARM64_EPAN
bool "Enable support for Enhanced Privileged Access Never (EPAN)"
default y
depends on ARM64_PAN
help
Enhanced Privileged Access Never (EPAN) allows Privileged
Access Never to be used with Execute-only mappings.
The feature is detected at runtime, and will remain disabled
if the cpu does not implement the feature.
endmenu
config ARM64_SVE
bool "ARM Scalable Vector Extension support"
default y
help
The Scalable Vector Extension (SVE) is an extension to the AArch64
execution state which complements and extends the SIMD functionality
of the base architecture to support much larger vectors and to enable
additional vectorisation opportunities.
To enable use of this extension on CPUs that implement it, say Y.
On CPUs that support the SVE2 extensions, this option will enable
those too.
Note that for architectural reasons, firmware _must_ implement SVE
support when running on SVE capable hardware. The required support
is present in:
* version 1.5 and later of the ARM Trusted Firmware
* the AArch64 boot wrapper since commit 5e1261e08abf
("bootwrapper: SVE: Enable SVE for EL2 and below").
For other firmware implementations, consult the firmware documentation
or vendor.
If you need the kernel to boot on SVE-capable hardware with broken
firmware, you may need to say N here until you get your firmware
fixed. Otherwise, you may experience firmware panics or lockups when
booting the kernel. If unsure and you are not observing these
symptoms, you should assume that it is safe to say Y.
config ARM64_MODULE_PLTS
bool "Use PLTs to allow module memory to spill over into vmalloc area"
depends on MODULES
select HAVE_MOD_ARCH_SPECIFIC
help
Allocate PLTs when loading modules so that jumps and calls whose
targets are too far away for their relative offsets to be encoded
in the instructions themselves can be bounced via veneers in the
module's PLT. This allows modules to be allocated in the generic
vmalloc area after the dedicated module memory area has been
exhausted.
When running with address space randomization (KASLR), the module
region itself may be too far away for ordinary relative jumps and
calls, and so in that case, module PLTs are required and cannot be
disabled.
Specific errata workaround(s) might also force module PLTs to be
enabled (ARM64_ERRATUM_843419).
config ARM64_PSEUDO_NMI
bool "Support for NMI-like interrupts"
select ARM_GIC_V3
help
Adds support for mimicking Non-Maskable Interrupts through the use of
GIC interrupt priority. This support requires version 3 or later of
ARM GIC.
This high priority configuration for interrupts needs to be
explicitly enabled by setting the kernel parameter
"irqchip.gicv3_pseudo_nmi" to 1.
If unsure, say N
if ARM64_PSEUDO_NMI
config ARM64_DEBUG_PRIORITY_MASKING
bool "Debug interrupt priority masking"
help
This adds runtime checks to functions enabling/disabling
interrupts when using priority masking. The additional checks verify
the validity of ICC_PMR_EL1 when calling concerned functions.
If unsure, say N
endif
config RELOCATABLE
bool "Build a relocatable kernel image" if EXPERT
select ARCH_HAS_RELR
default y
help
This builds the kernel as a Position Independent Executable (PIE),
which retains all relocation metadata required to relocate the
kernel binary at runtime to a different virtual address than the
address it was linked at.
Since AArch64 uses the RELA relocation format, this requires a
relocation pass at runtime even if the kernel is loaded at the
same address it was linked at.
arm64: add support for kernel ASLR This adds support for KASLR is implemented, based on entropy provided by the bootloader in the /chosen/kaslr-seed DT property. Depending on the size of the address space (VA_BITS) and the page size, the entropy in the virtual displacement is up to 13 bits (16k/2 levels) and up to 25 bits (all 4 levels), with the sidenote that displacements that result in the kernel image straddling a 1GB/32MB/512MB alignment boundary (for 4KB/16KB/64KB granule kernels, respectively) are not allowed, and will be rounded up to an acceptable value. If CONFIG_RANDOMIZE_MODULE_REGION_FULL is enabled, the module region is randomized independently from the core kernel. This makes it less likely that the location of core kernel data structures can be determined by an adversary, but causes all function calls from modules into the core kernel to be resolved via entries in the module PLTs. If CONFIG_RANDOMIZE_MODULE_REGION_FULL is not enabled, the module region is randomized by choosing a page aligned 128 MB region inside the interval [_etext - 128 MB, _stext + 128 MB). This gives between 10 and 14 bits of entropy (depending on page size), independently of the kernel randomization, but still guarantees that modules are within the range of relative branch and jump instructions (with the caveat that, since the module region is shared with other uses of the vmalloc area, modules may need to be loaded further away if the module region is exhausted) Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2016-01-26 13:12:01 +00:00
config RANDOMIZE_BASE
bool "Randomize the address of the kernel image"
select ARM64_MODULE_PLTS if MODULES
arm64: add support for kernel ASLR This adds support for KASLR is implemented, based on entropy provided by the bootloader in the /chosen/kaslr-seed DT property. Depending on the size of the address space (VA_BITS) and the page size, the entropy in the virtual displacement is up to 13 bits (16k/2 levels) and up to 25 bits (all 4 levels), with the sidenote that displacements that result in the kernel image straddling a 1GB/32MB/512MB alignment boundary (for 4KB/16KB/64KB granule kernels, respectively) are not allowed, and will be rounded up to an acceptable value. If CONFIG_RANDOMIZE_MODULE_REGION_FULL is enabled, the module region is randomized independently from the core kernel. This makes it less likely that the location of core kernel data structures can be determined by an adversary, but causes all function calls from modules into the core kernel to be resolved via entries in the module PLTs. If CONFIG_RANDOMIZE_MODULE_REGION_FULL is not enabled, the module region is randomized by choosing a page aligned 128 MB region inside the interval [_etext - 128 MB, _stext + 128 MB). This gives between 10 and 14 bits of entropy (depending on page size), independently of the kernel randomization, but still guarantees that modules are within the range of relative branch and jump instructions (with the caveat that, since the module region is shared with other uses of the vmalloc area, modules may need to be loaded further away if the module region is exhausted) Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2016-01-26 13:12:01 +00:00
select RELOCATABLE
help
Randomizes the virtual address at which the kernel image is
loaded, as a security feature that deters exploit attempts
relying on knowledge of the location of kernel internals.
It is the bootloader's job to provide entropy, by passing a
random u64 value in /chosen/kaslr-seed at kernel entry.
When booting via the UEFI stub, it will invoke the firmware's
EFI_RNG_PROTOCOL implementation (if available) to supply entropy
to the kernel proper. In addition, it will randomise the physical
location of the kernel Image as well.
arm64: add support for kernel ASLR This adds support for KASLR is implemented, based on entropy provided by the bootloader in the /chosen/kaslr-seed DT property. Depending on the size of the address space (VA_BITS) and the page size, the entropy in the virtual displacement is up to 13 bits (16k/2 levels) and up to 25 bits (all 4 levels), with the sidenote that displacements that result in the kernel image straddling a 1GB/32MB/512MB alignment boundary (for 4KB/16KB/64KB granule kernels, respectively) are not allowed, and will be rounded up to an acceptable value. If CONFIG_RANDOMIZE_MODULE_REGION_FULL is enabled, the module region is randomized independently from the core kernel. This makes it less likely that the location of core kernel data structures can be determined by an adversary, but causes all function calls from modules into the core kernel to be resolved via entries in the module PLTs. If CONFIG_RANDOMIZE_MODULE_REGION_FULL is not enabled, the module region is randomized by choosing a page aligned 128 MB region inside the interval [_etext - 128 MB, _stext + 128 MB). This gives between 10 and 14 bits of entropy (depending on page size), independently of the kernel randomization, but still guarantees that modules are within the range of relative branch and jump instructions (with the caveat that, since the module region is shared with other uses of the vmalloc area, modules may need to be loaded further away if the module region is exhausted) Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2016-01-26 13:12:01 +00:00
If unsure, say N.
config RANDOMIZE_MODULE_REGION_FULL
arm64/kernel: kaslr: reduce module randomization range to 4 GB We currently have to rely on the GCC large code model for KASLR for two distinct but related reasons: - if we enable full randomization, modules will be loaded very far away from the core kernel, where they are out of range for ADRP instructions, - even without full randomization, the fact that the 128 MB module region is now no longer fully reserved for kernel modules means that there is a very low likelihood that the normal bottom-up allocation of other vmalloc regions may collide, and use up the range for other things. Large model code is suboptimal, given that each symbol reference involves a literal load that goes through the D-cache, reducing cache utilization. But more importantly, literals are not instructions but part of .text nonetheless, and hence mapped with executable permissions. So let's get rid of our dependency on the large model for KASLR, by: - reducing the full randomization range to 4 GB, thereby ensuring that ADRP references between modules and the kernel are always in range, - reduce the spillover range to 4 GB as well, so that we fallback to a region that is still guaranteed to be in range - move the randomization window of the core kernel to the middle of the VMALLOC space Note that KASAN always uses the module region outside of the vmalloc space, so keep the kernel close to that if KASAN is enabled. Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-03-06 17:15:32 +00:00
bool "Randomize the module region over a 4 GB range"
depends on RANDOMIZE_BASE
arm64: add support for kernel ASLR This adds support for KASLR is implemented, based on entropy provided by the bootloader in the /chosen/kaslr-seed DT property. Depending on the size of the address space (VA_BITS) and the page size, the entropy in the virtual displacement is up to 13 bits (16k/2 levels) and up to 25 bits (all 4 levels), with the sidenote that displacements that result in the kernel image straddling a 1GB/32MB/512MB alignment boundary (for 4KB/16KB/64KB granule kernels, respectively) are not allowed, and will be rounded up to an acceptable value. If CONFIG_RANDOMIZE_MODULE_REGION_FULL is enabled, the module region is randomized independently from the core kernel. This makes it less likely that the location of core kernel data structures can be determined by an adversary, but causes all function calls from modules into the core kernel to be resolved via entries in the module PLTs. If CONFIG_RANDOMIZE_MODULE_REGION_FULL is not enabled, the module region is randomized by choosing a page aligned 128 MB region inside the interval [_etext - 128 MB, _stext + 128 MB). This gives between 10 and 14 bits of entropy (depending on page size), independently of the kernel randomization, but still guarantees that modules are within the range of relative branch and jump instructions (with the caveat that, since the module region is shared with other uses of the vmalloc area, modules may need to be loaded further away if the module region is exhausted) Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2016-01-26 13:12:01 +00:00
default y
help
arm64/kernel: kaslr: reduce module randomization range to 4 GB We currently have to rely on the GCC large code model for KASLR for two distinct but related reasons: - if we enable full randomization, modules will be loaded very far away from the core kernel, where they are out of range for ADRP instructions, - even without full randomization, the fact that the 128 MB module region is now no longer fully reserved for kernel modules means that there is a very low likelihood that the normal bottom-up allocation of other vmalloc regions may collide, and use up the range for other things. Large model code is suboptimal, given that each symbol reference involves a literal load that goes through the D-cache, reducing cache utilization. But more importantly, literals are not instructions but part of .text nonetheless, and hence mapped with executable permissions. So let's get rid of our dependency on the large model for KASLR, by: - reducing the full randomization range to 4 GB, thereby ensuring that ADRP references between modules and the kernel are always in range, - reduce the spillover range to 4 GB as well, so that we fallback to a region that is still guaranteed to be in range - move the randomization window of the core kernel to the middle of the VMALLOC space Note that KASAN always uses the module region outside of the vmalloc space, so keep the kernel close to that if KASAN is enabled. Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-03-06 17:15:32 +00:00
Randomizes the location of the module region inside a 4 GB window
covering the core kernel. This way, it is less likely for modules
arm64: add support for kernel ASLR This adds support for KASLR is implemented, based on entropy provided by the bootloader in the /chosen/kaslr-seed DT property. Depending on the size of the address space (VA_BITS) and the page size, the entropy in the virtual displacement is up to 13 bits (16k/2 levels) and up to 25 bits (all 4 levels), with the sidenote that displacements that result in the kernel image straddling a 1GB/32MB/512MB alignment boundary (for 4KB/16KB/64KB granule kernels, respectively) are not allowed, and will be rounded up to an acceptable value. If CONFIG_RANDOMIZE_MODULE_REGION_FULL is enabled, the module region is randomized independently from the core kernel. This makes it less likely that the location of core kernel data structures can be determined by an adversary, but causes all function calls from modules into the core kernel to be resolved via entries in the module PLTs. If CONFIG_RANDOMIZE_MODULE_REGION_FULL is not enabled, the module region is randomized by choosing a page aligned 128 MB region inside the interval [_etext - 128 MB, _stext + 128 MB). This gives between 10 and 14 bits of entropy (depending on page size), independently of the kernel randomization, but still guarantees that modules are within the range of relative branch and jump instructions (with the caveat that, since the module region is shared with other uses of the vmalloc area, modules may need to be loaded further away if the module region is exhausted) Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2016-01-26 13:12:01 +00:00
to leak information about the location of core kernel data structures
but it does imply that function calls between modules and the core
kernel will need to be resolved via veneers in the module PLT.
When this option is not set, the module region will be randomized over
a limited range that contains the [_stext, _etext] interval of the
core kernel, so branch relocations are always in range.
config CC_HAVE_STACKPROTECTOR_SYSREG
def_bool $(cc-option,-mstack-protector-guard=sysreg -mstack-protector-guard-reg=sp_el0 -mstack-protector-guard-offset=0)
config STACKPROTECTOR_PER_TASK
def_bool y
depends on STACKPROTECTOR && CC_HAVE_STACKPROTECTOR_SYSREG
endmenu
menu "Boot options"
arm64: kernel: implement ACPI parking protocol The SBBR and ACPI specifications allow ACPI based systems that do not implement PSCI (eg systems with no EL3) to boot through the ACPI parking protocol specification[1]. This patch implements the ACPI parking protocol CPU operations, and adds code that eases parsing the parking protocol data structures to the ARM64 SMP initializion carried out at the same time as cpus enumeration. To wake-up the CPUs from the parked state, this patch implements a wakeup IPI for ARM64 (ie arch_send_wakeup_ipi_mask()) that mirrors the ARM one, so that a specific IPI is sent for wake-up purpose in order to distinguish it from other IPI sources. Given the current ACPI MADT parsing API, the patch implements a glue layer that helps passing MADT GICC data structure from SMP initialization code to the parking protocol implementation somewhat overriding the CPU operations interfaces. This to avoid creating a completely trasparent DT/ACPI CPU operations layer that would require creating opaque structure handling for CPUs data (DT represents CPU through DT nodes, ACPI through static MADT table entries), which seems overkill given that ACPI on ARM64 mandates only two booting protocols (PSCI and parking protocol), so there is no need for further protocol additions. Based on the original work by Mark Salter <msalter@redhat.com> [1] https://acpica.org/sites/acpica/files/MP%20Startup%20for%20ARM%20platforms.docx Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Tested-by: Loc Ho <lho@apm.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Hanjun Guo <hanjun.guo@linaro.org> Cc: Sudeep Holla <sudeep.holla@arm.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mark Salter <msalter@redhat.com> Cc: Al Stone <ahs3@redhat.com> [catalin.marinas@arm.com: Added WARN_ONCE(!acpi_parking_protocol_valid() on the IPI] Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2016-01-26 11:10:38 +00:00
config ARM64_ACPI_PARKING_PROTOCOL
bool "Enable support for the ARM64 ACPI parking protocol"
depends on ACPI
help
Enable support for the ARM64 ACPI parking protocol. If disabled
the kernel will not allow booting through the ARM64 ACPI parking
protocol even if the corresponding data is present in the ACPI
MADT table.
config CMDLINE
string "Default kernel command string"
default ""
help
Provide a set of default command-line options at build time by
entering them here. As a minimum, you should specify the the
root device (e.g. root=/dev/nfs).
choice
prompt "Kernel command line type" if CMDLINE != ""
default CMDLINE_FROM_BOOTLOADER
help
Choose how the kernel will handle the provided default kernel
command line string.
config CMDLINE_FROM_BOOTLOADER
bool "Use bootloader kernel arguments if available"
help
Uses the command-line options passed by the boot loader. If
the boot loader doesn't provide any, the default kernel command
string provided in CMDLINE will be used.
config CMDLINE_FORCE
bool "Always use the default kernel command string"
help
Always use the default kernel command string, even if the boot
loader passes other arguments to the kernel.
This is useful if you cannot or don't want to change the
command-line options your boot loader passes to the kernel.
endchoice
config EFI_STUB
bool
config EFI
bool "UEFI runtime support"
depends on OF && !CPU_BIG_ENDIAN
depends on KERNEL_MODE_NEON
select ARCH_SUPPORTS_ACPI
select LIBFDT
select UCS2_STRING
select EFI_PARAMS_FROM_FDT
select EFI_RUNTIME_WRAPPERS
select EFI_STUB
select EFI_GENERIC_STUB
imply IMA_SECURE_AND_OR_TRUSTED_BOOT
default y
help
This option provides support for runtime services provided
by UEFI firmware (such as non-volatile variables, realtime
clock, and platform reset). A UEFI stub is also provided to
allow the kernel to be booted as an EFI application. This
is only useful on systems that have UEFI firmware.
config DMI
bool "Enable support for SMBIOS (DMI) tables"
depends on EFI
default y
help
This enables SMBIOS/DMI feature for systems.
This option is only useful on systems that have UEFI firmware.
However, even with this option, the resultant kernel should
continue to boot on existing non-UEFI platforms.
endmenu
config SYSVIPC_COMPAT
def_bool y
depends on COMPAT && SYSVIPC
config ARCH_ENABLE_HUGEPAGE_MIGRATION
def_bool y
depends on HUGETLB_PAGE && MIGRATION
config ARCH_ENABLE_THP_MIGRATION
def_bool y
depends on TRANSPARENT_HUGEPAGE
menu "Power management options"
source "kernel/power/Kconfig"
config ARCH_HIBERNATION_POSSIBLE
def_bool y
depends on CPU_PM
config ARCH_HIBERNATION_HEADER
def_bool y
depends on HIBERNATION
config ARCH_SUSPEND_POSSIBLE
def_bool y
endmenu
menu "CPU Power Management"
source "drivers/cpuidle/Kconfig"
source "drivers/cpufreq/Kconfig"
endmenu
source "drivers/firmware/Kconfig"
source "drivers/acpi/Kconfig"
source "arch/arm64/kvm/Kconfig"
if CRYPTO
source "arch/arm64/crypto/Kconfig"
endif