diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt index 0136cdf7e55d..9b3c086d4266 100644 --- a/Documentation/admin-guide/kernel-parameters.txt +++ b/Documentation/admin-guide/kernel-parameters.txt @@ -2279,8 +2279,7 @@ state is kept private from the host. Not valid if the kernel is running in EL2. - Defaults to VHE/nVHE based on hardware support and - the value of CONFIG_ARM64_VHE. + Defaults to VHE/nVHE based on hardware support. kvm-arm.vgic_v3_group0_trap= [KVM,ARM] Trap guest accesses to GICv3 group-0 diff --git a/Documentation/admin-guide/perf/hisi-pmu.rst b/Documentation/admin-guide/perf/hisi-pmu.rst index 404a5c3d9d00..546979360513 100644 --- a/Documentation/admin-guide/perf/hisi-pmu.rst +++ b/Documentation/admin-guide/perf/hisi-pmu.rst @@ -53,6 +53,60 @@ Example usage of perf:: $# perf stat -a -e hisi_sccl3_l3c0/rd_hit_cpipe/ sleep 5 $# perf stat -a -e hisi_sccl3_l3c0/config=0x02/ sleep 5 +For HiSilicon uncore PMU v2 whose identifier is 0x30, the topology is the same +as PMU v1, but some new functions are added to the hardware. + +(a) L3C PMU supports filtering by core/thread within the cluster which can be +specified as a bitmap:: + + $# perf stat -a -e hisi_sccl3_l3c0/config=0x02,tt_core=0x3/ sleep 5 + +This will only count the operations from core/thread 0 and 1 in this cluster. + +(b) Tracetag allow the user to chose to count only read, write or atomic +operations via the tt_req parameeter in perf. The default value counts all +operations. tt_req is 3bits, 3'b100 represents read operations, 3'b101 +represents write operations, 3'b110 represents atomic store operations and +3'b111 represents atomic non-store operations, other values are reserved:: + + $# perf stat -a -e hisi_sccl3_l3c0/config=0x02,tt_req=0x4/ sleep 5 + +This will only count the read operations in this cluster. + +(c) Datasrc allows the user to check where the data comes from. It is 5 bits. +Some important codes are as follows: +5'b00001: comes from L3C in this die; +5'b01000: comes from L3C in the cross-die; +5'b01001: comes from L3C which is in another socket; +5'b01110: comes from the local DDR; +5'b01111: comes from the cross-die DDR; +5'b10000: comes from cross-socket DDR; +etc, it is mainly helpful to find that the data source is nearest from the CPU +cores. If datasrc_cfg is used in the multi-chips, the datasrc_skt shall be +configured in perf command:: + + $# perf stat -a -e hisi_sccl3_l3c0/config=0xb9,datasrc_cfg=0xE/, + hisi_sccl3_l3c0/config=0xb9,datasrc_cfg=0xF/ sleep 5 + +(d)Some HiSilicon SoCs encapsulate multiple CPU and IO dies. Each CPU die +contains several Compute Clusters (CCLs). The I/O dies are called Super I/O +clusters (SICL) containing multiple I/O clusters (ICLs). Each CCL/ICL in the +SoC has a unique ID. Each ID is 11bits, include a 6-bit SCCL-ID and 5-bit +CCL/ICL-ID. For I/O die, the ICL-ID is followed by: +5'b00000: I/O_MGMT_ICL; +5'b00001: Network_ICL; +5'b00011: HAC_ICL; +5'b10000: PCIe_ICL; + +Users could configure IDs to count data come from specific CCL/ICL, by setting +srcid_cmd & srcid_msk, and data desitined for specific CCL/ICL by setting +tgtid_cmd & tgtid_msk. A set bit in srcid_msk/tgtid_msk means the PMU will not +check the bit when matching against the srcid_cmd/tgtid_cmd. + +If all of these options are disabled, it can works by the default value that +doesn't distinguish the filter condition and ID information and will return +the total counter values in the PMU counters. + The current driver does not support sampling. So "perf record" is unsupported. Also attach to a task is unsupported as the events are all uncore. diff --git a/Documentation/arm64/booting.rst b/Documentation/arm64/booting.rst index 7552dbc1cc54..4fcc00add117 100644 --- a/Documentation/arm64/booting.rst +++ b/Documentation/arm64/booting.rst @@ -202,9 +202,10 @@ Before jumping into the kernel, the following conditions must be met: - System registers - All writable architected system registers at the exception level where - the kernel image will be entered must be initialised by software at a - higher exception level to prevent execution in an UNKNOWN state. + All writable architected system registers at or below the exception + level where the kernel image will be entered must be initialised by + software at a higher exception level to prevent execution in an UNKNOWN + state. - SCR_EL3.FIQ must have the same value across all CPUs the kernel is executing on. @@ -270,6 +271,12 @@ Before jumping into the kernel, the following conditions must be met: having 0b1 set for the corresponding bit for each of the auxiliary counters present. + For CPUs with the Fine Grained Traps (FEAT_FGT) extension present: + + - If EL3 is present and the kernel is entered at EL2: + + - SCR_EL3.FGTEn (bit 27) must be initialised to 0b1. + The requirements described above for CPU mode, caches, MMUs, architected timers, coherency and system registers apply to all CPUs. All CPUs must enter the kernel in the same exception level. diff --git a/Documentation/arm64/pointer-authentication.rst b/Documentation/arm64/pointer-authentication.rst index 30b2ab06526b..f127666ea3a8 100644 --- a/Documentation/arm64/pointer-authentication.rst +++ b/Documentation/arm64/pointer-authentication.rst @@ -107,3 +107,37 @@ filter out the Pointer Authentication system key registers from KVM_GET/SET_REG_* ioctls and mask those features from cpufeature ID register. Any attempt to use the Pointer Authentication instructions will result in an UNDEFINED exception being injected into the guest. + + +Enabling and disabling keys +--------------------------- + +The prctl PR_PAC_SET_ENABLED_KEYS allows the user program to control which +PAC keys are enabled in a particular task. It takes two arguments, the +first being a bitmask of PR_PAC_APIAKEY, PR_PAC_APIBKEY, PR_PAC_APDAKEY +and PR_PAC_APDBKEY specifying which keys shall be affected by this prctl, +and the second being a bitmask of the same bits specifying whether the key +should be enabled or disabled. For example:: + + prctl(PR_PAC_SET_ENABLED_KEYS, + PR_PAC_APIAKEY | PR_PAC_APIBKEY | PR_PAC_APDAKEY | PR_PAC_APDBKEY, + PR_PAC_APIBKEY, 0, 0); + +disables all keys except the IB key. + +The main reason why this is useful is to enable a userspace ABI that uses PAC +instructions to sign and authenticate function pointers and other pointers +exposed outside of the function, while still allowing binaries conforming to +the ABI to interoperate with legacy binaries that do not sign or authenticate +pointers. + +The idea is that a dynamic loader or early startup code would issue this +prctl very early after establishing that a process may load legacy binaries, +but before executing any PAC instructions. + +For compatibility with previous kernel versions, processes start up with IA, +IB, DA and DB enabled, and are reset to this state on exec(). Processes created +via fork() and clone() inherit the key enabled state from the calling process. + +It is recommended to avoid disabling the IA key, as this has higher performance +overhead than disabling any of the other keys. diff --git a/Documentation/arm64/tagged-address-abi.rst b/Documentation/arm64/tagged-address-abi.rst index 4a9d9c794ee5..cbc4d4500241 100644 --- a/Documentation/arm64/tagged-address-abi.rst +++ b/Documentation/arm64/tagged-address-abi.rst @@ -40,7 +40,7 @@ space obtained in one of the following ways: during creation and with the same restrictions as for ``mmap()`` above (e.g. data, bss, stack). -The AArch64 Tagged Address ABI has two stages of relaxation depending +The AArch64 Tagged Address ABI has two stages of relaxation depending on how the user addresses are used by the kernel: 1. User addresses not accessed by the kernel but used for address space diff --git a/Documentation/dev-tools/kasan.rst b/Documentation/dev-tools/kasan.rst index ddf4239a5890..6f6ab3ed7b79 100644 --- a/Documentation/dev-tools/kasan.rst +++ b/Documentation/dev-tools/kasan.rst @@ -161,6 +161,15 @@ particular KASAN features. - ``kasan=off`` or ``=on`` controls whether KASAN is enabled (default: ``on``). +- ``kasan.mode=sync`` or ``=async`` controls whether KASAN is configured in + synchronous or asynchronous mode of execution (default: ``sync``). + Synchronous mode: a bad access is detected immediately when a tag + check fault occurs. + Asynchronous mode: a bad access detection is delayed. When a tag check + fault occurs, the information is stored in hardware (in the TFSR_EL1 + register for arm64). The kernel periodically checks the hardware and + only reports tag faults during these checks. + - ``kasan.stacktrace=off`` or ``=on`` disables or enables alloc and free stack traces collection (default: ``on``). diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig index f1a032ed2274..406b42c05ee1 100644 --- a/arch/arm64/Kconfig +++ b/arch/arm64/Kconfig @@ -108,9 +108,9 @@ config ARM64 select GENERIC_CPU_AUTOPROBE select GENERIC_CPU_VULNERABILITIES select GENERIC_EARLY_IOREMAP + select GENERIC_FIND_FIRST_BIT select GENERIC_IDLE_POLL_SETUP select GENERIC_IRQ_IPI - select GENERIC_IRQ_MULTI_HANDLER select GENERIC_IRQ_PROBE select GENERIC_IRQ_SHOW select GENERIC_IRQ_SHOW_LEVEL @@ -138,6 +138,7 @@ config ARM64 select HAVE_ARCH_JUMP_LABEL select HAVE_ARCH_JUMP_LABEL_RELATIVE select HAVE_ARCH_KASAN if !(ARM64_16K_PAGES && ARM64_VA_BITS_48) + select HAVE_ARCH_KASAN_VMALLOC if HAVE_ARCH_KASAN select HAVE_ARCH_KASAN_SW_TAGS if HAVE_ARCH_KASAN select HAVE_ARCH_KASAN_HW_TAGS if (HAVE_ARCH_KASAN && ARM64_MTE) select HAVE_ARCH_KFENCE @@ -195,6 +196,7 @@ config ARM64 select IOMMU_DMA if IOMMU_SUPPORT select IRQ_DOMAIN select IRQ_FORCED_THREADING + select KASAN_VMALLOC if KASAN_GENERIC select MODULES_USE_ELF_RELA select NEED_DMA_MAP_STATE select NEED_SG_DMA_LENGTH @@ -1069,6 +1071,9 @@ config SYS_SUPPORTS_HUGETLBFS config ARCH_HAS_CACHE_LINE_SIZE def_bool y +config ARCH_HAS_FILTER_PGPROT + def_bool y + config ARCH_ENABLE_SPLIT_PMD_PTLOCK def_bool y if PGTABLE_LEVELS > 2 @@ -1430,19 +1435,6 @@ config ARM64_USE_LSE_ATOMICS built with binutils >= 2.25 in order for the new instructions to be used. -config ARM64_VHE - bool "Enable support for Virtualization Host Extensions (VHE)" - default y - help - Virtualization Host Extensions (VHE) allow the kernel to run - directly at EL2 (instead of EL1) on processors that support - it. This leads to better performance for KVM, as they reduce - the cost of the world switch. - - Selecting this option allows the VHE feature to be detected - at runtime, and does not affect processors that do not - implement this feature. - endmenu menu "ARMv8.2 architectural features" @@ -1696,10 +1688,23 @@ config ARM64_MTE endmenu +menu "ARMv8.7 architectural features" + +config ARM64_EPAN + bool "Enable support for Enhanced Privileged Access Never (EPAN)" + default y + depends on ARM64_PAN + help + Enhanced Privileged Access Never (EPAN) allows Privileged + Access Never to be used with Execute-only mappings. + + The feature is detected at runtime, and will remain disabled + if the cpu does not implement the feature. +endmenu + config ARM64_SVE bool "ARM Scalable Vector Extension support" default y - depends on !KVM || ARM64_VHE help The Scalable Vector Extension (SVE) is an extension to the AArch64 execution state which complements and extends the SIMD functionality @@ -1728,12 +1733,6 @@ config ARM64_SVE booting the kernel. If unsure and you are not observing these symptoms, you should assume that it is safe to say Y. - CPUs that support SVE are architecturally required to support the - Virtualization Host Extensions (VHE), so the kernel makes no - provision for supporting SVE alongside KVM without VHE enabled. - Thus, you will need to enable CONFIG_ARM64_VHE if you want to support - KVM in the same kernel image. - config ARM64_MODULE_PLTS bool "Use PLTs to allow module memory to spill over into vmalloc area" depends on MODULES diff --git a/arch/arm64/configs/defconfig b/arch/arm64/configs/defconfig index d612f633b771..8793a9cb9d4b 100644 --- a/arch/arm64/configs/defconfig +++ b/arch/arm64/configs/defconfig @@ -1156,6 +1156,7 @@ CONFIG_CRYPTO_DEV_HISI_TRNG=m CONFIG_CMA_SIZE_MBYTES=32 CONFIG_PRINTK_TIME=y CONFIG_DEBUG_INFO=y +CONFIG_DEBUG_INFO_REDUCED=y CONFIG_MAGIC_SYSRQ=y CONFIG_DEBUG_FS=y CONFIG_DEBUG_KERNEL=y diff --git a/arch/arm64/crypto/aes-modes.S b/arch/arm64/crypto/aes-modes.S index 247011356d11..b495de22bb38 100644 --- a/arch/arm64/crypto/aes-modes.S +++ b/arch/arm64/crypto/aes-modes.S @@ -701,7 +701,7 @@ AES_FUNC_START(aes_mac_update) cbz w5, .Lmacout encrypt_block v0, w2, x1, x7, w8 st1 {v0.16b}, [x4] /* return dg */ - cond_yield .Lmacout, x7 + cond_yield .Lmacout, x7, x8 b .Lmacloop4x .Lmac1x: add w3, w3, #4 diff --git a/arch/arm64/crypto/sha1-ce-core.S b/arch/arm64/crypto/sha1-ce-core.S index 8c02bbc2684e..889ca0f8972b 100644 --- a/arch/arm64/crypto/sha1-ce-core.S +++ b/arch/arm64/crypto/sha1-ce-core.S @@ -121,7 +121,7 @@ CPU_LE( rev32 v11.16b, v11.16b ) add dgav.4s, dgav.4s, dg0v.4s cbz w2, 2f - cond_yield 3f, x5 + cond_yield 3f, x5, x6 b 0b /* diff --git a/arch/arm64/crypto/sha2-ce-core.S b/arch/arm64/crypto/sha2-ce-core.S index 6cdea7d56059..491179922f49 100644 --- a/arch/arm64/crypto/sha2-ce-core.S +++ b/arch/arm64/crypto/sha2-ce-core.S @@ -129,7 +129,7 @@ CPU_LE( rev32 v19.16b, v19.16b ) /* handled all input blocks? */ cbz w2, 2f - cond_yield 3f, x5 + cond_yield 3f, x5, x6 b 0b /* diff --git a/arch/arm64/crypto/sha3-ce-core.S b/arch/arm64/crypto/sha3-ce-core.S index 6f5208414fe3..9c77313f5a60 100644 --- a/arch/arm64/crypto/sha3-ce-core.S +++ b/arch/arm64/crypto/sha3-ce-core.S @@ -184,11 +184,11 @@ SYM_FUNC_START(sha3_ce_transform) eor v0.16b, v0.16b, v31.16b cbnz w8, 3b - cond_yield 3f, x8 + cond_yield 4f, x8, x9 cbnz w2, 0b /* save state */ -3: st1 { v0.1d- v3.1d}, [x0], #32 +4: st1 { v0.1d- v3.1d}, [x0], #32 st1 { v4.1d- v7.1d}, [x0], #32 st1 { v8.1d-v11.1d}, [x0], #32 st1 {v12.1d-v15.1d}, [x0], #32 diff --git a/arch/arm64/crypto/sha512-ce-core.S b/arch/arm64/crypto/sha512-ce-core.S index d6e7f6c95fa6..b6a3a36e15f5 100644 --- a/arch/arm64/crypto/sha512-ce-core.S +++ b/arch/arm64/crypto/sha512-ce-core.S @@ -195,7 +195,7 @@ CPU_LE( rev64 v19.16b, v19.16b ) add v10.2d, v10.2d, v2.2d add v11.2d, v11.2d, v3.2d - cond_yield 3f, x4 + cond_yield 3f, x4, x5 /* handled all input blocks? */ cbnz w2, 0b diff --git a/arch/arm64/include/asm/arch_gicv3.h b/arch/arm64/include/asm/arch_gicv3.h index 880b9054d75c..934b9be582d2 100644 --- a/arch/arm64/include/asm/arch_gicv3.h +++ b/arch/arm64/include/asm/arch_gicv3.h @@ -173,7 +173,7 @@ static inline void gic_pmr_mask_irqs(void) static inline void gic_arch_enable_irqs(void) { - asm volatile ("msr daifclr, #2" : : : "memory"); + asm volatile ("msr daifclr, #3" : : : "memory"); } #endif /* __ASSEMBLY__ */ diff --git a/arch/arm64/include/asm/arch_timer.h b/arch/arm64/include/asm/arch_timer.h index 9f0ec21d6327..88d20f04c64a 100644 --- a/arch/arm64/include/asm/arch_timer.h +++ b/arch/arm64/include/asm/arch_timer.h @@ -165,25 +165,6 @@ static inline void arch_timer_set_cntkctl(u32 cntkctl) isb(); } -/* - * Ensure that reads of the counter are treated the same as memory reads - * for the purposes of ordering by subsequent memory barriers. - * - * This insanity brought to you by speculative system register reads, - * out-of-order memory accesses, sequence locks and Thomas Gleixner. - * - * http://lists.infradead.org/pipermail/linux-arm-kernel/2019-February/631195.html - */ -#define arch_counter_enforce_ordering(val) do { \ - u64 tmp, _val = (val); \ - \ - asm volatile( \ - " eor %0, %1, %1\n" \ - " add %0, sp, %0\n" \ - " ldr xzr, [%0]" \ - : "=r" (tmp) : "r" (_val)); \ -} while (0) - static __always_inline u64 __arch_counter_get_cntpct_stable(void) { u64 cnt; @@ -224,8 +205,6 @@ static __always_inline u64 __arch_counter_get_cntvct(void) return cnt; } -#undef arch_counter_enforce_ordering - static inline int arch_timer_arch_init(void) { return 0; diff --git a/arch/arm64/include/asm/asm_pointer_auth.h b/arch/arm64/include/asm/asm_pointer_auth.h index 52dead2a8640..8ca2dc0661ee 100644 --- a/arch/arm64/include/asm/asm_pointer_auth.h +++ b/arch/arm64/include/asm/asm_pointer_auth.h @@ -13,30 +13,12 @@ * so use the base value of ldp as thread.keys_user and offset as * thread.keys_user.ap*. */ - .macro ptrauth_keys_install_user tsk, tmp1, tmp2, tmp3 + .macro __ptrauth_keys_install_user tsk, tmp1, tmp2, tmp3 mov \tmp1, #THREAD_KEYS_USER add \tmp1, \tsk, \tmp1 -alternative_if_not ARM64_HAS_ADDRESS_AUTH - b .Laddr_auth_skip_\@ -alternative_else_nop_endif ldp \tmp2, \tmp3, [\tmp1, #PTRAUTH_USER_KEY_APIA] msr_s SYS_APIAKEYLO_EL1, \tmp2 msr_s SYS_APIAKEYHI_EL1, \tmp3 - ldp \tmp2, \tmp3, [\tmp1, #PTRAUTH_USER_KEY_APIB] - msr_s SYS_APIBKEYLO_EL1, \tmp2 - msr_s SYS_APIBKEYHI_EL1, \tmp3 - ldp \tmp2, \tmp3, [\tmp1, #PTRAUTH_USER_KEY_APDA] - msr_s SYS_APDAKEYLO_EL1, \tmp2 - msr_s SYS_APDAKEYHI_EL1, \tmp3 - ldp \tmp2, \tmp3, [\tmp1, #PTRAUTH_USER_KEY_APDB] - msr_s SYS_APDBKEYLO_EL1, \tmp2 - msr_s SYS_APDBKEYHI_EL1, \tmp3 -.Laddr_auth_skip_\@: -alternative_if ARM64_HAS_GENERIC_AUTH - ldp \tmp2, \tmp3, [\tmp1, #PTRAUTH_USER_KEY_APGA] - msr_s SYS_APGAKEYLO_EL1, \tmp2 - msr_s SYS_APGAKEYHI_EL1, \tmp3 -alternative_else_nop_endif .endm .macro __ptrauth_keys_install_kernel_nosync tsk, tmp1, tmp2, tmp3 diff --git a/arch/arm64/include/asm/assembler.h b/arch/arm64/include/asm/assembler.h index ca31594d3d6c..ab569b0b45fc 100644 --- a/arch/arm64/include/asm/assembler.h +++ b/arch/arm64/include/asm/assembler.h @@ -15,6 +15,7 @@ #include #include +#include #include #include #include @@ -23,6 +24,14 @@ #include #include + /* + * Provide a wxN alias for each wN register so what we can paste a xN + * reference after a 'w' to obtain the 32-bit version. + */ + .irp n,0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30 + wx\n .req w\n + .endr + .macro save_and_disable_daif, flags mrs \flags, daif msr daifset, #0xf @@ -40,9 +49,9 @@ msr daif, \flags .endm - /* IRQ is the lowest priority flag, unconditionally unmask the rest. */ - .macro enable_da_f - msr daifclr, #(8 | 4 | 1) + /* IRQ/FIQ are the lowest priority flags, unconditionally unmask the rest. */ + .macro enable_da + msr daifclr, #(8 | 4) .endm /* @@ -50,7 +59,7 @@ */ .macro save_and_disable_irq, flags mrs \flags, daif - msr daifset, #2 + msr daifset, #3 .endm .macro restore_irq, flags @@ -692,90 +701,33 @@ USER(\label, ic ivau, \tmp2) // invalidate I line PoU isb .endm -/* - * Check whether to yield to another runnable task from kernel mode NEON code - * (which runs with preemption disabled). - * - * if_will_cond_yield_neon - * // pre-yield patchup code - * do_cond_yield_neon - * // post-yield patchup code - * endif_yield_neon