linux

Author	SHA1	Message	Date
Kees Cook	3364c6ce23	arm64: atomics: lse: Dereference matching size When building with -Warray-bounds, the following warning is generated: In file included from ./arch/arm64/include/asm/lse.h:16, from ./arch/arm64/include/asm/cmpxchg.h:14, from ./arch/arm64/include/asm/atomic.h:16, from ./include/linux/atomic.h:7, from ./include/asm-generic/bitops/atomic.h:5, from ./arch/arm64/include/asm/bitops.h:25, from ./include/linux/bitops.h:33, from ./include/linux/kernel.h:22, from kernel/printk/printk.c:22: ./arch/arm64/include/asm/atomic_lse.h:247:9: warning: array subscript 'long unsigned int[0]' is partly outside array bounds of 'atomic_t[1]' [-Warray-bounds] 247 \| asm volatile( \ \| ^~~ ./arch/arm64/include/asm/atomic_lse.h:266:1: note: in expansion of macro '__CMPXCHG_CASE' 266 \| __CMPXCHG_CASE(w, , acq_, 32, a, "memory") \| ^~~~~~~~~~~~~~ kernel/printk/printk.c:3606:17: note: while referencing 'printk_cpulock_owner' 3606 \| static atomic_t printk_cpulock_owner = ATOMIC_INIT(-1); \| ^~~~~~~~~~~~~~~~~~~~ This is due to the compiler seeing an unsigned long * cast against something (atomic_t) that is int sized. Replace the cast with the matching size cast. This results in no change in binary output. Note that __ll_sc__cmpxchg_case_##name##sz already uses the same constraint: [v] "+Q" ((u##sz )ptr Which is why only the LSE form needs updating and not the LL/SC form, so this change is unlikely to be problematic. Cc: Will Deacon <will@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Boqun Feng <boqun.feng@gmail.com> Cc: linux-arm-kernel@lists.infradead.org Acked-by: Ard Biesheuvel <ardb@kernel.org> Acked-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Kees Cook <keescook@chromium.org> Link: https://lore.kernel.org/r/20220112202259.3950286-1-keescook@chromium.org Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2022-01-20 09:13:48 +00:00
Xiongfeng Wang	440323b6cf	asm-generic: Add missing brackets for io_stop_wc macro After using io_stop_wc(), drivers reports following compile error when compiled on X86. drivers/net/ethernet/hisilicon/hns3/hns3_enet.c: In function ‘hns3_tx_push_bd’: drivers/net/ethernet/hisilicon/hns3/hns3_enet.c:2058:12: error: expected ‘;’ before ‘(’ token io_stop_wc(); ^ It is because I missed to add the brackets after io_stop_wc macro. So let's add the missing brackets. Fixes: `d5624bb29f` ("asm-generic: introduce io_stop_wc() and add implementation for ARM64") Reported-by: Guangbin Huang <huangguangbin2@huawei.com> Signed-off-by: Xiongfeng Wang <wangxiongfeng2@huawei.com> Link: https://lore.kernel.org/r/20220114105857.126300-1-wangxiongfeng2@huawei.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2022-01-20 09:12:04 +00:00
Catalin Marinas	945409a6ef	Merge branches 'for-next/misc', 'for-next/cache-ops-dzp', 'for-next/stacktrace', 'for-next/xor-neon', 'for-next/kasan', 'for-next/armv8_7-fp', 'for-next/atomics', 'for-next/bti', 'for-next/sve', 'for-next/kselftest' and 'for-next/kcsan', remote-tracking branch 'arm64/for-next/perf' into for-next/core * arm64/for-next/perf: (32 commits) arm64: perf: Don't register user access sysctl handler multiple times drivers: perf: marvell_cn10k: fix an IS_ERR() vs NULL check perf/smmuv3: Fix unused variable warning when CONFIG_OF=n arm64: perf: Support new DT compatibles arm64: perf: Simplify registration boilerplate arm64: perf: Support Denver and Carmel PMUs drivers/perf: hisi: Add driver for HiSilicon PCIe PMU docs: perf: Add description for HiSilicon PCIe PMU driver dt-bindings: perf: Add YAML schemas for Marvell CN10K LLC-TAD pmu bindings drivers: perf: Add LLC-TAD perf counter support perf/smmuv3: Synthesize IIDR from CoreSight ID registers perf/smmuv3: Add devicetree support dt-bindings: Add Arm SMMUv3 PMCG binding perf/arm-cmn: Add debugfs topology info perf/arm-cmn: Add CI-700 Support dt-bindings: perf: arm-cmn: Add CI-700 perf/arm-cmn: Support new IP features perf/arm-cmn: Demarcate CMN-600 specifics perf/arm-cmn: Move group validation data off-stack perf/arm-cmn: Optimise DTC counter accesses ... * for-next/misc: : Miscellaneous patches arm64: Use correct method to calculate nomap region boundaries arm64: Drop outdated links in comments arm64: errata: Fix exec handling in erratum `1418040` workaround arm64: Unhash early pointer print plus improve comment asm-generic: introduce io_stop_wc() and add implementation for ARM64 arm64: remove __dma__area() aliases docs/arm64: delete a space from tagged-address-abi arm64/fp: Add comments documenting the usage of state restore functions arm64: mm: Use asid feature macro for cheanup arm64: mm: Rename asid2idx() to ctxid2asid() arm64: kexec: reduce calls to page_address() arm64: extable: remove unused ex_handler_t definition arm64: entry: Use SDEI event constants arm64: Simplify checking for populated DT arm64/kvm: Fix bitrotted comment for SVE handling in handle_exit.c for-next/cache-ops-dzp: : Avoid DC instructions when DCZID_EL0.DZP == 1 arm64: mte: DC {GVA,GZVA} shouldn't be used when DCZID_EL0.DZP == 1 arm64: clear_page() shouldn't use DC ZVA when DCZID_EL0.DZP == 1 * for-next/stacktrace: : Unify the arm64 unwind code arm64: Make some stacktrace functions private arm64: Make dump_backtrace() use arch_stack_walk() arm64: Make profile_pc() use arch_stack_walk() arm64: Make return_address() use arch_stack_walk() arm64: Make __get_wchan() use arch_stack_walk() arm64: Make perf_callchain_kernel() use arch_stack_walk() arm64: Mark __switch_to() as __sched arm64: Add comment for stack_info::kr_cur arch: Make ARCH_STACKWALK independent of STACKTRACE * for-next/xor-neon: : Use SHA3 instructions to speed up XOR arm64/xor: use EOR3 instructions when available * for-next/kasan: : Log potential KASAN shadow aliases arm64: mm: log potential KASAN shadow alias arm64: mm: use die_kernel_fault() in do_mem_abort() * for-next/armv8_7-fp: : Add HWCAPS for ARMv8.7 FEAT_AFP amd FEAT_RPRES arm64: cpufeature: add HWCAP for FEAT_RPRES arm64: add ID_AA64ISAR2_EL1 sys register arm64: cpufeature: add HWCAP for FEAT_AFP * for-next/atomics: : arm64 atomics clean-ups and codegen improvements arm64: atomics: lse: define RETURN ops in terms of FETCH ops arm64: atomics: lse: improve constraints for simple ops arm64: atomics: lse: define ANDs in terms of ANDNOTs arm64: atomics lse: define SUBs in terms of ADDs arm64: atomics: format whitespace consistently * for-next/bti: : BTI clean-ups arm64: Ensure that the 'bti' macro is defined where linkage.h is included arm64: Use BTI C directly and unconditionally arm64: Unconditionally override SYM_FUNC macros arm64: Add macro version of the BTI instruction arm64: ftrace: add missing BTIs arm64: kexec: use __pa_symbol(empty_zero_page) arm64: update PAC description for kernel * for-next/sve: : SVE code clean-ups and refactoring in prepararation of Scalable Matrix Extensions arm64/sve: Minor clarification of ABI documentation arm64/sve: Generalise vector length configuration prctl() for SME arm64/sve: Make sysctl interface for SVE reusable by SME * for-next/kselftest: : arm64 kselftest additions kselftest/arm64: Add pidbench for floating point syscall cases kselftest/arm64: Add a test program to exercise the syscall ABI kselftest/arm64: Allow signal tests to trigger from a function kselftest/arm64: Parameterise ptrace vector length information * for-next/kcsan: : Enable KCSAN for arm64 arm64: Enable KCSAN	2022-01-05 18:14:32 +00:00
Huacai Chen	daa149dd8c	arm64: Use correct method to calculate nomap region boundaries Nomap regions are treated as "reserved". When region boundaries are not page aligned, we usually increase the "reserved" regions rather than decrease them. So, we should use memblock_region_reserved_base_pfn()/ memblock_region_reserved_end_pfn() instead of memblock_region_memory_ base_pfn()/memblock_region_memory_base_pfn() to calculate boundaries. Signed-off-by: Huacai Chen <chenhuacai@loongson.cn> Link: https://lore.kernel.org/r/20211022070646.41923-1-chenhuacai@loongson.cn Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2022-01-05 14:46:06 +00:00
Kees Cook	89d30b1150	arm64: Drop outdated links in comments As started by commit `05a5f51ca5` ("Documentation: Replace lkml.org links with lore"), an effort was made to replace lkml.org links with lore to better use a single source that's more likely to stay available long-term. However, it seems these links don't offer much value here, so just remove them entirely. Cc: Joe Perches <joe@perches.com> Suggested-by: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/lkml/20210211100213.GA29813@willie-the-truck/ Signed-off-by: Kees Cook <keescook@chromium.org> Link: https://lore.kernel.org/r/20211215191835.1420010-1-keescook@chromium.org [catalin.marinas@arm.com: removed the arch/arm changes] Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2022-01-05 11:51:01 +00:00
Will Deacon	3da4390bcd	arm64: perf: Don't register user access sysctl handler multiple times Commit `e201260081` ("arm64: perf: Add userspace counter access disable switch") introduced a new 'perf_user_access' sysctl file to enable and disable direct userspace access to the PMU counters. Sadly, Geert reports that on his big.LITTLE SoC ('Renesas Salvator-XS w/ R-Car H3'), the file is created for each PMU type probed, resulting in a splat during boot: \| hw perfevents: enabled with armv8_cortex_a53 PMU driver, 7 counters available \| sysctl duplicate entry: /kernel//perf_user_access \| CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.16.0-rc3-arm64-renesas-00003-ge2012600810c #1420 \| Hardware name: Renesas Salvator-X 2nd version board based on r8a77951 (DT) \| Call trace: \| dump_backtrace+0x0/0x190 \| show_stack+0x14/0x20 \| dump_stack_lvl+0x88/0xb0 \| dump_stack+0x14/0x2c \| __register_sysctl_table+0x384/0x818 \| register_sysctl+0x20/0x28 \| armv8_pmu_init.constprop.0+0x118/0x150 \| armv8_a57_pmu_init+0x1c/0x28 \| arm_pmu_device_probe+0x1b4/0x558 \| armv8_pmu_device_probe+0x18/0x20 \| platform_probe+0x64/0xd0 \| hw perfevents: enabled with armv8_cortex_a57 PMU driver, 7 counters available Introduce a state variable to track creation of the sysctl file and ensure that it is only created once. Reported-by: Geert Uytterhoeven <geert@linux-m68k.org> Fixes: `e201260081` ("arm64: perf: Add userspace counter access disable switch") Link: https://lore.kernel.org/r/CAMuHMdVcDxR9sGzc5pcnORiotonERBgc6dsXZXMd6wTvLGA9iw@mail.gmail.com Signed-off-by: Will Deacon <will@kernel.org>	2022-01-04 14:57:14 +00:00
Dan Carpenter	2da56881a7	drivers: perf: marvell_cn10k: fix an IS_ERR() vs NULL check The devm_ioremap() function does not return error pointers. It returns NULL. Fixes: `036a7584be` ("drivers: perf: Add LLC-TAD perf counter support") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Link: https://lore.kernel.org/r/20211217145907.GA16611@kili Signed-off-by: Will Deacon <will@kernel.org>	2022-01-04 13:58:17 +00:00
Will Deacon	527a7f5252	perf/smmuv3: Fix unused variable warning when CONFIG_OF=n The kbuild robot reports that building the SMMUv3 PMU driver with CONFIG_OF=n results in a warning for W=1 builds: >> drivers/perf/arm_smmuv3_pmu.c:889:34: warning: unused variable 'smmu_pmu_of_match' [-Wunused-const-variable] static const struct of_device_id smmu_pmu_of_match[] = { ^ Guard the match table with #ifdef CONFIG_OF. Link: https://lore.kernel.org/r/202201041700.01KZEzhb-lkp@intel.com Fixes: `3f7be43561` ("perf/smmuv3: Add devicetree support") Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Will Deacon <will@kernel.org>	2022-01-04 13:38:16 +00:00
D Scott Phillips	38e0257e0e	arm64: errata: Fix exec handling in erratum `1418040` workaround The erratum `1418040` workaround enables CNTVCT_EL1 access trapping in EL0 when executing compat threads. The workaround is applied when switching between tasks, but the need for the workaround could also change at an exec(), when a non-compat task execs a compat binary or vice versa. Apply the workaround in arch_setup_new_exec(). This leaves a small window of time between SET_PERSONALITY and arch_setup_new_exec where preemption could occur and confuse the old workaround logic that compares TIF_32BIT between prev and next. Instead, we can just read cntkctl to make sure it's in the state that the next task needs. I measured cntkctl read time to be about the same as a mov from a general-purpose register on N1. Update the workaround logic to examine the current value of cntkctl instead of the previous task's compat state. Fixes: `d49f7d7376` ("arm64: Move handling of erratum `1418040` into C code") Cc: <stable@vger.kernel.org> # 5.9.x Signed-off-by: D Scott Phillips <scott@os.amperecomputing.com> Reviewed-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20211220234114.3926-1-scott@os.amperecomputing.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2021-12-22 15:18:19 +00:00
Guilherme G. Piccoli	31e833b203	arm64: Unhash early pointer print plus improve comment When facing a really early issue on DT parsing we have currently a message that shows both the physical and virtual address of the FDT. The printk pointer modifier for the virtual address shows a hashed address there unless the user provides "no_hash_pointers" parameter in the command-line. The situation in which this message shows-up is a bit more serious though: the boot process is broken, nothing can be done (even an oops is too much for this early stage) so we have this message as a last resort in order to help debug bootloader issues, for example. Hence, we hereby change that to "%px" in order to make debugging easy, there's not much information leak risk in such early boot failure. Also, we tried to improve a bit the commenting on that function, given that if kernel fails there, it just hangs forever in a cpu_relax() loop. The reason we cannot BUG/panic is that is too early to do so; thanks to Mark Brown for pointing that on IRC and thanks Robin Murphy for the good pointer hash discussion in the mailing-list. Cc: Mark Brown <broonie@kernel.org> Cc: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Guilherme G. Piccoli <gpiccoli@igalia.com> Reviewed-by: Robin Murphy <robin.murphy@arm.com> Acked-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Link: https://lore.kernel.org/r/20211221155230.1532850-1-gpiccoli@igalia.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2021-12-22 11:06:11 +00:00
Xiongfeng Wang	d5624bb29f	asm-generic: introduce io_stop_wc() and add implementation for ARM64 For memory accesses with write-combining attributes (e.g. those returned by ioremap_wc()), the CPU may wait for prior accesses to be merged with subsequent ones. But in some situation, such wait is bad for the performance. We introduce io_stop_wc() to prevent the merging of write-combining memory accesses before this macro with those after it. We add implementation for ARM64 using DGH instruction and provide NOP implementation for other architectures. Signed-off-by: Xiongfeng Wang <wangxiongfeng2@huawei.com> Suggested-by: Will Deacon <will@kernel.org> Suggested-by: Catalin Marinas <catalin.marinas@arm.com> Acked-by: Arnd Bergmann <arnd@arndb.de> Link: https://lore.kernel.org/r/20211221035556.60346-1-wangxiongfeng2@huawei.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2021-12-22 10:44:53 +00:00
Catalin Marinas	dd73d18e7f	arm64: Ensure that the 'bti' macro is defined where linkage.h is included Not all .S files include asm/assembler.h, however the SYM_FUNC_* definitions invoke the 'bti' macro. Include asm/assembler.h in asm/linkage.h. Fixes: `9be34be87c` ("arm64: Add macro version of the BTI instruction") Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2021-12-17 16:20:45 +00:00
Mark Rutland	c2c529b27c	arm64: remove __dma_*_area() aliases The __dma_inv_area() and __dma_clean_area() aliases make cache.S harder to navigate, but don't gain us anything in practice. For clarity, let's remove them along with their redundant comments. The only users are __dma_map_area() and __dma_unmap_area(), which need to be position independent, and can call __pi_dcache_inval_poc() and __pi_dcache_clean_poc() directly. There should be no functional change as a result of this patch. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Ard Biesheuvel <ardb@kernel.org> Cc: Fuad Tabba <tabba@google.com> Cc: Marc Zyngier <maz@kernel.org> Cc: Will Deacon <will@kernel.org> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Acked-by: Mark Brown <broonie@kernel.org> Acked-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/20211206124715.4101571-4-mark.rutland@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2021-12-15 11:19:41 +00:00
Yanteng Si	f2cefc0c2d	docs/arm64: delete a space from tagged-address-abi Since e71e2ace5721("userfaultfd: do not untag user pointers") which introduced a warning: linux/Documentation/arm64/tagged-address-abi.rst:52: WARNING: Unexpected indentation. Let's fix it. Signed-off-by: Yanteng Si <siyanteng@loongson.cn> Link: https://lore.kernel.org/r/20211209091922.560979-1-siyanteng@loongson.cn Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2021-12-14 19:01:37 +00:00
Kefeng Wang	dd03762ab6	arm64: Enable KCSAN This patch enables KCSAN for arm64, with updates to build rules to not use KCSAN for several incompatible compilation units. Recent GCC version(at least GCC10) made outline-atomics as the default option(unlike Clang), which will cause linker errors for kernel/kcsan/core.o. Disables the out-of-line atomics by no-outline-atomics to fix the linker errors. Meanwhile, as Mark said[1], some latent issues are needed to be fixed which isn't just a KCSAN problem, we make the KCSAN depends on EXPERT for now. Tested selftest and kcsan_test(built with GCC11 and Clang 13), and all passed. [1] https://lkml.kernel.org/r/YadiUPpJ0gADbiHQ@FVFF77S0Q05N Acked-by: Marco Elver <elver@google.com> # kernel/kcsan Tested-by: Joey Gouly <joey.gouly@arm.com> Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com> Link: https://lore.kernel.org/r/20211211131734.126874-1-wangkefeng.wang@huawei.com [catalin.marinas@arm.com: added comment to justify EXPERT] Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2021-12-14 18:54:34 +00:00
Mark Brown	2c94ebedc8	kselftest/arm64: Add pidbench for floating point syscall cases Since it's likely to be useful for performance work with SVE let's have a pidbench that gives us some numbers for consideration. In order to ensure that we test exactly the scenario we want this is written in assembly - if system libraries use SVE this would stop us exercising the case where the process has never used SVE. We exercise three cases: - Never having used SVE. - Having used SVE once. - Using SVE after each syscall. by spinning running getpid() for a fixed number of iterations with the time measured using CNTVCT_EL0 reported on the console. This is obviously a totally unrealistic benchmark which will show the extremes of any performance variation but equally given the potential gotchas with use of FP instructions by system libraries it's good to have some concrete code shared to make it easier to compare notes on results. Testing over multiple SVE vector lengths will need to be done with vlset currently, the test could be extended to iterate over all of them if desired. Signed-off-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20211202165107.1075259-1-broonie@kernel.org Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2021-12-14 18:41:56 +00:00
Mark Brown	12b792e5e2	arm64/fp: Add comments documenting the usage of state restore functions Add comments to help people figure out when fpsimd_bind_state_to_cpu() and fpsimd_update_current_state() are used. Signed-off-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20211207163250.1373542-1-broonie@kernel.org Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2021-12-14 18:36:19 +00:00
Mark Brown	b77e995e3b	kselftest/arm64: Add a test program to exercise the syscall ABI Currently we don't have any coverage of the syscall ABI so let's add a very dumb test program which sets up register patterns, does a sysscall and then checks that the register state after the syscall matches what we expect. The program is written in an extremely simplistic fashion with the goal of making it easy to verify that it's doing what it thinks it's doing, it is not a model of how one should write actual code. Currently we validate the general purpose, FPSIMD and SVE registers. There are other thing things that could be covered like FPCR and flags registers, these can be covered incrementally - my main focus at the minute is covering the ABI for the SVE registers. The program repeats the tests for all possible SVE vector lengths in case some vector length specific optimisation causes issues, as well as testing FPSIMD only. It tries two syscalls, getpid() and sched_yield(), in an effort to cover both immediate return to userspace and scheduling another task though there are no guarantees which cases will be hit. A new test directory "abi" is added to hold the test, it doesn't seem to fit well into any of the existing directories. Signed-off-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20211210184133.320748-7-broonie@kernel.org Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2021-12-14 18:35:10 +00:00
Mark Brown	9331a60485	kselftest/arm64: Allow signal tests to trigger from a function Currently we have the facility to specify custom code to trigger a signal but none of the tests use it and for some reason the framework requires us to also specify a signal to send as a trigger in order to make use of a custom trigger. This doesn't seem to make much sense, instead allow the use of a custom trigger function without specifying a signal to inject. Signed-off-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20211210184133.320748-6-broonie@kernel.org Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2021-12-14 18:35:10 +00:00
Mark Brown	18edbb6b32	kselftest/arm64: Parameterise ptrace vector length information SME introduces a new mode called streaming mode in which the SVE registers have a different vector length. Since the ptrace interface for this is based on the existing SVE interface prepare for supporting this by moving the regset specific configuration into struct and passing that around, allowing these tests to be reused for streaming mode. As we will also have to verify the interoperation of the SVE and streaming SVE regsets don't just iterate over an array. Signed-off-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20211210184133.320748-5-broonie@kernel.org Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2021-12-14 18:35:10 +00:00
Mark Brown	aed34d9e52	arm64/sve: Minor clarification of ABI documentation As suggested by Luis for the SME version of this explicitly say that the vector length should be extracted from the return value of a set vector length prctl() with a bitwise and rather than just any old and. Suggested-by: Luis Machado <Luis.Machado@arm.com> Signed-off-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20211210184133.320748-4-broonie@kernel.org Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2021-12-14 18:33:44 +00:00
Mark Brown	30c43e73b3	arm64/sve: Generalise vector length configuration prctl() for SME In preparation for adding SME support update the bulk of the implementation for the vector length configuration prctl() calls to be independent of vector type. Signed-off-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20211210184133.320748-3-broonie@kernel.org Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2021-12-14 18:33:44 +00:00
Mark Brown	97bcbee404	arm64/sve: Make sysctl interface for SVE reusable by SME The vector length configuration for SME is very similar to that for SVE so in order to allow reuse refactor the SVE configuration so that it takes the vector type from the struct ctl_table. Since there's no dedicated space for this we repurpose the extra1 field to store the vector type, this is otherwise unused for integer sysctls. Signed-off-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20211210184133.320748-2-broonie@kernel.org Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2021-12-14 18:33:44 +00:00
Will Deacon	1609c22a8a	Merge branch 'for-next/perf-cpu' into for-next/perf * for-next/perf-cpu: arm64: perf: Support new DT compatibles arm64: perf: Simplify registration boilerplate arm64: perf: Support Denver and Carmel PMUs	2021-12-14 18:13:25 +00:00
Mark Brown	742a15b1a2	arm64: Use BTI C directly and unconditionally Now we have a macro for BTI C that looks like a regular instruction change all the users of the current BTI_C macro to just emit a BTI C directly and remove the macro. This does mean that we now unconditionally BTI annotate all assembly functions, meaning that they are worse in this respect than code generated by the compiler. The overhead should be minimal for implementations with a reasonable HINT implementation. Signed-off-by: Mark Brown <broonie@kernel.org> Reviewed-by: Ard Biesheuvel <ardb@kernel.org> Acked-by: Mark Rutland <mark.rutland@arm.com> Link: https://lore.kernel.org/r/20211214152714.2380849-4-broonie@kernel.org Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2021-12-14 18:12:58 +00:00
Mark Brown	481ee45ce9	arm64: Unconditionally override SYM_FUNC macros Currently we only override the SYM_FUNC macros when we need to insert BTI C into them, do this unconditionally to make it more likely that we'll notice bugs in our override. Suggested-by: Mark Rutland <mark.rutland@arm.com> Acked-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Mark Brown <broonie@kernel.org> Acked-by: Mark Rutland <mark.rutland@arm.com> Link: https://lore.kernel.org/r/20211214152714.2380849-3-broonie@kernel.org Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2021-12-14 18:12:58 +00:00
Mark Brown	9be34be87c	arm64: Add macro version of the BTI instruction BTI is only available from v8.5 so we need to encode it using HINT in generic code and for older toolchains. Add an assembler macro based on one written by Mark Rutland which lets us use the mnemonic and update the existing users. Suggested-by: Mark Rutland <mark.rutland@arm.com> Acked-by: Ard Biesheuvel <ardb@kernel.org> Acked-by: Will Deacon <will@kernel.org> Signed-off-by: Mark Brown <broonie@kernel.org> Acked-by: Mark Rutland <mark.rutland@arm.com> Link: https://lore.kernel.org/r/20211214152714.2380849-2-broonie@kernel.org Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2021-12-14 18:12:58 +00:00
Catalin Marinas	580b536b50	Merge 'arm64/for-next/fixes' into for-next/bti Needed for the arch/arm64/kernel/entry-ftrace.S fix. * commit 'arm64/for-next/fixes^^': arm64: ftrace: add missing BTIs arm64: kexec: use __pa_symbol(empty_zero_page) arm64: update PAC description for kernel	2021-12-14 18:11:52 +00:00
Robin Murphy	893c34b60a	arm64: perf: Support new DT compatibles Wire up the new DT compatibles so we can present appropriate PMU names to userspace for the latest and greatest CPUs. Signed-off-by: Robin Murphy <robin.murphy@arm.com> Link: https://lore.kernel.org/r/62d14ba12d847ec7f1fba7cb0b3b881b437e1cc5.1639490264.git.robin.murphy@arm.com Signed-off-by: Will Deacon <will@kernel.org>	2021-12-14 18:05:11 +00:00
Robin Murphy	6ac9f30bd4	arm64: perf: Simplify registration boilerplate With the trend for per-core events moving to userspace JSON, registering names for PMUv3 implementations is increasingly a pure boilerplate exercise. Let's wrap things a step further so we can generate the basic PMUv3 init function with a macro invocation, and reduce further new addition to just 2 lines each. Suggested-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Robin Murphy <robin.murphy@arm.com> Link: https://lore.kernel.org/r/b79477ea3b97f685d00511d4ecd2f686184dca34.1639490264.git.robin.murphy@arm.com Signed-off-by: Will Deacon <will@kernel.org>	2021-12-14 18:05:11 +00:00
Thierry Reding	d4c4844a9b	arm64: perf: Support Denver and Carmel PMUs Add support for the NVIDIA Denver and Carmel PMUs using the generic PMUv3 event map for now. Acked-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Thierry Reding <treding@nvidia.com> [ rm: reorder entries alphabetically ] Signed-off-by: Robin Murphy <robin.murphy@arm.com> Link: https://lore.kernel.org/r/5f0f69d47acca78a9e479501aa4d8b429e23cf11.1639490264.git.robin.murphy@arm.com Signed-off-by: Will Deacon <will@kernel.org>	2021-12-14 18:05:11 +00:00
Will Deacon	8bd09b41b8	Merge branch 'for-next/perf-user-counter-access' into for-next/perf * for-next/perf-user-counter-access: Documentation: arm64: Document PMU counters access from userspace arm64: perf: Enable PMU counter userspace access for perf event arm64: perf: Add userspace counter access disable switch perf: Add a counter for number of user access events in context x86: perf: Move RDPMC event flag to a common definition	2021-12-14 13:42:22 +00:00
Will Deacon	1879a61f4a	Merge branch 'for-next/perf-smmu' into for-next/perf * for-next/perf-smmu: perf/smmuv3: Synthesize IIDR from CoreSight ID registers perf/smmuv3: Add devicetree support dt-bindings: Add Arm SMMUv3 PMCG binding	2021-12-14 13:42:17 +00:00
Will Deacon	8330904fed	Merge branch 'for-next/perf-hisi' into for-next/perf * for-next/perf-hisi: drivers/perf: hisi: Add driver for HiSilicon PCIe PMU docs: perf: Add description for HiSilicon PCIe PMU driver	2021-12-14 13:42:05 +00:00
Will Deacon	e73bc4fd78	Merge branch 'for-next/perf-cn10k' into for-next/perf * for-next/perf-cn10k: dt-bindings: perf: Add YAML schemas for Marvell CN10K LLC-TAD pmu bindings drivers: perf: Add LLC-TAD perf counter support	2021-12-14 13:41:58 +00:00
Will Deacon	fc369f925f	Merge branch 'for-next/perf-cmn' into for-next/perf * for-next/perf-cmn: perf/arm-cmn: Add debugfs topology info perf/arm-cmn: Add CI-700 Support dt-bindings: perf: arm-cmn: Add CI-700 perf/arm-cmn: Support new IP features perf/arm-cmn: Demarcate CMN-600 specifics perf/arm-cmn: Move group validation data off-stack perf/arm-cmn: Optimise DTC counter accesses perf/arm-cmn: Optimise DTM counter reads perf/arm-cmn: Refactor DTM handling perf/arm-cmn: Streamline node iteration perf/arm-cmn: Refactor node ID handling perf/arm-cmn: Drop compile-test restriction perf/arm-cmn: Account for NUMA affinity perf/arm-cmn: Fix CPU hotplug unregistration	2021-12-14 13:41:42 +00:00
Mark Rutland	053f58bab3	arm64: atomics: lse: define RETURN ops in terms of FETCH ops The FEAT_LSE atomic instructions include LD* instructions which return the original value of a memory location can be used to directly implement FETCH opertations. Each RETURN op is implemented as a copy of the corresponding FETCH op with a trailing instruction to generate the new value of the memory location. We only directly implement _fetch_add(), for which we have a trailing `add` instruction. As the compiler has no visibility of the `add`, this leads to less than optimal code generation when consuming the result. For example, the compiler cannot constant-fold the addition into later operations, and currently GCC 11.1.0 will compile: return __lse_atomic_sub_return(1, v) == 0; As: mov w1, #0xffffffff ldaddal w1, w2, [x0] add w1, w1, w2 cmp w1, #0x0 cset w0, eq // eq = none ret This patch improves this by replacing the `add` with C addition after the inline assembly block, e.g. ret += i; This allows the compiler to manipulate `i`. This permits the compiler to merge the `add` and `cmp` for the above, e.g. mov w1, #0xffffffff ldaddal w1, w1, [x0] cmp w1, #0x1 cset w0, eq // eq = none ret With this change the assembly for each RETURN op is identical to the corresponding FETCH op (including barriers and clobbers) so I've removed the inline assembly and rewritten each RETURN op in terms of the corresponding FETCH op, e.g. \| static inline void __lse_atomic_add_return(int i, atomic_t *v) \| { \| return __lse_atomic_fetch_add(i, v) + i \| } The new construction does not adversely affect the common case, and before and after this patch GCC 11.1.0 can compile: __lse_atomic_add_return(i, v) As: ldaddal w0, w2, [x1] add w0, w0, w2 ... while having the freedom to do better elsewhere. This is intended as an optimization and cleanup. There should be no functional change as a result of this patch. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Boqun Feng <boqun.feng@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Will Deacon <will@kernel.org> Acked-by: Will Deacon <will@kernel.org> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20211210151410.2782645-6-mark.rutland@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2021-12-14 13:00:33 +00:00
Mark Rutland	8a578a759a	arm64: atomics: lse: improve constraints for simple ops We have overly conservative assembly constraints for the basic FEAT_LSE atomic instructions, and using more accurate and permissive constraints will allow for better code generation. The FEAT_LSE basic atomic instructions have come in two forms: LD{op}{order}{size} <Rs>, <Rt>, [<Rn>] ST{op}{order}{size} <Rs>, [<Rn>] The ST* forms are aliases of the LD* forms where: ST{op}{order}{size} <Rs>, [<Rn>] Is: LD{op}{order}{size} <Rs>, XZR, [<Rn>] For either form, both <Rs> and <Rn> are read but not written back to, and <Rt> is written with the original value of the memory location. Where (<Rt> == <Rs>) or (<Rt> == <Rn>), <Rt> is written after the other register value(s) are consumed. There are no UNPREDICTABLE or CONSTRAINED UNPREDICTABLE behaviours when any pair of <Rs>, <Rt>, or <Rn> are the same register. Our current inline assembly always uses <Rs> == <Rt>, treating this register as both an input and an output (using a '+r' constraint). This forces the compiler to do some unnecessary register shuffling and/or redundant value generation. For example, the compiler cannot reuse the <Rs> value, and currently GCC 11.1.0 will compile: __lse_atomic_add(1, a); __lse_atomic_add(1, b); __lse_atomic_add(1, c); As: mov w3, #0x1 mov w4, w3 stadd w4, [x0] mov w0, w3 stadd w0, [x1] stadd w3, [x2] We can improve this with more accurate constraints, separating <Rs> and <Rt>, where <Rs> is an input-only register ('r'), and <Rt> is an output-only value ('=r'). As <Rt> is written back after <Rs> is consumed, it does not need to be earlyclobber ('=&r'), leaving the compiler free to use the same register for both <Rs> and <Rt> where this is desirable. At the same time, the redundant 'r' constraint for `v` is removed, as the `+Q` constraint is sufficient. With this change, the above example becomes: mov w3, #0x1 stadd w3, [x0] stadd w3, [x1] stadd w3, [x2] I've made this change for the non-value-returning and FETCH ops. The RETURN ops have a multi-instruction sequence for which we cannot use the same constraints, and a subsequent patch will rewrite hte RETURN ops in terms of the FETCH ops, relying on the ability for the compiler to reuse the <Rs> value. This is intended as an optimization. There should be no functional change as a result of this patch. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Boqun Feng <boqun.feng@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Will Deacon <will@kernel.org> Acked-by: Will Deacon <will@kernel.org> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20211210151410.2782645-5-mark.rutland@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2021-12-14 13:00:23 +00:00
Mark Rutland	5e9e43c987	arm64: atomics: lse: define ANDs in terms of ANDNOTs The FEAT_LSE atomic instructions include atomic bit-clear instructions (`ldclr` and `stclr`) which can be used to directly implement ANDNOT operations. Each AND op is implemented as a copy of the corresponding ANDNOT op with a leading `mvn` instruction to apply a bitwise NOT to the `i` argument. As the compiler has no visibility of the `mvn`, this leads to less than optimal code generation when generating `i` into a register. For example, __lse_atomic_fetch_and(0xf, v) can be compiled to: mov w1, #0xf mvn w1, w1 ldclral w1, w1, [x2] This patch improves this by replacing the `mvn` with NOT in C before the inline assembly block, e.g. i = ~i; This allows the compiler to generate `i` into a register more optimally, e.g. mov w1, #0xfffffff0 ldclral w1, w1, [x2] With this change the assembly for each AND op is identical to the corresponding ANDNOT op (including barriers and clobbers), so I've removed the inline assembly and rewritten each AND op in terms of the corresponding ANDNOT op, e.g. \| static inline void __lse_atomic_and(int i, atomic_t *v) \| { \| return __lse_atomic_andnot(~i, v); \| } This is intended as an optimization and cleanup. There should be no functional change as a result of this patch. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Boqun Feng <boqun.feng@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Will Deacon <will@kernel.org> Acked-by: Will Deacon <will@kernel.org> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20211210151410.2782645-4-mark.rutland@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2021-12-14 13:00:23 +00:00
Mark Rutland	ef53245060	arm64: atomics lse: define SUBs in terms of ADDs The FEAT_LSE atomic instructions include atomic ADD instructions (`stadd` and `ldadd`), but do not include atomic SUB instructions, so we must build all of the SUB operations using the ADD instructions. We open-code these today, with each SUB op implemented as a copy of the corresponding ADD op with a leading `neg` instruction in the inline assembly to negate the `i` argument. As the compiler has no visibility of the `neg`, this leads to less than optimal code generation when generating `i` into a register. For example, __les_atomic_fetch_sub(1, v) can be compiled to: mov w1, #0x1 neg w1, w1 ldaddal w1, w1, [x2] This patch improves this by replacing the `neg` with negation in C before the inline assembly block, e.g. i = -i; This allows the compiler to generate `i` into a register more optimally, e.g. mov w1, #0xffffffff ldaddal w1, w1, [x2] With this change the assembly for each SUB op is identical to the corresponding ADD op (including barriers and clobbers), so I've removed the inline assembly and rewritten each SUB op in terms of the corresponding ADD op, e.g. \| static inline void __lse_atomic_sub(int i, atomic_t *v) \| { \| __lse_atomic_add(-i, v); \| } For clarity I've moved the definition of each SUB op immediately after the corresponding ADD op, and used a single macro to create the RETURN forms of both ops. This is intended as an optimization and cleanup. There should be no functional change as a result of this patch. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Boqun Feng <boqun.feng@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Will Deacon <will@kernel.org> Acked-by: Will Deacon <will@kernel.org> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20211210151410.2782645-3-mark.rutland@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2021-12-14 13:00:23 +00:00
Mark Rutland	8e6082e94a	arm64: atomics: format whitespace consistently The code for the atomic ops is formatted inconsistently, and while this is not a functional problem it is rather distracting when working on them. Some have ops have consistent indentation, e.g. \| #define ATOMIC_OP_ADD_RETURN(name, mb, cl...) \ \| static inline int __lse_atomic_add_return##name(int i, atomic_t v) \ \| { \ \| u32 tmp; \ \| \ \| asm volatile( \ \| __LSE_PREAMBLE \ \| " ldadd" #mb " %w[i], %w[tmp], %[v]\n" \ \| " add %w[i], %w[i], %w[tmp]" \ \| : [i] "+r" (i), [v] "+Q" (v->counter), [tmp] "=&r" (tmp) \ \| : "r" (v) \ \| : cl); \ \| \ \| return i; \ \| } While others have negative indentation for some lines, and/or have misaligned trailing backslashes, e.g. \| static inline void __lse_atomic_##op(int i, atomic_t v) \ \| { \ \| asm volatile( \ \| __LSE_PREAMBLE \ \| " " #asm_op " %w[i], %[v]\n" \ \| : [i] "+r" (i), [v] "+Q" (v->counter) \ \| : "r" (v)); \ \| } This patch makes the indentation consistent and also aligns the trailing backslashes. This makes the code easier to read for those (like myself) who are easily distracted by these inconsistencies. This is intended as a cleanup. There should be no functional change as a result of this patch. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Boqun Feng <boqun.feng@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Will Deacon <will@kernel.org> Acked-by: Will Deacon <will@kernel.org> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20211210151410.2782645-2-mark.rutland@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2021-12-14 13:00:23 +00:00
Qi Liu	8404b0fbc7	drivers/perf: hisi: Add driver for HiSilicon PCIe PMU PCIe PMU Root Complex Integrated End Point(RCiEP) device is supported to sample bandwidth, latency, buffer occupation etc. Each PMU RCiEP device monitors multiple Root Ports, and each RCiEP is registered as a PMU in /sys/bus/event_source/devices, so users can select target PMU, and use filter to do further sets. Filtering options contains: event - select the event. port - select target Root Ports. Information of Root Ports are shown under sysfs. bdf - select requester_id of target EP device. trig_len - set trigger condition for starting event statistics. trig_mode - set trigger mode. 0 means starting to statistic when bigger than trigger condition, and 1 means smaller. thr_len - set threshold for statistics. thr_mode - set threshold mode. 0 means count when bigger than threshold, and 1 means smaller. Acked-by: Krzysztof Wilczyński <kw@linux.com> Reviewed-by: John Garry <john.garry@huawei.com> Signed-off-by: Qi Liu <liuqi115@huawei.com> Reviewed-by: Shaokun Zhang <zhangshaokun@hisilicon.com> Link: https://lore.kernel.org/r/20211202080633.2919-3-liuqi115@huawei.com Signed-off-by: Will Deacon <will@kernel.org>	2021-12-14 12:30:26 +00:00
Qi Liu	c8602008e2	docs: perf: Add description for HiSilicon PCIe PMU driver PCIe PMU Root Complex Integrated End Point(RCiEP) device is supported on HiSilicon HIP09 platform. Document it to provide guidance on how to use it. Reviewed-by: John Garry <john.garry@huawei.com> Signed-off-by: Qi Liu <liuqi115@huawei.com> Reviewed-by: Shaokun Zhang <zhangshaokun@hisilicon.com> Link: https://lore.kernel.org/r/20211202080633.2919-2-liuqi115@huawei.com Signed-off-by: Will Deacon <will@kernel.org>	2021-12-14 12:30:26 +00:00
Bhaskara Budiredla	4cbf47728f	dt-bindings: perf: Add YAML schemas for Marvell CN10K LLC-TAD pmu bindings Add device tree bindings for Last-level-cache Tag-and-data (LLC-TAD) unit PMU for Marvell CN10K SoCs. Signed-off-by: Bhaskara Budiredla <bbudiredla@marvell.com> Reviewed-by: Rob Herring <robh@kernel.org> Link: https://lore.kernel.org/r/20211115043506.6679-3-bbudiredla@marvell.com Signed-off-by: Will Deacon <will@kernel.org>	2021-12-14 12:23:01 +00:00
Bhaskara Budiredla	036a7584be	drivers: perf: Add LLC-TAD perf counter support This driver adds support for Last-level cache tag-and-data unit (LLC-TAD) PMU that is featured in some of the Marvell's CN10K infrastructure silicons. The LLC is divided into 2N slices distributed across N Mesh tiles in a single-socket configuration. The driver always configures the same counter for all of the TADs. The user would end up effectively reserving one of eight counters in every TAD to look across all TADs. The occurrences of events are aggregated and presented to the user at the end of an application run. The driver does not provide a way for the user to partition TADs so that different TADs are used for different applications. The event counters are zeroed to start event counting to avoid any rollover issues. TAD perf counters are 64-bit, so it's not currently possible to overflow event counters at current mesh and core frequencies. To measure tad pmu events use perf tool stat command. For instance: perf stat -e tad_dat_msh_in_dss,tad_req_msh_out_any <application> perf stat -e tad_alloc_any,tad_hit_any,tad_tag_rd <application> Signed-off-by: Bhaskara Budiredla <bbudiredla@marvell.com> Link: https://lore.kernel.org/r/20211115043506.6679-2-bbudiredla@marvell.com Signed-off-by: Will Deacon <will@kernel.org>	2021-12-14 12:23:01 +00:00
Ard Biesheuvel	2c54b423cf	arm64/xor: use EOR3 instructions when available Use the EOR3 instruction to implement xor_blocks() if the instruction is available, which is the case if the CPU implements the SHA-3 extension. This is about 20% faster on Apple M1 when using the 5-way version. Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/20211213140252.2856053-1-ardb@kernel.org Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2021-12-14 12:14:26 +00:00
Robin Murphy	df457ca973	perf/smmuv3: Synthesize IIDR from CoreSight ID registers The SMMU_PMCG_IIDR register was not present in older revisions of the Arm SMMUv3 spec. On Arm Ltd. implementations, the IIDR value consists of fields from several PIDR registers, allowing us to present a standardized identifier to userspace. Signed-off-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org> Link: https://lore.kernel.org/r/20211117144844.241072-4-jean-philippe@linaro.org Signed-off-by: Will Deacon <will@kernel.org>	2021-12-14 12:09:52 +00:00
Jean-Philippe Brucker	3f7be43561	perf/smmuv3: Add devicetree support Add device-tree support to the SMMUv3 PMCG driver. Signed-off-by: Jay Chen <jkchen@linux.alibaba.com> Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org> Signed-off-by: Robin Murphy <robin.murphy@arm.com> Link: https://lore.kernel.org/r/20211117144844.241072-3-jean-philippe@linaro.org Signed-off-by: Will Deacon <will@kernel.org>	2021-12-14 12:09:52 +00:00
Jean-Philippe Brucker	2704e75943	dt-bindings: Add Arm SMMUv3 PMCG binding Add binding for the Arm SMMUv3 PMU. Each node represents a PMCG, and is placed as a sibling node of the SMMU. Although the PMCGs registers may be within the SMMU MMIO region, they are separate devices, and there can be multiple PMCG devices for each SMMU (for example one for the TCU and one for each TBU). Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org> Signed-off-by: Robin Murphy <robin.murphy@arm.com> Link: https://lore.kernel.org/r/20211117144844.241072-2-jean-philippe@linaro.org Signed-off-by: Will Deacon <will@kernel.org>	2021-12-14 12:09:52 +00:00
Robin Murphy	a88fa6c28b	perf/arm-cmn: Add debugfs topology info In general, detailed performance analysis will require knoweldge of the the SoC beyond the CMN itself - e.g. which actual CPUs/peripherals/etc. are connected to each node. However for certain development and bringup tasks it can be useful to have a quick overview of the CMN internal topology to hand too. Add a debugfs file to map this out. Signed-off-by: Robin Murphy <robin.murphy@arm.com> Link: https://lore.kernel.org/r/159fd4d7e19fb3c8801a8cb64ee73ec50f55903c.1638530442.git.robin.murphy@arm.com Signed-off-by: Will Deacon <will@kernel.org>	2021-12-14 12:09:28 +00:00

1 2 3 4 5 ...

1058963 Commits