linux

Author	SHA1	Message	Date
Chen Lu	64a19591a2	riscv: fix misalgned trap vector base address The trap vector marked by label .Lsecondary_park must align on a 4-byte boundary, as the {m,s}tvec is defined to require 4-byte alignment. Signed-off-by: Chen Lu <181250012@smail.nju.edu.cn> Reviewed-by: Anup Patel <anup.patel@wdc.com> Fixes: `e011995e82` ("RISC-V: Move relocate and few other functions out of __init") Cc: stable@vger.kernel.org Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>	2021-10-27 13:08:01 -07:00
Vincent Whitchurch	890d335613	virtio-ring: fix DMA metadata flags The flags are currently overwritten, leading to the wrong direction being passed to the DMA unmap functions. Fixes: `72b5e89587` ("virtio-ring: store DMA metadata in desc_extra for split virtqueue") Signed-off-by: Vincent Whitchurch <vincent.whitchurch@axis.com> Link: https://lore.kernel.org/r/20211026133100.17541-1-vincent.whitchurch@axis.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com>	2021-10-27 15:54:34 -04:00
Trond Myklebust	01d29f87fc	NFSv4: Fix a regression in nfs_set_open_stateid_locked() If we already hold open state on the client, yet the server gives us a completely different stateid to the one we already hold, then we currently treat it as if it were an out-of-sequence update, and wait for 5 seconds for other updates to come in. This commit fixes the behaviour so that we immediately start processing of the new stateid, and then leave it to the call to nfs4_test_and_free_stateid() to decide what to do with the old stateid. Fixes: `b4868b44c5` ("NFSv4: Wait for stateid updates after CLOSE/OPEN_DOWNGRADE") Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>	2021-10-27 15:24:46 -04:00
Adrian Hunter	624ff63abf	perf intel-pt: Support itrace d+o option to direct debug log to stdout It can be useful to see debug output in between normal output. Add support for AUXTRACE_LOG_FLG_USE_STDOUT to Intel PT. Reviewed-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: https://lore.kernel.org/r/20211027080334.365596-7-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2021-10-27 16:21:01 -03:00
Adrian Hunter	4b2b2c6a7d	perf auxtrace: Add itrace d+o option to direct debug log to stdout It can be useful to see debug output in between normal output. Add 'o' to the flags of debug option 'd', so that '--itrace=d+o' can specify output of the debug log to stdout. Reviewed-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: https://lore.kernel.org/r/20211027080334.365596-6-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2021-10-27 16:20:45 -03:00
Adrian Hunter	c3afd6e50f	perf dlfilter: Add dlfilter-show-cycles Add a new dlfilter to show cycles. Cycle counts are accumulated per CPU (or per thread if CPU is not recorded) from IPC information, and printed together with the change since the last print, at the start of each line. Separate counts are kept for branches, instructions or other events. Note also, the itrace A option can be useful to provide higher granularity cycle information. Example: $ perf record -e intel_pt/cyc/u uname Linux [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.044 MB perf.data ] $ perf script --itrace=A --call-trace --dlfilter dlfilter-show-cycles.so --deltatime \| head 0 perf-exec 8509 [001] 0.000000000: psb offs: 0 0 perf-exec 8509 [001] 0.000000000: cbr: 42 freq: 4219 MHz (156%) 833 833 uname 8509 [001] 0.000047689: (/usr/lib/x86_64-linux-gnu/ld-2.31.so ) _start 833 uname 8509 [001] 0.000003261: (/usr/lib/x86_64-linux-gnu/ld-2.31.so ) _dl_start 2015 1182 uname 8509 [001] 0.000000282: (/usr/lib/x86_64-linux-gnu/ld-2.31.so ) _dl_start 2676 661 uname 8509 [001] 0.000002629: (/usr/lib/x86_64-linux-gnu/ld-2.31.so ) _dl_start 3612 936 uname 8509 [001] 0.000001232: (/usr/lib/x86_64-linux-gnu/ld-2.31.so ) _dl_start 4579 967 uname 8509 [001] 0.000002519: (/usr/lib/x86_64-linux-gnu/ld-2.31.so ) _dl_start 6145 1566 uname 8509 [001] 0.000001050: (/usr/lib/x86_64-linux-gnu/ld-2.31.so ) _dl_setup_hash 6239 94 uname 8509 [001] 0.000000023: (/usr/lib/x86_64-linux-gnu/ld-2.31.so ) _dl_sysdep_start Reviewed-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: https://lore.kernel.org/r/20211027080334.365596-5-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2021-10-27 16:20:30 -03:00
Adrian Hunter	f2b91386ff	perf intel-pt: Support itrace A option to approximate IPC Normally, for cycle-acccurate mode, IPC values are an exact number of instructions and cycles. Due to the granularity of timestamps, that happens only when a CYC packet correlates to the event. Support the itrace 'A' option, to use instead, the number of cycles associated with the current timestamp. This provides IPC information for every change of timestamp, but at the expense of accuracy. Due to the granularity of timestamps, the actual number of cycles increases even though the cycles reported does not. The number of instructions is known, but if IPC is reported, cycles can be too low and so IPC is too high. Note that inaccuracy decreases as the period of sampling increases i.e. if the number of cycles is too low by a small amount, that becomes less significant if the number of cycles is large. Furthermore, it can be used in conjunction with dlfilter-show-cycles.so to provide higher granularity cycle information. Reviewed-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: https://lore.kernel.org/r/20211027080334.365596-4-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2021-10-27 16:20:18 -03:00
Adrian Hunter	b6778fe1bb	perf auxtrace: Add itrace A option to approximate IPC Add an option to specify that synthesized IPC can be approximate, rather than completely accurate. Reviewed-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: https://lore.kernel.org/r/20211027080334.365596-3-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2021-10-27 16:20:08 -03:00
Adrian Hunter	cf14013b6c	perf auxtrace: Add missing Z option to ITRACE_HELP ITRACE_HELP is used by perf commands to display help text for the --itrace option. Add missing Z option. Reviewed-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: https://lore.kernel.org/r/20211027080334.365596-2-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2021-10-27 16:19:58 -03:00
Arnd Bergmann	f25c0515c5	net: sched: gred: dynamically allocate tc_gred_qopt_offload The tc_gred_qopt_offload structure has grown too big to be on the stack for 32-bit architectures after recent changes. net/sched/sch_gred.c:903:13: error: stack frame size (1180) exceeds limit (1024) in 'gred_destroy' [-Werror,-Wframe-larger-than] net/sched/sch_gred.c:310:13: error: stack frame size (1212) exceeds limit (1024) in 'gred_offload' [-Werror,-Wframe-larger-than] Use dynamic allocation per qdisc to avoid this. Fixes: `50dc9a8572` ("net: sched: Merge Qdisc::bstats and Qdisc::cpu_bstats data types") Fixes: `67c9e6270f` ("net: sched: Protect Qdisc::bstats with u64_stats") Suggested-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Link: https://lore.kernel.org/r/20211026100711.nalhttf6mbe6sudx@linutronix.de Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-10-27 12:06:52 -07:00
Wang Hai	6f7c886911	usbnet: fix error return code in usbnet_probe() Return error code if usb_maxpacket() returns 0 in usbnet_probe() Fixes: `397430b50a` ("usbnet: sanity check for maxpacket") Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Wang Hai <wanghai38@huawei.com> Reviewed-by: Johan Hovold <johan@kernel.org> Link: https://lore.kernel.org/r/20211026124015.3025136-1-wanghai38@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-10-27 12:06:15 -07:00
Andrii Nakryiko	03e6a7a940	Merge branch 'selftests/bpf: parallel mode improvement' Yucong Sun says: ==================== Several patches to improve parallel execution mode, updating vmtest.sh and fixed two previously dropped patches according to feedback. ==================== Signed-off-by: Andrii Nakryiko <andrii@kernel.org>	2021-10-27 11:59:43 -07:00
Yucong Sun	e1ef62a4dd	selftests/bpf: Adding a namespace reset for tc_redirect This patch delete ns_src/ns_dst/ns_redir namespaces before recreating them, making the test more robust. Signed-off-by: Yucong Sun <sunyucong@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20211025223345.2136168-5-fallentree@fb.com	2021-10-27 11:59:02 -07:00
Yucong Sun	9e7240fb2d	selftests/bpf: Fix attach_probe in parallel mode This patch makes attach_probe uses its own method as attach point, avoiding conflict with other tests like bpf_cookie. Signed-off-by: Yucong Sun <sunyucong@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20211025223345.2136168-4-fallentree@fb.com	2021-10-27 11:59:02 -07:00
Yucong Sun	547208a386	selfetests/bpf: Update vmtest.sh defaults Increase memory to 4G, 8 SMP core with host cpu passthrough. This make it run faster in parallel mode and more likely to succeed. Signed-off-by: Yucong Sun <sunyucong@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20211025223345.2136168-2-fallentree@fb.com	2021-10-27 11:58:44 -07:00
Jakub Kicinski	4796e2518a	Merge branch 'two-reverts-to-calm-down-devlink-discussion' Leon Romanovsky says: ==================== Two reverts to calm down devlink discussion Two reverts as was discussed in [1], fast, easy and wrong in long run solution to syzkaller bug [2]. [1] https://lore.kernel.org/all/20211026120234.3408fbcc@kicinski-fedora-pc1c0hjn.dhcp.thefacebook.com [2] https://lore.kernel.org/netdev/000000000000af277405cf0a7ef0@google.com/ ==================== Link: https://lore.kernel.org/r/cover.1635276828.git.leonro@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-10-27 11:58:11 -07:00
Leon Romanovsky	c5e0321e43	Revert "devlink: Remove not-executed trap policer notifications" This reverts commit `22849b5ea5` as it revealed that mlxsw and netdevsim (copy/paste from mlxsw) reregisters devlink objects during another devlink user triggered command. Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-10-27 11:58:07 -07:00
Leon Romanovsky	fb9d19c2d8	Revert "devlink: Remove not-executed trap group notifications" This reverts commit `8bbeed4858` as it revealed that mlxsw and netdevsim (copy/paste from mlxsw) reregisters devlink objects during another devlink user triggered command. Fixes: `22849b5ea5` ("devlink: Remove not-executed trap policer notifications") Reported-by: syzbot+93d5accfaefceedf43c1@syzkaller.appspotmail.com Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-10-27 11:58:07 -07:00
Chunfeng Yun	7ddae8c779	usb: mtu3: enable wake-up interrupt after runtime_suspend called Use the new API dev_pm_set_dedicated_wake_irq_reverse() to request dedicated wake-up interrupt, due to we want to enable the wake IRQ after running ->runtime_suspend(). Signed-off-by: Chunfeng Yun <chunfeng.yun@mediatek.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2021-10-27 20:49:32 +02:00
Chunfeng Yun	0537282d3b	usb: xhci-mtk: enable wake-up interrupt after runtime_suspend called Use new function dev_pm_set_dedicated_wake_irq_reverse() to request dedicated wake-up interrupt, due to we want to enable the wake IRQ after running ->runtime_suspend(). Signed-off-by: Chunfeng Yun <chunfeng.yun@mediatek.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2021-10-27 20:49:32 +02:00
Chunfeng Yun	259714100d	PM / wakeirq: support enabling wake-up irq after runtime_suspend called When the dedicated wake IRQ is level trigger, and it uses the device's low-power status as the wakeup source, that means if the device is not in low-power state, the wake IRQ will be triggered if enabled; For this case, need enable the wake IRQ after running the device's ->runtime_suspend() which make it enter low-power state. e.g. Assume the wake IRQ is a low level trigger type, and the wakeup signal comes from the low-power status of the device. The wakeup signal is low level at running time (0), and becomes high level when the device enters low-power state (runtime_suspend (1) is called), a wakeup event at (2) make the device exit low-power state, then the wakeup signal also becomes low level. ------------------ \| ^ ^\| ---------------- \| \| -------------- \|<---(0)--->\|<--(1)--\| (3) (2) (4) if enable the wake IRQ before running runtime_suspend during (0), a wake IRQ will arise, it causes resume immediately; it works if enable wake IRQ ( e.g. at (3) or (4)) after running ->runtime_suspend(). This patch introduces a new status WAKE_IRQ_DEDICATED_REVERSE to optionally support enabling wake IRQ after running ->runtime_suspend(). Suggested-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Chunfeng Yun <chunfeng.yun@mediatek.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2021-10-27 20:49:32 +02:00
Krzysztof Wilczyński	fd1ae23b49	PCI: Prefer 'unsigned int' over bare 'unsigned' The bare "unsigned" type implicitly means "unsigned int", but the preferred coding style is to use the complete type name. Update the bare use of "unsigned" to the preferred "unsigned int". No change to functionality intended. See `a1ce18e4f9` ("checkpatch: warn on bare unsigned or signed declarations without int"). Link: https://lore.kernel.org/r/20211013014136.1117543-1-kw@linux.com Signed-off-by: Krzysztof Wilczyński <kw@linux.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>	2021-10-27 13:41:22 -05:00
Rafael J. Wysocki	031eda1840	Update devfreq next for v5.16 Detailed description for this pull request 1. Add minor update for exynos-ppmu devfreq-event driver - Devicetree naming convention requires the device node manes to use hyphens instead of underscope. In order to support this requirement, changes the code with hyphens. - Simplify parsing event-type from devicetree without behavior changes 2. Strengthen check for freq_table in devfreq core - Check whether both freq_table is not NULL and size of freq_table is not zero in order to prevent the error by mistake of devfreq driver developer. -----BEGIN PGP SIGNATURE----- iQJKBAABCgA0FiEEsSpuqBtbWtRe4rLGnM3fLN7rz1MFAmF449YWHGN3MDAuY2hv aUBzYW1zdW5nLmNvbQAKCRCczd8s3uvPU9dlD/9hiVqplSwtW2BKWPkkedJLnWFp ji+OrCzuIk4HK8U5Yzejs6qoU4ZhQd7W6jtLvaN8ptJ1lgbe8x3IYC9WBqjBNHrC Vc0Rl+8mDB1AzAHk4veQN0JQlmcm+0CoAx00CdjrL8pFcxdr8AKIZ0fdK5LFo5Ks GOEa/iCRqpCBWXoA5PIVENbD8W57ZUOTaLoaHHJbBmX04E6lWJk/u+NKXWhJNdxn awKpA36tK3wYIbtJuAqv2fWyAIOaHRPUmP0KDJBAqWYMRtQ7LzTrPfDdzmZbI5fX wg0UFu2H9tmH7fuOK4ZjSLIqwyzljG3Vj3NxevPcnBF1IQ2rKLETtZ0zy6pONFSK 396ZV+LaRjln0dMwjWJrYLsICqiCxGIDhcjSfCA8+2fFAifkeE6hyGhmt7FAcKoc LqDeE3DAhdAT4hGEo35s8HislMC/svnA/WmrY2DzRGXC9O7qKqFrPsrGiGtjwrGo vEsyIqLvmsY33GIJx66B/kdkhqjkV76qWS+cW+QCBdvp9qTSCwCMI5IjZZpdSMf8 wZtWUe2neNCBDSY0B5q68+Zad/D40+cvckGtac13JGh3qDQpvZBTM6uGxfT09rxf fk7rlY3Hs9Lwq2W1uFEVTG9PROCNd0GBaQwr4ZvB/iaIBFR0woiqC098BZKC8Fmj TTaZP8Gf23b1bcJZ7A== =KYy7 -----END PGP SIGNATURE----- Merge tag 'devfreq-next-for-5.16' of git://git.kernel.org/pub/scm/linux/kernel/git/chanwoo/linux Pull devfreq updates for v5.16 from Chanwoo Choi: "1. Minor update for exynos-ppmu devfreq-event driver - Devicetree naming convention requires the device node names to use hyphens instead of underlines. In order to support this requirement, changes the code with hyphens. - Simplify parsing event-type from devicetree without behavior changes. 2. Strengthen check for freq_table in devfreq core - Check whether both freq_table is not NULL and size of freq_table is not zero in order to prevent the error by mistake of devfreq driver developer. * tag 'devfreq-next-for-5.16' of git://git.kernel.org/pub/scm/linux/kernel/git/chanwoo/linux: PM / devfreq: Strengthen check for freq_table devfreq: exynos-ppmu: simplify parsing event-type from DT devfreq: exynos-ppmu: use node names with hyphens	2021-10-27 20:40:17 +02:00
Rafael J. Wysocki	d5a8fb654c	perf: qcom_l2_pmu: ACPI: Use ACPI_COMPANION() directly The ACPI_HANDLE() macro is a wrapper arond the ACPI_COMPANION() macro and the ACPI handle produced by the former comes from the ACPI device object produced by the latter, so it is way more straightforward to evaluate the latter directly instead of passing the handle produced by the former to acpi_bus_get_device(). Modify l2_cache_pmu_probe_cluster() accordingly (no intentional functional impact). While at it, rename the ACPI device pointer to adev for more clarity. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2021-10-27 20:38:22 +02:00
Christoph Hellwig	06606646af	ACPI: APEI: mark apei_hest_parse() static apei_hest_parse() is only used in hest.c, so mark it static. Signed-off-by: Christoph Hellwig <hch@lst.de> [ rjw: Minor subject and changelog edits ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2021-10-27 20:34:47 +02:00
Shuai Xue	bf7fc0c369	ACPI: APEI: EINJ: Relax platform response timeout to 1 second When injecting an error into the platform, the OSPM executes an EXECUTE_OPERATION action to instruct the platform to begin the injection operation. And then, the OSPM busy waits for a while by continually executing CHECK_BUSY_STATUS action until the platform indicates that the operation is complete. More specifically, the platform is limited to respond within 1 millisecond right now. This is too strict for some platforms. For example, in Arm platform, when injecting a Processor Correctable error, the OSPM will warn: Firmware does not respond in time. And a message is printed on the console: echo: write error: Input/output error We observe that the waiting time for DDR error injection is about 10 ms and that for PCIe error injection is about 500 ms in Arm platform. In this patch, we relax the response timeout to 1 second. Signed-off-by: Shuai Xue <xueshuai@linux.alibaba.com> Reviewed-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2021-10-27 20:33:26 +02:00
Alexei Starovoitov	f9d532fc5d	Merge branch 'bpf: use 32bit safe version of u64_stats' Eric Dumazet says: ==================== From: Eric Dumazet <edumazet@google.com> Two first patches fix bugs added in 5.1 and 5.5 Third patch replaces the u64 fields in struct bpf_prog_stats with u64_stats_t ones to avoid possible sampling errors, in case of load/store stearing. ==================== Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2021-10-27 11:13:53 -07:00
Eric Dumazet	61a0abaee2	bpf: Use u64_stats_t in struct bpf_prog_stats Commit `316580b69d` ("u64_stats: provide u64_stats_t type") fixed possible load/store tearing on 64bit arches. For instance the following C code stats->nsecs += sched_clock() - start; Could be rightfully implemented like this by a compiler, confusing concurrent readers a lot: stats->nsecs += sched_clock(); // arbitrary delay stats->nsecs -= start; Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20211026214133.3114279-4-eric.dumazet@gmail.com	2021-10-27 11:13:52 -07:00
Eric Dumazet	d979617aa8	bpf: Fixes possible race in update_prog_stats() for 32bit arches It seems update_prog_stats() suffers from same issue fixed in the prior patch: As it can run while interrupts are enabled, it could be re-entered and the u64_stats syncp could be mangled. Fixes: `fec56f5890` ("bpf: Introduce BPF trampoline") Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20211026214133.3114279-3-eric.dumazet@gmail.com	2021-10-27 11:13:52 -07:00
Eric Dumazet	f941eadd8d	bpf: Avoid races in __bpf_prog_run() for 32bit arches __bpf_prog_run() can run from non IRQ contexts, meaning it could be re entered if interrupted. This calls for the irq safe variant of u64_stats_update_{begin\|end}, or risk a deadlock. This patch is a nop on 64bit arches, fortunately. syzbot report: WARNING: inconsistent lock state 5.12.0-rc3-syzkaller #0 Not tainted -------------------------------- inconsistent {IN-SOFTIRQ-W} -> {SOFTIRQ-ON-W} usage. udevd/4013 [HC0[0]:SC0[0]:HE1:SE1] takes: ff7c9dec (&(&pstats->syncp)->seq){+.?.}-{0:0}, at: sk_filter include/linux/filter.h:867 [inline] ff7c9dec (&(&pstats->syncp)->seq){+.?.}-{0:0}, at: do_one_broadcast net/netlink/af_netlink.c:1468 [inline] ff7c9dec (&(&pstats->syncp)->seq){+.?.}-{0:0}, at: netlink_broadcast_filtered+0x27c/0x4fc net/netlink/af_netlink.c:1520 {IN-SOFTIRQ-W} state was registered at: lock_acquire.part.0+0xf0/0x41c kernel/locking/lockdep.c:5510 lock_acquire+0x6c/0x74 kernel/locking/lockdep.c:5483 do_write_seqcount_begin_nested include/linux/seqlock.h:520 [inline] do_write_seqcount_begin include/linux/seqlock.h:545 [inline] u64_stats_update_begin include/linux/u64_stats_sync.h:129 [inline] bpf_prog_run_pin_on_cpu include/linux/filter.h:624 [inline] bpf_prog_run_clear_cb+0x1bc/0x270 include/linux/filter.h:755 run_filter+0xa0/0x17c net/packet/af_packet.c:2031 packet_rcv+0xc0/0x3e0 net/packet/af_packet.c:2104 dev_queue_xmit_nit+0x2bc/0x39c net/core/dev.c:2387 xmit_one net/core/dev.c:3588 [inline] dev_hard_start_xmit+0x94/0x518 net/core/dev.c:3609 sch_direct_xmit+0x11c/0x1f0 net/sched/sch_generic.c:313 qdisc_restart net/sched/sch_generic.c:376 [inline] __qdisc_run+0x194/0x7f8 net/sched/sch_generic.c:384 qdisc_run include/net/pkt_sched.h:136 [inline] qdisc_run include/net/pkt_sched.h:128 [inline] __dev_xmit_skb net/core/dev.c:3795 [inline] __dev_queue_xmit+0x65c/0xf84 net/core/dev.c:4150 dev_queue_xmit+0x14/0x18 net/core/dev.c:4215 neigh_resolve_output net/core/neighbour.c:1491 [inline] neigh_resolve_output+0x170/0x228 net/core/neighbour.c:1471 neigh_output include/net/neighbour.h:510 [inline] ip6_finish_output2+0x2e4/0x9fc net/ipv6/ip6_output.c:117 __ip6_finish_output net/ipv6/ip6_output.c:182 [inline] __ip6_finish_output+0x164/0x3f8 net/ipv6/ip6_output.c:161 ip6_finish_output+0x2c/0xb0 net/ipv6/ip6_output.c:192 NF_HOOK_COND include/linux/netfilter.h:290 [inline] ip6_output+0x74/0x294 net/ipv6/ip6_output.c:215 dst_output include/net/dst.h:448 [inline] NF_HOOK include/linux/netfilter.h:301 [inline] NF_HOOK include/linux/netfilter.h:295 [inline] mld_sendpack+0x2a8/0x7e4 net/ipv6/mcast.c:1679 mld_send_cr net/ipv6/mcast.c:1975 [inline] mld_ifc_timer_expire+0x1e8/0x494 net/ipv6/mcast.c:2474 call_timer_fn+0xd0/0x570 kernel/time/timer.c:1431 expire_timers kernel/time/timer.c:1476 [inline] __run_timers kernel/time/timer.c:1745 [inline] run_timer_softirq+0x2e4/0x384 kernel/time/timer.c:1758 __do_softirq+0x204/0x7ac kernel/softirq.c:345 do_softirq_own_stack include/asm-generic/softirq_stack.h:10 [inline] invoke_softirq kernel/softirq.c:228 [inline] __irq_exit_rcu+0x1d8/0x200 kernel/softirq.c:422 irq_exit+0x10/0x3c kernel/softirq.c:446 __handle_domain_irq+0xb4/0x120 kernel/irq/irqdesc.c:692 handle_domain_irq include/linux/irqdesc.h:176 [inline] gic_handle_irq+0x84/0xac drivers/irqchip/irq-gic.c:370 __irq_svc+0x5c/0x94 arch/arm/kernel/entry-armv.S:205 debug_smp_processor_id+0x0/0x24 lib/smp_processor_id.c:53 rcu_read_lock_held_common kernel/rcu/update.c:108 [inline] rcu_read_lock_sched_held+0x24/0x7c kernel/rcu/update.c:123 trace_lock_acquire+0x24c/0x278 include/trace/events/lock.h:13 lock_acquire+0x3c/0x74 kernel/locking/lockdep.c:5481 rcu_lock_acquire include/linux/rcupdate.h:267 [inline] rcu_read_lock include/linux/rcupdate.h:656 [inline] avc_has_perm_noaudit+0x6c/0x260 security/selinux/avc.c:1150 selinux_inode_permission+0x140/0x220 security/selinux/hooks.c:3141 security_inode_permission+0x44/0x60 security/security.c:1268 inode_permission.part.0+0x5c/0x13c fs/namei.c:521 inode_permission fs/namei.c:494 [inline] may_lookup fs/namei.c:1652 [inline] link_path_walk.part.0+0xd4/0x38c fs/namei.c:2208 link_path_walk fs/namei.c:2189 [inline] path_lookupat+0x3c/0x1b8 fs/namei.c:2419 filename_lookup+0xa8/0x1a4 fs/namei.c:2453 user_path_at_empty+0x74/0x90 fs/namei.c:2733 do_readlinkat+0x5c/0x12c fs/stat.c:417 __do_sys_readlink fs/stat.c:450 [inline] sys_readlink+0x24/0x28 fs/stat.c:447 ret_fast_syscall+0x0/0x2c arch/arm/mm/proc-v7.S:64 0x7eaa4974 irq event stamp: 298277 hardirqs last enabled at (298277): [<802000d0>] no_work_pending+0x4/0x34 hardirqs last disabled at (298276): [<8020c9b8>] do_work_pending+0x9c/0x648 arch/arm/kernel/signal.c:676 softirqs last enabled at (298216): [<8020167c>] __do_softirq+0x584/0x7ac kernel/softirq.c:372 softirqs last disabled at (298201): [<8024dff4>] do_softirq_own_stack include/asm-generic/softirq_stack.h:10 [inline] softirqs last disabled at (298201): [<8024dff4>] invoke_softirq kernel/softirq.c:228 [inline] softirqs last disabled at (298201): [<8024dff4>] __irq_exit_rcu+0x1d8/0x200 kernel/softirq.c:422 other info that might help us debug this: Possible unsafe locking scenario: CPU0 ---- lock(&(&pstats->syncp)->seq); <Interrupt> lock(&(&pstats->syncp)->seq); * DEADLOCK * 1 lock held by udevd/4013: #0: 82b09c5c (rcu_read_lock){....}-{1:2}, at: sk_filter_trim_cap+0x54/0x434 net/core/filter.c:139 stack backtrace: CPU: 1 PID: 4013 Comm: udevd Not tainted 5.12.0-rc3-syzkaller #0 Hardware name: ARM-Versatile Express Backtrace: [<81802550>] (dump_backtrace) from [<818027c4>] (show_stack+0x18/0x1c arch/arm/kernel/traps.c:252) r7:00000080 r6:600d0093 r5:00000000 r4:82b58344 [<818027ac>] (show_stack) from [<81809e98>] (__dump_stack lib/dump_stack.c:79 [inline]) [<818027ac>] (show_stack) from [<81809e98>] (dump_stack+0xb8/0xe8 lib/dump_stack.c:120) [<81809de0>] (dump_stack) from [<81804a00>] (print_usage_bug.part.0+0x228/0x230 kernel/locking/lockdep.c:3806) r7:86bcb768 r6:81a0326c r5:830f96a8 r4:86bcb0c0 [<818047d8>] (print_usage_bug.part.0) from [<802bb1b8>] (print_usage_bug kernel/locking/lockdep.c:3776 [inline]) [<818047d8>] (print_usage_bug.part.0) from [<802bb1b8>] (valid_state kernel/locking/lockdep.c:3818 [inline]) [<818047d8>] (print_usage_bug.part.0) from [<802bb1b8>] (mark_lock_irq kernel/locking/lockdep.c:4021 [inline]) [<818047d8>] (print_usage_bug.part.0) from [<802bb1b8>] (mark_lock.part.0+0xc34/0x136c kernel/locking/lockdep.c:4478) r10:83278fe8 r9:82c6d748 r8:00000000 r7:82c6d2d4 r6:00000004 r5:86bcb768 r4:00000006 [<802ba584>] (mark_lock.part.0) from [<802bc644>] (mark_lock kernel/locking/lockdep.c:4442 [inline]) [<802ba584>] (mark_lock.part.0) from [<802bc644>] (mark_usage kernel/locking/lockdep.c:4391 [inline]) [<802ba584>] (mark_lock.part.0) from [<802bc644>] (__lock_acquire+0x9bc/0x3318 kernel/locking/lockdep.c:4854) r10:86bcb768 r9:86bcb0c0 r8:00000001 r7:00040000 r6:0000075a r5:830f96a8 r4:00000000 [<802bbc88>] (__lock_acquire) from [<802bfb90>] (lock_acquire.part.0+0xf0/0x41c kernel/locking/lockdep.c:5510) r10:00000000 r9:600d0013 r8:00000000 r7:00000000 r6:828a2680 r5:828a2680 r4:861e5bc8 [<802bfaa0>] (lock_acquire.part.0) from [<802bff28>] (lock_acquire+0x6c/0x74 kernel/locking/lockdep.c:5483) r10:8146137c r9:00000000 r8:00000001 r7:00000000 r6:00000000 r5:00000000 r4:ff7c9dec [<802bfebc>] (lock_acquire) from [<81381eb4>] (do_write_seqcount_begin_nested include/linux/seqlock.h:520 [inline]) [<802bfebc>] (lock_acquire) from [<81381eb4>] (do_write_seqcount_begin include/linux/seqlock.h:545 [inline]) [<802bfebc>] (lock_acquire) from [<81381eb4>] (u64_stats_update_begin include/linux/u64_stats_sync.h:129 [inline]) [<802bfebc>] (lock_acquire) from [<81381eb4>] (__bpf_prog_run_save_cb include/linux/filter.h:727 [inline]) [<802bfebc>] (lock_acquire) from [<81381eb4>] (bpf_prog_run_save_cb include/linux/filter.h:741 [inline]) [<802bfebc>] (lock_acquire) from [<81381eb4>] (sk_filter_trim_cap+0x26c/0x434 net/core/filter.c:149) r10:a4095dd0 r9:ff7c9dd0 r8:e44be000 r7:8146137c r6:00000001 r5:8611ba80 r4:00000000 [<81381c48>] (sk_filter_trim_cap) from [<8146137c>] (sk_filter include/linux/filter.h:867 [inline]) [<81381c48>] (sk_filter_trim_cap) from [<8146137c>] (do_one_broadcast net/netlink/af_netlink.c:1468 [inline]) [<81381c48>] (sk_filter_trim_cap) from [<8146137c>] (netlink_broadcast_filtered+0x27c/0x4fc net/netlink/af_netlink.c:1520) r10:00000001 r9:833d6b1c r8:00000000 r7:8572f864 r6:8611ba80 r5:8698d800 r4:8572f800 [<81461100>] (netlink_broadcast_filtered) from [<81463e60>] (netlink_broadcast net/netlink/af_netlink.c:1544 [inline]) [<81461100>] (netlink_broadcast_filtered) from [<81463e60>] (netlink_sendmsg+0x3d0/0x478 net/netlink/af_netlink.c:1925) r10:00000000 r9:00000002 r8:8698d800 r7:000000b7 r6:8611b900 r5:861e5f50 r4:86aa3000 [<81463a90>] (netlink_sendmsg) from [<81321f54>] (sock_sendmsg_nosec net/socket.c:654 [inline]) [<81463a90>] (netlink_sendmsg) from [<81321f54>] (sock_sendmsg+0x3c/0x4c net/socket.c:674) r10:00000000 r9:861e5dd4 r8:00000000 r7:86570000 r6:00000000 r5:86570000 r4:861e5f50 [<81321f18>] (sock_sendmsg) from [<813234d0>] (____sys_sendmsg+0x230/0x29c net/socket.c:2350) r5:00000040 r4:861e5f50 [<813232a0>] (____sys_sendmsg) from [<8132549c>] (___sys_sendmsg+0xac/0xe4 net/socket.c:2404) r10:00000128 r9:861e4000 r8:00000000 r7:00000000 r6:86570000 r5:861e5f50 r4:00000000 [<813253f0>] (___sys_sendmsg) from [<81325684>] (__sys_sendmsg net/socket.c:2433 [inline]) [<813253f0>] (___sys_sendmsg) from [<81325684>] (__do_sys_sendmsg net/socket.c:2442 [inline]) [<813253f0>] (___sys_sendmsg) from [<81325684>] (sys_sendmsg+0x58/0xa0 net/socket.c:2440) r8:80200224 r7:00000128 r6:00000000 r5:7eaa541c r4:86570000 [<8132562c>] (sys_sendmsg) from [<80200060>] (ret_fast_syscall+0x0/0x2c arch/arm/mm/proc-v7.S:64) Exception stack(0x861e5fa8 to 0x861e5ff0) 5fa0: 00000000 00000000 0000000c 7eaa541c 00000000 00000000 5fc0: 00000000 00000000 76fbf840 00000128 00000000 `0000008f` 7eaa541c 000563f8 5fe0: 00056110 7eaa53e0 00036cec 76c9bf44 r6:76fbf840 r5:00000000 r4:00000000 Fixes: `492ecee892` ("bpf: enable program stats") Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20211026214133.3114279-2-eric.dumazet@gmail.com	2021-10-27 11:13:52 -07:00
Joe Burton	689624f037	libbpf: Deprecate bpf_objects_list Add a flag to `enum libbpf_strict_mode' to disable the global `bpf_objects_list', preventing race conditions when concurrent threads call bpf_object__open() or bpf_object__close(). bpf_object__next() will return NULL if this option is set. Callers may achieve the same workflow by tracking bpf_objects in application code. [0] Closes: https://github.com/libbpf/libbpf/issues/293 Signed-off-by: Joe Burton <jevburton@google.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20211026223528.413950-1-jevburton.kernel@gmail.com	2021-10-27 11:00:12 -07:00
Suzuki K Poulose	561ced0bb9	arm64: errata: Enable TRBE workaround for write to out-of-range address With the TRBE driver workaround available, enable the config symbols to be built without COMPILE_TEST Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will@kernel.org> Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com> Acked-by: Will Deacon <will@kernel.org> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com> Link: https://lore.kernel.org/r/20211019163153.3692640-16-suzuki.poulose@arm.com Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>	2021-10-27 11:46:06 -06:00
Suzuki K Poulose	74b2740f57	arm64: errata: Enable workaround for TRBE overwrite in FILL mode With the workaround enabled in TRBE, enable the config entries to be built without COMPILE_TEST Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will@kernel.org> Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com> Acked-by: Will Deacon <will@kernel.org> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com> Link: https://lore.kernel.org/r/20211019163153.3692640-15-suzuki.poulose@arm.com Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>	2021-10-27 11:46:04 -06:00
Suzuki K Poulose	f9efc79d0a	coresight: trbe: Work around write to out of range TRBE implementations affected by Arm erratum (2253138 or 2224489), could write to the next address after the TRBLIMITR.LIMIT, instead of wrapping to the TRBBASER. This implies that the TRBE could potentially corrupt : - A page used by the rest of the kernel/user (if the LIMIT = end of perf ring buffer) - A page within the ring buffer, but outside the driver's range. [head, head + size]. This may contain some trace data, may be consumed by the userspace. We workaround this erratum by : - Making sure that there is at least an extra PAGE space left in the TRBE's range than we normally assign. This will be additional to other restrictions (e.g, the TRBE alignment for working around TRBE_WORKAROUND_OVERWRITE_IN_FILL_MODE, where there is a minimum of PAGE_SIZE. Thus we would have 2 * PAGE_SIZE) - Adjust the LIMIT to leave the last PAGE_SIZE out of the TRBE's allowed range (i.e, TRBEBASER...TRBLIMITR.LIMIT), by : TRBLIMITR.LIMIT -= PAGE_SIZE Cc: Anshuman Khandual <anshuman.khandual@arm.com> Cc: Mathieu Poirier <mathieu.poirier@linaro.org> Cc: Mike Leach <mike.leach@linaro.org> Cc: Leo Yan <leo.yan@linaro.org> Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com> Link: https://lore.kernel.org/r/20211019163153.3692640-14-suzuki.poulose@arm.com Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>	2021-10-27 11:46:01 -06:00
Suzuki K Poulose	adf35d0586	coresight: trbe: Make sure we have enough space The TRBE driver makes sure that there is enough space for a meaningful run, otherwise pads the given space and restarts the offset calculation once. But there is no guarantee that we may find space or hit "no space". Make sure that we repeat the step until, either : - We have the minimum space OR - There is NO space at all. Cc: Anshuman Khandual <anshuman.khandual@arm.com> Cc: Mathieu Poirier <mathieu.poirier@linaro.org> Cc: Mike Leach <mike.leach@linaro.org> Cc: Leo Yan <leo.yan@linaro.org> Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com> Link: https://lore.kernel.org/r/20211019163153.3692640-13-suzuki.poulose@arm.com Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>	2021-10-27 11:45:59 -06:00
Suzuki K Poulose	7c2cc5e26c	coresight: trbe: Add a helper to determine the minimum buffer size For the TRBE to operate, we need a minimum space available to collect meaningful trace session. This is currently a few bytes, but we may need to extend this for working around errata. So, abstract this into a helper function. Cc: Anshuman Khandual <anshuman.khandual@arm.com> Cc: Mike Leach <mike.leach@linaro.org> Cc: Mathieu Poirier <mathieu.poirier@linaro.org> Cc: Leo Yan <leo.yan@linaro.org> Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com> Link: https://lore.kernel.org/r/20211019163153.3692640-12-suzuki.poulose@arm.com Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>	2021-10-27 11:45:57 -06:00
Suzuki K Poulose	5cb75f1880	coresight: trbe: Workaround TRBE errata overwrite in FILL mode ARM Neoverse-N2 (#2139208) and Cortex-A710(##2119858) suffers from an erratum, which when triggered, might cause the TRBE to overwrite the trace data already collected in FILL mode, in the event of a WRAP. i.e, the TRBE doesn't stop writing the data, instead wraps to the base and could write upto 3 cache line size worth trace. Thus, this could corrupt the trace at the "BASE" pointer. The workaround is to program the write pointer 256bytes from the base, such that if the erratum is triggered, it doesn't overwrite the trace data that was captured. This skipped region could be padded with ignore packets at the end of the session, so that the decoder sees a continuous buffer with some padding at the beginning. The trace data written at the base is considered lost as the limit could have been in the middle of the perf ring buffer, and jumping to the "base" is not acceptable. We set the flags already to indicate that some amount of trace was lost during the FILL event IRQ. So this is fine. One important change with the work around is, we program the TRBBASER_EL1 to current page where we are allowed to write. Otherwise, it could overwrite a region that may be consumed by the perf. Towards this, we always make sure that the "handle->head" and thus the trbe_write is PAGE_SIZE aligned, so that we can set the BASE to the PAGE base and move the TRBPTR to the 256bytes offset. Cc: Mike Leach <mike.leach@linaro.org> Cc: Mathieu Poirier <mathieu.poirier@linaro.org> Cc: Anshuman Khandual <anshuman.khandual@arm.com> Cc: Leo Yan <leo.yan@linaro.org> Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com> Link: https://lore.kernel.org/r/20211019163153.3692640-11-suzuki.poulose@arm.com Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>	2021-10-27 11:45:56 -06:00
Suzuki K Poulose	8a1065127d	coresight: trbe: Add infrastructure for Errata handling Add a minimal infrastructure to keep track of the errata affecting the given TRBE instance. Given that we have heterogeneous CPUs, we have to manage the list per-TRBE instance to be able to apply the work around as needed. Thus we will need to check if individual CPUs are affected by the erratum. We rely on the arm64 errata framework for the actual description and the discovery of a given erratum, to keep the Erratum work around at a central place and benefit from the code and the advertisement from the kernel. Though we could reuse the "this_cpu_has_cap()" to apply an erratum work around, it is a bit of a heavy operation, as it must go through the "erratum" detection check on the CPU every time it is called (e.g, scanning through a table of affected MIDRs). Since we need to do this check for every session, may be multiple times (depending on the wrok around), we could save the cycles by caching the affected errata per-CPU instance in the per-CPU struct trbe_cpudata. Since we are only interested in the errata affecting the TRBE driver, we only need to track a very few of them per-CPU. Thus we use a local mapping of the CPUCAP for the erratum to avoid bloating up a bitmap for trbe_cpudata. i.e, each arm64 TRBE erratum bit is assigned a "index" within the driver to track. Each trbe instance updates the list of affected erratum at probe time on the CPU. This makes sure that we can easily access the list of errata on a given TRBE instance without much overhead. Cc: Mathieu Poirier <mathieu.poirier@linaro.org> Cc: Mike Leach <mike.leach@linaro.org> Cc: Leo Yan <leo.yan@linaro.org> Cc: Anshuman Khandual <anshuman.khandual@arm.com> Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com> Link: https://lore.kernel.org/r/20211019163153.3692640-10-suzuki.poulose@arm.com Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>	2021-10-27 11:45:54 -06:00
Suzuki K Poulose	e4bc8829a7	coresight: trbe: Allow driver to choose a different alignment The TRBE hardware mandates a minimum alignment for the TRBPTR_EL1, advertised via the TRBIDR_EL1. This is used by the driver to align the buffer write head. This patch allows the driver to choose a different alignment from that of the hardware, by decoupling the alignment tracking. This will be useful for working around errata. Cc: Mathieu Poirier <mathieu.poirier@linaro.org> Cc: Anshuman Khandual <anshuman.khandual@arm.com> Cc: Mike Leach <mike.leach@linaro.org> Cc: Leo Yan <leo.yan@linaro.org> Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com> Link: https://lore.kernel.org/r/20211019163153.3692640-9-suzuki.poulose@arm.com Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>	2021-10-27 11:45:52 -06:00
Suzuki K Poulose	2336a7b29b	coresight: trbe: Decouple buffer base from the hardware base We always set the TRBBASER_EL1 to the base of the virtual ring buffer. We are about to change this for working around an erratum. So, in preparation to that, allow the driver to choose a different base for the TRBBASER_EL1 (which is within the buffer range). Cc: Anshuman Khandual <anshuman.khandual@arm.com> Cc: Mike Leach <mike.leach@linaro.org> Cc: Mathieu Poirier <mathieu.poirier@linaro.org> Cc: Leo Yan <leo.yan@linaro.org> Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com> Link: https://lore.kernel.org/r/20211019163153.3692640-8-suzuki.poulose@arm.com Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>	2021-10-27 11:45:51 -06:00
Suzuki K Poulose	4585481af3	coresight: trbe: Add a helper to pad a given buffer area Refactor the helper to pad a given AUX buffer area to allow "filling" ignore packets, without moving any handle pointers. This will be useful in working around errata, where we may have to fill the buffer after a session. Cc: Mathieu Poirier <mathieu.poirier@linaro.org> Cc: Mike Leach <mike.leach@linaro.org> Cc: Leo Yan <leo.yan@linaro.org> Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com> Link: https://lore.kernel.org/r/20211019163153.3692640-7-suzuki.poulose@arm.com Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>	2021-10-27 11:45:49 -06:00
Suzuki K Poulose	41c0e5b7a3	coresight: trbe: Add a helper to calculate the trace generated We collect the trace from the TRBE on FILL event from IRQ context and via update_buffer(), when the event is stopped. Let us consolidate how we calculate the trace generated into a helper. Cc: Mathieu Poirier <mathieu.poirier@linaro.org> Cc: Mike Leach <mike.leach@linaro.org> Cc: Leo Yan <leo.yan@linaro.org> Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com> Link: https://lore.kernel.org/r/20211019163153.3692640-6-suzuki.poulose@arm.com Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>	2021-10-27 11:45:46 -06:00
Suzuki K Poulose	a08025b3fe	coresight: trbe: Defer the probe on offline CPUs If a CPU is offline during the driver init, we could end up causing a kernel crash trying to register the coresight device for the TRBE instance. The trbe_cpudata for the TRBE instance is initialized only when it is probed. Otherwise, we could end up dereferencing a NULL cpudata->drvdata. e.g: [ 0.149999] coresight ete0: CPU0: ete v1.1 initialized [ 0.149999] coresight-etm4x ete_1: ETM arch init failed [ 0.149999] coresight-etm4x: probe of ete_1 failed with error -22 [ 0.150085] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000050 [ 0.150085] Mem abort info: [ 0.150085] ESR = 0x96000005 [ 0.150085] EC = 0x25: DABT (current EL), IL = 32 bits [ 0.150085] SET = 0, FnV = 0 [ 0.150085] EA = 0, S1PTW = 0 [ 0.150085] Data abort info: [ 0.150085] ISV = 0, ISS = 0x00000005 [ 0.150085] CM = 0, WnR = 0 [ 0.150085] [0000000000000050] user address but active_mm is swapper [ 0.150085] Internal error: Oops: 96000005 [#1] PREEMPT SMP [ 0.150085] Modules linked in: [ 0.150085] Hardware name: FVP Base RevC (DT) [ 0.150085] pstate: 00800009 (nzcv daif -PAN +UAO -TCO BTYPE=--) [ 0.150155] pc : arm_trbe_register_coresight_cpu+0x74/0x144 [ 0.150155] lr : arm_trbe_register_coresight_cpu+0x48/0x144 ... [ 0.150237] Call trace: [ 0.150237] arm_trbe_register_coresight_cpu+0x74/0x144 [ 0.150237] arm_trbe_device_probe+0x1c0/0x2d8 [ 0.150259] platform_drv_probe+0x94/0xbc [ 0.150259] really_probe+0x1bc/0x4a8 [ 0.150266] driver_probe_device+0x7c/0xb8 [ 0.150266] device_driver_attach+0x6c/0xac [ 0.150266] __driver_attach+0xc4/0x148 [ 0.150266] bus_for_each_dev+0x7c/0xc8 [ 0.150266] driver_attach+0x24/0x30 [ 0.150266] bus_add_driver+0x100/0x1e0 [ 0.150266] driver_register+0x78/0x110 [ 0.150266] __platform_driver_register+0x44/0x50 [ 0.150266] arm_trbe_init+0x28/0x84 [ 0.150266] do_one_initcall+0x94/0x2bc [ 0.150266] do_initcall_level+0xa4/0x158 [ 0.150266] do_initcalls+0x54/0x94 [ 0.150319] do_basic_setup+0x24/0x30 [ 0.150319] kernel_init_freeable+0xe8/0x14c [ 0.150319] kernel_init+0x14/0x18c [ 0.150319] ret_from_fork+0x10/0x30 [ 0.150319] Code: f94012c8 b0004ce2 9134a442 52819801 (f9402917) [ 0.150319] ---[ end trace d23e0cfe5098535e ]--- [ 0.150346] Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b Fix this by skipping the step, if we are unable to probe the CPU. Fixes: `3fbf7f011f` ("coresight: sink: Add TRBE driver") Reported-by: Bransilav Rankov <branislav.rankov@arm.com> Cc: Anshuman Khandual <anshuman.khandual@arm.com> Cc: Mathieu Poirier <mathieu.poirier@linaro.org> Cc: Mike Leach <mike.leach@linaro.org> Cc: Leo Yan <leo.yan@linaro.org> Cc: stable <stable@vger.kernel.org> Tested-by: Branislav Rankov <branislav.rankov@arm.com> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com> Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com> Link: https://lore.kernel.org/r/20211014142238.2221248-1-suzuki.poulose@arm.com Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>	2021-10-27 11:45:38 -06:00
Suzuki K Poulose	bb5293e334	coresight: trbe: Fix incorrect access of the sink specific data The TRBE driver wrongly treats the aux private data as the TRBE driver specific buffer for a given perf handle, while it is the ETM PMU's event specific data. Fix this by correcting the instance to use appropriate helper. Cc: stable <stable@vger.kernel.org> Fixes: `3fbf7f011f` ("coresight: sink: Add TRBE driver") Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com> Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com> Link: https://lore.kernel.org/r/20210921134121.2423546-2-suzuki.poulose@arm.com [Fixed 13 character SHA down to 12] Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>	2021-10-27 11:45:37 -06:00
Tao Zhang	0605b89d05	coresight: etm4x: Add ETM PID for Kryo-5XX Add ETM PID for Kryo-5XX to the list of supported ETMs. Otherwise, Kryo-5XX ETMs will not be initialized successfully. e.g. This change can be verified on qrb5165-rb5 board. ETM4-ETM7 nodes will not be visible without this change. Signed-off-by: Tao Zhang <quic_taozha@quicinc.com> Link: https://lore.kernel.org/r/1632477981-13632-2-git-send-email-quic_taozha@quicinc.com Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com> Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>	2021-10-27 11:45:35 -06:00
Suzuki K Poulose	dcfecfa444	coresight: trbe: Prohibit trace before disabling TRBE When the TRBE generates an IRQ, we stop the TRBE, collect the trace and then reprogram the TRBE with the updated buffer pointers, whenever possible. We might also leave the TRBE disabled, if there is not enough space left in the buffer. However, we do not touch the ETE at all during all of this. This means the ETE is only disabled when the event is disabled later (via irq_work). This is incorrect, as the ETE trace is still ON without actually being captured and may be routed to the ATB (even if it is for a short duration). So, we move the CPU into trace prohibited state always before disabling the TRBE, upon entering the IRQ handler. The state is restored if the TRBE is enabled back. Otherwise the trace remains prohibited. Since, the ETM/ETE driver now controls the TRFCR_EL1 per session, the tracing can be restored/enabled back when the event is rescheduled in. Cc: Anshuman Khandual <anshuman.khandual@arm.com> Cc: Mathieu Poirier <mathieu.poirier@linaro.org> Cc: Mike Leach <mike.leach@linaro.org> Cc: Leo Yan <leo.yan@linaro.org> Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com> Link: https://lore.kernel.org/r/20210923143919.2944311-6-suzuki.poulose@arm.com Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>	2021-10-27 11:45:33 -06:00
Suzuki K Poulose	9bef9d0850	coresight: trbe: End the AUX handle on truncation When we detect that there isn't enough space left to start a meaningful session, we disable the TRBE, marking the buffer as TRUNCATED. But we delay the notification to the perf layer by perf_aux_output_end() until the event is scheduled out, triggered from the kernel perf layer. This will cause significant black outs in the trace. Now that the CoreSight PMU layer can handle a closed "AUX" handle properly, we can close the handle as soon as we detect the case, allowing the userspace to collect and re-enable the event. Also, while in the IRQ handler, move the irq_work_run() after we have updated the handle, to make sure the "TRUNCATED" flag causes the event to be disabled as soon as possible. Cc: Anshuman Khandual <anshuman.khandual@arm.com> Cc: Mathieu Poirier <mathieu.poirier@linaro.org> Cc: Mike Leach <mike.leach@linaro.org> Cc: Leo Yan <leo.yan@linaro.org> Cc: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Will Deacon <will@kernel.org> Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com> Link: https://lore.kernel.org/r/20210923143919.2944311-5-suzuki.poulose@arm.com Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>	2021-10-27 11:45:31 -06:00
Suzuki K Poulose	0a5f355633	coresight: trbe: Do not truncate buffer on IRQ The TRBE driver marks the AUX buffer as TRUNCATED when we get an IRQ on FILL event. This has rather unwanted side-effect of the event being disabled when there may be more space in the ring buffer. So, instead of TRUNCATE we need a different flag to indicate that the trace may have lost a few bytes (i.e from the point of generating the FILL event until the IRQ is consumed). Anyways, the userspace must use the size from RECORD_AUX headers to restrict the "trace" decoding. Using PARTIAL flag causes the perf tool to generate the following warning: Warning: AUX data had gaps in it XX times out of YY! Are you running a KVM guest in the background? which is pointlessly scary for a user. The other remaining options are : - COLLISION - Use by SPE to indicate samples collided - Add a new flag - Specifically for CoreSight, doesn't sound so good, if we can re-use something. Given that we don't already use the "COLLISION" flag, the above behavior can be notified using this flag for CoreSight. Cc: Mathieu Poirier <mathieu.poirier@linaro.org> Cc: James Clark <james.clark@arm.com> Cc: Mike Leach <mike.leach@linaro.org> Cc: Anshuman Khandual <anshuman.khandual@arm.com> Cc: Leo Yan <leo.yan@linaro.org> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com> Link: https://lore.kernel.org/r/20210923143919.2944311-4-suzuki.poulose@arm.com Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>	2021-10-27 11:45:30 -06:00
Suzuki K Poulose	7037a39d37	coresight: trbe: Fix handling of spurious interrupts On a spurious IRQ, right now we disable the TRBE and then re-enable it back, resetting the "buffer" pointers(i.e BASE, LIMIT and more importantly WRITE) to the original pointers from the AUX handle. This implies that we overwrite any trace that was written so far, (by overwriting TRBPTR) while we should have ignored the IRQ. On detecting a spurious IRQ after examining the TRBSR we simply re-enable the TRBE without touching the other parameters. Cc: Anshuman Khandual <anshuman.khandual@arm.com> Cc: Mathieu Poirier <mathieu.poirier@linaro.org> Cc: Mike Leach <mike.leach@linaro.org> Cc: Leo Yan <leo.yan@linaro.org> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com> Link: https://lore.kernel.org/r/20210923143919.2944311-3-suzuki.poulose@arm.com Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>	2021-10-27 11:45:28 -06:00
Suzuki K Poulose	85fb92353e	coresight: trbe: irq handler: Do not disable TRBE if no action is needed The IRQ handler of the TRBE driver could race against the update_buffer() in consuming the IRQ. So, if the update_buffer() gets to processing the TRBE irq, the TRBSR will be cleared. Thus by the time IRQ handler is triggered, there is nothing to do there. Handle these cases and do not disable the TRBE unnecessarily. Since the TRBSR can be read without stopping the TRBE, we can check that before disabling the TRBE. Cc: Mathieu Poirier <mathieu.poirier@linaro.org> Cc: Anshuman Khandual <anshuman.khandual@arm.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mike Leach <mike.leach@linaro.org> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com> Link: https://lore.kernel.org/r/20210923143919.2944311-2-suzuki.poulose@arm.com Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>	2021-10-27 11:45:27 -06:00

... 84 85 86 87 88 ...

1059871 Commits