linux

mirror of https://github.com/torvalds/linux.git synced 2024-11-23 12:42:02 +00:00

Author	SHA1	Message	Date
Marc Zyngier	367eb095b8	Merge branch kvm-arm64/selftest/misc-6.4 into kvmarm-master/next * kvm-arm64/selftest/misc-6.4: : . : Misc selftest updates for 6.4 : : - Add comments for recently added ID registers : . KVM: selftests: Comment newly defined aarch64 ID registers Signed-off-by: Marc Zyngier <maz@kernel.org>	2023-04-21 09:39:07 +01:00
Marc Zyngier	e2e321a7d6	Merge branch kvm-arm64/selftest/lpa into kvmarm-master/next * kvm-arm64/selftest/lpa: : . : Selftest fixes addressing PTE and TTBR0_EL1 encodings for : 52bit PAs : . KVM: selftests: arm64: Fix ttbr0_el1 encoding for PA bits > 48 KVM: selftests: arm64: Fix pte encode/decode for PA bits > 48 KVM: selftests: Fixup config fragment for access_tracking_perf_test Signed-off-by: Marc Zyngier <maz@kernel.org>	2023-04-21 09:37:36 +01:00
Marc Zyngier	b22498c484	Merge branch kvm-arm64/timer-vm-offsets into kvmarm-master/next * kvm-arm64/timer-vm-offsets: (21 commits) : . : This series aims at satisfying multiple goals: : : - allow a VMM to atomically restore a timer offset for a whole VM : instead of updating the offset each time a vcpu get its counter : written : : - allow a VMM to save/restore the physical timer context, something : that we cannot do at the moment due to the lack of offsetting : : - provide a framework that is suitable for NV support, where we get : both global and per timer, per vcpu offsetting, and manage : interrupts in a less braindead way. : : Conflict resolution involves using the new per-vcpu config lock instead : of the home-grown timer lock. : . KVM: arm64: Handle 32bit CNTPCTSS traps KVM: arm64: selftests: Augment existing timer test to handle variable offset KVM: arm64: selftests: Deal with spurious timer interrupts KVM: arm64: selftests: Add physical timer registers to the sysreg list KVM: arm64: nv: timers: Support hyp timer emulation KVM: arm64: nv: timers: Add a per-timer, per-vcpu offset KVM: arm64: Document KVM_ARM_SET_CNT_OFFSETS and co KVM: arm64: timers: Abstract the number of valid timers per vcpu KVM: arm64: timers: Fast-track CNTPCT_EL0 trap handling KVM: arm64: Elide kern_hyp_va() in VHE-specific parts of the hypervisor KVM: arm64: timers: Move the timer IRQs into arch_timer_vm_data KVM: arm64: timers: Abstract per-timer IRQ access KVM: arm64: timers: Rationalise per-vcpu timer init KVM: arm64: timers: Allow save/restoring of the physical timer KVM: arm64: timers: Allow userspace to set the global counter offset KVM: arm64: Expose {un,}lock_all_vcpus() to the rest of KVM KVM: arm64: timers: Allow physical offset without CNTPOFF_EL2 KVM: arm64: timers: Use CNTPOFF_EL2 to offset the physical timer arm64: Add HAS_ECV_CNTPOFF capability arm64: Add CNTPOFF_EL2 register definition ... Signed-off-by: Marc Zyngier <maz@kernel.org>	2023-04-21 09:36:40 +01:00
Ido Schimmel	7648ac72dc	selftests: net: Add bridge neighbor suppression test Add test cases for bridge neighbor suppression, testing both per-port and per-{Port, VLAN} neighbor suppression with both ARP and NS packets. Example truncated output: # ./test_bridge_neigh_suppress.sh [...] Tests passed: 148 Tests failed: 0 Signed-off-by: Ido Schimmel <idosch@nvidia.com> Acked-by: Nikolay Aleksandrov <razor@blackwall.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-04-21 08:25:50 +01:00
Shunsuke Mie	e9c4962c5d	tools/virtio: fix build caused by virtio_ring changes Fix the build dependency for virtio_test. The virtio_ring that is used from the test requires container_of_const(). Change to use container_of.h kernel header directly and adapt related codes. Signed-off-by: Shunsuke Mie <mie@igel.co.jp> Message-Id: <20230417022037.917668-2-mie@igel.co.jp> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2023-04-21 03:02:35 -04:00
Rong Tao	6b27cd84a7	tools/virtio: virtio_test -h,--help should return directly When we get help information, we should return directly, and we should not execute test cases. Move the exit() directly into the help() function and remove it from case '?'. Signed-off-by: Rong Tao <rongtao@cestc.cn> Message-Id: <tencent_822CEBEB925205EA1573541CD1C2604F4805@qq.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2023-04-21 03:02:30 -04:00
Rong Tao	9b2b3de63c	tools/virtio: virtio_test: Fix indentation Replace eight spaces with Tab. Signed-off-by: Rong Tao <rtoax@foxmail.com> Message-Id: <tencent_89579C514BC4020324A1A4ACA44B5B95BB07@qq.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2023-04-21 03:02:30 -04:00
Vladimir Oltean	e6991384ac	selftests: forwarding: add a test for MAC Merge layer The MAC Merge layer (IEEE 802.3-2018 clause 99) does all the heavy lifting for Frame Preemption (IEEE 802.1Q-2018 clause 6.7.2), a TSN feature for minimizing latency. Preemptible traffic is different on the wire from normal traffic in incompatible ways. If we send a preemptible packet and the link partner doesn't support preemption, it will drop it as an error frame and we will never know. The MAC Merge layer has a control plane of its own, which can be manipulated (using ethtool) in order to negotiate this capability with the link partner (through LLDP). Actually the TLV format for LLDP solves this problem only partly, because both partners only advertise: - if they support preemption (RX and TX) - if they have enabled preemption (TX) so we cannot tell the link partner what to do - we cannot force it to enable reception of our preemptible packets. That is fully solved by the verification feature, where the local device generates some small probe frames which look like preemptible frames with no useful content, and the link partner is obliged to respond to them if it supports the standard. If the verification times out, we know that preemption isn't active in our TX direction on the link. Having clarified the definition, this selftest exercises the manual (ethtool) configuration path of 2 link partners (with and without verification), and the LLDP code path, using the openlldp project. The test also verifies the TX activity of the MAC Merge layer by sending traffic through a traffic class configured as preemptible (using mqprio). There isn't a good way to make this really portable (user space cannot find out how many traffic classes there are for a device), but I chose num_tc 4 here, that should work reasonably well. I also know that some devices (stmmac) only permit TXQ0 to be preemptible, so this is why PREEMPTIBLE_PRIO was strategically chosen as 0. Even if other hardware is more configurable, this test should cover the baseline. This is not really a "forwarding" selftest, but I put it near the other "ethtool" selftests. $ ./ethtool_mm.sh eno0 swp0 TEST: Manual configuration with verification: eno0 to swp0 [ OK ] TEST: Manual configuration with verification: swp0 to eno0 [ OK ] TEST: Manual configuration without verification: eno0 to swp0 [ OK ] TEST: Manual configuration without verification: swp0 to eno0 [ OK ] TEST: Manual configuration with failed verification: eno0 to swp0 [ OK ] TEST: Manual configuration with failed verification: swp0 to eno0 [ OK ] TEST: LLDP [ OK ] Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-04-20 20:03:21 -07:00
Vladimir Oltean	b5bf7126a6	selftests: forwarding: introduce helper for standard ethtool counters Counters for the MAC Merge layer and preemptible MAC have standardized so far on using structured ethtool stats as opposed to the driver specific names and meanings. Benefit from that rare opportunity and introduce a helper to lib.sh for querying standardized counters, in the hope that these will take off for other uses as well. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-04-20 20:03:21 -07:00
Petr Machata	8fcac79270	selftests: forwarding: generalize bail_on_lldpad from mlxsw mlxsw selftests often invoke a bail_on_lldpad() helper to make sure LLDPAD is not running, to prevent conflicts between the QoS configuration applied through TC or DCB command line tool, and the DCB configuration that LLDPAD might apply. This helper might be useful to others. Move the function to lib.sh, and parameterize to make reusable in other contexts. Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Danielle Ratson <danieller@nvidia.com> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-04-20 20:03:21 -07:00
Petr Machata	54e906f163	selftests: forwarding: sch_tbf_*: Add a pre-run hook The driver-specific wrappers of these selftests invoke bail_on_lldpad to make sure that LLDPAD doesn't trample the configuration. The function bail_on_lldpad is going to move to lib.sh in the next patch. With that, it won't be visible for the wrappers before sourcing the framework script. And after sourcing it, it is too late: the selftest will have run by then. One option might be to source NUM_NETIFS=0 lib.sh from the wrapper, but even if that worked (it might, it might not), that seems cumbersome. lib.sh is doing fair amount of stuff, and even if it works today, it does not look particularly solid as a solution. Instead, introduce a hook, sch_tbf_pre_hook(), that when available, gets invoked. Move the bail to the hook. Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Danielle Ratson <danieller@nvidia.com> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-04-20 20:03:21 -07:00
Eduard Zingerman	cbb110bc66	selftests/bpf: populate map_array_ro map for verifier_array_access test Two test cases: - "valid read map access into a read-only array 1" and - "valid read map access into a read-only array 2" Expect that map_array_ro map is filled with mock data. This logic was not taken into acount during initial test conversion. This commit modifies prog_tests/verifier.c entry point for this test to fill the map. Fixes: `a3c830ae02` ("selftests/bpf: verifier/array_access.c converted to inline assembly") Signed-off-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/r/20230420232317.2181776-5-eddyz87@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2023-04-20 16:49:16 -07:00
Eduard Zingerman	5b22f4d143	selftests/bpf: add pre bpf_prog_test_run_opts() callback for test_loader When a test case is annotated with __retval tag the test_loader engine would use libbpf's bpf_prog_test_run_opts() to do a test run of the program and compare retvals. This commit allows to perform arbitrary actions on bpf object right before test loader invokes bpf_prog_test_run_opts(). This could be used to setup some state for program execution, e.g. fill some maps. Signed-off-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/r/20230420232317.2181776-4-eddyz87@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2023-04-20 16:49:16 -07:00
Eduard Zingerman	7cdddb99e4	selftests/bpf: fix __retval() being always ignored Florian Westphal found a bug in and suggested a fix for test_loader.c processing of __retval tag. Because of this bug the function test_loader.c:do_prog_test_run() never executed and all __retval test tags were ignored. If this bug is fixed a number of test cases from progs/verifier_array_access.c fail with retval not matching the expected value. This test was recently converted to use test_loader.c and inline assembly in [1]. When doing the conversion I missed the important detail of test_verifier.c operation: when it creates fixup_map_array_ro, fixup_map_array_wo and fixup_map_array_small it populates these maps with a dummy record. Disabling the __retval checks for the affected verifier_array_access in this commit to avoid false-postivies in any potential bisects. The issue is addressed in the next patch. I verified that the __retval tags are now respected by changing expected return values for all tests annotated with __retval, and checking that these tests started to fail. [1] https://lore.kernel.org/bpf/20230325025524.144043-1-eddyz87@gmail.com/ Fixes: `19a8e06f5f` ("selftests/bpf: Tests execution support for test_loader.c") Reported-by: Florian Westphal <fw@strlen.de> Link: https://lore.kernel.org/bpf/f4c4aee644425842ee6aa8edf1da68f0a8260e7c.camel@gmail.com/T/ Signed-off-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/r/20230420232317.2181776-3-eddyz87@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2023-04-20 16:49:16 -07:00
Eduard Zingerman	7c4b96c000	selftests/bpf: disable program test run for progs/refcounted_kptr.c Florian Westphal found a bug in test_loader.c processing of __retval tag. Because of this bug the function test_loader.c:do_prog_test_run() never executed and all __retval test tags were ignored. This hid an issue with progs/refcounted_kptr.c tests. When __retval tag bug is fixed and refcounted_kptr.c tests are run kernel reports various issues and eventually hangs. Shortest reproducer is the following command run a few times: $ for i in $(seq 1 4); do (./test_progs --allow=refcounted_kptr &); done Commenting out __retval tags for these tests until this issue is resolved. Reported-by: Florian Westphal <fw@strlen.de> Link: https://lore.kernel.org/bpf/f4c4aee644425842ee6aa8edf1da68f0a8260e7c.camel@gmail.com/T/ Signed-off-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/r/20230420232317.2181776-2-eddyz87@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2023-04-20 16:49:16 -07:00
Quentin Monnet	4b7ef71ac9	bpftool: Replace "__fallthrough" by a comment to address merge conflict The recent support for inline annotations in control flow graphs generated by bpftool introduced the usage of the "__fallthrough" macro in a switch/case block in btf_dumper.c. This change went through the bpf-next tree, but resulted in a merge conflict in linux-next, because this macro has been renamed "fallthrough" (no underscores) in the meantime. To address the conflict, we temporarily switch to a simple comment instead of a macro. Related: commit `f7a858bffc` ("tools: Rename __fallthrough to fallthrough") Fixes: `9fd496848b` ("bpftool: Support inline annotations when dumping the CFG of a program") Reported-by: Sven Schnelle <svens@linux.ibm.com> Reported-by: Thomas Richter <tmricht@linux.ibm.com> Suggested-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Quentin Monnet <quentin@isovalent.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/all/yt9dttxlwal7.fsf@linux.ibm.com/ Link: https://lore.kernel.org/bpf/20230412123636.2358949-1-tmricht@linux.ibm.com/ Link: https://lore.kernel.org/bpf/20230420003333.90901-1-quentin@isovalent.com	2023-04-20 16:38:10 -07:00
Jakub Kicinski	681c5b51dc	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Adjacent changes: net/mptcp/protocol.h `63740448a3` ("mptcp: fix accept vs worker race") `2a6a870e44` ("mptcp: stops worker on unaccepted sockets at listener close") `ddb1a072f8` ("mptcp: move first subflow allocation at mpc access time") Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-04-20 16:29:51 -07:00
Ian Rogers	9be6ab181b	libperf rc_check: Enable implicitly with sanitizers If using leak sanitizer then implicitly enable reference count checking. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20230420171812.561603-1-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2023-04-20 15:12:02 -03:00
Ian Rogers	edd4cab2d4	perf test: Fix maps use after put Fix a use after put reference count issue. maps is copied from leader, but the leader is put on line 79 and then maps is used to read the reference count below - so a use after put, with the put of maps happening within thread__put. Fix by reversing the order of puts so that the leader is put last. To explain the reference count checker, I wrote this up as a little example here: https://perf.wiki.kernel.org/index.php/Reference_Count_Checking Note, the bug was introduced by the committer and wasn't present in the original reference count patch set. Committer notes: Yes, the bug predated your patch and is detected by the reference count checking you contributed. This was just part of splitting up your series into smaller chunks, in this case either we fix the problem detected while developing this reference counting infrastructure before the patch introducing REFCNT_CHECKING or fix it later after the merged infrastructure, when built with EXTRA_CFLAGS="-DREFCNT_CHECKING=1" detects it when running 'perf test', which is what this patch does. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lore.kernel.org/lkml/20230420030430.489243-1-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2023-04-20 08:23:52 -03:00
Feng Zhou	5ff54dedf3	selftests/bpf: Add test to access integer type of variable array Add prog test for accessing integer type of variable array in tracing program. In addition, hook load_balance function to access sd->span[0], only to confirm whether the load is successful. Because there is no direct way to trigger load_balance call. Co-developed-by: Chengming Zhou <zhouchengming@bytedance.com> Signed-off-by: Chengming Zhou <zhouchengming@bytedance.com> Signed-off-by: Feng Zhou <zhoufeng.zf@bytedance.com> Link: https://lore.kernel.org/r/20230420032735.27760-3-zhoufeng.zf@bytedance.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2023-04-19 21:29:39 -07:00
Benjamin Gray	ae7312c090	selftests/powerpc/dscr: Restore timeout to DSCR selftests Reducing the time taken by dscr_sysfs_test.c allows restoring the default timeout, which was removed in commit `850507f30c` ("selftests/powerpc: Turn off timeout setting for benchmarks, dscr, signal, tm") because that test took too long. Signed-off-by: Benjamin Gray <bgray@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/20230406043320.125138-8-bgray@linux.ibm.com	2023-04-20 13:21:46 +10:00
Benjamin Gray	c14a9d0a79	selftests/powerpc/dscr: Speed up DSCR sysfs tests This test case is extremely slow, taking around a minute compared to most of the other DSCR tests taking a second at most. Perf shows most time is spent by the kernel switching to each CPU it reads in /sys/devices/system/cpu. This switching is an unavoidable consequnce of reading all the .../cpuN/dscr values. Remove the outer iteration loop from this test case, reducing the reads from 1600 to 16. This still updates the DSCR 16 times and verifies on every CPU each time, so I do not expect the lower coverage to be meaningful. The speedup is significant: back down to ~1 second like the other tests. Signed-off-by: Benjamin Gray <bgray@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/20230406043320.125138-7-bgray@linux.ibm.com	2023-04-20 13:21:46 +10:00
Benjamin Gray	3067b89ab6	selftests/powerpc/dscr: Improve DSCR explicit random test case The tests currently have a single writer thread updating the system DSCR with a 1/1000 chance looped only 100 times. So only around one in 10 runs actually do anything. * Add multiple threads to the dscr_explicit_random_test case. * Use a barrier to make all the threads start work as simultaneously as possible. * Use a rwlock and make all threads have a reasonable chance to write to the DSCR on each iteration. PTHREAD_RWLOCK_PREFER_WRITER_NONRECURSIVE_NP is used to prevent writers from starving while all the other threads keep reading. Logging the reads/writes shows a decent mix across the whole test. * Allow all threads a chance to write. * Make the chance of writing more likely. Signed-off-by: Benjamin Gray <bgray@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/20230406043320.125138-6-bgray@linux.ibm.com	2023-04-20 13:21:45 +10:00
Benjamin Gray	fda8158870	selftests/powerpc/dscr: Add lockstep test cases to DSCR explicit tests Add new cases to the relevant tests that use explicitly synchronized threads to test the behaviour across context switches with less randomness. By locking the participants to the same CPU we guarantee a context switch occurs each time they make progress, which is a likely failure point if the kernel is not tracking the thread local DSCR correctly. The random case is left in to keep exercising potential edge cases. Signed-off-by: Benjamin Gray <bgray@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/20230406043320.125138-5-bgray@linux.ibm.com	2023-04-20 13:21:45 +10:00
Benjamin Gray	6ff4dc2548	selftests/powerpc: Allow bind_to_cpu() to automatically pick CPU All current users of bind_to_cpu() don't care _which_ CPU they get, just that they are bound to a single free one. So alter the interface to 1. Accept a BIND_CPU_ANY value that tells it to automatically pick a CPU 2. Return the picked CPU And convert all these users to bind_to_cpu(BIND_CPU_ANY). Signed-off-by: Benjamin Gray <bgray@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/20230406043320.125138-4-bgray@linux.ibm.com	2023-04-20 13:21:45 +10:00
Benjamin Gray	c97b2fc662	selftests/powerpc: Move bind_to_cpu() to utils.h This function will be useful in the DSCR test patches later in this series, so promote it to be shared by all powerpc selftests. Signed-off-by: Benjamin Gray <bgray@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/20230406043320.125138-3-bgray@linux.ibm.com	2023-04-20 13:21:45 +10:00
Benjamin Gray	15f0c2601e	selftests/powerpc/dscr: Correct typos Correct a couple of typos while working on other improvements to the DSCR tests. Signed-off-by: Benjamin Gray <bgray@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/20230406043320.125138-2-bgray@linux.ibm.com	2023-04-20 13:21:45 +10:00
Nicholas Piggin	4e991e3c16	powerpc: add CFUNC assembly label annotation This macro is to be used in assembly where C functions are called. pcrel addressing mode requires branches to functions with a localentry value of 1 to have either a trailing nop or @notoc. This macro permits the latter without changing callers. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> [mpe: Add dummy definitions to fix selftests build] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/20230408021752.862660-5-npiggin@gmail.com	2023-04-20 12:54:24 +10:00
Linus Torvalds	cb0856346a	22 hotfixes. 19 are cc:stable and the remainder address issues which were introduced during this merge cycle, or aren't considered suitable for -stable backporting. 19 are for MM and the remainder are for other subsystems. -----BEGIN PGP SIGNATURE----- iHUEABYIAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCZEB7GgAKCRDdBJ7gKXxA jl4zAP9LxKisY8L29qrZG/SKoYbMMSM33ASOGZJRAuRRaOYL6QEAvS14pg/c22rL 4GCZbzvENY4xPRbz/6kc/s2Jnuww4wA= =Kh/V -----END PGP SIGNATURE----- Merge tag 'mm-hotfixes-stable-2023-04-19-16-36' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull misc fixes from Andrew Morton: "22 hotfixes. 19 are cc:stable and the remainder address issues which were introduced during this merge cycle, or aren't considered suitable for -stable backporting. 19 are for MM and the remainder are for other subsystems" * tag 'mm-hotfixes-stable-2023-04-19-16-36' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (22 commits) nilfs2: initialize unused bytes in segment summary blocks mm: page_alloc: skip regions with hugetlbfs pages when allocating 1G pages mm/mmap: regression fix for unmapped_area{_topdown} maple_tree: fix mas_empty_area() search maple_tree: make maple state reusable after mas_empty_area_rev() mm: kmsan: handle alloc failures in kmsan_ioremap_page_range() mm: kmsan: handle alloc failures in kmsan_vmap_pages_range_noflush() tools/Makefile: do missed s/vm/mm/ mm: fix memory leak on mm_init error handling mm/page_alloc: fix potential deadlock on zonelist_update_seq seqlock kernel/sys.c: fix and improve control flow in __sys_setres[ug]id() Revert "userfaultfd: don't fail on unrecognized features" writeback, cgroup: fix null-ptr-deref write in bdi_split_work_to_wbs maple_tree: fix a potential memory leak, OOB access, or other unpredictable bug tools/mm/page_owner_sort.c: fix TGID output when cull=tg is used mailmap: update jtoppins' entry to reference correct email mm/mempolicy: fix use-after-free of VMA iterator mm/huge_memory.c: warn with pr_warn_ratelimited instead of VM_WARN_ON_ONCE_FOLIO mm/mprotect: fix do_mprotect_pkey() return on error mm/khugepaged: check again on anon uffd-wp during isolation ...	2023-04-19 17:55:45 -07:00
Arnaldo Carvalho de Melo	265b0de2f0	perf probe: Add missing 0x prefix for addresses printed in hexadecimal To fix this confusing warning: # perf probe -l Failed to find debug information for address 798240 probe_main:prometheus_new_counter__return (on github.com/prometheus/client_golang/prometheus.NewCounter%return in /home/acme/git/prometheus-uprobes/main with counter) # As that 798240 is printed with PRIx64 but has no letters, better print the 0x prefix to disambiguate. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/lkml/ZEBCyFu2GjTw6qOi@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2023-04-19 16:38:43 -03:00
Arnaldo Carvalho de Melo	686c511866	perf build: Test the refcnt check build Make sure we test build the currently added REFCNT_CHECKING infrastructure. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2023-04-19 13:47:03 -03:00
Ian Rogers	2832ef81d4	perf map: Add reference count checking There's no strict get/put policy with map that leads to leaks or use after free. Reference count checking identifies correct pairing of gets and puts. Committer notes: Extracted from a larger patch removing bits that were covered by the use of pre-existing map__ accessors (e.g. maps__nr_maps()) and new ones added (map__refcnt() and the maps__set_ ones) to reduce RC_CHK_ACCESS(maps)-> source code pollution. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com> Cc: Dmitriy Vyukov <dvyukov@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Riccardo Mancini <rickyman7@gmail.com> Cc: Stephane Eranian <eranian@google.com> Cc: Stephen Brennan <stephen.s.brennan@oracle.com> Link: https://lore.kernel.org/lkml/20230407230405.2931830-6-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2023-04-19 12:57:53 -03:00
Arnaldo Carvalho de Melo	e6a9efcee5	perf map: Add set_ methods for map->{start,end,pgoff,pgoff,reloc,erange_warned,dso,map_ip,unmap_ip,priv} To have a way to intercept usage of the reference counted struct map. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2023-04-19 12:54:41 -03:00
Arnaldo Carvalho de Melo	e1805aae1e	perf map: Add missing conversions to map__refcnt() Some conversions weren't performed in `4e8db2d752` ("perf map: Add map__refcnt() accessor to use in the maps test"), fix it. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2023-04-19 12:33:53 -03:00
Magnus Karlsson	2ddade3229	selftests/xsk: Fix munmap for hugepage allocated umem Fix the unmapping of hugepage allocated umems so that they are properly unmapped. The new test referred to in the fixes label, introduced a test that allocated a umem that is not a multiple of a 2M hugepage size. This is fine for mmap() that rounds the size up the nearest multiple of 2M. But munmap() requires the size to be a multiple of the hugepage size in order for it to unmap the region. The current behaviour of not properly unmapping the umem, was discovered when further additions of tests that require hugepages (unaligned mode tests only) started failing as the system was running out of hugepages. Fixes: `c0801598e5` ("selftests: xsk: Add test UNALIGNED_INV_DESC_4K1_FRAME_SIZE") Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20230418143617.27762-1-magnus.karlsson@gmail.com	2023-04-19 16:33:53 +02:00
Ian Rogers	8f12692b7e	perf maps: Add reference count checking Add reference count checking to make sure of good use of get and put. Add and use accessors to reduce RC_CHK clutter. The only significant issue was in tests/thread-maps-share.c where reference counts were released in the reverse order to acquisition, leading to a use after put. This was fixed by reversing the put order. Committer notes: Extracted from a larger patch removing bits that were covered by the use of pre-existing maps__ accessors (e.g. maps__nr_maps()) and new ones added (maps__refcnt()) to reduce RC_CHK_ACCESS(maps)-> source code pollution. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com> Cc: Dmitriy Vyukov <dvyukov@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Riccardo Mancini <rickyman7@gmail.com> Cc: Stephane Eranian <eranian@google.com> Cc: Stephen Brennan <stephen.s.brennan@oracle.com> Link: https://lore.kernel.org/lkml/20230407230405.2931830-5-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2023-04-19 10:53:01 -03:00
Arnaldo Carvalho de Melo	a07dacad8a	perf maps: Use maps__nr_maps() instead of open coded maps->nr_maps To use the existing accessor and be consistent. Signef-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2023-04-19 10:52:54 -03:00
Arnaldo Carvalho de Melo	fe693d951e	perf maps: Add maps__refcnt() accessor to allow checking maps pointer To remove one more direct access to 'struct maps' so that we can intercept accesses to its instantiations and refcount check it to catch use after free, etc. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2023-04-19 10:52:54 -03:00
Arnaldo Carvalho de Melo	3ad1be6fae	perf dso: Fix use before NULL check introduced by map__dso() introduction James Clark noticed that the recent `63df0e4bc3` ("perf map: Add accessor for dso") patch accessed map->dso before the 'map' variable was NULL checked, which is a change in logic that leads to segmentation faults, so comb thru that patch to fix similar cases. Fixes: `63df0e4bc3` ("perf map: Add accessor for dso") Acked-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.g.garry@oracle.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mathieu Poirier <mathieu.poirier@linaro.org> Cc: Mike Leach <mike.leach@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org Cc: Suzuki Poulouse <suzuki.poulose@arm.com> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/lkml/ZD68RYCVT8hqPuxr@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2023-04-19 10:51:48 -03:00
Tiezhu Yang	b5533e990d	tools/loongarch: Use __SIZEOF_LONG__ to define __BITS_PER_LONG Although __SIZEOF_POINTER__ is equal to _SIZEOF_LONG__ on LoongArch, it is better to use __SIZEOF_LONG__ to define __BITS_PER_LONG to keep consistent between arch/loongarch/include/uapi/asm/bitsperlong.h and tools/arch/loongarch/include/uapi/asm/bitsperlong.h. Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>	2023-04-19 12:07:34 +08:00
Linus Torvalds	427fda2c8a	x86: improve on the non-rep 'copy_user' function The old 'copy_user_generic_unrolled' function was oddly implemented for largely historical reasons: it had been largely based on the uncached copy case, which has some other concerns. For example, the __copy_user_nocache() function uses 'movnti' for the destination stores, and those want the destination to be aligned. In contrast, the regular copy function doesn't really care, and trying to align things only complicates matters. Also, like the clear_user function, the copy function had some odd handling of the repeat counts, complicating the exception handling for no really good reason. So as with clear_user, just write it to keep all the byte counts in the %rcx register, exactly like the 'rep movs' functionality that this replaces. Unlike a real 'rep movs', we do allow for this to trash a few temporary registers to not have to unnecessarily save/restore registers on the stack. And like the clearing case, rename this to what it now clearly is: 'rep_movs_alternative', and make it one coherent function, so that it shows up as such in profiles (instead of the odd split between "copy_user_generic_unrolled" and "copy_user_short_string", the latter of which was not about strings at all, and which was shared with the uncached case). Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2023-04-18 17:05:28 -07:00
Linus Torvalds	8c9b6a88b7	x86: improve on the non-rep 'clear_user' function The old version was oddly written to have the repeat count in multiple registers. So instead of taking advantage of %rax being zero, it had some sub-counts in it. All just for a "single word clearing" loop, which isn't even efficient to begin with. So get rid of those games, and just keep all the state in the same registers we got it in (and that we should return things in). That not only makes this act much more like 'rep stos' (which this function is replacing), but makes it much easier to actually do the obvious loop unrolling. Also rename the function from the now nonsensical 'clear_user_original' to what it now clearly is: 'rep_stos_alternative'. End result: if we don't have a fast 'rep stosb', at least we can have a fast fallback for it. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2023-04-18 17:05:28 -07:00
Linus Torvalds	577e6a7fd5	x86: inline the 'rep movs' in user copies for the FSRM case This does the same thing for the user copies as commit `0db7058e8e` ("x86/clear_user: Make it faster") did for clear_user(). In other words, it inlines the "rep movs" case when X86_FEATURE_FSRM is set, avoiding the function call entirely. In order to do that, it makes the calling convention for the out-of-line case ("copy_user_generic_unrolled") match the 'rep movs' calling convention, although it does also end up clobbering a number of additional registers. Also, to simplify code sharing in the low-level assembly with the __copy_user_nocache() function (that uses the normal C calling convention), we end up with a kind of mixed return value for the low-level asm code: it will return the result in both %rcx (to work as an alternative for the 'rep movs' case), _and_ in %rax (for the nocache case). We could avoid this by wrapping __copy_user_nocache() callers in an inline asm, but since the cost is just an extra register copy, it's probably not worth it. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2023-04-18 17:05:28 -07:00
Linus Torvalds	3639a53558	x86: move stac/clac from user copy routines into callers This is preparatory work for inlining the 'rep movs' case, but also a cleanup. The __copy_user_nocache() function was mis-used by the rdma code to do uncached kernel copies that don't actually want user copies at all, and as a result doesn't want the stac/clac either. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2023-04-18 17:05:28 -07:00
Linus Torvalds	d2c95f9d68	x86: don't use REP_GOOD or ERMS for user memory clearing The modern target to use is FSRS (Fast Short REP STOS), and the other cases should only be used for bigger areas (ie mainly things like page clearing). Note! This changes the conditional for the inlining from FSRM ("fast short rep movs") to FSRS ("fast short rep stos"). We'll have a separate fixup for AMD microarchitectures that have a good 'rep stosb' yet do not set the new Intel-specific FSRS bit (because FSRM was there first). Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2023-04-18 17:05:28 -07:00
Jeff Xu	3cc0c3738c	selftests/memfd: fix test_sysctl sysctl memfd_noexec is pid-namespaced, non-reservable, and inherent to the child process. Move the inherence test from init ns to child ns, so init ns can keep the default value. Link: https://lkml.kernel.org/r/20230414022801.2545257-1-jeffxu@google.com Signed-off-by: Jeff Xu <jeffxu@google.com> Reported-by: kernel test robot <yujie.liu@intel.com> Link: https://lore.kernel.org/oe-lkp/202303312259.441e35db-yujie.liu@intel.com Tested-by: Yujie Liu <yujie.liu@intel.com> Cc: Daniel Verkamp <dverkamp@chromium.org> Cc: Dmitry Torokhov <dmitry.torokhov@gmail.com> Cc: Hugh Dickins <hughd@google.com> Cc: Jann Horn <jannh@google.com> Cc: Jorge Lucangeli Obes <jorgelo@chromium.org> Cc: Kees Cook <keescook@chromium.org> Cc: Shuah Khan <skhan@linuxfoundation.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2023-04-18 16:53:52 -07:00
Chaitanya S Prakash	c025da0f14	selftests/mm: run hugetlb testcases of va switch The va_high_addr_switch selftest is used to test mmap across 128TB boundary. It divides the selftest cases into two main categories on the basis of size. One set is used to create mappings that are multiples of PAGE_SIZE while the other creates mappings that are multiples of HUGETLB_SIZE. In order to run the hugetlb testcases the binary must be appended with "--run-hugetlb" but the file that used to run the test only invokes the binary, thereby completely skipping the hugetlb testcases. Hence, the required statement has been added. Link: https://lkml.kernel.org/r/20230323105243.2807166-6-chaitanyas.prakash@arm.com Signed-off-by: Chaitanya S Prakash <chaitanyas.prakash@arm.com> Cc: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Shuah Khan <shuah@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2023-04-18 16:53:52 -07:00
Chaitanya S Prakash	2f489e2e69	selftests/mm: configure nr_hugepages for arm64 Arm64 has a default hugepage size of 512MB when CONFIG_ARM64_64K_PAGES=y is enabled. While testing on arm64 platforms having up to 4PB of virtual address space, a minimum of 6 hugepages were required for all test cases to pass. Support for this requirement has been added. Link: https://lkml.kernel.org/r/20230323105243.2807166-5-chaitanyas.prakash@arm.com Signed-off-by: Chaitanya S Prakash <chaitanyas.prakash@arm.com> Cc: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Shuah Khan <shuah@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2023-04-18 16:53:51 -07:00
Chaitanya S Prakash	c2af2a4190	selftests/mm: add platform independent in code comments The in code comments for the selftest were made on the basis of 128TB switch, an architecture feature specific to PowerPc and x86 platforms. Keeping in mind the support added for arm64 platforms which implements a 256TB switch, a more generic explanation has been provided. Link: https://lkml.kernel.org/r/20230323105243.2807166-4-chaitanyas.prakash@arm.com Signed-off-by: Chaitanya S Prakash <chaitanyas.prakash@arm.com> Cc: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Shuah Khan <shuah@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2023-04-18 16:53:51 -07:00
Chaitanya S Prakash	bbe168729d	selftests/mm: rename va_128TBswitch to va_high_addr_switch As the initial selftest only took into consideration PowperPC and x86 architectures, on adding support for arm64, a platform independent naming convention is chosen. Link: https://lkml.kernel.org/r/20230323105243.2807166-3-chaitanyas.prakash@arm.com Signed-off-by: Chaitanya S Prakash <chaitanyas.prakash@arm.com> Cc: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Shuah Khan <shuah@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2023-04-18 16:53:51 -07:00
Chaitanya S Prakash	cd834afa8e	selftests/mm: add support for arm64 platform on va switch Patch series "selftests/mm: Implement support for arm64 on va". The va_128TBswitch selftest is designed and implemented for PowerPC and x86 architectures which support a 128TB switch, up to 256TB of virtual address space and hugepage sizes of 16MB and 2MB respectively. Arm64 platforms on the other hand support a 256Tb switch, up to 4PB of virtual address space and a default hugepage size of 512MB when 64k pagesize is enabled. These architectural differences require introducing support for arm64 platforms, after which a more generic naming convention is suggested. The in code comments are amended to provide a more platform independent explanation of the working of the code and nr_hugepages are configured as required. Finally, the file running the testcase is modified in order to prevent skipping of hugetlb testcases of va_high_addr_switch. This patch (of 5): Arm64 platforms have the ability to support 64kb pagesize, 512MB default hugepage size and up to 4PB of virtual address space. The address switch occurs at 256TB as opposed to 128TB. Hence, the necessary support has been added. Link: https://lkml.kernel.org/r/20230323105243.2807166-1-chaitanyas.prakash@arm.com Link: https://lkml.kernel.org/r/20230323105243.2807166-2-chaitanyas.prakash@arm.com Signed-off-by: Chaitanya S Prakash <chaitanyas.prakash@arm.com> Cc: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Shuah Khan <shuah@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2023-04-18 16:53:51 -07:00
Yang Yang	a3b2aeac9d	delayacct: track delays from IRQ/SOFTIRQ Delay accounting does not track the delay of IRQ/SOFTIRQ. While IRQ/SOFTIRQ could have obvious impact on some workloads productivity, such as when workloads are running on system which is busy handling network IRQ/SOFTIRQ. Get the delay of IRQ/SOFTIRQ could help users to reduce such delay. Such as setting interrupt affinity or task affinity, using kernel thread for NAPI etc. This is inspired by "sched/psi: Add PSI_IRQ to track IRQ/SOFTIRQ pressure"[1]. Also fix some code indent problems of older code. And update tools/accounting/getdelays.c: / # ./getdelays -p 156 -di print delayacct stats ON printing IO accounting PID 156 CPU count real total virtual total delay total delay average 15 15836008 16218149 275700790 18.380ms IO count delay total delay average 0 0 0.000ms SWAP count delay total delay average 0 0 0.000ms RECLAIM count delay total delay average 0 0 0.000ms THRASHING count delay total delay average 0 0 0.000ms COMPACT count delay total delay average 0 0 0.000ms WPCOPY count delay total delay average 36 7586118 0.211ms IRQ count delay total delay average 42 929161 0.022ms [1] commit 52b1364ba0b1("sched/psi: Add PSI_IRQ to track IRQ/SOFTIRQ pressure") Link: https://lkml.kernel.org/r/202304081728353557233@zte.com.cn Signed-off-by: Yang Yang <yang.yang29@zte.com.cn> Cc: Jiang Xuexin <jiang.xuexin@zte.com.cn> Cc: wangyong <wang.yong12@zte.com.cn> Cc: junhua huang <huang.junhua@zte.com.cn> Cc: Balbir Singh <bsingharora@gmail.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Juri Lelli <juri.lelli@redhat.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2023-04-18 16:39:34 -07:00
Peter Xu	43759d44dc	selftests/mm: add uffdio register ioctls test This new test tests against the returned ioctls from UFFDIO_REGISTER, where put into uffdio_register.ioctls. This also tests the expected failure cases of UFFDIO_REGISTER, aka: - Register with empty mode should fail with -EINVAL - Register minor without page cache (anon) should fail with -EINVAL Link: https://lkml.kernel.org/r/20230412164548.329376-1-peterx@redhat.com Signed-off-by: Peter Xu <peterx@redhat.com> Cc: Axel Rasmussen <axelrasmussen@google.com> Cc: David Hildenbrand <david@redhat.com> Cc: Dmitry Safonov <0x7f454c46@gmail.com> Cc: Mike Kravetz <mike.kravetz@oracle.com> Cc: Mike Rapoport (IBM) <rppt@kernel.org> Cc: Zach O'Keefe <zokeefe@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2023-04-18 16:30:08 -07:00
Peter Xu	5aec236fdd	selftests/mm: add shmem-private test to uffd-stress The userfaultfd stress test never tested private shmem, which I think was overlooked long due. Add it so it matches with uffd unit test and it'll cover all memory supported with the three memory types. Meanwhile, rename the memory types a bit. Considering shared mem is the major use case for both shmem / hugetlbfs, changing from: anon, hugetlb, hugetlb_shared, shmem To (with shmem-private added): anon, hugetlb, hugetlb-private, shmem, shmem-private Add the shmem-private to run_vmtests.sh too. Link: https://lkml.kernel.org/r/20230412164546.329355-1-peterx@redhat.com Signed-off-by: Peter Xu <peterx@redhat.com> Cc: Axel Rasmussen <axelrasmussen@google.com> Cc: David Hildenbrand <david@redhat.com> Cc: Dmitry Safonov <0x7f454c46@gmail.com> Cc: Mike Kravetz <mike.kravetz@oracle.com> Cc: Mike Rapoport (IBM) <rppt@kernel.org> Cc: Zach O'Keefe <zokeefe@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2023-04-18 16:30:08 -07:00
Peter Xu	111fd29b2a	selftests/mm: drop sys/dev test in uffd-stress test With the new uffd unit test covering the /dev/userfaultfd path and syscall path of uffd initializations, we can safely drop the devnode test in the old stress test. One thing is to avoid duplication of running the stress test twice which is an overkill to only test the /dev/ interface in run_vmtests.sh. The other benefit is now all uffd tests (that uses userfaultfd_open) can run automatically as long as any type of interface is enabled (either syscall or dev), so it's more likely to succeed rather than fail due to unprivilege. With this patch lands, we can drop all the "mem_type:XXX" handlings too. Link: https://lkml.kernel.org/r/20230412164525.329176-1-peterx@redhat.com Signed-off-by: Peter Xu <peterx@redhat.com> Cc: Axel Rasmussen <axelrasmussen@google.com> Cc: David Hildenbrand <david@redhat.com> Cc: Dmitry Safonov <0x7f454c46@gmail.com> Cc: Mike Kravetz <mike.kravetz@oracle.com> Cc: Mike Rapoport (IBM) <rppt@kernel.org> Cc: Zach O'Keefe <zokeefe@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2023-04-18 16:30:08 -07:00
Peter Xu	f9da24263d	selftests/mm: allow uffd test to skip properly with no privilege Allow skip a unit test properly due to no privilege (e.g. sigbus and events tests). [colin.i.king@gmail.com: fix spelling mistake "priviledge" -> "privilege"] Link: https://lkml.kernel.org/r/20230414081506.1678998-1-colin.i.king@gmail.com Link: https://lkml.kernel.org/r/20230412164520.329163-1-peterx@redhat.com Signed-off-by: Peter Xu <peterx@redhat.com> Signed-off-by: Colin Ian King <colin.i.king@gmail.com> Cc: Axel Rasmussen <axelrasmussen@google.com> Cc: David Hildenbrand <david@redhat.com> Cc: Dmitry Safonov <0x7f454c46@gmail.com> Cc: Mike Kravetz <mike.kravetz@oracle.com> Cc: Mike Rapoport (IBM) <rppt@kernel.org> Cc: Zach O'Keefe <zokeefe@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2023-04-18 16:30:08 -07:00
Peter Xu	4df9cefa94	selftests/mm: workaround no way to detect uffd-minor + wp Userfaultfd minor+wp mode was very recently added. The test will fail on the old kernels at ioctl(UFFDIO_CONTINUE) which is misterious. Unfortunately there's no feature bit to detect for this support. Add a hack to leverage WP_UNPOPULATED to detect whether that feature existed, since WP_UNPOPULATED was merged right after minor+wp. Link: https://lkml.kernel.org/r/20230412164517.329152-1-peterx@redhat.com Signed-off-by: Peter Xu <peterx@redhat.com> Cc: Axel Rasmussen <axelrasmussen@google.com> Cc: David Hildenbrand <david@redhat.com> Cc: Dmitry Safonov <0x7f454c46@gmail.com> Cc: Mike Kravetz <mike.kravetz@oracle.com> Cc: Mike Rapoport (IBM) <rppt@kernel.org> Cc: Zach O'Keefe <zokeefe@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2023-04-18 16:30:07 -07:00
Peter Xu	c3315502c9	selftests/mm: move zeropage test into uffd unit tests Simplifies it a bit along the way, e.g., drop the never used offset field (which was always the 1st page so offset=0). Introduce uffd_register_with_ioctls() out of uffd_register() to detect uffdio_register.ioctls got returned. Check that automatically when testing UFFDIO_ZEROPAGE on different types of memory (and kernel). Link: https://lkml.kernel.org/r/20230412164404.328815-1-peterx@redhat.com Signed-off-by: Peter Xu <peterx@redhat.com> Cc: Axel Rasmussen <axelrasmussen@google.com> Cc: David Hildenbrand <david@redhat.com> Cc: Dmitry Safonov <0x7f454c46@gmail.com> Cc: Mike Kravetz <mike.kravetz@oracle.com> Cc: Mike Rapoport (IBM) <rppt@kernel.org> Cc: Zach O'Keefe <zokeefe@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2023-04-18 16:30:07 -07:00
Peter Xu	73c1ea939b	selftests/mm: move uffd sig/events tests into uffd unit tests Move the two tests into the unit test, and convert it into 20 standalone tests: - events test on all 5 mem types, with wp on/off - signal test on all 5 mem types, with wp on/off Testing sigbus on anon... done Testing sigbus on shmem... done Testing sigbus on shmem-private... done Testing sigbus on hugetlb... done Testing sigbus on hugetlb-private... done Testing sigbus-wp on anon... done Testing sigbus-wp on shmem... done Testing sigbus-wp on shmem-private... done Testing sigbus-wp on hugetlb... done Testing sigbus-wp on hugetlb-private... done Testing events on anon... done Testing events on shmem... done Testing events on shmem-private... done Testing events on hugetlb... done Testing events on hugetlb-private... done Testing events-wp on anon... done Testing events-wp on shmem... done Testing events-wp on shmem-private... done Testing events-wp on hugetlb... done Testing events-wp on hugetlb-private... done It'll also remove a lot of global references along the way, e.g. test_uffdio_wp will be replaced with the wp value passed over. Link: https://lkml.kernel.org/r/20230412164400.328798-1-peterx@redhat.com Signed-off-by: Peter Xu <peterx@redhat.com> Cc: Axel Rasmussen <axelrasmussen@google.com> Cc: David Hildenbrand <david@redhat.com> Cc: Dmitry Safonov <0x7f454c46@gmail.com> Cc: Mike Kravetz <mike.kravetz@oracle.com> Cc: Mike Rapoport (IBM) <rppt@kernel.org> Cc: Zach O'Keefe <zokeefe@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2023-04-18 16:30:07 -07:00
Peter Xu	62515b5f9f	selftests/mm: move uffd minor test to unit test This moves the minor test to the new unit test. Rewrite the content check with char* opeartions to avoid fiddling with my_bcmp(). Drop global vars test_uffdio_minor and test_collapse, just assume test them always in common code for now. OTOH make this single test into five tests: - minor test on [shmem, hugetlb] with wp=false - minor test on [shmem, hugetlb] with wp=true - minor test + collapse on shmem only One thing to mention that we used to test COLLAPSE+WP but that doesn't sound right at all. It's possible it's silently broken but unnoticed because COLLAPSE is not part of the default test suite. Make the MADV_COLLAPSE test fail-able (by skip it when failing), because it's not guaranteed to success anyway. Drop a bunch of useless code after the move, because the unit test always use aligned num of pages and has nothing to do with n_cpus. Link: https://lkml.kernel.org/r/20230412164357.328779-1-peterx@redhat.com Signed-off-by: Peter Xu <peterx@redhat.com> Cc: Zach O'Keefe <zokeefe@google.com> Cc: Axel Rasmussen <axelrasmussen@google.com> Cc: David Hildenbrand <david@redhat.com> Cc: Dmitry Safonov <0x7f454c46@gmail.com> Cc: Mike Kravetz <mike.kravetz@oracle.com> Cc: Mike Rapoport (IBM) <rppt@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2023-04-18 16:30:07 -07:00
Peter Xu	8bda424fca	selftests/mm: move uffd pagemap test to unit test Move it over and make it split into two tests, one for pagemap and one for the new WP_UNPOPULATED (to be a separate one). The thp pagemap test wasn't really working (with MADV_HUGEPAGE). Let's just drop it (since it never really worked anyway..) and leave that for later. Link: https://lkml.kernel.org/r/20230412164352.328733-1-peterx@redhat.com Signed-off-by: Peter Xu <peterx@redhat.com> Reviewed-by: Mike Rapoport (IBM) <rppt@kernel.org> Cc: Axel Rasmussen <axelrasmussen@google.com> Cc: David Hildenbrand <david@redhat.com> Cc: Dmitry Safonov <0x7f454c46@gmail.com> Cc: Mike Kravetz <mike.kravetz@oracle.com> Cc: Zach O'Keefe <zokeefe@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2023-04-18 16:30:07 -07:00
Peter Xu	16a45b57cb	selftests/mm: add framework for uffd-unit-test Add a framework to be prepared to move unit tests from uffd-stress.c into uffd-unit-tests.c. The goal is to allow detection of uffd features for each test, and also loop over specified types of memory that a test support. Link: https://lkml.kernel.org/r/20230412164348.328710-1-peterx@redhat.com Signed-off-by: Peter Xu <peterx@redhat.com> Cc: Axel Rasmussen <axelrasmussen@google.com> Cc: David Hildenbrand <david@redhat.com> Cc: Dmitry Safonov <0x7f454c46@gmail.com> Cc: Mike Kravetz <mike.kravetz@oracle.com> Cc: Mike Rapoport (IBM) <rppt@kernel.org> Cc: Zach O'Keefe <zokeefe@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2023-04-18 16:30:06 -07:00
Peter Xu	be39fec4f9	selftests/mm: allow allocate_area() to fail properly Mostly to detect hugetlb allocation errors and skip hugetlb tests when pages are not allocated. Link: https://lkml.kernel.org/r/20230412164345.328659-1-peterx@redhat.com Signed-off-by: Peter Xu <peterx@redhat.com> Cc: Axel Rasmussen <axelrasmussen@google.com> Cc: David Hildenbrand <david@redhat.com> Cc: Dmitry Safonov <0x7f454c46@gmail.com> Cc: Mike Kravetz <mike.kravetz@oracle.com> Cc: Mike Rapoport (IBM) <rppt@kernel.org> Cc: Zach O'Keefe <zokeefe@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2023-04-18 16:30:06 -07:00
Peter Xu	0210c43ef6	selftests/mm: let uffd_handle_page_fault() take wp parameter Make the handler optionally apply WP bit when resolving page faults for either missing or minor page faults. This moves towards removing global test_uffdio_wp outside of the common code. Link: https://lkml.kernel.org/r/20230412164341.328618-1-peterx@redhat.com Signed-off-by: Peter Xu <peterx@redhat.com> Cc: Axel Rasmussen <axelrasmussen@google.com> Cc: David Hildenbrand <david@redhat.com> Cc: Dmitry Safonov <0x7f454c46@gmail.com> Cc: Mike Kravetz <mike.kravetz@oracle.com> Cc: Mike Rapoport (IBM) <rppt@kernel.org> Cc: Zach O'Keefe <zokeefe@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2023-04-18 16:30:06 -07:00
Peter Xu	508340845d	selftests/mm: rename uffd_stats to uffd_args Prepare for adding more fields into the struct. Link: https://lkml.kernel.org/r/20230412164337.328607-1-peterx@redhat.com Signed-off-by: Peter Xu <peterx@redhat.com> Suggested-by: Mike Rapoport (IBM) <rppt@kernel.org> Cc: Axel Rasmussen <axelrasmussen@google.com> Cc: David Hildenbrand <david@redhat.com> Cc: Dmitry Safonov <0x7f454c46@gmail.com> Cc: Mike Kravetz <mike.kravetz@oracle.com> Cc: Zach O'Keefe <zokeefe@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2023-04-18 16:30:06 -07:00
Peter Xu	265818ef98	selftests/mm: drop global hpage_size in uffd tests hpage_size was wrongly used. Sometimes it means hugetlb default size, sometimes it was used as thp size. Remove the global variable and use the right one at each place. Link: https://lkml.kernel.org/r/20230412164333.328596-1-peterx@redhat.com Signed-off-by: Peter Xu <peterx@redhat.com> Reviewed-by: Mike Rapoport (IBM) <rppt@kernel.org> Cc: Axel Rasmussen <axelrasmussen@google.com> Cc: David Hildenbrand <david@redhat.com> Cc: Dmitry Safonov <0x7f454c46@gmail.com> Cc: Mike Kravetz <mike.kravetz@oracle.com> Cc: Zach O'Keefe <zokeefe@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2023-04-18 16:30:06 -07:00
Peter Xu	c5cb903646	selftests/mm: drop global mem_fd in uffd tests Drop it by creating the memfd dynamically in the tests. Link: https://lkml.kernel.org/r/20230412164331.328584-1-peterx@redhat.com Signed-off-by: Peter Xu <peterx@redhat.com> Reviewed-by: Mike Rapoport (IBM) <rppt@kernel.org> Cc: Axel Rasmussen <axelrasmussen@google.com> Cc: David Hildenbrand <david@redhat.com> Cc: Dmitry Safonov <0x7f454c46@gmail.com> Cc: Mike Kravetz <mike.kravetz@oracle.com> Cc: Zach O'Keefe <zokeefe@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2023-04-18 16:30:05 -07:00
Peter Xu	d5433ce84d	selftests/mm: UFFDIO_API test Add one simple test for UFFDIO_API. With that, I also added a bunch of small but handy helpers along the way. Link: https://lkml.kernel.org/r/20230412164257.328375-1-peterx@redhat.com Signed-off-by: Peter Xu <peterx@redhat.com> Cc: Axel Rasmussen <axelrasmussen@google.com> Cc: David Hildenbrand <david@redhat.com> Cc: Dmitry Safonov <0x7f454c46@gmail.com> Cc: Mike Kravetz <mike.kravetz@oracle.com> Cc: Mike Rapoport (IBM) <rppt@kernel.org> Cc: Zach O'Keefe <zokeefe@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2023-04-18 16:30:05 -07:00
Peter Xu	78391f6460	selftests/mm: uffd_open_{dev\|sys}() Provide two helpers to open an uffd handle. Drop the error checks around SKIPs because it's inside an errexit() anyway, which IMHO doesn't really help much if the test will not continue. Link: https://lkml.kernel.org/r/20230412164254.328335-1-peterx@redhat.com Signed-off-by: Peter Xu <peterx@redhat.com> Reviewed-by: David Hildenbrand <david@redhat.com> Reviewed-by: Mike Rapoport (IBM) <rppt@kernel.org> Reviewed-by: Axel Rasmussen <axelrasmussen@google.com> Cc: Dmitry Safonov <0x7f454c46@gmail.com> Cc: Mike Kravetz <mike.kravetz@oracle.com> Cc: Zach O'Keefe <zokeefe@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2023-04-18 16:30:05 -07:00
Peter Xu	c4277cb6c8	selftests/mm: uffd_[un]register() Add two helpers to register/unregister to an uffd. Use them to drop duplicate codes. This patch also drops assert_expected_ioctls_present() and get_expected_ioctls(). Reasons: - It'll need a lot of effort to pass test_type==HUGETLB into it from the upper, so it's the simplest way to get rid of another global var - The ioctls returned in UFFDIO_REGISTER is hardly useful at all, because any app can already detect kernel support on any ioctl via its corresponding UFFD_FEATURE_*. The check here is for sanity mostly but it's probably destined no user app will even use it. - It's not friendly to one future goal of uffd to run on old kernels, the problem is get_expected_ioctls() compiles against UFFD_API_RANGE_IOCTLS, which is a value that can change depending on where the test is compiled, rather than reflecting what the kernel underneath has. It means it'll report false negatives on old kernels so it's against our will. So let's make our lives easier. [peterx@redhat.com; tools/testing/selftests/mm/hugepage-mremap.c: add headers] Link: https://lkml.kernel.org/r/ZDxrvZh/cw357D8P@x1n Link: https://lkml.kernel.org/r/20230412164247.328293-1-peterx@redhat.com Signed-off-by: Peter Xu <peterx@redhat.com> Reviewed-by: Axel Rasmussen <axelrasmussen@google.com> Cc: David Hildenbrand <david@redhat.com> Cc: Dmitry Safonov <0x7f454c46@gmail.com> Cc: Mike Kravetz <mike.kravetz@oracle.com> Cc: Mike Rapoport (IBM) <rppt@kernel.org> Cc: Zach O'Keefe <zokeefe@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2023-04-18 16:30:05 -07:00
Peter Xu	686a8bb723	selftests/mm: split uffd tests into uffd-stress and uffd-unit-tests In many ways it's weird and unwanted to keep all the tests in the same userfaultfd.c at least when still in the current way. For example, it doesn't make much sense to run the stress test for each method we can create an userfaultfd handle (either via syscall or /dev/ node). It's a waste of time running this twice for the whole stress as the stress paths are the same, only the open path is different. It's also just weird to need to manually specify different types of memory to run all unit tests for the userfaultfd interface. We should be able to just run a single program and that should go through all functional uffd tests without running the stress test at all. The stress test was more for torturing and finding race conditions. We don't want to wait for stress to finish just to regress test a functional test. When we start to pile up more things on top of the same file and same functions, things start to go a bit chaos and the code is just harder to maintain too with tons of global variables. This patch creates a new test uffd-unit-tests to keep userfaultfd unit tests in the future, currently empty. Meanwhile rename the old userfaultfd.c test to uffd-stress.c. Link: https://lkml.kernel.org/r/20230412164244.328270-1-peterx@redhat.com Signed-off-by: Peter Xu <peterx@redhat.com> Reviewed-by: Mike Rapoport (IBM) <rppt@kernel.org> Reviewed-by: Axel Rasmussen <axelrasmussen@google.com> Cc: David Hildenbrand <david@redhat.com> Cc: Dmitry Safonov <0x7f454c46@gmail.com> Cc: Mike Kravetz <mike.kravetz@oracle.com> Cc: Zach O'Keefe <zokeefe@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2023-04-18 16:30:04 -07:00
Peter Xu	33be4e8928	selftests/mm: create uffd-common.[ch] Move common utility functions into uffd-common.[ch] files from the original userfaultfd.c. This prepares for a split of userfaultfd.c into two tests: one to only cover the old but powerful stress test, the other one covers all the functional tests. This movement is kind of a brute-force effort for now, with light touch-ups but nothing should really change. There's chances to optimize more, but let's leave that for later. Link: https://lkml.kernel.org/r/20230412164241.328259-1-peterx@redhat.com Signed-off-by: Peter Xu <peterx@redhat.com> Reviewed-by: Mike Rapoport (IBM) <rppt@kernel.org> Reviewed-by: Axel Rasmussen <axelrasmussen@google.com> Cc: David Hildenbrand <david@redhat.com> Cc: Dmitry Safonov <0x7f454c46@gmail.com> Cc: Mike Kravetz <mike.kravetz@oracle.com> Cc: Zach O'Keefe <zokeefe@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2023-04-18 16:30:04 -07:00
Peter Xu	618aeb5d62	selftests/mm: drop test_uffdio_zeropage_eexist The idea was trying to flip this var in the alarm handler from time to time to test -EEXIST of UFFDIO_ZEROPAGE, but firstly it's only used in the zeropage test so probably only used once, meanwhile we passed "retry==false" so it'll never got tested anyway. Drop both sides so we always test UFFDIO_ZEROPAGE retries if has_zeropage is set (!hugetlb). One more thing to do is doing UFFDIO_REGISTER for the alias buffer too, because otherwise the test won't even pass! We were just lucky that this test never really got ran at all. Link: https://lkml.kernel.org/r/20230412164238.328238-1-peterx@redhat.com Signed-off-by: Peter Xu <peterx@redhat.com> Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com> Reviewed-by: David Hildenbrand <david@redhat.com> Reviewed-by: Mike Rapoport (IBM) <rppt@kernel.org> Cc: Axel Rasmussen <axelrasmussen@google.com> Cc: Dmitry Safonov <0x7f454c46@gmail.com> Cc: Zach O'Keefe <zokeefe@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2023-04-18 16:30:04 -07:00
Peter Xu	4af9ff2981	selftests/mm: test UFFDIO_ZEROPAGE only when !hugetlb Make the check as simple as "test_type == TEST_HUGETLB" because that's the only mem that doesn't support ZEROPAGE. Link: https://lkml.kernel.org/r/20230412164234.328168-1-peterx@redhat.com Signed-off-by: Peter Xu <peterx@redhat.com> Reviewed-by: Axel Rasmussen <axelrasmussen@google.com> Cc: David Hildenbrand <david@redhat.com> Cc: Dmitry Safonov <0x7f454c46@gmail.com> Cc: Mike Kravetz <mike.kravetz@oracle.com> Cc: Mike Rapoport (IBM) <rppt@kernel.org> Cc: Zach O'Keefe <zokeefe@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2023-04-18 16:30:04 -07:00
Peter Xu	366e93c465	selftests/mm: reuse pagemap_get_entry() in vm_util.h Meanwhile drop pagemap_read_vaddr(). Link: https://lkml.kernel.org/r/20230412164231.328157-1-peterx@redhat.com Signed-off-by: Peter Xu <peterx@redhat.com> Reviewed-by: Axel Rasmussen <axelrasmussen@google.com> Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com> Reviewed-by: David Hildenbrand <david@redhat.com> Reviewed-by: Mike Rapoport (IBM) <rppt@kernel.org> Cc: Dmitry Safonov <0x7f454c46@gmail.com> Cc: Zach O'Keefe <zokeefe@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2023-04-18 16:30:04 -07:00
Peter Xu	9f74696bd2	selftests/mm: use PM_* macros in vm_utils.h We've got the macros in uffd-stress.c, move it over and use it in vm_util.h. Link: https://lkml.kernel.org/r/20230412164227.328145-1-peterx@redhat.com Signed-off-by: Peter Xu <peterx@redhat.com> Reviewed-by: Axel Rasmussen <axelrasmussen@google.com> Reviewed-by: David Hildenbrand <david@redhat.com> Reviewed-by: Mike Rapoport (IBM) <rppt@kernel.org> Cc: Dmitry Safonov <0x7f454c46@gmail.com> Cc: Mike Kravetz <mike.kravetz@oracle.com> Cc: Zach O'Keefe <zokeefe@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2023-04-18 16:30:03 -07:00
Peter Xu	bd4d67e76f	selftests/mm: merge default_huge_page_size() into one There're already 3 same definitions of the three functions. Move it into vm_util.[ch]. Link: https://lkml.kernel.org/r/20230412164223.328134-1-peterx@redhat.com Signed-off-by: Peter Xu <peterx@redhat.com> Reviewed-by: Axel Rasmussen <axelrasmussen@google.com> Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com> Reviewed-by: David Hildenbrand <david@redhat.com> Reviewed-by: Mike Rapoport (IBM) <rppt@kernel.org> Cc: Dmitry Safonov <0x7f454c46@gmail.com> Cc: Zach O'Keefe <zokeefe@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2023-04-18 16:30:03 -07:00
Peter Xu	4b54f5a758	selftests/mm: link vm_util.c always We do have plenty of files that want to link against vm_util.c. Just make it simple by linking it always. Link: https://lkml.kernel.org/r/20230412164220.328123-1-peterx@redhat.com Signed-off-by: Peter Xu <peterx@redhat.com> Reviewed-by: David Hildenbrand <david@redhat.com> Reviewed-by: Mike Rapoport (IBM) <rppt@kernel.org> Cc: Axel Rasmussen <axelrasmussen@google.com> Cc: Dmitry Safonov <0x7f454c46@gmail.com> Cc: Mike Kravetz <mike.kravetz@oracle.com> Cc: Zach O'Keefe <zokeefe@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2023-04-18 16:30:03 -07:00
Peter Xu	aef6fde75d	selftests/mm: use TEST_GEN_PROGS where proper TEST_GEN_PROGS and TEST_GEN_FILES are used randomly in the mm/Makefile to specify programs that need to build. Logically all these binaries should all fall into TEST_GEN_PROGS. Replace those TEST_GEN_FILES with TEST_GEN_PROGS, so that we can reference all the tests easily later. [peterx@redhat.com: tools/testing/selftests/mm/Makefile: don't wipe out TEST_GEN_PROGS] Link: https://lkml.kernel.org/r/ZDxrvZh/cw357D8P@x1n Link: https://lkml.kernel.org/r/20230412164218.328104-1-peterx@redhat.com Signed-off-by: Peter Xu <peterx@redhat.com> Reviewed-by: David Hildenbrand <david@redhat.com> Reviewed-by: Mike Rapoport (IBM) <rppt@kernel.org> Cc: Axel Rasmussen <axelrasmussen@google.com> Cc: Dmitry Safonov <0x7f454c46@gmail.com> Cc: Mike Kravetz <mike.kravetz@oracle.com> Cc: Zach O'Keefe <zokeefe@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2023-04-18 16:30:03 -07:00
Peter Xu	af605d26a8	selftests/mm: merge util.h into vm_util.h There're two util headers under mm/ kselftest. Merge one with another. It turns out util.h is the easy one to move. When merging, drop PAGE_SIZE / PAGE_SHIFT because they're unnecessary wrappers to page_size() / page_shift(), meanwhile rename them to psize() and pshift() so as to not conflict with some existing definitions in some test files that includes vm_util.h. Link: https://lkml.kernel.org/r/20230412164120.327731-1-peterx@redhat.com Signed-off-by: Peter Xu <peterx@redhat.com> Reviewed-by: Axel Rasmussen <axelrasmussen@google.com> Reviewed-by: David Hildenbrand <david@redhat.com> Reviewed-by: Mike Rapoport (IBM) <rppt@kernel.org> Cc: Dmitry Safonov <0x7f454c46@gmail.com> Cc: Mike Kravetz <mike.kravetz@oracle.com> Cc: Zach O'Keefe <zokeefe@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2023-04-18 16:30:02 -07:00
Peter Xu	c7c55fc4e3	selftests/mm: dump a summary in run_vmtests.sh Dump a summary after running whatever test specified. Useful for human runners to identify any kind of failures (besides exit code). Link: https://lkml.kernel.org/r/20230412164117.327720-1-peterx@redhat.com Signed-off-by: Peter Xu <peterx@redhat.com> Reviewed-by: Axel Rasmussen <axelrasmussen@google.com> Reviewed-by: David Hildenbrand <david@redhat.com> Reviewed-by: Mike Rapoport (IBM) <rppt@kernel.org> Cc: Dmitry Safonov <0x7f454c46@gmail.com> Cc: Mike Kravetz <mike.kravetz@oracle.com> Cc: Zach O'Keefe <zokeefe@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2023-04-18 16:30:02 -07:00
Peter Xu	c14ef37871	selftests/mm: update .gitignore with two missing tests Patch series "selftests/mm: Split / Refactor userfault test", v2. This patchset splits userfaultfd.c into two tests: - uffd-stress: the "vanilla", old and powerful stress test - uffd-unit-tests: all the unit tests will be moved here This is on my todo list for a long time but I never did it for real. The uffd test is growing into a small and cute monster. I start to notice it's going harder to maintain such a test and make it useful. A few issues I found when looking at userfaultfd test: - We have a bunch of unit tests in userfaultfd.c, but they always need to be run only after a stress type. No way to not do it. - We can only run an unit test for one memory type only, if we want to do a quick smoke test to check regressions, there's no good way. The best to come currently is "bash ./run_vmtests.sh -t userfaultfd" thanks to the most recent changes to run_vmtests.sh on tagging. Still, that needs to run the stress tests always and hard to see what's wrong. - It's hard to add a new unit test to userfaultfd.c, we don't really know what's happening, not until we mostly read the whole file. - We did a bunch of useless tests, e.g. we run twice the whole suite of stress test just to verify both syscall and /dev/userfaultfd. They're all using userfaultfd_new() to create the handle, everything should really be the same underneath. One simple unit test should cover that! - We have tens of global variables in one file but shared with all the tests. Some of them are not suitable to be a global var from maintainance pov. It enforces every unit test to consider how these vars affects the stress test and vice versa, but that's logically not necessary. - Userfaultfd test is not friendly to old kernels. Mostly it only works on the latest kernel tree. It's preferrable to be run on all kernels and properly report what's missing. I'll stop here, I feel like I can still list some.. This patchset should resolve all issues above, and actually we can do even more on top. I stopped doing that until I found I already got 29 patches and 2000+ LOC changes. That's already a patchset terrible enough so we should move in small steps. After the whole set applied, "./run_vmtests.sh -t userfaultfd" looks like this: ===8<=== vm.nr_hugepages = 1024 ------------------------- running ./uffd-unit-tests ------------------------- Testing UFFDIO_API (with syscall)... done Testing UFFDIO_API (with /dev/userfaultfd)... done Testing register-ioctls on anon... done Testing register-ioctls on shmem... done Testing register-ioctls on shmem-private... done Testing register-ioctls on hugetlb... done Testing register-ioctls on hugetlb-private... done Testing zeropage on anon... done Testing zeropage on shmem... done Testing zeropage on shmem-private... done Testing zeropage on hugetlb... done Testing zeropage on hugetlb-private... done Testing pagemap on anon... done Testing wp-unpopulated on anon... done Testing minor on shmem... done Testing minor on hugetlb... done Testing minor-wp on shmem... done Testing minor-wp on hugetlb... done Testing minor-collapse on shmem... done Testing sigbus on anon... done Testing sigbus on shmem... done Testing sigbus on shmem-private... done Testing sigbus on hugetlb... done Testing sigbus on hugetlb-private... done Testing sigbus-wp on anon... done Testing sigbus-wp on shmem... done Testing sigbus-wp on shmem-private... done Testing sigbus-wp on hugetlb... done Testing sigbus-wp on hugetlb-private... done Testing events on anon... done Testing events on shmem... done Testing events on shmem-private... done Testing events on hugetlb... done Testing events on hugetlb-private... done Testing events-wp on anon... done Testing events-wp on shmem... done Testing events-wp on shmem-private... done Testing events-wp on hugetlb... done Testing events-wp on hugetlb-private... done Userfaults unit tests: pass=39, skip=0, fail=0 (total=39) [PASS] -------------------------------- running ./uffd-stress anon 20 16 -------------------------------- nr_pages: 5120, nr_pages_per_cpu: 640 bounces: 15, mode: rnd racing ver poll, userfaults: 345 missing (26+48+61+102+30+12+59+7) 1596 wp (120+139+317+346+215+67+306+86) [...] [PASS] ------------------------------------ running ./uffd-stress hugetlb 128 32 ------------------------------------ nr_pages: 64, nr_pages_per_cpu: 8 bounces: 31, mode: rnd racing ver poll, userfaults: 29 missing (6+6+6+5+4+2+0+0) 104 wp (20+19+22+18+7+12+5+1) [...] [PASS] -------------------------------------------- running ./uffd-stress hugetlb-private 128 32 -------------------------------------------- nr_pages: 64, nr_pages_per_cpu: 8 bounces: 31, mode: rnd racing ver poll, userfaults: 33 missing (12+9+7+0+5+0+0+0) 111 wp (24+25+14+14+11+17+5+1) [...] [PASS] --------------------------------- running ./uffd-stress shmem 20 16 --------------------------------- nr_pages: 5120, nr_pages_per_cpu: 640 bounces: 15, mode: rnd racing ver poll, userfaults: 247 missing (15+17+34+60+81+37+3+0) 2038 wp (180+114+276+400+381+318+165+204) [...] [PASS] ----------------------------------------- running ./uffd-stress shmem-private 20 16 ----------------------------------------- nr_pages: 5120, nr_pages_per_cpu: 640 bounces: 15, mode: rnd racing ver poll, userfaults: 235 missing (52+29+55+56+13+9+16+5) 2849 wp (218+406+461+531+328+284+430+191) [...] [PASS] SUMMARY: PASS=6 SKIP=0 FAIL=0 ===8<=== The output may be different if we miss some features (e.g., hugetlb not allocated, old kernel, less privilege of uffd handle), but they should show up with good reasons. E.g., I tried to run the unit test on my Fedora kernel and it gives me: ===8<=== UFFDIO_API (with syscall)... failed [reason: UFFDIO_API should fail with wrong api but didn't] UFFDIO_API (with /dev/userfaultfd)... skipped [reason: cannot open userfaultfd handle] zeropage on anon... done zeropage on shmem... done zeropage on shmem-private... done zeropage-hugetlb on hugetlb... done zeropage-hugetlb on hugetlb-private... done pagemap on anon... pagemap on anon... pagemap on anon... done wp-unpopulated on anon... skipped [reason: feature missing] minor on shmem... done minor on hugetlb... done minor-wp on shmem... skipped [reason: feature missing] minor-wp on hugetlb... skipped [reason: feature missing] minor-collapse on shmem... done sigbus on anon... skipped [reason: possible lack of priviledge] sigbus on shmem... skipped [reason: possible lack of priviledge] sigbus on shmem-private... skipped [reason: possible lack of priviledge] sigbus on hugetlb... skipped [reason: possible lack of priviledge] sigbus on hugetlb-private... skipped [reason: possible lack of priviledge] sigbus-wp on anon... skipped [reason: possible lack of priviledge] sigbus-wp on shmem... skipped [reason: possible lack of priviledge] sigbus-wp on shmem-private... skipped [reason: possible lack of priviledge] sigbus-wp on hugetlb... skipped [reason: possible lack of priviledge] sigbus-wp on hugetlb-private... skipped [reason: possible lack of priviledge] events on anon... skipped [reason: possible lack of priviledge] events on shmem... skipped [reason: possible lack of priviledge] events on shmem-private... skipped [reason: possible lack of priviledge] events on hugetlb... skipped [reason: possible lack of priviledge] events on hugetlb-private... skipped [reason: possible lack of priviledge] events-wp on anon... skipped [reason: possible lack of priviledge] events-wp on shmem... skipped [reason: possible lack of priviledge] events-wp on shmem-private... skipped [reason: possible lack of priviledge] events-wp on hugetlb... skipped [reason: possible lack of priviledge] events-wp on hugetlb-private... skipped [reason: possible lack of priviledge] Userfaults unit tests: pass=9, skip=24, fail=1 (total=34) ===8<=== Patch layout: - Revert "userfaultfd: don't fail on unrecognized features" Something I found when I got the UFFDIO_API test below. Axel, I still propose to revert it as a whole, but feel free to continue the discussion from the original patch thread. - selftests/mm: Update .gitignore with two missing tests - selftests/mm: Dump a summary in run_vmtests.sh - selftests/mm: Merge util.h into vm_util.h - selftests/mm: Use TEST_GEN_PROGS where proper - selftests/mm: Link vm_util.c always - selftests/mm: Merge default_huge_page_size() into one - selftests/mm: Use PM_* macros in vm_utils.h - selftests/mm: Reuse pagemap_get_entry() in vm_util.h - selftests/mm: Test UFFDIO_ZEROPAGE only when !hugetlb - selftests/mm: Drop test_uffdio_zeropage_eexist Until here, all cleanups here and there. I wanted to keep going, but I found that maybe it'll take a few more days to split the test. Hence I did a split starting from the next one, so we have a working thing first. - selftests/mm: Create uffd-common.[ch] - selftests/mm: Split uffd tests into uffd-stress and uffd-unit-tests This did the major brute force split of common codes into uffd-common.[ch]. That'll be the so far common base for stress and unit tests. Then a new unit test is created. - selftests/mm: uffd_[un]register() - selftests/mm: uffd_open_{dev\|sys}() - selftests/mm: UFFDIO_API test This patch hides here to start writting the 1st unit test with UFFDIO_API, also detection of userfaultfd privileges. - selftests/mm: Drop global mem_fd in uffd tests - selftests/mm: Drop global hpage_size in uffd tests - selftests/mm: Rename uffd_stats to uffd_args - selftests/mm: Let uffd_handle_page_fault() takes wp parameter - selftests/mm: Allow allocate_area() to fail properly Some further cleanup that I noticed otherwise hard to move the tests. - selftests/mm: Add framework for uffd-unit-test The major patch provides the framework for most of the rest unit tests. - selftests/mm: Move uffd pagemap test to unit test - selftests/mm: Move uffd minor test to unit test - selftests/mm: Move uffd sig/events tests into uffd unit tests - selftests/mm: Move zeropage test into uffd unit tests Move unit tests and suite them into the new file. - selftests/mm: Workaround no way to detect uffd-minor + wp - selftests/mm: Allow uffd test to skip properly with no privilege - selftests/mm: Drop sys/dev test in uffd-stress test - selftests/mm: Add shmem-private test to uffd-stress A bunch of changes to do better on error reportings, and add shmem-private to the stress test which was long missing. - selftests/mm: Add uffdio register ioctls test One more patch to test uffdio_register.ioctls. This patch (of 30): Update .gitignore with two missing tests. Link: https://lkml.kernel.org/r/20230412163922.327282-1-peterx@redhat.com Link: https://lkml.kernel.org/r/20230412164114.327709-1-peterx@redhat.com Signed-off-by: Peter Xu <peterx@redhat.com> Reviewed-by: David Hildenbrand <david@redhat.com> Reviewed-by: Mike Rapoport (IBM) <rppt@kernel.org> Cc: Axel Rasmussen <axelrasmussen@google.com> Cc: Dmitry Safonov <0x7f454c46@gmail.com> Cc: Mike Kravetz <mike.kravetz@oracle.com> Cc: Zach O'Keefe <zokeefe@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2023-04-18 16:30:02 -07:00
David Hildenbrand	9eac40fc0c	selftests/mm: mkdirty: test behavior of (pte\|pmd)_mkdirty on VMAs without write permissions Let's add some tests that trigger (pte\|pmd)_mkdirty on VMAs without write permissions. If an architecture implementation is wrong, we might accidentally set the PTE/PMD writable and allow for write access in a VMA without write permissions. The tests include reproducers for the two issues recently discovered and worked-around in core-MM for now: (1) commit `624a2c94f5` ("Partly revert "mm/thp: carry over dirty bit when thp splits on pmd"") (2) commit `96a9c287e2` ("mm/migrate: fix wrongly apply write bit after mkdirty on sparc64") In addition, some other tests that reveal further issues. All tests pass under x86_64: ./mkdirty # [INFO] detected THP size: 2048 KiB TAP version 13 1..6 # [INFO] PTRACE write access ok 1 SIGSEGV generated, page not modified # [INFO] PTRACE write access to THP ok 2 SIGSEGV generated, page not modified # [INFO] Page migration ok 3 SIGSEGV generated, page not modified # [INFO] Page migration of THP ok 4 SIGSEGV generated, page not modified # [INFO] PTE-mapping a THP ok 5 SIGSEGV generated, page not modified # [INFO] UFFDIO_COPY ok 6 SIGSEGV generated, page not modified # Totals: pass:6 fail:0 xfail:0 xpass:0 skip:0 error:0 But some fail on sparc64: ./mkdirty # [INFO] detected THP size: 8192 KiB TAP version 13 1..6 # [INFO] PTRACE write access not ok 1 SIGSEGV generated, page not modified # [INFO] PTRACE write access to THP not ok 2 SIGSEGV generated, page not modified # [INFO] Page migration ok 3 SIGSEGV generated, page not modified # [INFO] Page migration of THP ok 4 SIGSEGV generated, page not modified # [INFO] PTE-mapping a THP ok 5 SIGSEGV generated, page not modified # [INFO] UFFDIO_COPY not ok 6 SIGSEGV generated, page not modified Bail out! 3 out of 6 tests failed # Totals: pass:3 fail:3 xfail:0 xpass:0 skip:0 error:0 Reverting both above commits makes all tests fail on sparc64: ./mkdirty # [INFO] detected THP size: 8192 KiB TAP version 13 1..6 # [INFO] PTRACE write access not ok 1 SIGSEGV generated, page not modified # [INFO] PTRACE write access to THP not ok 2 SIGSEGV generated, page not modified # [INFO] Page migration not ok 3 SIGSEGV generated, page not modified # [INFO] Page migration of THP not ok 4 SIGSEGV generated, page not modified # [INFO] PTE-mapping a THP not ok 5 SIGSEGV generated, page not modified # [INFO] UFFDIO_COPY not ok 6 SIGSEGV generated, page not modified Bail out! 6 out of 6 tests failed # Totals: pass:0 fail:6 xfail:0 xpass:0 skip:0 error:0 The tests are useful to detect other problematic archs, to verify new arch fixes, and to stop such issues from reappearing in the future. For now, we don't add any hugetlb tests. Link: https://lkml.kernel.org/r/20230411142512.438404-3-david@redhat.com Signed-off-by: David Hildenbrand <david@redhat.com> Cc: Anshuman Khandual <anshuman.khandual@arm.com> Cc: David S. Miller <davem@davemloft.net> Cc: Hugh Dickins <hughd@google.com> Cc: Peter Xu <peterx@redhat.com> Cc: Sam Ravnborg <sam@ravnborg.org> Cc: Shuah Khan <shuah@kernel.org> Cc: Yu Zhao <yuzhao@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2023-04-18 16:30:00 -07:00
David Hildenbrand	d6e61afb40	selftests/mm: reuse read_pmd_pagesize() in COW selftest Patch series "mm: (pte\|pmd)_mkdirty() should not unconditionally allow for write access". This is the follow-up on [1], adding selftests (testing for known issues we added workarounds for and other issues that haven't been fixed yet), fixing sparc64, reverting the workarounds, and perform one cleanup. The patch from [1] was modified slightly (updated/extended patch description, dropped one unnecessary NOP instruction from the ASM in __pte_mkhwwrite()). Retested on x86_64 and sparc64 (sun4u in QEMU). I scanned most architectures to make sure their (pte\|pmd)_mkdirty() handling is correct. To be sure, we can run the selftests and find out if other architectures are still affectes (loongarch was fixed recently as well). Based on master for now. I don't expect surprises regarding mm-tress, but I can rebase if there are any problems. This patch (of 6): The COW selftest can deal with THP not being configured. So move error handling of read_pmd_pagesize() into the callers such that we can reuse it in the COW selftest. Link: https://lkml.kernel.org/r/20230411142512.438404-1-david@redhat.com Link: https://lkml.kernel.org/r/20221212130213.136267-1-david@redhat.com [1] Link: https://lkml.kernel.org/r/20230411142512.438404-2-david@redhat.com Signed-off-by: David Hildenbrand <david@redhat.com> Cc: Anshuman Khandual <anshuman.khandual@arm.com> Cc: David S. Miller <davem@davemloft.net> Cc: Hugh Dickins <hughd@google.com> Cc: Peter Xu <peterx@redhat.com> Cc: Sam Ravnborg <sam@ravnborg.org> Cc: Shuah Khan <shuah@kernel.org> Cc: Yu Zhao <yuzhao@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2023-04-18 16:30:00 -07:00
Peng Zhang	3b7939c8e5	maple_tree: add a test case to check maple_alloc Add a test case to check whether the number of maple_alloc structures is actually equal to mas->alloc->total. Link: https://lkml.kernel.org/r/20230411041005.26205-2-zhangpeng.00@bytedance.com Signed-off-by: Peng Zhang <zhangpeng.00@bytedance.com> Cc: Liam R. Howlett <Liam.Howlett@Oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2023-04-18 16:29:56 -07:00
Evan Green	287dcc2b0c	selftests: Test the new RISC-V hwprobe interface This adds a test for the recently added RISC-V interface for probing hardware capabilities. It happens to be the first selftest we have for RISC-V, so I've added some infrastructure for those as well. Co-developed-by: Palmer Dabbelt <palmer@rivosinc.com> Signed-off-by: Evan Green <evan@rivosinc.com> Link: https://lore.kernel.org/r/20230407231103.2622178-6-evan@rivosinc.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>	2023-04-18 15:48:17 -07:00
Andrew Morton	f8f238ffe5	sync mm-stable with mm-hotfixes-stable to pick up depended-upon upstream changes	2023-04-18 14:53:49 -07:00
SeongJae Park	a101482421	tools/Makefile: do missed s/vm/mm/ Commit `799fb82aa1` ("tools/vm: rename tools/vm to tools/mm") missed renaming 'vm' in 'tools/Makefile' to 'mm'. As a result, 'make clean' under 'tools/' directory fails as below: $ make -C tools clean DESCEND vm make[1]: Entering directory '/linux/tools/vm' make[1]: * No rule to make target 'clean'. Stop. make[1]: Leaving directory '/linux/tools/vm' make: * [Makefile:173: vm_clean] Error 2 make: Leaving directory '/linux/tools' Do the missed rename. Link: https://lkml.kernel.org/r/20230415203110.13858-1-sj@kernel.org Fixes: `799fb82aa1` ("tools/vm: rename tools/vm to tools/mm") Signed-off-by: SeongJae Park <sj@kernel.org> Reported-by: Ricardo Pardini <ricardo@pardini.net> Link: https://lore.kernel.org/linux-mm/20230415202454.13558-1-sj@kernel.org/ Tested-by: Ricardo Pardini <ricardo@pardini.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2023-04-18 14:22:12 -07:00
Andrii Nakryiko	94dccba795	libbpf: mark bpf_iter_num_{new,next,destroy} as __weak Mark bpf_iter_num_{new,next,destroy}() kfuncs declared for bpf_for()/bpf_repeat() macros as __weak to allow users to feature-detect their presence and guard bpf_for()/bpf_repeat() loops accordingly for backwards compatibility with old kernels. Now that libbpf supports kfunc calls poisoning and better reporting of unresolved (but called) kfuncs, declaring number iterator kfuncs in bpf_helpers.h won't degrade user experience and won't cause unnecessary kernel feature dependencies. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20230418002148.3255690-7-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2023-04-18 12:45:11 -07:00
Andrii Nakryiko	c5e6474167	libbpf: move bpf_for(), bpf_for_each(), and bpf_repeat() into bpf_helpers.h To make it easier for bleeding-edge BPF applications, such as sched_ext, to utilize open-coded iterators, move bpf_for(), bpf_for_each(), and bpf_repeat() macros from selftests/bpf-internal bpf_misc.h helper, to libbpf-provided bpf_helpers.h header. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20230418002148.3255690-6-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2023-04-18 12:45:11 -07:00
Andrii Nakryiko	30bbfe3236	selftests/bpf: add missing __weak kfunc log fixup test Add test validating that libbpf correctly poisons and reports __weak unresolved kfuncs in post-processed verifier log. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20230418002148.3255690-5-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2023-04-18 12:45:10 -07:00
Andrii Nakryiko	05b6f766b2	libbpf: improve handling of unresolved kfuncs Currently, libbpf leaves `call #0` instruction for __weak unresolved kfuncs, which might lead to a confusing verifier log situations, where invalid `call #0` will be treated as successfully validated. We can do better. Libbpf already has an established mechanism of poisoning instructions that failed some form of resolution (e.g., CO-RE relocation and BPF map set to not be auto-created). Libbpf doesn't fail them outright to allow users to guard them through other means, and as long as BPF verifier can prove that such poisoned instructions cannot be ever reached, this doesn't consistute an invalid BPF program. If user didn't guard such code, libbpf will extract few pieces of information to tie such poisoned instructions back to additional information about what entitity wasn't resolved (e.g., BPF map name, or CO-RE relocation information). __weak unresolved kfuncs fit this model well, so this patch extends libbpf with poisioning and log fixup logic for kfunc calls. Note, this poisoning is done only for kfunc calls, not kfunc address resolution (ldimm64 instructions). The former cannot be ever valid, if reached, so it's safe to poison them. The latter is a valid mechanism to check if __weak kfunc ksym was resolved, and do necessary guarding and work arounds based on this result, supported in most recent kernels. As such, libbpf keeps such ldimm64 instructions as loading zero, never poisoning them. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20230418002148.3255690-4-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2023-04-18 12:45:10 -07:00
Andrii Nakryiko	f709160d17	libbpf: report vmlinux vs module name when dealing with ksyms Currently libbpf always reports "kernel" as a source of ksym BTF type, which is ambiguous given ksym's BTF can come from either vmlinux or kernel module BTFs. Make this explicit and log module name, if used BTF is from kernel module. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20230418002148.3255690-3-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2023-04-18 12:45:10 -07:00
Andrii Nakryiko	3055ddd654	libbpf: misc internal libbpf clean ups around log fixup Normalize internal constants, field names, and comments related to log fixup. Also add explicit `ext_idx` alias for relocation where relocation is pointing to extern description for additional information. No functional changes, just a clean up before subsequent additions. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20230418002148.3255690-2-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2023-04-18 12:45:10 -07:00
James Clark	b550bc90bb	perf cs-etm: Fix segfault in dso lookup map__dso() is called before thread__find_map() which always results in a null pointer dereference. Fix it by finding first, then checking if it exists. Fixes: `63df0e4bc3` ("perf map: Add accessor for dso") Signed-off-by: James Clark <james.clark@arm.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.g.garry@oracle.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mathieu Poirier <mathieu.poirier@linaro.org> Cc: Mike Leach <mike.leach@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Suzuki Poulouse <suzuki.poulose@arm.com> Cc: Will Deacon <will@kernel.org> Cc: coresight@lists.linaro.org Cc: linux-arm-kernel@lists.infradead.org Link: https://lore.kernel.org/r/20230418141203.673465-1-james.clark@arm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2023-04-18 12:25:10 -03:00
Frederic Weisbecker	263dda24ff	selftests/proc: Assert clock_gettime(CLOCK_BOOTTIME) VS /proc/uptime monotonicity The first field of /proc/uptime relies on the CLOCK_BOOTTIME clock which can also be fetched from clock_gettime() API. Improve the test coverage while verifying the monotonicity of CLOCK_BOOTTIME accross both interfaces. Suggested-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Frederic Weisbecker <frederic@kernel.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/20230222144649.624380-9-frederic@kernel.org	2023-04-18 16:35:13 +02:00
Frederic Weisbecker	270b2a679e	selftests/proc: Remove idle time monotonicity assertions Due to broken iowait task counting design (cf: comments above get_cpu_idle_time_us() and nr_iowait()), it is not possible to provide the guarantee that /proc/stat or /proc/uptime display monotonic idle time values. Remove the assertions that verify the related wrong assumption so that testers and maintainers don't spend more time on that. Reported-by: Yu Liao <liaoyu15@huawei.com> Reported-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Frederic Weisbecker <frederic@kernel.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/20230222144649.624380-8-frederic@kernel.org	2023-04-18 16:35:13 +02:00
Colin Ian King	de047c1091	perf script task-analyzer: Fix spelling mistake "miliseconds" -> "milliseconds" There is a spelling mistake in the help for the --ms option. Fix it. Signed-off-by: Colin Ian King <colin.i.king@gmail.com> Acked-by: Hagen Paul Pfeifer <hagen@jauu.net> Acked-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Petar Gligoric <petar.gligoric@rohde-schwarz.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: kernel-janitors@vger.kernel.org Link: https://lore.kernel.org/r/20230417174826.52963-1-colin.i.king@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2023-04-17 22:24:14 -03:00
Arnaldo Carvalho de Melo	2d1acd3f10	perf namespaces: Introduce nsinfo__mntns_path() accessor to avoid accessing ->mntns_path directly To reduce the use of RC_CHK_ACCESS(nsi). Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2023-04-17 22:22:24 -03:00
Arnaldo Carvalho de Melo	f94c21dfd0	perf namespaces: Introduce nsinfo__refcnt() accessor to avoid accessing ->refcnt directly To reduces the use of RC_CHK_ACCESS(nsi). Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2023-04-17 22:22:12 -03:00
Arnaldo Carvalho de Melo	4d623903f1	perf namespaces: Use the need_setns() accessors instead of accessing ->need_setns directly This uses pre-existing accessors and reduces the use of RC_CHK_ACCESS(nsi). Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2023-04-17 22:21:54 -03:00
Yonghong Song	49859de997	selftests/bpf: Add a selftest for checking subreg equality Add a selftest to ensure subreg equality if source register upper 32bit is 0. Without previous patch, the test will fail verification. Acked-by: Eduard Zingerman <eddyz87@gmail.com> Signed-off-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/r/20230417222139.360607-1-yhs@fb.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2023-04-17 15:50:02 -07:00
Ian Rogers	c35ce1d918	perf namespaces: Add reference count checking Add reference count checking controlled by REFCNT_CHECKING ifdef. The reference count checking interposes an allocated pointer between the reference counted struct on a get and frees the pointer on a put. Accesses after a put cause faults and use after free, missed puts are caughts as leaks and double puts are double frees. This checking helped resolve a memory leak and use after free: https://lore.kernel.org/linux-perf-users/CAP-5=fWZH20L4kv-BwVtGLwR=Em3AOOT+Q4QGivvQuYn5AsPRg@mail.gmail.com/ Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com> Cc: Dmitriy Vyukov <dvyukov@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Riccardo Mancini <rickyman7@gmail.com> Cc: Stephane Eranian <eranian@google.com> Cc: Stephen Brennan <stephen.s.brennan@oracle.com> Link: https://lore.kernel.org/lkml/20230407230405.2931830-4-irogers@google.com [ Extracted from a larger patch ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2023-04-17 18:51:57 -03:00
Arnaldo Carvalho de Melo	7031edac9d	perf dso: Add dso__filename_with_chroot() to reduce number of accesses to dso->nsinfo members We'll need to reference count dso->nsinfo, so reduce the number of direct accesses by having a shorter form of obtaining a filename with a chroot (namespace one). Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com> Cc: Dmitriy Vyukov <dvyukov@google.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Riccardo Mancini <rickyman7@gmail.com> Cc: Stephane Eranian <eranian@google.com> Cc: Stephen Brennan <stephen.s.brennan@oracle.com> Link: https://lore.kernel.org/lkml/ZD26ZlqSbgSyH5lX@kernel.org [ Used nsinfo__pid(dso->nsinfo), as it was already present ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2023-04-17 18:47:55 -03:00
Ian Rogers	da885a0e5e	perf cpumap: Add reference count checking Enabled when REFCNT_CHECKING is defined. The change adds a memory allocated pointer that is interposed between the reference counted cpu map at a get and freed by a put. The pointer replaces the original perf_cpu_map struct, so use of the perf_cpu_map via APIs remains unchanged. Any use of the cpu map without the API requires two versions, handled via the RC_CHK_ACCESS macro. This change is intended to catch: - use after put: using a cpumap after you have put it will cause a segv. - unbalanced puts: two puts for a get will result in a double free that can be captured and reported by tools like address sanitizer, including with the associated stack traces of allocation and frees. - missing puts: if a put is missing then the get turns into a memory leak that can be reported by leak sanitizer, including the stack trace at the point the get occurs. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Cc: Darren Hart <dvhart@infradead.org> Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: Dmitriy Vyukov <dvyukov@google.com> Cc: Eric Dumazet <edumazet@google.com> Cc: German Gomez <german.gomez@arm.com> Cc: Hao Luo <haoluo@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.g.garry@oracle.com> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Madhavan Srinivasan <maddy@linux.ibm.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Miaoqian Lin <linmq006@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Riccardo Mancini <rickyman7@gmail.com> Cc: Shunsuke Nakamura <nakamura.shun@fujitsu.com> Cc: Song Liu <song@kernel.org> Cc: Stephen Brennan <stephen.s.brennan@oracle.com> Cc: Steven Rostedt (VMware) <rostedt@goodmis.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Thomas Richter <tmricht@linux.ibm.com>, Cc: Yury Norov <yury.norov@gmail.com> Link: https://lore.kernel.org/lkml/20230407230405.2931830-3-irogers@google.com [ Extracted from a larger patch ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2023-04-17 16:50:02 -03:00
Ian Rogers	491b13c46d	perf cpumap: Use perf_cpu_map__cpu(map, cpu) instead of accessing map->map[cpu] directly So that we can validate the 'map' instance wrt refcount checking. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com> Cc: Dmitriy Vyukov <dvyukov@google.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Riccardo Mancini <rickyman7@gmail.com> Cc: Stephane Eranian <eranian@google.com> Cc: Stephen Brennan <stephen.s.brennan@oracle.com> Link: https://lore.kernel.org/lkml/20230407230405.2931830-3-irogers@google.com [ Extracted from a larger patch ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2023-04-17 16:49:08 -03:00
Arnaldo Carvalho de Melo	d57fd4926a	perf cpumap: Remove initializations done in perf_cpu_map__alloc() When extracting this patch from Ian's original patch I forgot to remove the setting of ->nr and ->refcnt, no need to do those initializations again as those are done in perf_cpu_map__alloc() already, duh. Cc: Ian Rogers <irogers@google.com> Fixes: `1f94479edb` ("libperf: Make perf_cpu_map__alloc() available as an internal function for tools/perf to use") Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2023-04-17 16:19:38 -03:00
Ian Rogers	a9b867f68e	libperf: Add reference count checking macros The macros serve as a way to debug use of a reference counted struct. The macros add a memory allocated pointer that is interposed between the reference counted original struct at a get and freed by a put. The pointer replaces the original struct, so use of the struct name via APIs remains unchanged. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Cc: Darren Hart <dvhart@infradead.org> Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: Dmitriy Vyukov <dvyukov@google.com> Cc: Eric Dumazet <edumazet@google.com> Cc: German Gomez <german.gomez@arm.com> Cc: Hao Luo <haoluo@google.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.g.garry@oracle.com> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Madhavan Srinivasan <maddy@linux.ibm.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Miaoqian Lin <linmq006@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Riccardo Mancini <rickyman7@gmail.com> Cc: Shunsuke Nakamura <nakamura.shun@fujitsu.com> Cc: Song Liu <song@kernel.org> Cc: Stephane Eranian <eranian@google.com> Cc: Stephen Brennan <stephen.s.brennan@oracle.com> Cc: Steven Rostedt (VMware) <rostedt@goodmis.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Thomas Richter <tmricht@linux.ibm.com> Cc: Yury Norov <yury.norov@gmail.com> Link: http://lore.kernel.org/lkml/20230407230405.2931830-2-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2023-04-17 15:53:01 -03:00
Arnaldo Carvalho de Melo	4121234a32	libperf: Add perf_cpu_map__refcnt() interanl accessor to use in the maps test To remove one more direct access to 'struct perf_cpu_map' so that we can intercept accesses to its instantiations and refcount check it to catch use after free, etc. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com> Cc: Dmitriy Vyukov <dvyukov@google.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Riccardo Mancini <rickyman7@gmail.com> Cc: Stephane Eranian <eranian@google.com> Cc: Stephen Brennan <stephen.s.brennan@oracle.com> Link: https://lore.kernel.org/lkml/ZD1qdYjG+DL6KOfP@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2023-04-17 15:52:36 -03:00
Josh Poimboeuf	f372463124	btrfs: mark btrfs_assertfail() __noreturn Fixes a bunch of warnings including: vmlinux.o: warning: objtool: select_reloc_root+0x314: unreachable instruction vmlinux.o: warning: objtool: finish_inode_if_needed+0x15b1: unreachable instruction vmlinux.o: warning: objtool: get_bio_sector_nr+0x259: unreachable instruction vmlinux.o: warning: objtool: raid_wait_read_end_io+0xc26: unreachable instruction vmlinux.o: warning: objtool: raid56_parity_alloc_scrub_rbio+0x37b: unreachable instruction ... Reported-by: kernel test robot <lkp@intel.com> Link: https://lore.kernel.org/oe-kbuild-all/202302210709.IlXfgMpX-lkp@intel.com/ Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>	2023-04-17 19:52:19 +02:00
Matthieu Baerts	0fcd72df88	selftests: mptcp: join: fix ShellCheck warnings Most of the code had an issue according to ShellCheck. That's mainly due to the fact it incorrectly believes most of the code was unreachable because it's invoked by variable name, see how the "tests" array is used. Once SC2317 has been ignored, three small warnings were still visible: - SC2155: Declare and assign separately to avoid masking return values. - SC2046: Quote this to prevent word splitting: can be ignored because "ip netns pids" can display more than one pid. - SC2166: Prefer [ p ] \|\| [ q ] as [ p -o q ] is not well defined. This probably didn't fix any actual issues but it might help spotting new interesting warnings reported by ShellCheck as just before, ShellCheck was reporting issues for most lines making it a bit useless. Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-04-17 08:25:33 +01:00
Matthieu Baerts	0a85264e48	selftests: mptcp: remove duplicated entries in usage mptcp_connect tool was printing some duplicated entries when showing how to use it: -j -l -r While at it, I also: - moved the very few entries that were not sorted, - added -R that was missing since commit `8a4b910d00` ("mptcp: selftests: add rcvbuf set option"), - removed the -u parameter that has been removed in commit `f730b65c9d` ("selftests: mptcp: try to set mptcp ulp mode in different sk states"). No need to backport this, it is just an internal tool used by our selftests. The help menu is mainly useful for MPTCP kernel devs. Acked-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-04-17 08:25:33 +01:00
Aaron Conole	9feac87b67	selftests: openvswitch: add support for upcall testing The upcall socket interface can be exercised now to make sure that future feature adjustments to the field can maintain backwards compatibility. Signed-off-by: Aaron Conole <aconole@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-04-17 08:12:33 +01:00
Aaron Conole	e52b07aa1a	selftests: openvswitch: add flow dump support Add a basic set of fields to print in a 'dpflow' format. This will be used by future commits to check for flow fields after parsing, as well as verifying the flow fields pushed into the kernel from userspace. Signed-off-by: Aaron Conole <aconole@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-04-17 08:12:33 +01:00
Aaron Conole	74cc26f416	selftests: openvswitch: add interface support Includes an associated test to generate netns and connect interfaces, with the option to include packet tracing. This will be used in the future when flow support is added for additional test cases. Signed-off-by: Aaron Conole <aconole@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-04-17 08:12:33 +01:00
Andrew Morton	e492cd61b9	sync mm-stable with mm-hotfixes-stable to pick up depended-upon upstream changes	2023-04-16 12:31:58 -07:00
Steve Chou	9235756885	tools/mm/page_owner_sort.c: fix TGID output when cull=tg is used When using cull option with 'tg' flag, the fprintf is using pid instead of tgid. It should use tgid instead. Link: https://lkml.kernel.org/r/20230411034929.2071501-1-steve_chou@pesi.com.tw Fixes: `9c8a0a8e59` ("tools/vm/page_owner_sort.c: support for user-defined culling rules") Signed-off-by: Steve Chou <steve_chou@pesi.com.tw> Cc: Jiajian Ye <yejiajian2018@email.szu.edu.cn> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2023-04-16 10:41:25 -07:00
David Vernet	09b501d905	bpf: Remove bpf_kfunc_call_test_kptr_get() test kfunc We've managed to improve the UX for kptrs significantly over the last 9 months. All of the prior main use cases, struct bpf_cpumask , struct task_struct , and struct cgroup *, have all been updated to be synchronized mainly using RCU. In other words, their KF_ACQUIRE kfunc calls are all KF_RCU, and the pointers themselves are MEM_RCU and can be accessed in an RCU read region in BPF. In a follow-on change, we'll be removing the KF_KPTR_GET kfunc flag. This patch prepares for that by removing the bpf_kfunc_call_test_kptr_get() kfunc, and all associated selftests. Signed-off-by: David Vernet <void@manifault.com> Link: https://lore.kernel.org/r/20230416084928.326135-2-void@manifault.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2023-04-16 08:51:24 -07:00
Gregory Price	8c8fa605f7	selftest, ptrace: Add selftest for syscall user dispatch config api Validate that the following new ptrace requests work as expected * PTRACE_GET_SYSCALL_USER_DISPATCH_CONFIG returns the contents of task->syscall_dispatch * PTRACE_SET_SYSCALL_USER_DISPATCH_CONFIG sets the contents of task->syscall_dispatch Signed-off-by: Gregory Price <gregory.price@memverge.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/20230407171834.3558-5-gregory.price@memverge.com	2023-04-16 14:23:08 +02:00
Dmitry Vyukov	e797203fb3	selftests/timers/posix_timers: Test delivery of signals across threads Test that POSIX timers using CLOCK_PROCESS_CPUTIME_ID eventually deliver a signal to all running threads. This effectively tests that the kernel doesn't prefer any one thread (or subset of threads) for signal delivery. Signed-off-by: Dmitry Vyukov <dvyukov@google.com> Signed-off-by: Marco Elver <elver@google.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/20230316123028.2890338-2-elver@google.com	2023-04-16 09:00:18 +02:00
Dave Marchevsky	6147f15131	selftests/bpf: Add refcounted_kptr tests Test refcounted local kptr functionality added in previous patches in the series. Usecases which pass verification: * Add refcounted local kptr to both tree and list. Then, read and - possibly, depending on test variant - delete from tree, then list. * Also test doing read-and-maybe-delete in opposite order * Stash a refcounted local kptr in a map_value, then add it to a rbtree. Read from both, possibly deleting after tree read. * Add refcounted local kptr to both tree and list. Then, try reading and deleting twice from one of the collections. * bpf_refcount_acquire of just-added non-owning ref should work, as should bpf_refcount_acquire of owning ref just out of bpf_obj_new Usecases which fail verification: * The simple successful bpf_refcount_acquire cases from above should both fail to verify if the newly-acquired owning ref is not dropped Signed-off-by: Dave Marchevsky <davemarchevsky@fb.com> Link: https://lore.kernel.org/r/20230415201811.343116-10-davemarchevsky@fb.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2023-04-15 17:36:50 -07:00
Dave Marchevsky	404ad75a36	bpf: Migrate bpf_rbtree_remove to possibly fail This patch modifies bpf_rbtree_remove to account for possible failure due to the input rb_node already not being in any collection. The function can now return NULL, and does when the aforementioned scenario occurs. As before, on successful removal an owning reference to the removed node is returned. Adding KF_RET_NULL to bpf_rbtree_remove's kfunc flags - now KF_RET_NULL \| KF_ACQUIRE - provides the desired verifier semantics: * retval must be checked for NULL before use * if NULL, retval's ref_obj_id is released * retval is a "maybe acquired" owning ref, not a non-owning ref, so it will live past end of critical section (bpf_spin_unlock), and thus can be checked for NULL after the end of the CS BPF programs must add checks ============================ This does change bpf_rbtree_remove's verifier behavior. BPF program writers will need to add NULL checks to their programs, but the resulting UX looks natural: bpf_spin_lock(&glock); n = bpf_rbtree_first(&ghead); if (!n) { /* ... /} res = bpf_rbtree_remove(&ghead, &n->node); bpf_spin_unlock(&glock); if (!res) / Newly-added check after this patch / return 1; n = container_of(res, / ... /); / Do something else with n / bpf_obj_drop(n); return 0; The "if (!res)" check above is the only addition necessary for the above program to pass verification after this patch. bpf_rbtree_remove no longer clobbers non-owning refs ==================================================== An issue arises when bpf_rbtree_remove fails, though. Consider this example: struct node_data { long key; struct bpf_list_node l; struct bpf_rb_node r; struct bpf_refcount ref; }; long failed_sum; void bpf_prog() { struct node_data n = bpf_obj_new(/* ... /); struct bpf_rb_node res; n->key = 10; bpf_spin_lock(&glock); bpf_list_push_back(&some_list, &n->l); /* n is now a non-owning ref / res = bpf_rbtree_remove(&some_tree, &n->r, / ... /); if (!res) failed_sum += n->key; / not possible / bpf_spin_unlock(&glock); / if (res) { do something useful and drop } ... */ } The bpf_rbtree_remove in this example will always fail. Similarly to bpf_spin_unlock, bpf_rbtree_remove is a non-owning reference invalidation point. The verifier clobbers all non-owning refs after a bpf_rbtree_remove call, so the "failed_sum += n->key" line will fail verification, and in fact there's no good way to get information about the node which failed to add after the invalidation. This patch removes non-owning reference invalidation from bpf_rbtree_remove to allow the above usecase to pass verification. The logic for why this is now possible is as follows: Before this series, bpf_rbtree_add couldn't fail and thus assumed that its input, a non-owning reference, was in the tree. But it's easy to construct an example where two non-owning references pointing to the same underlying memory are acquired and passed to rbtree_remove one after another (see rbtree_api_release_aliasing in selftests/bpf/progs/rbtree_fail.c). So it was necessary to clobber non-owning refs to prevent this case and, more generally, to enforce "non-owning ref is definitely in some collection" invariant. This series removes that invariant and the failure / runtime checking added in this patch provide a clean way to deal with the aliasing issue - just fail to remove. Because the aliasing issue prevented by clobbering non-owning refs is no longer an issue, this patch removes the invalidate_non_owning_refs call from verifier handling of bpf_rbtree_remove. Note that bpf_spin_unlock - the other caller of invalidate_non_owning_refs - clobbers non-owning refs for a different reason, so its clobbering behavior remains unchanged. No BPF program changes are necessary for programs to remain valid as a result of this clobbering change. A valid program before this patch passed verification with its non-owning refs having shorter (or equal) lifetimes due to more aggressive clobbering. Also, update existing tests to check bpf_rbtree_remove retval for NULL where necessary, and move rbtree_api_release_aliasing from progs/rbtree_fail.c to progs/rbtree.c since it's now expected to pass verification. Signed-off-by: Dave Marchevsky <davemarchevsky@fb.com> Link: https://lore.kernel.org/r/20230415201811.343116-8-davemarchevsky@fb.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2023-04-15 17:36:50 -07:00
Dave Marchevsky	de67ba3968	selftests/bpf: Modify linked_list tests to work with macro-ified inserts The linked_list tests use macros and function pointers to reduce code duplication. Earlier in the series, bpf_list_push_{front,back} were modified to be macros, expanding to invoke actual kfuncs bpf_list_push_{front,back}_impl. Due to this change, a code snippet like: void (p)(void , void ) = (void )&bpf_list_##op; p(hexpr, nexpr); meant to do bpf_list_push_{front,back}(hexpr, nexpr), will no longer work as it's no longer valid to do &bpf_list_push_{front,back} since they're no longer functions. This patch fixes issues of this type, along with two other minor changes - one improvement and one fix - both related to the node argument to list_push_{front,back}. * The fix: migration of list_push tests away from (void , void ) func ptr uncovered that some tests were incorrectly passing pointer to node, not pointer to struct bpf_list_node within the node. This patch fixes such issues (CHECK(..., f) -> CHECK(..., &f->node)) * The improvement: In linked_list tests, the struct foo type has two list_node fields: node and node2, at byte offsets 0 and 40 within the struct, respectively. Currently node is used in ~all tests involving struct foo and lists. The verifier needs to do some work to account for the offset of bpf_list_node within the node type, so using node2 instead of node exercises that logic more in the tests. This patch migrates linked_list tests to use node2 instead of node. Signed-off-by: Dave Marchevsky <davemarchevsky@fb.com> Link: https://lore.kernel.org/r/20230415201811.343116-7-davemarchevsky@fb.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2023-04-15 17:36:50 -07:00
Dave Marchevsky	d2dcc67df9	bpf: Migrate bpf_rbtree_add and bpf_list_push_{front,back} to possibly fail Consider this code snippet: struct node { long key; bpf_list_node l; bpf_rb_node r; bpf_refcount ref; } int some_bpf_prog(void ctx) { struct node n = bpf_obj_new(/.../), m; bpf_spin_lock(&glock); bpf_rbtree_add(&some_tree, &n->r, / ... /); m = bpf_refcount_acquire(n); bpf_rbtree_add(&other_tree, &m->r, / ... /); bpf_spin_unlock(&glock); / ... / } After bpf_refcount_acquire, n and m point to the same underlying memory, and that node's bpf_rb_node field is being used by the some_tree insert, so overwriting it as a result of the second insert is an error. In order to properly support refcounted nodes, the rbtree and list insert functions must be allowed to fail. This patch adds such support. The kfuncs bpf_rbtree_add, bpf_list_push_{front,back} are modified to return an int indicating success/failure, with 0 -> success, nonzero -> failure. bpf_obj_drop on failure ======================= Currently the only reason an insert can fail is the example above: the bpf_{list,rb}_node is already in use. When such a failure occurs, the insert kfuncs will bpf_obj_drop the input node. This allows the insert operations to logically fail without changing their verifier owning ref behavior, namely the unconditional release_reference of the input owning ref. With insert that always succeeds, ownership of the node is always passed to the collection, since the node always ends up in the collection. With a possibly-failed insert w/ bpf_obj_drop, ownership of the node is always passed either to the collection (success), or to bpf_obj_drop (failure). Regardless, it's correct to continue unconditionally releasing the input owning ref, as something is always taking ownership from the calling program on insert. Keeping owning ref behavior unchanged results in a nice default UX for insert functions that can fail. If the program's reaction to a failed insert is "fine, just get rid of this owning ref for me and let me go on with my business", then there's no reason to check for failure since that's default behavior. e.g.: long important_failures = 0; int some_bpf_prog(void ctx) { struct node n, m, o; / all bpf_obj_new'd / bpf_spin_lock(&glock); bpf_rbtree_add(&some_tree, &n->node, / ... /); bpf_rbtree_add(&some_tree, &m->node, / ... /); if (bpf_rbtree_add(&some_tree, &o->node, / ... /)) { important_failures++; } bpf_spin_unlock(&glock); } If we instead chose to pass ownership back to the program on failed insert - by returning NULL on success or an owning ref on failure - programs would always have to do something with the returned ref on failure. The most likely action is probably "I'll just get rid of this owning ref and go about my business", which ideally would look like: if (n = bpf_rbtree_add(&some_tree, &n->node, / ... /)) bpf_obj_drop(n); But bpf_obj_drop isn't allowed in a critical section and inserts must occur within one, so in reality error handling would become a hard-to-parse mess. For refcounted nodes, we can replicate the "pass ownership back to program on failure" logic with this patch's semantics, albeit in an ugly way: struct node n = bpf_obj_new(/* ... /), m; bpf_spin_lock(&glock); m = bpf_refcount_acquire(n); if (bpf_rbtree_add(&some_tree, &n->node, /* ... /)) { / Do something with m / } bpf_spin_unlock(&glock); bpf_obj_drop(m); bpf_refcount_acquire is used to simulate "return owning ref on failure". This should be an uncommon occurrence, though. Addition of two verifier-fixup'd args to collection inserts =========================================================== The actual bpf_obj_drop kfunc is bpf_obj_drop_impl(void , struct btf_struct_meta ), with bpf_obj_drop macro populating the second arg with 0 and the verifier later filling in the arg during insn fixup. Because bpf_rbtree_add and bpf_list_push_{front,back} now might do bpf_obj_drop, these kfuncs need a btf_struct_meta parameter that can be passed to bpf_obj_drop_impl. Similarly, because the 'node' param to those insert functions is the bpf_{list,rb}_node within the node type, and bpf_obj_drop expects a pointer to the beginning of the node, the insert functions need to be able to find the beginning of the node struct. A second verifier-populated param is necessary: the offset of {list,rb}_node within the node type. These two new params allow the insert kfuncs to correctly call __bpf_obj_drop_impl: beginning_of_node = bpf_rb_node_ptr - offset if (already_inserted) __bpf_obj_drop_impl(beginning_of_node, btf_struct_meta->record); Similarly to other kfuncs with "hidden" verifier-populated params, the insert functions are renamed with _impl prefix and a macro is provided for common usage. For example, bpf_rbtree_add kfunc is now bpf_rbtree_add_impl and bpf_rbtree_add is now a macro which sets "hidden" args to 0. Due to the two new args BPF progs will need to be recompiled to work with the new _impl kfuncs. This patch also rewrites the "hidden argument" explanation to more directly say why the BPF program writer doesn't need to populate the arguments with anything meaningful. How does this new logic affect non-owning references? ===================================================== Currently, non-owning refs are valid until the end of the critical section in which they're created. We can make this guarantee because, if a non-owning ref exists, the referent was added to some collection. The collection will drop() its nodes when it goes away, but it can't go away while our program is accessing it, so that's not a problem. If the referent is removed from the collection in the same CS that it was added in, it can't be bpf_obj_drop'd until after CS end. Those are the only two ways to free the referent's memory and neither can happen until after the non-owning ref's lifetime ends. On first glance, having these collection insert functions potentially bpf_obj_drop their input seems like it breaks the "can't be bpf_obj_drop'd until after CS end" line of reasoning. But we care about the memory not being _freed_ until end of CS end, and a previous patch in the series modified bpf_obj_drop such that it doesn't free refcounted nodes until refcount == 0. So the statement can be more accurately rewritten as "can't be free'd until after CS end". We can prove that this rewritten statement holds for any non-owning reference produced by collection insert functions: If the input to the insert function is _not_ refcounted * We have an owning reference to the input, and can conclude it isn't in any collection * Inserting a node in a collection turns owning refs into non-owning, and since our input type isn't refcounted, there's no way to obtain additional owning refs to the same underlying memory * Because our node isn't in any collection, the insert operation cannot fail, so bpf_obj_drop will not execute * If bpf_obj_drop is guaranteed not to execute, there's no risk of memory being free'd * Otherwise, the input to the insert function is refcounted * If the insert operation fails due to the node's list_head or rb_root already being in some collection, there was some previous successful insert which passed refcount to the collection * We have an owning reference to the input, it must have been acquired via bpf_refcount_acquire, which bumped the refcount * refcount must be >= 2 since there's a valid owning reference and the node is already in a collection * Insert triggering bpf_obj_drop will decr refcount to >= 1, never resulting in a free So although we may do bpf_obj_drop during the critical section, this will never result in memory being free'd, and no changes to non-owning ref logic are needed in this patch. Signed-off-by: Dave Marchevsky <davemarchevsky@fb.com> Link: https://lore.kernel.org/r/20230415201811.343116-6-davemarchevsky@fb.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2023-04-15 17:36:50 -07:00
Dave Marchevsky	7c50b1cb76	bpf: Add bpf_refcount_acquire kfunc Currently, BPF programs can interact with the lifetime of refcounted local kptrs in the following ways: bpf_obj_new - Initialize refcount to 1 as part of new object creation bpf_obj_drop - Decrement refcount and free object if it's 0 collection add - Pass ownership to the collection. No change to refcount but collection is responsible for bpf_obj_dropping it In order to be able to add a refcounted local kptr to multiple collections we need to be able to increment the refcount and acquire a new owning reference. This patch adds a kfunc, bpf_refcount_acquire, implementing such an operation. bpf_refcount_acquire takes a refcounted local kptr and returns a new owning reference to the same underlying memory as the input. The input can be either owning or non-owning. To reinforce why this is safe, consider the following code snippets: struct node n = bpf_obj_new(typeof(n)); // A struct node m = bpf_refcount_acquire(n); // B In the above snippet, n will be alive with refcount=1 after (A), and since nothing changes that state before (B), it's obviously safe. If n is instead added to some rbtree, we can still safely refcount_acquire it: struct node n = bpf_obj_new(typeof(n)); struct node m; bpf_spin_lock(&glock); bpf_rbtree_add(&groot, &n->node, less); // A m = bpf_refcount_acquire(n); // B bpf_spin_unlock(&glock); In the above snippet, after (A) n is a non-owning reference, and after (B) m is an owning reference pointing to the same memory as n. Although n has no ownership of that memory's lifetime, it's guaranteed to be alive until the end of the critical section, and n would be clobbered if we were past the end of the critical section, so it's safe to bump refcount. Implementation details: * From verifier's perspective, bpf_refcount_acquire handling is similar to bpf_obj_new and bpf_obj_drop. Like the former, it returns a new owning reference matching input type, although like the latter, type can be inferred from concrete kptr input. Verifier changes in {check,fixup}_kfunc_call and check_kfunc_args are largely copied from aforementioned functions' verifier changes. * An exception to the above is the new KF_ARG_PTR_TO_REFCOUNTED_KPTR arg, indicated by new "__refcounted_kptr" kfunc arg suffix. This is necessary in order to handle both owning and non-owning input without adding special-casing to "__alloc" arg handling. Also a convenient place to confirm that input type has bpf_refcount field. * The implemented kfunc is actually bpf_refcount_acquire_impl, with 'hidden' second arg that the verifier sets to the type's struct_meta in fixup_kfunc_call. Signed-off-by: Dave Marchevsky <davemarchevsky@fb.com> Link: https://lore.kernel.org/r/20230415201811.343116-5-davemarchevsky@fb.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2023-04-15 17:36:50 -07:00
Dave Marchevsky	d54730b50b	bpf: Introduce opaque bpf_refcount struct and add btf_record plumbing A 'struct bpf_refcount' is added to the set of opaque uapi/bpf.h types meant for use in BPF programs. Similarly to other opaque types like bpf_spin_lock and bpf_rbtree_node, the verifier needs to know where in user-defined struct types a bpf_refcount can be located, so necessary btf_record plumbing is added to enable this. bpf_refcount is sized to hold a refcount_t. Similarly to bpf_spin_lock, the offset of a bpf_refcount is cached in btf_record as refcount_off in addition to being in the field array. Caching refcount_off makes sense for this field because further patches in the series will modify functions that take local kptrs (e.g. bpf_obj_drop) to change their behavior if the type they're operating on is refcounted. So enabling fast "is this type refcounted?" checks is desirable. No such verifier behavior changes are introduced in this patch, just logic to recognize 'struct bpf_refcount' in btf_record. Signed-off-by: Dave Marchevsky <davemarchevsky@fb.com> Link: https://lore.kernel.org/r/20230415201811.343116-3-davemarchevsky@fb.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2023-04-15 17:36:49 -07:00
Arnaldo Carvalho de Melo	17354d1528	perf test: Simplify for_each_test() to avoid tripping on -Werror=array-bounds When cross building on debian to the mips 32-bit arch we get these warnings: In function '__cmd_test', inlined from 'cmd_test' at tests/builtin-test.c:561:9: tests/builtin-test.c:260:66: error: array subscript 1 is outside array bounds of 'struct test_suite [1]' [-Werror=array-bounds] 260 \| for (k = 0, t = tests[j][k]; tests[j][k]; k++, t = tests[j][k]) \| ^ tests/builtin-test.c:369:9: note: in expansion of macro 'for_each_test' 369 \| for_each_test(j, k, t) { \| ^~~~~~~~~~~~~ tests/builtin-test.c: In function 'cmd_test': tests/builtin-test.c:36:27: note: at offset 4 into object 'arch_tests' of size 4 36 \| struct test_suite __weak arch_tests[] = { \| ^~~~~~~~~~ cc1: all warnings being treated as errors Switch to using a while(!sentinel) for the second level of the 'tests' array to avoid that compiler complaint. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2023-04-14 21:43:39 -03:00
Aaron Lewis	457bd7af1a	KVM: selftests: Test the PMU event "Instructions retired" Add testing for the event "Instructions retired" (0xc0) in the PMU event filter on both Intel and AMD to ensure that the event doesn't count when it is disallowed. Unlike most of the other events, the event "Instructions retired" will be incremented by KVM when an instruction is emulated. Test that this case is being properly handled and that KVM doesn't increment the counter when that event is disallowed. Signed-off-by: Aaron Lewis <aaronlewis@google.com> Link: https://lore.kernel.org/r/20230307141400.1486314-6-aaronlewis@google.com Link: https://lore.kernel.org/r/20230407233254.957013-7-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>	2023-04-14 13:21:38 -07:00
Sean Christopherson	e9f322bd23	KVM: selftests: Copy full counter values from guest in PMU event filter test Use a single struct to track all PMC event counts in the PMU filter test, and copy the full struct to/from the guest when running and measuring each guest workload. Using a common struct avoids naming conflicts, e.g. the loads/stores testcase has claimed "perf_counter", and eliminates the unnecessary truncation of the counter values when they are propagated from the guest MSRs to the host structs. Zero the struct before running the guest workload to ensure that the test doesn't get a false pass due to consuming data from a previous run. Link: https://lore.kernel.org/r/20230407233254.957013-6-seanjc@google.com Reviewed by: Aaron Lewis <aaronlewis@google.com> Signed-off-by: Sean Christopherson <seanjc@google.com>	2023-04-14 13:21:32 -07:00
Sean Christopherson	c02c744282	KVM: selftests: Use error codes to signal errors in PMU event filter test Use '0' to signal success and '-errno' to signal failure in the PMU event filter test so that the values are slightly less magical/arbitrary. Using '0' in the error paths is especially confusing as understanding it's an error value requires following the breadcrumbs to the host code that ultimately consumes the value. Arguably there should also be a #define for "success", but 0/-errno is a common enough pattern that defining another macro on top would likely do more harm than good. Link: https://lore.kernel.org/r/20230407233254.957013-5-seanjc@google.com Reviewed by: Aaron Lewis <aaronlewis@google.com> Signed-off-by: Sean Christopherson <seanjc@google.com>	2023-04-14 13:21:25 -07:00
Aaron Lewis	c140e93a0c	KVM: selftests: Print detailed info in PMU event filter asserts Provide the actual vs. expected count in the PMU event filter test's asserts instead of relying on pr_info() to provide the context, e.g. so that all information needed to triage a failure is readily available even if the environment in which the test is run captures only the assert itself. Signed-off-by: Aaron Lewis <aaronlewis@google.com> [sean: rewrite changelog] Link: https://lore.kernel.org/r/20230407233254.957013-4-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>	2023-04-14 13:20:53 -07:00
Aaron Lewis	fa32233d51	KVM: selftests: Add helpers for PMC asserts in PMU event filter test Add helper macros to consolidate the asserts that a PMC is/isn't counting (branch) instructions retired. This will make it easier to add additional asserts related to counting instructions later on. No functional changes intended. Signed-off-by: Aaron Lewis <aaronlewis@google.com> [sean: add "INSTRUCTIONS", massage changelog] Link: https://lore.kernel.org/r/20230407233254.957013-3-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>	2023-04-14 13:20:53 -07:00
Aaron Lewis	33ef1411a3	KVM: selftests: Add a common helper for the PMU event filter guest code Split out the common parts of the Intel and AMD guest code in the PMU event filter test into a helper function. This is in preparation for adding additional counters to the test. No functional changes intended. Signed-off-by: Aaron Lewis <aaronlewis@google.com> Link: https://lore.kernel.org/r/20230407233254.957013-2-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>	2023-04-14 13:20:53 -07:00
Reinette Chatre	50ad2fb7ec	selftests/resctrl: Fix incorrect error return on test complete An error snuck in between two recent conflicting changes: Until recently ->setup() used negative values to indicate normal test termination. This was changed in commit `fa10366cc6` ("selftests/resctrl: Allow ->setup() to return errors") that transitioned ->setup() to use negative values to indicate errors and a new END_OF_TESTS to indicate normal termination. commit `42e3b093eb` ("selftests/resctrl: Fix set up schemata with 100% allocation on first run in MBM test") continued to use negative return to indicate normal test termination. Fix mbm_setup() to use the new END_OF_TESTS to indicate error-free test termination. Fixes: `42e3b093eb` ("selftests/resctrl: Fix set up schemata with 100% allocation on first run in MBM test") Reported-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Link: https://lore.kernel.org/lkml/bb65cce8-54d7-68c5-ef19-3364ec95392a@linux.intel.com/ Signed-off-by: Reinette Chatre <reinette.chatre@intel.com> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>	2023-04-14 11:13:18 -06:00
Colin Ian King	20aef201da	KVM: selftests: Fix spelling mistake "perrmited" -> "permitted" There is a spelling mistake in a test report message. Fix it. Signed-off-by: Colin Ian King <colin.i.king@gmail.com> Link: https://lore.kernel.org/r/20230414080809.1678603-1-colin.i.king@gmail.com Signed-off-by: Sean Christopherson <seanjc@google.com>	2023-04-14 10:04:51 -07:00
Guilherme G. Piccoli	611d4c716d	x86/hyperv: Mark hv_ghcb_terminate() as noreturn Annotate the function prototype and definition as noreturn to prevent objtool warnings like: vmlinux.o: warning: objtool: hyperv_init+0x55c: unreachable instruction Also, as per Josh's suggestion, add it to the global_noreturns list. As a comparison, an objdump output without the annotation: [...] 1b63: mov $0x1,%esi 1b68: xor %edi,%edi 1b6a: callq ffffffff8102f680 <hv_ghcb_terminate> 1b6f: jmpq ffffffff82f217ec <hyperv_init+0x9c> # unreachable 1b74: cmpq $0xffffffffffffffff,-0x702a24(%rip) [...] Now, after adding the __noreturn to the function prototype: [...] 17df: callq ffffffff8102f6d0 <hv_ghcb_negotiate_protocol> 17e4: test %al,%al 17e6: je ffffffff82f21bb9 <hyperv_init+0x469> [...] <many insns> 1bb9: mov $0x1,%esi 1bbe: xor %edi,%edi 1bc0: callq ffffffff8102f680 <hv_ghcb_terminate> 1bc5: nopw %cs:0x0(%rax,%rax,1) # end of function Reported-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Guilherme G. Piccoli <gpiccoli@igalia.com> Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Michael Kelley <mikelley@microsoft.com> Link: https://lore.kernel.org/r/32453a703dfcf0d007b473c9acbf70718222b74b.1681342859.git.jpoimboe@kernel.org	2023-04-14 17:31:28 +02:00
Josh Poimboeuf	6e36a56a5f	scsi: message: fusion: Mark mpt_halt_firmware() __noreturn mpt_halt_firmware() doesn't return. Mark it as such. Fixes the following warnings: vmlinux.o: warning: objtool: mptscsih_abort+0x7f4: unreachable instruction vmlinux.o: warning: objtool: mptctl_timeout_expired+0x310: unreachable instruction Reported-by: kernel test robot <lkp@intel.com> Reported-by: Mark Rutland <mark.rutland@arm.com> Debugged-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/d8129817423422355bf30e90dadc6764261b53e0.1681342859.git.jpoimboe@kernel.org	2023-04-14 17:31:27 +02:00
Josh Poimboeuf	52668badd3	x86/cpu: Mark {hlt,resume}_play_dead() __noreturn Fixes the following warning: vmlinux.o: warning: objtool: resume_play_dead+0x21: unreachable instruction Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/ce1407c4bf88b1334fe40413126343792a77ca50.1681342859.git.jpoimboe@kernel.org	2023-04-14 17:31:27 +02:00
Josh Poimboeuf	09c5ae30d0	btrfs: Mark btrfs_assertfail() __noreturn Fixes a bunch of warnings including: vmlinux.o: warning: objtool: select_reloc_root+0x314: unreachable instruction vmlinux.o: warning: objtool: finish_inode_if_needed+0x15b1: unreachable instruction vmlinux.o: warning: objtool: get_bio_sector_nr+0x259: unreachable instruction vmlinux.o: warning: objtool: raid_wait_read_end_io+0xc26: unreachable instruction vmlinux.o: warning: objtool: raid56_parity_alloc_scrub_rbio+0x37b: unreachable instruction ... Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/960bd9c0c9e3cfc409ba9c35a17644b11b832956.1681342859.git.jpoimboe@kernel.org	2023-04-14 17:31:26 +02:00
Josh Poimboeuf	1c47c8758a	objtool: Include weak functions in global_noreturns check If a global function doesn't return, and its prototype has the __noreturn attribute, its weak counterpart must also not return so that it matches the prototype and meets call site expectations. To properly follow the compiled control flow at the call sites, change the global_noreturns check to include both global and weak functions. On the other hand, if a weak function isn't in global_noreturns, assume the prototype doesn't have __noreturn. Even if the weak function doesn't return, call sites treat it like a returnable function. Fixes the following warning: kernel/sched/build_policy.o: warning: objtool: do_idle() falls through to next function play_idle_precise() Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Miroslav Benes <mbenes@suse.cz> Link: https://lore.kernel.org/r/ede3460d63f4a65d282c86f1175bd2662c2286ba.1681342859.git.jpoimboe@kernel.org	2023-04-14 17:31:26 +02:00
Josh Poimboeuf	27dea14c7f	cpu: Mark nmi_panic_self_stop() __noreturn In preparation for improving objtool's handling of weak noreturn functions, mark nmi_panic_self_stop() __noreturn. Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/316fc6dfab5a8c4e024c7185484a1ee5fb0afb79.1681342859.git.jpoimboe@kernel.org	2023-04-14 17:31:26 +02:00
Josh Poimboeuf	7412a60dec	cpu: Mark panic_smp_self_stop() __noreturn In preparation for improving objtool's handling of weak noreturn functions, mark panic_smp_self_stop() __noreturn. Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/92d76ab5c8bf660f04fdcd3da1084519212de248.1681342859.git.jpoimboe@kernel.org	2023-04-14 17:31:25 +02:00
Josh Poimboeuf	4208d2d798	x86/head: Mark *_start_kernel() __noreturn Now that start_kernel() is __noreturn, mark its chain of callers __noreturn. Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/c2525f96b88be98ee027ee0291d58003036d4120.1681342859.git.jpoimboe@kernel.org	2023-04-14 17:31:24 +02:00
Josh Poimboeuf	25a6917ca6	init: Mark start_kernel() __noreturn Now that arch_call_rest_init() is __noreturn, mark its caller start_kernel() __noreturn. Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/7069acf026a195f26a88061227fba5a3b0337b9a.1681342859.git.jpoimboe@kernel.org	2023-04-14 17:31:23 +02:00
Josh Poimboeuf	9ea7e6b62c	init: Mark [arch_call_]rest_init() __noreturn In preparation for improving objtool's handling of weak noreturn functions, mark start_kernel(), arch_call_rest_init(), and rest_init() __noreturn. Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Nick Desaulniers <ndesaulniers@google.com> Link: https://lore.kernel.org/r/7194ed8a989a85b98d92e62df660f4a90435a723.1681342859.git.jpoimboe@kernel.org	2023-04-14 17:31:23 +02:00
Josh Poimboeuf	5743654f5e	objtool: Generate ORC data for __pfx code Allow unwinding from prefix code by copying the CFI from the starting instruction of the corresponding function. Even when the NOPs are replaced, they're still stack-invariant instructions so the same ORC entry can be reused everywhere. Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/bc3344e51f3e87102f1301a0be0f72a7689ea4a4.1681331135.git.jpoimboe@kernel.org	2023-04-14 16:08:30 +02:00
Josh Poimboeuf	bd456a1bed	objtool: Separate prefix code from stack validation code Simplify the prefix code by moving it after validate_reachable_instructions(). Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/d7f31ac2de462d0cd7b1db01b7ecb525c057c8f6.1681331135.git.jpoimboe@kernel.org	2023-04-14 16:08:29 +02:00
Josh Poimboeuf	6126ed5dfb	objtool: Remove superfluous dead_end_function() check annotate_call_site() already sets 'insn->dead_end' for calls to dead end functions. Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/5d603a301e9a8b1036b61503385907e154867ace.1681325924.git.jpoimboe@kernel.org	2023-04-14 16:08:29 +02:00
Josh Poimboeuf	9290e772ba	objtool: Add symbol iteration helpers Add [sec_]for_each_sym() and use them. Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/59023e5886ab125aa30702e633be7732b1acaa7e.1681325924.git.jpoimboe@kernel.org	2023-04-14 16:08:29 +02:00
Josh Poimboeuf	246b2c8548	objtool: Add WARN_INSN() It's easier to use and also gives easy access to the instruction's containing function, which is useful for printing that function's symbol. It will also be useful in the future for rate-limiting and disassembly of warned functions. Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/2eaa3155c90fba683d8723599f279c46025b75f3.1681325924.git.jpoimboe@kernel.org	2023-04-14 16:08:28 +02:00
Josh Poimboeuf	7f530fba11	objtool: Add stackleak instrumentation to uaccess safe list If a function has a large stack frame, the stackleak plugin adds a call to stackleak_track_stack() after the prologue. This function may be called in uaccess-enabled code. Add it to the uaccess safe list. Fixes the following warning: vmlinux.o: warning: objtool: kasan_report+0x12: call to stackleak_track_stack() with UACCESS enabled Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/42e9b487ef89e9b237fd5220ad1c7cf1a2ad7eb8.1681320562.git.jpoimboe@kernel.org	2023-04-14 16:08:27 +02:00
Josh Poimboeuf	e18398e80c	Revert "objtool: Support addition to set CFA base" Commit `468af56a7b` ("objtool: Support addition to set CFA base") was added as a preparatory patch for arm64 support, but that support never came. It triggers a false positive warning on x86, so just revert it for now. Fixes the following warning: vmlinux.o: warning: objtool: cdce925_regmap_i2c_write+0xdb: stack state mismatch: cfa1=4+120 cfa2=5+40 Fixes: `468af56a7b` ("objtool: Support addition to set CFA base") Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/oe-kbuild-all/202304080538.j5G6h1AB-lkp@intel.com/	2023-04-14 16:08:27 +02:00
Rahul Rameshbabu	85a4abed15	tools: ynl: Rename ethtool to ethtool.py Make it explicit that this tool is not a drop-in replacement for ethtool. This tool is intended for testing ethtool functionality implemented in the kernel and should use a name that differentiates it from the ethtool utility. Signed-off-by: Rahul Rameshbabu <rrameshbabu@nvidia.com> Link: https://lore.kernel.org/r/20230413012252.184434-2-rrameshbabu@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-04-13 22:18:29 -07:00
Rahul Rameshbabu	3ea31e6664	tools: ynl: Remove absolute paths to yaml files from ethtool testing tool Absolute paths for the spec and schema files make the ethtool testing tool unusable with freshly checked-out source trees. Replace absolute paths with relative paths for both files in the Documentation/ directory. Issue seen before the change Traceback (most recent call last): File "/home/binary-eater/Documents/mlx/linux/tools/net/ynl/./ethtool", line 424, in <module> main() File "/home/binary-eater/Documents/mlx/linux/tools/net/ynl/./ethtool", line 158, in main ynl = YnlFamily(spec, schema) File "/home/binary-eater/Documents/mlx/linux/tools/net/ynl/lib/ynl.py", line 342, in __init__ super().__init__(def_path, schema) File "/home/binary-eater/Documents/mlx/linux/tools/net/ynl/lib/nlspec.py", line 333, in __init__ with open(spec_path, "r") as stream: FileNotFoundError: [Errno 2] No such file or directory: '/usr/local/google/home/sdf/src/linux/Documentation/netlink/specs/ethtool.yaml' Fixes: `f3d07b02b2` ("tools: ynl: ethtool testing tool") Signed-off-by: Rahul Rameshbabu <rrameshbabu@nvidia.com> Link: https://lore.kernel.org/r/20230413012252.184434-1-rrameshbabu@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-04-13 22:18:29 -07:00
Alexei Starovoitov	75860b5201	selftests/bpf: Workaround for older vm_sockets.h. Some distros ship with older vm_sockets.h that doesn't have VMADDR_CID_LOCAL which causes selftests build to fail: /tmp/work/bpf/bpf/tools/testing/selftests/bpf/prog_tests/sockmap_listen.c:261:18: error: ‘VMADDR_CID_LOCAL’ undeclared (first use in this function); did you mean ‘VMADDR_CID_HOST’? 261 \| addr->svm_cid = VMADDR_CID_LOCAL; \| ^~~~~~~~~~~~~~~~ \| VMADDR_CID_HOST Workaround this issue by defining it on demand. Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2023-04-13 19:54:17 -07:00
Alexei Starovoitov	c04135ab35	selftests/bpf: Fix merge conflict due to SYS() macro change. Fix merge conflict between bpf/bpf-next trees due to change of arguments in SYS() macro. Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2023-04-13 17:22:48 -07:00
Jakub Kicinski	c2865b1122	bpf-next-for-netdev -----BEGIN PGP SIGNATURE----- iHUEABYIAB0WIQTFp0I1jqZrAX+hPRXbK58LschIgwUCZDhSiwAKCRDbK58LschI g8cbAQCH4xrquOeDmYyGXFQGchHZAIj++tKg8ABU4+hYeJtrlwEA6D4W6wjoSZRk mLSptZ9qro8yZA86BvyPvlBT1h9ELQA= =StAc -----END PGP SIGNATURE----- Daniel Borkmann says: ==================== pull-request: bpf-next 2023-04-13 We've added 260 non-merge commits during the last 36 day(s) which contain a total of 356 files changed, 21786 insertions(+), 11275 deletions(-). The main changes are: 1) Rework BPF verifier log behavior and implement it as a rotating log by default with the option to retain old-style fixed log behavior, from Andrii Nakryiko. 2) Adds support for using {FOU,GUE} encap with an ipip device operating in collect_md mode and add a set of BPF kfuncs for controlling encap params, from Christian Ehrig. 3) Allow BPF programs to detect at load time whether a particular kfunc exists or not, and also add support for this in light skeleton, from Alexei Starovoitov. 4) Optimize hashmap lookups when key size is multiple of 4, from Anton Protopopov. 5) Enable RCU semantics for task BPF kptrs and allow referenced kptr tasks to be stored in BPF maps, from David Vernet. 6) Add support for stashing local BPF kptr into a map value via bpf_kptr_xchg(). This is useful e.g. for rbtree node creation for new cgroups, from Dave Marchevsky. 7) Fix BTF handling of is_int_ptr to skip modifiers to work around tracing issues where a program cannot be attached, from Feng Zhou. 8) Migrate a big portion of test_verifier unit tests over to test_progs -a verifier_* via inline asm to ease {read,debug}ability, from Eduard Zingerman. 9) Several updates to the instruction-set.rst documentation which is subject to future IETF standardization (https://lwn.net/Articles/926882/), from Dave Thaler. 10) Fix BPF verifier in the __reg_bound_offset's 64->32 tnum sub-register known bits information propagation, from Daniel Borkmann. 11) Add skb bitfield compaction work related to BPF with the overall goal to make more of the sk_buff bits optional, from Jakub Kicinski. 12) BPF selftest cleanups for build id extraction which stand on its own from the upcoming integration work of build id into struct file object, from Jiri Olsa. 13) Add fixes and optimizations for xsk descriptor validation and several selftest improvements for xsk sockets, from Kal Conley. 14) Add BPF links for struct_ops and enable switching implementations of BPF TCP cong-ctls under a given name by replacing backing struct_ops map, from Kui-Feng Lee. 15) Remove a misleading BPF verifier env->bypass_spec_v1 check on variable offset stack read as earlier Spectre checks cover this, from Luis Gerhorst. 16) Fix issues in copy_from_user_nofault() for BPF and other tracers to resemble copy_from_user_nmi() from safety PoV, from Florian Lehner and Alexei Starovoitov. 17) Add --json-summary option to test_progs in order for CI tooling to ease parsing of test results, from Manu Bretelle. 18) Batch of improvements and refactoring to prep for upcoming bpf_local_storage conversion to bpf_mem_cache_{alloc,free} allocator, from Martin KaFai Lau. 19) Improve bpftool's visual program dump which produces the control flow graph in a DOT format by adding C source inline annotations, from Quentin Monnet. 20) Fix attaching fentry/fexit/fmod_ret/lsm to modules by extracting the module name from BTF of the target and searching kallsyms of the correct module, from Viktor Malik. 21) Improve BPF verifier handling of '<const> <cond> <non_const>' to better detect whether in particular jmp32 branches are taken, from Yonghong Song. 22) Allow BPF TCP cong-ctls to write app_limited of struct tcp_sock. A built-in cc or one from a kernel module is already able to write to app_limited, from Yixin Shen. Conflicts: Documentation/bpf/bpf_devel_QA.rst `b7abcd9c65` ("bpf, doc: Link to submitting-patches.rst for general patch submission info") `0f10f647f4` ("bpf, docs: Use internal linking for link to netdev subsystem doc") https://lore.kernel.org/all/20230307095812.236eb1be@canb.auug.org.au/ include/net/ip_tunnels.h `bc9d003dc4` ("ip_tunnel: Preserve pointer const in ip_tunnel_info_opts") `ac931d4cde` ("ipip,ip_tunnel,sit: Add FOU support for externally controlled ipip devices") https://lore.kernel.org/all/20230413161235.4093777-1-broonie@kernel.org/ net/bpf/test_run.c `e5995bc7e2` ("bpf, test_run: fix crashes due to XDP frame overwriting/corruption") `294635a816` ("bpf, test_run: fix &xdp_frame misplacement for LIVE_FRAMES") https://lore.kernel.org/all/20230320102619.05b80a98@canb.auug.org.au/ ==================== Link: https://lore.kernel.org/r/20230413191525.7295-1-daniel@iogearbox.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-04-13 16:43:38 -07:00
Jakub Kicinski	800e68c44f	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Conflicts: tools/testing/selftests/net/config `62199e3f16` ("selftests: net: Add VXLAN MDB test") `3a0385be13` ("selftests: add the missing CONFIG_IP_SCTP in net config") Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-04-13 16:04:28 -07:00
Linus Torvalds	829cca4d17	Including fixes from bpf, and bluetooth. Not all that quiet given spring celebrations, but "current" fixes are thinning out, which is encouraging. One outstanding regression in the mlx5 driver when using old FW, not blocking but we're pushing for a fix. Current release - new code bugs: - eth: enetc: workaround for unresponsive pMAC after receiving express traffic Previous releases - regressions: - rtnetlink: restore RTM_NEW/DELLINK notification behavior, keep the pid/seq fields 0 for backward compatibility Previous releases - always broken: - sctp: fix a potential overflow in sctp_ifwdtsn_skip - mptcp: - use mptcp_schedule_work instead of open-coding it and make the worker check stricter, to avoid scheduling work on closed sockets - fix NULL pointer dereference on fastopen early fallback - skbuff: fix memory corruption due to a race between skb coalescing and releasing clones confusing page_pool reference counting - bonding: fix neighbor solicitation validation on backup slaves - bpf: tcp: use sock_gen_put instead of sock_put in bpf_iter_tcp - bpf: arm64: fixed a BTI error on returning to patched function - openvswitch: fix race on port output leading to inf loop - sfp: initialize sfp->i2c_block_size at sfp allocation to avoid returning a different errno than expected - phy: nxp-c45-tja11xx: unregister PTP, purge queues on remove - Bluetooth: fix printing errors if LE Connection times out - Bluetooth: assorted UaF, deadlock and data race fixes - eth: macb: fix memory corruption in extended buffer descriptor mode Misc: - adjust the XDP Rx flow hash API to also include the protocol layers over which the hash was computed Signed-off-by: Jakub Kicinski <kuba@kernel.org> -----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEE6jPA+I1ugmIBA4hXMUZtbf5SIrsFAmQ4ZrsACgkQMUZtbf5S IruWUQ/9F+HlnHf3Sv08zGlnV++vLaJ/Ld8C2YNYNxRwuoJvcCyikQ/ZfUKdKAoS kCf0XqFD2SMl8wHpCQBhK4uXvKBdBMx6L6wEp7dbDciGl/+5yihe9opBBXKekWbB ULRZcZE7RACri/QsXQhD7Y8p530xnYWQXO8ZMjR3vOAWxplJtBDNDnXi7hqtxQpW Vzwl1ntvD1msmxhvy0UZrgeesL8R3UckFvqYEqnINeyd8E8HB1dAOg899/ZPUbjA UoEw5VsSBSr9DE7+Fs6uD8trBxQ1CrdRVJjhRhwivHk8/Ro7dIzjcVV30ei3wucz 0RiNvCqypkeLeRrcVlSk8lR5r9FBGvhDMFbcGM8lHnxSc0WB+Sj+2iup4fpTE8/p VUIvhhzuBuXU4Sc022pm6BL5DBSKdnJRquFq6XCTwnQM6v7fvzu1yWNXsQom8Nbi 1/ZcFdj27FHwMpU0JPZ4PFxT7Ta830UyulVZuyWA+zEzlN7DvW3O7bGQC72GEuID Xc58D4kVtywzbntFmUjuhXCD/6vvD5WW5orLpMWg5SH9F14prt3C9OFSpTwTTq+W uPBEslhnhhCPecTNo2iFPLX3bN67n8KDVUWua1AHaqmcK7QFGs0njJGGLpFdHMll SuNgrNrtQE9EHH8XL6VbSD2zf35ZfoKVg6qvL3oeLzXkGkZrnls= =W+J2 -----END PGP SIGNATURE----- Merge tag 'net-6.3-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Jakub Kicinski: "Including fixes from bpf, and bluetooth. Not all that quiet given spring celebrations, but "current" fixes are thinning out, which is encouraging. One outstanding regression in the mlx5 driver when using old FW, not blocking but we're pushing for a fix. Current release - new code bugs: - eth: enetc: workaround for unresponsive pMAC after receiving express traffic Previous releases - regressions: - rtnetlink: restore RTM_NEW/DELLINK notification behavior, keep the pid/seq fields 0 for backward compatibility Previous releases - always broken: - sctp: fix a potential overflow in sctp_ifwdtsn_skip - mptcp: - use mptcp_schedule_work instead of open-coding it and make the worker check stricter, to avoid scheduling work on closed sockets - fix NULL pointer dereference on fastopen early fallback - skbuff: fix memory corruption due to a race between skb coalescing and releasing clones confusing page_pool reference counting - bonding: fix neighbor solicitation validation on backup slaves - bpf: tcp: use sock_gen_put instead of sock_put in bpf_iter_tcp - bpf: arm64: fixed a BTI error on returning to patched function - openvswitch: fix race on port output leading to inf loop - sfp: initialize sfp->i2c_block_size at sfp allocation to avoid returning a different errno than expected - phy: nxp-c45-tja11xx: unregister PTP, purge queues on remove - Bluetooth: fix printing errors if LE Connection times out - Bluetooth: assorted UaF, deadlock and data race fixes - eth: macb: fix memory corruption in extended buffer descriptor mode Misc: - adjust the XDP Rx flow hash API to also include the protocol layers over which the hash was computed" * tag 'net-6.3-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (50 commits) selftests/bpf: Adjust bpf_xdp_metadata_rx_hash for new arg mlx4: bpf_xdp_metadata_rx_hash add xdp rss hash type veth: bpf_xdp_metadata_rx_hash add xdp rss hash type mlx5: bpf_xdp_metadata_rx_hash add xdp rss hash type xdp: rss hash types representation selftests/bpf: xdp_hw_metadata remove bpf_printk and add counters skbuff: Fix a race between coalescing and releasing SKBs net: macb: fix a memory corruption in extended buffer descriptor mode selftests: add the missing CONFIG_IP_SCTP in net config udp6: fix potential access to stale information selftests: openvswitch: adjust datapath NL message declaration selftests: mptcp: userspace pm: uniform verify events mptcp: fix NULL pointer dereference on fastopen early fallback mptcp: stricter state check in mptcp_worker mptcp: use mptcp_schedule_work instead of open-coding it net: enetc: workaround for unresponsive pMAC after receiving express traffic sctp: fix a potential overflow in sctp_ifwdtsn_skip net: qrtr: Fix an uninit variable access bug in qrtr_tx_resume() rtnetlink: Restore RTM_NEW/DELLINK notification behavior net: ti/cpsw: Add explicit platform_device.h and of_platform.h includes ...	2023-04-13 15:33:04 -07:00
Markus Elfring	c160118a90	perf map: Delete two variable initialisations before null pointer checks in sort__sym_from_cmp() Addresses of two data structure members were determined before corresponding null pointer checks in the implementation of the function “sort__sym_from_cmp”. Thus avoid the risk for undefined behaviour by removing extra initialisations for the local variables “from_l” and “from_r” (also because they were already reassigned with the same value behind this pointer check). This issue was detected by using the Coccinelle software. Fixes: `1b9e97a2a9` ("perf tools: Fix report -F symbol_from for data without branch info") Signed-off-by: <elfring@users.sourceforge.net> Acked-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: German Gomez <german.gomez@arm.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/cocci/54a21fea-64e3-de67-82ef-d61b90ffad05@web.de/ Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2023-04-13 19:01:51 -03:00
Ian Rogers	ee31f6fea6	perf vendor events intel: Fix uncore topics for tigerlake Move events from 'uncore-other' topic classification to interconnect. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Caleb Biggers <caleb.biggers@intel.com> Cc: Edward Baker <edward.baker@intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Perry Taylor <perry.taylor@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Link: https://lore.kernel.org/r/20230413132949.3487664-22-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2023-04-13 18:56:18 -03:00
Ian Rogers	2bb848f820	perf vendor events intel: Fix uncore topics for snowridgex Remove 'uncore-other' topic classification, move to cache, interconnect and io. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Caleb Biggers <caleb.biggers@intel.com> Cc: Edward Baker <edward.baker@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Perry Taylor <perry.taylor@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Link: http://lore.kernel.org/lkml/20230413132949.3487664-21-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2023-04-13 18:56:03 -03:00
Ian Rogers	748d5cf719	perf vendor events intel: Fix uncore topics for skylakex Remove 'uncore-other' topic classification, move to cache, interconnect, io and memory. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Caleb Biggers <caleb.biggers@intel.com> Cc: Edward Baker <edward.baker@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Perry Taylor <perry.taylor@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Link: http://lore.kernel.org/lkml/20230413132949.3487664-20-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2023-04-13 18:55:23 -03:00
Ian Rogers	9a8b303688	perf vendor events intel: Fix uncore topics for skylake Move events from 'uncore-other' topic classification to cache and interconnect. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Caleb Biggers <caleb.biggers@intel.com> Cc: Edward Baker <edward.baker@intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Perry Taylor <perry.taylor@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Link: https://lore.kernel.org/r/20230413132949.3487664-19-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2023-04-13 18:55:19 -03:00
Ian Rogers	f58468a815	perf vendor events intel: Fix uncore topics for sandybridge Remove 'uncore-other' topic classification, move to cache and interconnect. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Caleb Biggers <caleb.biggers@intel.com> Cc: Edward Baker <edward.baker@intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Perry Taylor <perry.taylor@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Link: https://lore.kernel.org/r/20230413132949.3487664-18-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2023-04-13 18:55:17 -03:00
Ian Rogers	6c3566c594	perf vendor events intel: Fix uncore topics for knightslanding Remove 'uncore-other' topic classification, move to cache, io and memory. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Caleb Biggers <caleb.biggers@intel.com> Cc: Edward Baker <edward.baker@intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Perry Taylor <perry.taylor@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Link: https://lore.kernel.org/r/20230413132949.3487664-17-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2023-04-13 18:52:34 -03:00
Ian Rogers	05c74de4ec	perf vendor events intel: Fix uncore topics for jaketown Remove 'uncore-other' topic classification, move to cache, interconnect and io. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Caleb Biggers <caleb.biggers@intel.com> Cc: Edward Baker <edward.baker@intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Perry Taylor <perry.taylor@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Link: https://lore.kernel.org/r/20230413132949.3487664-16-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2023-04-13 18:52:10 -03:00
Ian Rogers	14b4c54485	perf vendor events intel: Fix uncore topics for ivytown Remove 'uncore-other' topic classification, move to cache, interconnect and io. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Caleb Biggers <caleb.biggers@intel.com> Cc: Edward Baker <edward.baker@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Perry Taylor <perry.taylor@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Link: http://lore.kernel.org/lkml/20230413132949.3487664-15-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2023-04-13 18:51:46 -03:00
Ian Rogers	c2f38d3b95	perf vendor events intel: Fix uncore topics for ivybridge Remove 'uncore-other' topic classification, move to cache and interconnect. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Caleb Biggers <caleb.biggers@intel.com> Cc: Edward Baker <edward.baker@intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Perry Taylor <perry.taylor@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Link: https://lore.kernel.org/r/20230413132949.3487664-14-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2023-04-13 18:51:27 -03:00
Ian Rogers	f42a7d02b7	perf vendor events intel: Fix uncore topics for icelakex Remove 'uncore-other' topic classification, move to cache, interconnect and io. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Caleb Biggers <caleb.biggers@intel.com> Cc: Edward Baker <edward.baker@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Perry Taylor <perry.taylor@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Link: http://lore.kernel.org/lkml/20230413132949.3487664-13-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2023-04-13 18:51:01 -03:00
Ian Rogers	bc4a245a80	perf vendor events intel: Fix uncore topics for icelake Move events from 'uncore-other' topic classification to interconnect. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Caleb Biggers <caleb.biggers@intel.com> Cc: Edward Baker <edward.baker@intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Perry Taylor <perry.taylor@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Link: https://lore.kernel.org/r/20230413132949.3487664-12-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2023-04-13 18:50:37 -03:00
Ian Rogers	579c047215	perf vendor events intel: Fix uncore topics for haswellx Remove 'uncore-other' topic classification, move to cache, interconnect and io. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Caleb Biggers <caleb.biggers@intel.com> Cc: Edward Baker <edward.baker@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Perry Taylor <perry.taylor@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Link: http://lore.kernel.org/lkml/20230413132949.3487664-11-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2023-04-13 18:49:56 -03:00
Ian Rogers	6910f7bac2	perf vendor events intel: Fix uncore topics for haswell Move events from 'uncore-other' topic classification to cache and interconnect. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Caleb Biggers <caleb.biggers@intel.com> Cc: Edward Baker <edward.baker@intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Perry Taylor <perry.taylor@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Link: https://lore.kernel.org/r/20230413132949.3487664-10-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2023-04-13 18:49:27 -03:00
Ian Rogers	b3eb533ca5	perf vendor events intel: Fix uncore topics for cascadelakex Remove 'uncore-other' topic classification, move to cache, interconnect, io and memory. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Caleb Biggers <caleb.biggers@intel.com> Cc: Edward Baker <edward.baker@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Perry Taylor <perry.taylor@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Link: http://lore.kernel.org/lkml/20230413132949.3487664-9-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2023-04-13 18:49:06 -03:00
Ian Rogers	c9f485c63d	perf vendor events intel: Fix uncore topics for broadwellx Remove 'uncore-other' topic classification, move to cache, interconnect and io. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Caleb Biggers <caleb.biggers@intel.com> Cc: Edward Baker <edward.baker@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Perry Taylor <perry.taylor@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Link: http://lore.kernel.org/lkml/20230413132949.3487664-8-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2023-04-13 18:48:45 -03:00
Ian Rogers	55b7bcef86	perf vendor events intel: Fix uncore topics for broadwellde Remove 'uncore-other' topic classification, move to cache, interconnect and io. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Caleb Biggers <caleb.biggers@intel.com> Cc: Edward Baker <edward.baker@intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Perry Taylor <perry.taylor@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Link: https://lore.kernel.org/r/20230413132949.3487664-7-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2023-04-13 18:47:58 -03:00
Ian Rogers	141825578a	perf vendor events intel: Fix uncore topics for broadwell Reduce the number of 'uncore-other' topic classifications, move to cache and interconnect. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Caleb Biggers <caleb.biggers@intel.com> Cc: Edward Baker <edward.baker@intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Perry Taylor <perry.taylor@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Link: https://lore.kernel.org/r/20230413132949.3487664-6-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2023-04-13 18:47:45 -03:00
Ian Rogers	759e81507e	perf vendor events intel: Fix uncore topics for alderlake Move events from 'uncore-other' topic classification to interconnect. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Caleb Biggers <caleb.biggers@intel.com> Cc: Edward Baker <edward.baker@intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Perry Taylor <perry.taylor@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Link: https://lore.kernel.org/r/20230413132949.3487664-5-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2023-04-13 18:47:33 -03:00
Ian Rogers	98806c08f9	perf vendor events intel: Add sierraforest Add v1.00 from: https://github.com/intel/perfmon/pull/69 Signed-off-by: Ian Rogers <irogers@google.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Caleb Biggers <caleb.biggers@intel.com> Cc: Edward Baker <edward.baker@intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Perry Taylor <perry.taylor@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Link: https://lore.kernel.org/r/20230413132949.3487664-4-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2023-04-13 18:47:20 -03:00
Ian Rogers	dbe9d887d3	perf vendor events intel: Add grandridge Add v1.00 from: https://github.com/intel/perfmon/pull/69 Signed-off-by: Ian Rogers <irogers@google.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Caleb Biggers <caleb.biggers@intel.com> Cc: Edward Baker <edward.baker@intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Perry Taylor <perry.taylor@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Link: https://lore.kernel.org/r/20230413132949.3487664-3-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2023-04-13 18:47:07 -03:00
Ian Rogers	54f5de6f29	perf vendor events intel: Update sapphirerapids to v1.12 Summary from https://github.com/intel/perfmon/pull/68 - Numerous uncore event additions and changes. - Description updates for core events XQ.FULL_CYCLES and MISC2_RETIRED.LFENCE. - Update ARITH.IDIV_ACTIVE counter mask. This change also gets rid of uncore-other as a topic, derived from the file name, breaking it apart in to more specific topics. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Caleb Biggers <caleb.biggers@intel.com> Cc: Edward Baker <edward.baker@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Perry Taylor <perry.taylor@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Link: http://lore.kernel.org/lkml/20230413132949.3487664-2-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2023-04-13 18:45:26 -03:00
Jesper Dangaard Brouer	0f26b74e7d	selftests/bpf: Adjust bpf_xdp_metadata_rx_hash for new arg Update BPF selftests to use the new RSS type argument for kfunc bpf_xdp_metadata_rx_hash. Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Acked-by: Toke Høiland-Jørgensen <toke@redhat.com> Acked-by: Stanislav Fomichev <sdf@google.com> Link: https://lore.kernel.org/r/168132894068.340624.8914711185697163690.stgit@firesoul Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2023-04-13 11:15:11 -07:00
Jesper Dangaard Brouer	e8163b98d9	selftests/bpf: xdp_hw_metadata remove bpf_printk and add counters The tool xdp_hw_metadata can be used by driver developers implementing XDP-hints metadata kfuncs. Remove all bpf_printk calls, as the tool already transfers all the XDP-hints related information via metadata area to AF_XDP userspace process. Add counters for providing remaining information about failure and skipped packet events. Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Acked-by: Stanislav Fomichev <sdf@google.com> Link: https://lore.kernel.org/r/168132891533.340624.7313781245316405141.stgit@firesoul Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2023-04-13 11:15:10 -07:00
Shaopeng Tan	91db4fd901	selftests/resctrl: Remove duplicate codes that clear each test result file Before exiting each test function(run_cmt/cat/mbm/mba_test()), test results("ok","not ok") are printed by ksft_test_result() and then temporary result files are cleaned by function cmt/cat/mbm/mba_test_cleanup(). However, before running ksft_test_result(), function cmt/cat/mbm/mba_test_cleanup() has been run in each test function as follows: cmt_resctrl_val() cat_perf_miss_val() mba_schemata_change() mbm_bw_change() Remove duplicate codes that clear each test result file, while ensuring cleanup properly even when errors occur in each test. Reviewed-by: Reinette Chatre <reinette.chatre@intel.com> Signed-off-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>	2023-04-13 11:34:29 -06:00
Shaopeng Tan	73c55fa5ab	selftests/resctrl: Commonize the signal handler register/unregister for all tests After creating a child process with fork() in CAT test, if a signal such as SIGINT is received, the parent process will be terminated immediately, and therefore the child process will not be killed and also resctrlfs is not unmounted. There is a signal handler registered in CMT/MBM/MBA tests, which kills child process, unmount resctrlfs, cleanups result files, etc., if a signal such as SIGINT is received. Commonize the signal handler registered for CMT/MBM/MBA tests and reuse it in CAT. To reuse the signal handler to kill child process use global bm_pid instead of local bm_pid. Also, since the MBA/MBA/CMT/CAT are run in order, unregister the signal handler at the end of each test so that the signal handler cannot be inherited by other tests. Reviewed-by: Ilpo Jarvinen <ilpo.jarvinen@linux.intel.com> Reviewed-by: Reinette Chatre <reinette.chatre@intel.com> Signed-off-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>	2023-04-13 11:34:23 -06:00
Shaopeng Tan	39e34ddc38	selftests/resctrl: Cleanup properly when an error occurs in CAT test After creating a child process with fork() in CAT test, if an error occurs when parent process runs cat_val() or check_results(), the child process will not be killed and also resctrlfs is not unmounted. Also if an error occurs when child process runs cat_val() or check_results(), the parent process will wait for the pipe message from the child process which will never be sent by the child process and the parent process cannot proceed to unmount resctrlfs. Synchronize the exits between the parent and child. An error could occur whether in parent process or child process. The parent process always kills the child process and runs umount_resctrlfs(). The child process always waits to be killed by the parent process. Reviewed-by: Reinette Chatre <reinette.chatre@intel.com> Signed-off-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>	2023-04-13 11:34:17 -06:00
Shaopeng Tan	a080b6e74b	selftests/resctrl: Flush stdout file buffer before executing fork() When a process has buffered output, a child process created by fork() will also copy buffered output. When using kselftest framework, the output (resctrl test result message) will be printed multiple times. Add fflush() to flush out the buffered output before executing fork(). Reviewed-by: Shuah Khan <skhan@linuxfoundation.org> Reviewed-by: Reinette Chatre <reinette.chatre@intel.com> Signed-off-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>	2023-04-13 11:34:12 -06:00
Shaopeng Tan	1e359b6a94	selftests/resctrl: Return MBA check result and make it to output message Since MBA check result is not returned, the MBA test result message is always output as "ok" regardless of whether the MBA check result is true or false. Make output message to be "not ok" if MBA check result is failed. Reviewed-by: Shuah Khan <skhan@linuxfoundation.org> Reviewed-by: Reinette Chatre <reinette.chatre@intel.com> Signed-off-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>	2023-04-13 11:34:05 -06:00
Shaopeng Tan	42e3b093eb	selftests/resctrl: Fix set up schemata with 100% allocation on first run in MBM test There is a comment "Set up shemata with 100% allocation on the first run" in function mbm_setup(), but there is an increment bug and the condition "num_of_runs == 0" will never be met and write_schemata() will never be called to set schemata to 100%. Even if write_schemata() is called in MBM test, since it is not supported for MBM test it does not set the schemata. This is currently fine because resctrl_val_parm->mum_resctrlfs is always 1 and umount/mount will be run in each test to set the schemata to 100%. To support the usage when MBM test does not unmount/remount resctrl filesystem before the test starts, fix to call write_schemata() and set schemata properly when the function is called for the first time. Also, remove static local variable 'num_of_runs' because this is not needed as there is resctrl_val_param->num_of_runs which should be used instead like in cat_setup(). Reviewed-by: Reinette Chatre <reinette.chatre@intel.com> Signed-off-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>	2023-04-13 11:33:59 -06:00
Peter Newman	c2b1790747	selftests/resctrl: Use correct exit code when tests fail Use ksft_finished() after running tests so that resctrl_tests doesn't return exit code 0 when tests fail. Consequently, report the MBA and MBM tests as skipped when running on non-Intel hardware, otherwise resctrl_tests will exit with a failure code. Signed-off-by: Peter Newman <peternewman@google.com> Reviewed-by: Shaopeng Tan <tan.shaopeng@fujitsu.com> Tested-by: Shaopeng Tan <tan.shaopeng@fujitsu.com> Reviewed-by: Reinette Chatre <reinette.chatre@intel.com> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>	2023-04-13 11:33:00 -06:00
Xin Long	3a0385be13	selftests: add the missing CONFIG_IP_SCTP in net config The selftest sctp_vrf needs CONFIG_IP_SCTP set in config when building the kernel, so add it. Fixes: `a61bd7b9fe` ("selftests: add a selftest for sctp vrf") Reported-by: Naresh Kamboju <naresh.kamboju@linaro.org> Signed-off-by: Xin Long <lucien.xin@gmail.com> Reviewed-by: Sridhar Samudrala <sridhar.samudrala@intel.com> Link: https://lore.kernel.org/r/61dddebc4d2dd98fe7fb145e24d4b2430e42b572.1681312386.git.lucien.xin@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-04-13 10:04:55 -07:00
Aaron Conole	306dc21361	selftests: openvswitch: adjust datapath NL message declaration The netlink message for creating a new datapath takes an array of ports for the PID creation. This shouldn't cause much issue but correct it for future cases where we need to do decode of datapath information that could include the per-cpu PID map. Fixes: `25f16c873f` ("selftests: add openvswitch selftest suite") Signed-off-by: Aaron Conole <aconole@redhat.com> Link: https://lore.kernel.org/r/20230412115828.3991806-1-aconole@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-04-13 10:01:23 -07:00
Matthieu Baerts	711ae788cb	selftests: mptcp: userspace pm: uniform verify events Simply adding a "sleep" before checking something is usually not a good idea because the time that has been picked can not be enough or too much. The best is to wait for events with a timeout. In this selftest, 'sleep 0.5' is used more than 40 times. It is always used before calling a 'verify_' function except for this verify_listener_events which has been added later. At the end, using all these 'sleep 0.5' seems to work: the slow CIs don't complain so far. Also because it doesn't take too much time, we can just add two more 'sleep 0.5' to uniform what is done before calling a 'verify_' function. For the same reasons, we can also delay a bigger refactoring to replace all these 'sleep 0.5' by functions waiting for events instead of waiting for a fix time and hope for the best. Fixes: `6c73008aa3` ("selftests: mptcp: listener test for userspace PM") Cc: stable@vger.kernel.org Suggested-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-04-13 09:58:55 -07:00
Andrii Nakryiko	4099be372f	selftests/bpf: Fix compiler warnings in bpf_testmod for kfuncs Add -Wmissing-prototypes ignore in bpf_testmod.c, similarly to what we do in kernel code proper. Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/oe-kbuild-all/202304080951.l14IDv3n-lkp@intel.com/ Link: https://lore.kernel.org/bpf/20230412034647.3968143-1-andrii@kernel.org	2023-04-13 14:54:45 +02:00
Andrii Nakryiko	ee5059a64d	selftests/bpf: Remove stand-along test_verifier_log test binary test_prog's prog_tests/verifier_log.c is superseding test_verifier_log stand-alone test. It cover same checks and adds more, and is also integrated into test_progs test runner. Just remove test_verifier_log.c. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20230412170655.1866831-1-andrii@kernel.org	2023-04-13 14:34:51 +02:00
Song Liu	2995f9a8d4	selftests/bpf: Keep the loop in bpf_testmod_loop_test Some compilers (for example clang-15) optimize bpf_testmod_loop_test and remove the loop: gcc version (gdb) disassemble bpf_testmod_loop_test Dump of assembler code for function bpf_testmod_loop_test: 0x0000000000000570 <+0>: callq 0x575 <bpf_testmod_loop_test+5> 0x0000000000000575 <+5>: xor %eax,%eax 0x0000000000000577 <+7>: test %edi,%edi 0x0000000000000579 <+9>: jle 0x587 <bpf_testmod_loop_test+23> 0x000000000000057b <+11>: xor %edx,%edx 0x000000000000057d <+13>: add %edx,%eax 0x000000000000057f <+15>: add $0x1,%edx 0x0000000000000582 <+18>: cmp %edx,%edi 0x0000000000000584 <+20>: jne 0x57d <bpf_testmod_loop_test+13> 0x0000000000000586 <+22>: retq 0x0000000000000587 <+23>: retq clang-15 version (gdb) disassemble bpf_testmod_loop_test Dump of assembler code for function bpf_testmod_loop_test: 0x0000000000000450 <+0>: nopl 0x0(%rax,%rax,1) 0x0000000000000455 <+5>: test %edi,%edi 0x0000000000000457 <+7>: jle 0x46b <bpf_testmod_loop_test+27> 0x0000000000000459 <+9>: lea -0x1(%rdi),%eax 0x000000000000045c <+12>: lea -0x2(%rdi),%ecx 0x000000000000045f <+15>: imul %rax,%rcx 0x0000000000000463 <+19>: shr %rcx 0x0000000000000466 <+22>: lea -0x1(%rdi,%rcx,1),%eax 0x000000000000046a <+26>: retq 0x000000000000046b <+27>: xor %eax,%eax 0x000000000000046d <+29>: retq Note: The jne instruction is removed in clang-15 version. Force the compile to keep the loop by making sum volatile. Signed-off-by: Song Liu <song@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20230412210423.900851-4-song@kernel.org	2023-04-13 14:32:05 +02:00
Song Liu	c1e07a80cf	selftests/bpf: Fix leaked bpf_link in get_stackid_cannot_attach skel->links.oncpu is leaked in one case. This causes test perf_branches fails when it runs after get_stackid_cannot_attach: ./test_progs -t get_stackid_cannot_attach,perf_branches 84 get_stackid_cannot_attach:OK test_perf_branches_common:PASS:test_perf_branches_load 0 nsec test_perf_branches_common:PASS:attach_perf_event 0 nsec test_perf_branches_common:PASS:set_affinity 0 nsec check_good_sample:FAIL:output not valid no valid sample from prog 146/1 perf_branches/perf_branches_hw:FAIL 146/2 perf_branches/perf_branches_no_hw:OK 146 perf_branches:FAIL All error logs: test_perf_branches_common:PASS:test_perf_branches_load 0 nsec test_perf_branches_common:PASS:attach_perf_event 0 nsec test_perf_branches_common:PASS:set_affinity 0 nsec check_good_sample:FAIL:output not valid no valid sample from prog 146/1 perf_branches/perf_branches_hw:FAIL 146 perf_branches:FAIL Summary: 1/1 PASSED, 0 SKIPPED, 1 FAILED Fix this by adding the missing bpf_link__destroy(). Fixes: `346938e938` ("selftests/bpf: Add get_stackid_cannot_attach") Signed-off-by: Song Liu <song@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20230412210423.900851-3-song@kernel.org	2023-04-13 14:32:05 +02:00
Song Liu	de6d014a09	selftests/bpf: Use read_perf_max_sample_freq() in perf_event_stackmap Currently, perf_event sample period in perf_event_stackmap is set too low that the test fails randomly. Fix this by using the max sample frequency, from read_perf_max_sample_freq(). Move read_perf_max_sample_freq() to testing_helpers.c. Replace the CHECK() with if-printf, as CHECK is not available in testing_helpers.c. Fixes: `1da4864c2b` ("selftests/bpf: Add callchain_stackid") Signed-off-by: Song Liu <song@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20230412210423.900851-2-song@kernel.org	2023-04-13 14:32:04 +02:00
Lorenz Bauer	5a674611d1	selftests/bpf: Fix use of uninitialized op_name in log tests One of the test assertions uses an uninitialized op_name, which leads to some headscratching if it fails. Use a string constant instead. Fixes: `b1a7a480a1` ("selftests/bpf: Add fixed vs rotating verifier log tests") Signed-off-by: Lorenz Bauer <lmb@isovalent.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20230413094740.18041-1-lmb@isovalent.com	2023-04-13 14:17:02 +02:00
Christian Ehrig	d9688f898c	selftests/bpf: Test FOU kfuncs for externally controlled ipip devices Add tests for FOU and GUE encapsulation via the bpf_skb_{set,get}_fou_encap kfuncs, using ipip devices in collect-metadata mode. These tests make sure that we can successfully set and obtain FOU and GUE encap parameters using ingress / egress BPF tc-hooks. Signed-off-by: Christian Ehrig <cehrig@cloudflare.com> Link: https://lore.kernel.org/r/040193566ddbdb0b53eb359f7ac7bbd316f338b5.1680874078.git.cehrig@cloudflare.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2023-04-12 16:40:39 -07:00

... 2 3 4 5 6 ...

35487 Commits