Currently, KUnit will report SKIPPED tests as having failed if one uses
--json.
Add the missing if statement to set the appropriate status ("SKIP").
See https://api.kernelci.org/schema-test-case.html:
"status": {
"type": "string",
"description": "The status of the execution of this test case",
"enum": ["PASS", "FAIL", "SKIP", "ERROR"],
"default": "PASS"
},
with this, we now can properly produce all four of the statuses.
Fixes: 5acaf6031f ("kunit: tool: Support skipped tests in kunit_tool")
Signed-off-by: Daniel Latypov <dlatypov@google.com>
Reviewed-by: David Gow <davidgow@google.com>
Reviewed-by: Brendan Higgins <brendanhiggins@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
Adding tests for get_func_[arg|ret|arg_cnt] helpers.
Using these helpers in fentry/fexit/fmod_ret programs.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20211208193245.172141-6-jolsa@kernel.org
Adding verifier test for accessing int pointer argument in
tracing programs.
The test program loads 2nd argument of bpf_modify_return_test
function which is int pointer and checks that verifier allows
that.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20211208193245.172141-3-jolsa@kernel.org
The recent GRE selftests defined NUM_NETIFS=10. If the users copy
forwarding.config.sample to forwarding.config directly, they will get
error "Command line is not complete" when run the GRE tests, because
create_netif_veth() failed with no interface name defined.
Fix it by extending the NETIFS with p9 and p10.
Fixes: 2800f24854 ("selftests: forwarding: Test multipath hashing on inner IP pkts for GRE tunnel")
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The test is currently run on a single host with private addresses,
either over veth or by setting a nic in loopback mode with macvlan.
Support running between two real devices. Allow overriding addresses.
Also cut timeout to fail faster on error and explicitly log success.
Signed-off-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Tiny fix. Option -u ("use udp") does not take an argument.
It can cause the next argument to silently be ignored.
Fixes: 5ebfb4cc30 ("selftests/net: toeplitz test")
Signed-off-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
bpf_create_map_xattr() call was reintroduced after merging bpf tree into
bpf-next tree. Convert the last instance into bpf_map_create() call.
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20211212191341.2529573-1-andrii@kernel.org
IPv6 allows binding a socket to a device then binding to an address
not on the device (__inet6_bind -> ipv6_chk_addr with strict flag
not set). Update the bind tests to reflect legacy behavior.
Fixes: 34d0302ab8 ("selftests: Add ipv6 address bind tests to fcnal-test")
Reported-by: Li Zhijian <lizhijian@fujitsu.com>
Signed-off-by: David Ahern <dsahern@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Commit referenced below added negative socket bind tests for VRF. The
socket binds should fail since the address to bind to is in a VRF yet
the socket is not bound to the VRF or a device within it. Update the
expected return code to check for 1 (bind failure) so the test passes
when the bind fails as expected. Add a 'show_hint' comment to explain
why the bind is expected to fail.
Fixes: 75b2b2b3db ("selftests: Add ipv4 address bind tests to fcnal-test")
Reported-by: Li Zhijian <lizhijian@fujitsu.com>
Signed-off-by: David Ahern <dsahern@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Commit referenced below added configuration in the default VRF that
duplicates a VRF to check MD5 passwords are properly used and fail
when expected. That config should not be added all the time as it
can cause tests to pass that should not (by matching on default VRF
setup when it should not). Move the duplicate setup to a function
that is only called for the MD5 tests and add a cleanup function
to remove it after the MD5 tests.
Fixes: 5cad8bce26 ("fcnal-test: Add TCP MD5 tests for VRF")
Signed-off-by: David Ahern <dsahern@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
If any sub-test in this icmp_redirect.sh is failing but not expected
to fail. The script will complain:
./icmp_redirect.sh: line 72: [: 1: unary operator expected
This is because when the sub-test is not expected to fail, we won't
pass any value for the xfail local variable in log_test() and thus
it's empty. Fix this by passing 0 as the 4th variable to log_test()
for non-xfail cases.
v2: added fixes tag
Fixes: 0a36a75c68 ("selftests: icmp_redirect: support expected failures")
Signed-off-by: Po-Hsu Lin <po-hsu.lin@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
As mentined by Takashi Sakamoto, the system-wide alsa-lib configuration
may override the standard device declarations. This patch use the private
alsa-lib configuration to set the predictable environment.
Signed-off-by: Jaroslav Kysela <perex@perex.cz>
Link: https://lore.kernel.org/r/20211208095209.1772296-1-perex@perex.cz
[Restructure version test to keep the preprocessor happy -- broonie]
Reviewed-by: Shuah Khan <skhan@linuxfoundation.org>
Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20211210185410.740009-4-broonie@kernel.org
Signed-off-by: Takashi Iwai <tiwai@suse.de>
The volatile attribute of control element means that the hardware can
voluntarily change the state of control element independent of any
operation by software. ALSA control core necessarily sends notification
to userspace subscribers for any change from userspace application, while
it doesn't for the hardware's voluntary change.
This commit adds optimization for the attribute. Even if read value is
different from written value, the test reports success as long as the
target control element has the attribute. On the other hand, the
difference is itself reported for developers' convenience.
Signed-off-by: Takashi Sakamoto <o-takashi@sakamocchi.jp>
Link: https://lore.kernel.org/r/Ya7TAHdMe9i41bsC@workstation
[Fix comment style as suggested by Shuah -- broonie]
Reviewed-by: Shuah Khan <skhan@linuxfoundation.org>
Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20211210185410.740009-3-broonie@kernel.org
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Add a basic test for the mixer control interface. For every control on
every sound card in the system it checks that it can read and write the
default value where the control supports that and for writeable controls
attempts to write all valid values, restoring the default values after
each test to minimise disruption for users.
There are quite a few areas for improvement - currently no coverage of the
generation of notifications, several of the control types don't have any
coverage for the values and we don't have any testing of error handling
when we attempt to write out of range values - but this provides some basic
coverage.
This is added as a kselftest since unlike other ALSA test programs it does
not require either physical setup of the device or interactive monitoring
by users and kselftest is one of the test suites that is frequently run by
people doing general automated testing so should increase coverage. It is
written in terms of alsa-lib since tinyalsa is not generally packaged for
distributions which makes things harder for general users interested in
kselftest as a whole but it will be a barrier to people with Android.
Signed-off-by: Mark Brown <broonie@kernel.org>
Reviewed-by: Shuah Khan <skhan@linuxfoundation.org>
Link: https://lore.kernel.org/r/20211210185410.740009-2-broonie@kernel.org
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Four test cases are added:
(1) ensure the return value is expected
(2) ensure no const string size is rejected
(3) ensure writable target is rejected
(4) ensure no null-terminated target is rejected
Signed-off-by: Hou Tao <houtao1@huawei.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20211210141652.877186-5-houtao1@huawei.com
Fix checkpatch error: "ERROR: Bad function definition - void foo()
should probably be void foo(void)". Most replacements are done by
the following command:
sed -i 's#\([a-z]\)()$#\1(void)#g' testing/selftests/bpf/benchs/*.c
Signed-off-by: Hou Tao <houtao1@huawei.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20211210141652.877186-3-houtao1@huawei.com
Merge misc fixes from Andrew Morton:
"21 patches.
Subsystems affected by this patch series: MAINTAINERS, mailmap, and mm
(mlock, pagecache, damon, slub, memcg, hugetlb, and pagecache)"
* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (21 commits)
mm: bdi: initialize bdi_min_ratio when bdi is unregistered
hugetlbfs: fix issue of preallocation of gigantic pages can't work
mm/memcg: relocate mod_objcg_mlstate(), get_obj_stock() and put_obj_stock()
mm/slub: fix endianness bug for alloc/free_traces attributes
selftests/damon: split test cases
selftests/damon: test debugfs file reads/writes with huge count
selftests/damon: test wrong DAMOS condition ranges input
selftests/damon: test DAMON enabling with empty target_ids case
selftests/damon: skip test if DAMON is running
mm/damon/vaddr-test: remove unnecessary variables
mm/damon/vaddr-test: split a test function having >1024 bytes frame size
mm/damon/vaddr: remove an unnecessary warning message
mm/damon/core: remove unnecessary error messages
mm/damon/dbgfs: remove an unnecessary error message
mm/damon/core: use better timer mechanisms selection threshold
mm/damon/core: fix fake load reports due to uninterruptible sleeps
timers: implement usleep_idle_range()
filemap: remove PageHWPoison check from next_uptodate_page()
mailmap: update email address for Guo Ren
MAINTAINERS: update kdump maintainers
...
Currently, the single test program, debugfs.sh, contains all test cases
for DAMON. When one of the cases fails, finding which case is failed
from the test log is not so easy, and all remaining tests will be
skipped. To improve the situation, this commit splits the single
program into small test programs having their own names.
Link: https://lkml.kernel.org/r/20211201150440.1088-12-sj@kernel.org
Signed-off-by: SeongJae Park <sj@kernel.org>
Cc: Brendan Higgins <brendanhiggins@google.com>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
DAMON debugfs interface users were able to trigger warning by writing
some files with arbitrarily large 'count' parameter. The issue is fixed
with commit db7a347b26 ("mm/damon/dbgfs: use '__GFP_NOWARN' for
user-specified size buffer allocation"). This commit adds a test case
for the issue in DAMON selftests to avoid future regressions.
Link: https://lkml.kernel.org/r/20211201150440.1088-11-sj@kernel.org
Signed-off-by: SeongJae Park <sj@kernel.org>
Cc: Brendan Higgins <brendanhiggins@google.com>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
DAMON debugfs didn't check empty targets when starting monitoring, and
the issue is fixed with commit b5ca3e83dd ("mm/damon/dbgfs: add
adaptive_targets list check before enable monitor_on"). To avoid future
regression, this commit adds a test case for that in DAMON selftests.
Link: https://lkml.kernel.org/r/20211201150440.1088-9-sj@kernel.org
Signed-off-by: SeongJae Park <sj@kernel.org>
Cc: Brendan Higgins <brendanhiggins@google.com>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Testing the DAMON debugfs files while DAMON is running makes no sense,
as any write to the debugfs files will fail. This commit makes the test
be skipped in this case.
Link: https://lkml.kernel.org/r/20211201150440.1088-8-sj@kernel.org
Signed-off-by: SeongJae Park <sj@kernel.org>
Cc: Brendan Higgins <brendanhiggins@google.com>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
ARRAY_SIZE is defined in several selftests. Remove definitions from
individual test files and include header file for the define instead.
ARRAY_SIZE define is added in a separate patch to prepare for this
change.
Remove ARRAY_SIZE from vm tests and pickup the one defined in
kselftest.h.
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
ARRAY_SIZE is defined in several selftests. Remove definitions from
individual test files and include header file for the define instead.
ARRAY_SIZE define is added in a separate patch to prepare for this
change.
Remove ARRAY_SIZE from timens tests and pickup the one defined in
kselftest.h.
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
ARRAY_SIZE is defined in several selftests. Remove definition from
individual test files and include header file for the define instead.
ARRAY_SIZE define is added in a separate patch to prepare for this
change.
Remove ARRAY_SIZE from adi-test and pickup the one defined in
kselftest.h.
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
ARRAY_SIZE is defined in several selftests. Remove definitions from
individual test files and include header file for the define instead.
ARRAY_SIZE define is added in a separate patch to prepare for this
change.
Remove ARRAY_SIZE from seccomp_benchmark and pickup the one defined in
kselftest.h.
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
ARRAY_SIZE is defined in several selftests. Remove definitions from
individual test files and include header file for the define instead.
ARRAY_SIZE define is added in a separate patch to prepare for this
change.
Remove ARRAY_SIZE from rseq tests and pickup the one defined in
kselftest.h.
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
ARRAY_SIZE is defined in several selftests. Remove definitions from
individual test files and include header file for the define instead.
ARRAY_SIZE define is added in a separate patch to prepare for this
change.
Remove ARRAY_SIZE from net tests and pickup the one defined in
kselftest.h.
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
ARRAY_SIZE is defined in several selftests. Remove definitions from
individual test files and include header file for the define instead.
ARRAY_SIZE define is added in a separate patch to prepare for this
change.
Remove ARRAY_SIZE from common.h and pickup the one defined in
kselftest.h.
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
ARRAY_SIZE is defined in several selftests. Remove definitions from
individual test files and include header file for the define instead.
ARRAY_SIZE define is added in a separate patch to prepare for this
change.
Remove ARRAY_SIZE from ir_loopback.c and pickup the one defined
in kselftest.h.
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
ARRAY_SIZE is defined in several selftests. Remove definitions from
individual test files and include header file for the define instead.
ARRAY_SIZE define is added in a separate patch to prepare for this
change.
Remove ARRAY_SIZE from close_range_test.c and pickup the one defined
in kselftest.h.
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
ARRAY_SIZE is defined in several selftests. Remove definitions from
individual test files and include header file for the define instead.
ARRAY_SIZE define is added in a separate patch to prepare for this
change.
Remove ARRAY_SIZE from cgroup_util.h and pickup the one defined in
kselftest.h.
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
ARRAY_SIZE is defined in several selftests. Remove definitions from
individual test files and include header file for the define instead.
ARRAY_SIZE define is added in a separate patch to prepare for this
change.
Remove ARRAY_SIZE from vec-syscfg.c and pickup the one defined in
kselftest.h.
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
tools/include/linux/kernel.h and kselftest_harness.h are missing
ifndef guard around ARRAY_SIZE define. Fix them to avoid duplicate
define errors during compile when another file defines it. This
problem was found when compiling selftests that include a header
with ARRAY_SIZE define.
ARRAY_SIZE is defined in several selftests. There are about 25+
duplicate defines in various selftests source and header files.
Add ARRAY_SIZE to kselftest.h in preparation for removing duplicate
ARRAY_SIZE defines from individual test files.
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
Andrii Nakryiko says:
====================
bpf-next 2021-12-10 v2
We've added 115 non-merge commits during the last 26 day(s) which contain
a total of 182 files changed, 5747 insertions(+), 2564 deletions(-).
The main changes are:
1) Various samples fixes, from Alexander Lobakin.
2) BPF CO-RE support in kernel and light skeleton, from Alexei Starovoitov.
3) A batch of new unified APIs for libbpf, logging improvements, version
querying, etc. Also a batch of old deprecations for old APIs and various
bug fixes, in preparation for libbpf 1.0, from Andrii Nakryiko.
4) BPF documentation reorganization and improvements, from Christoph Hellwig
and Dave Tucker.
5) Support for declarative initialization of BPF_MAP_TYPE_PROG_ARRAY in
libbpf, from Hengqi Chen.
6) Verifier log fixes, from Hou Tao.
7) Runtime-bounded loops support with bpf_loop() helper, from Joanne Koong.
8) Extend branch record capturing to all platforms that support it,
from Kajol Jain.
9) Light skeleton codegen improvements, from Kumar Kartikeya Dwivedi.
10) bpftool doc-generating script improvements, from Quentin Monnet.
11) Two libbpf v0.6 bug fixes, from Shuyi Cheng and Vincent Minet.
12) Deprecation warning fix for perf/bpf_counter, from Song Liu.
13) MAX_TAIL_CALL_CNT unification and MIPS build fix for libbpf,
from Tiezhu Yang.
14) BTF_KING_TYPE_TAG follow-up fixes, from Yonghong Song.
15) Selftests fixes and improvements, from Ilya Leoshkevich, Jean-Philippe
Brucker, Jiri Olsa, Maxim Mikityanskiy, Tirthendu Sarkar, Yucong Sun,
and others.
* https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (115 commits)
libbpf: Add "bool skipped" to struct bpf_map
libbpf: Fix typo in btf__dedup@LIBBPF_0.0.2 definition
bpftool: Switch bpf_object__load_xattr() to bpf_object__load()
selftests/bpf: Remove the only use of deprecated bpf_object__load_xattr()
selftests/bpf: Add test for libbpf's custom log_buf behavior
selftests/bpf: Replace all uses of bpf_load_btf() with bpf_btf_load()
libbpf: Deprecate bpf_object__load_xattr()
libbpf: Add per-program log buffer setter and getter
libbpf: Preserve kernel error code and remove kprobe prog type guessing
libbpf: Improve logging around BPF program loading
libbpf: Allow passing user log setting through bpf_object_open_opts
libbpf: Allow passing preallocated log_buf when loading BTF into kernel
libbpf: Add OPTS-based bpf_btf_load() API
libbpf: Fix bpf_prog_load() log_buf logic for log_level 0
samples/bpf: Remove unneeded variable
bpf: Remove redundant assignment to pointer t
selftests/bpf: Fix a compilation warning
perf/bpf_counter: Use bpf_map_create instead of bpf_create_map
samples: bpf: Fix 'unknown warning group' build warning on Clang
samples: bpf: Fix xdp_sample_user.o linking with Clang
...
====================
Link: https://lore.kernel.org/r/20211210234746.2100561-1-andrii@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
When building selftests/cgroup: with clang the following error are seen:
clang -Wall -pthread test_memcontrol.c cgroup_util.c ../clone3/clone3_selftests.h -o .../builds/current/kselftest/cgroup/test_memcontrol
clang: error: cannot specify -o when generating multiple output files
make[3]: *** [../lib.mk:146: .../builds/current/kselftest/cgroup/test_memcontrol] Error 1
Rework to add the header files to LOCAL_HDRS before including ../lib.mk,
since the dependency is evaluated in '$(OUTPUT)/%:%.c $(LOCAL_HDRS)' in
file lib.mk.
Suggested-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Anders Roxell <anders.roxell@linaro.org>
Acked-by: Christian Brauner <christian.brauner@ubuntu.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
The coccinelle report
./tools/testing/selftests/move_mount_set_group/move_mount_set_group_test.c:225:18-23:
WARNING: conversion to bool not needed here
Relational and logical operators evaluate to bool,
explicit conversion is overly verbose and unneeded.
Reported-by: Zeal Robot <zealci@zte.com.cn>
Signed-off-by: Yang Guang <yang.guang5@zte.com.cn>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
The coccinelle report
./tools/testing/selftests/mount/unprivileged-remount-test.c:285:54-59:
WARNING: conversion to bool not needed here
./tools/testing/selftests/mount/unprivileged-remount-test.c:207:54-59:
WARNING: conversion to bool not needed here
Relational and logical operators evaluate to bool,
explicit conversion is overly verbose and unneeded.
Reported-by: Zeal Robot <zealci@zte.com.cn>
Signed-off-by: Yang Guang <yang.guang5@zte.com.cn>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
Test can fail either immediately when ASSERT() failed or at the
end if one or more EXPECT() was not met. The exact return code
is decided based on the number of successful ASSERT()s.
If test has no ASSERT()s, however, the return code will be 0,
as if the test did not fail. Start counting ASSERT()s from 1.
Fixes: 369130b631 ("selftests: Enhance kselftest_harness.h to print which assert failed")
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
Commit 32f6e5da83 ("selftests/ftrace: Add kprobe profile testcase")
added a new kprobes testcase, but has a description which does not
describe what the test case is doing and is duplicating the description
of another test case.
Therefore change the test case description, so it is unique and then
allows easily to tell which test case actually passed or failed.
Reported-by: Alexander Egorenkov <egorenar@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Acked-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
Building selftests/clone3 with clang warns about enumeration not handled
in switch case:
clone3.c:54:10: warning: enumeration value 'CLONE3_ARGS_NO_TEST' not handled in switch [-Wswitch]
switch (test_mode) {
^
Add the missing switch case with a comment.
Fixes: 17a810699c ("selftests: add tests for clone3()")
Signed-off-by: Anders Roxell <anders.roxell@linaro.org>
Acked-by: Christian Brauner <christian.brauner@ubuntu.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
Switch from bpf_object__load_xattr() to bpf_object__load() and
kernel_log_level in bpf_object_open_opts.
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20211209193840.1248570-12-andrii@kernel.org
Add a selftest that validates that per-program and per-object log_buf
overrides work as expected. Also test same logic for low-level
bpf_prog_load() and bpf_btf_load() APIs.
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20211209193840.1248570-11-andrii@kernel.org
Add missing "prog '%s': " prefixes in few places and use consistently
markers for beginning and end of program load logs. Here's an example of
log output:
libbpf: prog 'handler': BPF program load failed: Permission denied
libbpf: -- BEGIN PROG LOAD LOG ---
arg#0 reference type('UNKNOWN ') size cannot be determined: -22
; out1 = in1;
0: (18) r1 = 0xffffc9000cdcc000
2: (61) r1 = *(u32 *)(r1 +0)
...
81: (63) *(u32 *)(r4 +0) = r5
R1_w=map_value(id=0,off=16,ks=4,vs=20,imm=0) R4=map_value(id=0,off=400,ks=4,vs=16,imm=0)
invalid access to map value, value_size=16 off=400 size=4
R4 min value is outside of the allowed memory range
processed 63 insns (limit 1000000) max_states_per_insn 0 total_states 0 peak_states 0 mark_read 0
-- END PROG LOAD LOG --
libbpf: failed to load program 'handler'
libbpf: failed to load object 'test_skeleton'
The entire verifier log, including BEGIN and END markers are now always
youtput during a single print callback call. This should make it much
easier to post-process or parse it, if necessary. It's not an explicit
API guarantee, but it can be reasonably expected to stay like that.
Also __bpf_object__open is renamed to bpf_object_open() as it's always
an adventure to find the exact function that implements bpf_object's
open phase, so drop the double underscored and use internal libbpf
naming convention.
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20211209193840.1248570-6-andrii@kernel.org
* Logic bugs in CR0 writes and Hyper-V hypercalls
* Don't use Enlightened MSR Bitmap for L3
* Remove user-triggerable WARN
Plus a few selftest fixes and a regression test for the
user-triggerable WARN.
-----BEGIN PGP SIGNATURE-----
iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmGzZuIUHHBib256aW5p
QHJlZGhhdC5jb20ACgkQv/vSX3jHroMD4Qf+Im7Q0XNRZGzK6x4Blu3ZZJSuIIkW
gEK5mDX/BWSYxoGhRN0IOkyf1Tx/A5qYwbZts87wZSvKONG2MuVzdeQ0mkDxgKc3
cYwvvIPxCKaW/dQLD2OKVlqdAv6YbeJiFURWXgszMkrcgHvw39H5Tn6ldi0B5nvg
Gvpj8LtbPDXGXab//Xrhia3+1F9TKOrcOG+obGC5G2mrGKTkG2+pi9L6LohvENhd
sOSWdpmvQTU4PeqGlhW8RCwcN+vpa+NasHT2i2tHcWZA9Lqp4P81+4ZyQLIBsRB3
psANG0c40EW+lfjFGbLL/6VR5kypxa6zy9RgX+QiRcj6C0+dJOgnwNY35A==
=cREz
-----END PGP SIGNATURE-----
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull kvm fixes from Paolo Bonzini:
"More x86 fixes:
- Logic bugs in CR0 writes and Hyper-V hypercalls
- Don't use Enlightened MSR Bitmap for L3
- Remove user-triggerable WARN
Plus a few selftest fixes and a regression test for the
user-triggerable WARN"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
selftests: KVM: Add test to verify KVM doesn't explode on "bad" I/O
KVM: x86: Don't WARN if userspace mucks with RCX during string I/O exit
KVM: X86: Raise #GP when clearing CR0_PG in 64 bit mode
selftests: KVM: avoid failures due to reserved HyperTransport region
KVM: x86: Ignore sparse banks size for an "all CPUs", non-sparse IPI req
KVM: x86: Wait for IPIs to be delivered when handling Hyper-V TLB flush hypercall
KVM: x86: selftests: svm_int_ctl_test: fix intercept calculation
KVM: nVMX: Don't use Enlightened MSR Bitmap for L3
'sys/ioctl.h' included in 'mptcp_inq.c' is duplicated.
Reported-by: ZealRobot <zealci@zte.com.cn>
Signed-off-by: Ye Guojin <ye.guojin@zte.com.cn>
Reviewed-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Link: https://lore.kernel.org/r/20211210071424.425773-1-ye.guojin@zte.com.cn
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
This patch adds tests for the verifier's tracking for spilled, <8B
registers. The first two test cases ensure the verifier doesn't
incorrectly prune states in case of <8B spill/fills. The last one simply
checks that a filled u64 register is marked unknown if the register
spilled in the same slack slot was less than 8B.
The map value access at the end of the first program is only incorrect
for the path R6=32. If the precision bit for register R8 isn't
backtracked through the u32 spill/fill, the R6=32 path is pruned at
instruction 9 and the program is incorrectly accepted. The second
program is a variation of the same with u32 spills and a u64 fill.
The additional instructions to introduce the first pruning point may be
a bit fragile as they depend on the heuristics for pruning points in the
verifier (currently at least 8 instructions and 2 jumps). If the
heuristics are changed, the pruning point may move (e.g., to the
subsequent jump) or disappear, which would cause the test to always pass.
Signed-off-by: Paul Chaignon <paul@isovalent.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Add an x86 selftest to verify that KVM doesn't WARN or otherwise explode
if userspace modifies RCX during a userspace exit to handle string I/O.
This is a regression test for a user-triggerable WARN introduced by
commit 3b27de2718 ("KVM: x86: split the two parts of emulator_pio_in").
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <20211025201311.1881846-3-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
AMD proceessors define an address range that is reserved by HyperTransport
and causes a failure if used for guest physical addresses. Avoid
selftests failures by reserving those guest physical addresses; the
rules are:
- On parts with <40 bits, its fully hidden from software.
- Before Fam17h, it was always 12G just below 1T, even if there was more
RAM above this location. In this case we just not use any RAM above 1T.
- On Fam17h and later, it is variable based on SME, and is either just
below 2^48 (no encryption) or 2^43 (encryption).
Fixes: ef4c9f4f65 ("KVM: selftests: Fix 32-bit truncation of vm_get_max_gfn()")
Cc: stable@vger.kernel.org
Cc: David Matlack <dmatlack@google.com>
Reported-by: Maxim Levitsky <mlevitsk@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20210805105423.412878-1-pbonzini@redhat.com>
Reviewed-by: Sean Christopherson <seanjc@google.com>
Tested-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
./fcnal-test.sh -v -t ipv6_ping
TEST: ping out, VRF bind - ns-B IPv6 LLA [FAIL]
TEST: ping out, VRF bind - multicast IP [FAIL]
ping6 is failing as it should.
COMMAND: ip netns exec ns-A /bin/ping6 -c1 -w1 fe80::7c4c:bcff:fe66:a63a%red
strace of ping6 shows it is failing with '1',
so change the expected rc from 2 to 1.
Fixes: c0644e71df ("selftests: Add ipv6 ping tests to fcnal-test")
Reported-by: kernel test robot <lkp@intel.com>
Suggested-by: David Ahern <dsahern@gmail.com>
Signed-off-by: Jie2x Zhou <jie2x.zhou@intel.com>
Link: https://lore.kernel.org/r/20211209020230.37270-1-jie2x.zhou@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
INTERCEPT_x are bit positions, but the code was using the raw value of
INTERCEPT_VINTR (4) instead of BIT(INTERCEPT_VINTR).
This resulted in masking of bit 2 - that is, SMI instead of VINTR.
Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com>
Message-Id: <49b9571d25588870db5380b0be1a41df4bbaaf93.1638486479.git.maciej.szmigiero@oracle.com>
A mis-match between reported and actual mitigation is not restricted to the
Vulnerable case. The guest might also report the mitigation as "Software
count cache flush" and the host will still mitigate with branch cache
disabled.
So, instead of skipping depending on the detected mitigation, simply skip
whenever the detected miss_percent is the expected one for a fully
mitigated system, that is, above 95%.
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20211207130557.40566-1-cascardo@canonical.com
The following warning is triggered when I used clang compiler
to build the selftest.
/.../prog_tests/btf_dedup_split.c:368:6: warning: variable 'btf2' is used uninitialized whenever 'if' condition is true [-Wsometimes-uninitialized]
if (!ASSERT_OK(err, "btf_dedup"))
^~~~~~~~~~~~~~~~~~~~~~~~~~~~
/.../prog_tests/btf_dedup_split.c:424:12: note: uninitialized use occurs here
btf__free(btf2);
^~~~
/.../prog_tests/btf_dedup_split.c:368:2: note: remove the 'if' if its condition is always false
if (!ASSERT_OK(err, "btf_dedup"))
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/.../prog_tests/btf_dedup_split.c:343:25: note: initialize the variable 'btf2' to silence this warning
struct btf *btf1, *btf2;
^
= NULL
Initialize local variable btf2 = NULL and the warning is gone.
Fixes: 9a49afe6f5 ("selftests/bpf: Add btf_dedup case with duplicated structs within CU")
Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20211209050403.1770836-1-yhs@fb.com
Pablo Neira Ayuso says:
====================
Netfilter fixes for net
1) Fix bogus compilter warning in nfnetlink_queue, from Florian Westphal.
2) Don't run conntrack on vrf with !dflt qdisc, from Nicolas Dichtel.
3) Fix nft_pipapo bucket load in AVX2 lookup routine for six 8-bit
groups, from Stefano Brivio.
4) Break rule evaluation on malformed TCP options.
5) Use socat instead of nc in selftests/netfilter/nft_zones_many.sh,
also from Florian
6) Fix KCSAN data-race in conntrack timeout updates, from Eric Dumazet.
* git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf:
netfilter: conntrack: annotate data-races around ct->timeout
selftests: netfilter: switch zone stress to socat
netfilter: nft_exthdr: break evaluation if setting TCP option fails
selftests: netfilter: Add correctness test for mac,net set type
nft_set_pipapo: Fix bucket load in AVX2 lookup routine for six 8-bit groups
vrf: don't run conntrack on vrf with !dflt qdisc
netfilter: nfnetlink_queue: silence bogus compiler warning
====================
Link: https://lore.kernel.org/r/20211209000847.102598-1-pablo@netfilter.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Daniel Borkmann says:
====================
bpf 2021-12-08
We've added 12 non-merge commits during the last 22 day(s) which contain
a total of 29 files changed, 659 insertions(+), 80 deletions(-).
The main changes are:
1) Fix an off-by-two error in packet range markings and also add a batch of
new tests for coverage of these corner cases, from Maxim Mikityanskiy.
2) Fix a compilation issue on MIPS JIT for R10000 CPUs, from Johan Almbladh.
3) Fix two functional regressions and a build warning related to BTF kfunc
for modules, from Kumar Kartikeya Dwivedi.
4) Fix outdated code and docs regarding BPF's migrate_disable() use on non-
PREEMPT_RT kernels, from Sebastian Andrzej Siewior.
5) Add missing includes in order to be able to detangle cgroup vs bpf header
dependencies, from Jakub Kicinski.
6) Fix regression in BPF sockmap tests caused by missing detachment of progs
from sockets when they are removed from the map, from John Fastabend.
7) Fix a missing "no previous prototype" warning in x86 JIT caused by BPF
dispatcher, from Björn Töpel.
* https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf:
bpf: Add selftests to cover packet access corner cases
bpf: Fix the off-by-two error in range markings
treewide: Add missing includes masked by cgroup -> bpf dependency
tools/resolve_btfids: Skip unresolved symbol warning for empty BTF sets
bpf: Fix bpf_check_mod_kfunc_call for built-in modules
bpf: Make CONFIG_DEBUG_INFO_BTF depend upon CONFIG_BPF_SYSCALL
mips, bpf: Fix reference to non-existing Kconfig symbol
bpf: Make sure bpf_disable_instrumentation() is safe vs preemption.
Documentation/locking/locktypes: Update migrate_disable() bits.
bpf, sockmap: Re-evaluate proto ops when psock is removed from sockmap
bpf, sockmap: Attach map progs to psock early for feature probes
bpf, x86: Fix "no previous prototype" warning
====================
Link: https://lore.kernel.org/r/20211208155125.11826-1-daniel@iogearbox.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
This commit adds BPF verifier selftests that cover all corner cases by
packet boundary checks. Specifically, 8-byte packet reads are tested at
the beginning of data and at the beginning of data_meta, using all kinds
of boundary checks (all comparison operators: <, >, <=, >=; both
permutations of operands: data + length compared to end, end compared to
data + length). For each case there are three tests:
1. Length is just enough for an 8-byte read. Length is either 7 or 8,
depending on the comparison.
2. Length is increased by 1 - should still pass the verifier. These
cases are useful, because they failed before commit 2fa7d94afc
("bpf: Fix the off-by-two error in range markings").
3. Length is decreased by 1 - should be rejected by the verifier.
Some existing tests are just renamed to avoid duplication.
Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20211207081521.41923-1-maximmi@nvidia.com
Add tests for TLSv1.2 and TLSv1.3 with AES256-GCM cipher
Signed-off-by: Vadim Fedorenko <vfedorenko@novek.ru>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Add tests for TLSv1.2 and TLSv1.3 with AES-CCM cipher.
Signed-off-by: Vadim Fedorenko <vfedorenko@novek.ru>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
With the addition of multiple callback-flood kthreads, the maximum number
of callbacks from any one of those kthreads is reported in the rcutorture
run summary. This commit changes this to report the sum of each kthread's
maximum number of callbacks in a given callback-flooding episode.
Cc: Neeraj Upadhyay <neeraj.iitr10@gmail.com>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
This commit modifies the TASKS01 scenario to use four callback queues
and the TRACE01 scenario to use two queues, thus providing testing of
multiple queues by default.
Cc: Neeraj Upadhyay <neeraj.iitr10@gmail.com>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
centos9 has nmap-ncat which doesn't like the '-q' option, use socat.
While at it, mark test skipped if needed tools are missing.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
The existing net,mac test didn't cover the issue recently reported
by Nikita Yushchenko, where MAC addresses wouldn't match if given
as first field of a concatenated set with AVX2 and 8-bit groups,
because there's a different code path covering the lookup of six
8-bit groups (MAC addresses) if that's the first field.
Add a similar mac,net test, with MAC address and IPv4 address
swapped in the set specification.
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
After the below patch, the conntrack attached to skb is set to "notrack" in
the context of vrf device, for locally generated packets.
But this is true only when the default qdisc is set to the vrf device. When
changing the qdisc, notrack is not set anymore.
In fact, there is a shortcut in the vrf driver, when the default qdisc is
set, see commit dcdd43c41e ("net: vrf: performance improvements for
IPv4") for more details.
This patch ensures that the behavior is always the same, whatever the qdisc
is.
To demonstrate the difference, a new test is added in conntrack_vrf.sh.
Fixes: 8c9c296adf ("vrf: run conntrack only in context of lower/physdev for locally generated packets")
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Acked-by: Florian Westphal <fw@strlen.de>
Reviewed-by: David Ahern <dsahern@kernel.org>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Check that getsockopt(IP_TOS) returns what setsockopt(IP_TOS) did set
right before.
Also check that socklen_t == 0 and -1 input values match those
of normal tcp sockets.
Reviewed-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
client & server use a unix socket connection to communicate
outside of the mptcp connection.
This allows the consumer to know in advance how many bytes have been
(or will be) sent by the peer.
This allows stricter checks on the bytecounts reported by TCP_INQ cmsg.
Suggested-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Do checks on the returned inq counter.
Fail on:
1. Huge value (> 1 kbyte, test case files are 1 kb)
2. last hint larger than returned bytes when read was short
3. erronenous indication of EOF.
3) happens when a hint of X bytes reads X-1 on next call
but next recvmsg returns more data (instead of EOF).
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
No driver is left using the external pgmap refcount, so remove the
code to support it.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Link: https://lore.kernel.org/r/20211028151017.50234-1-hch@lst.de
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
The test for bpf_iter_task_vma assumes that the output will be longer
than 1 kB, as the comment above the loop says. Due to this assumption,
the loop becomes infinite if the output turns to be shorter than 1 kB.
The return value of read_fd_into_buffer is 0 when the end of file was
reached, and len isn't being increased any more.
This commit adds a break on EOF to handle short output correctly. For
the reference, this is the contents that I get when running test_progs
under vmtest.sh, and it's shorter than 1 kB:
00400000-00401000 r--p 00000000 fe:00 25867 /root/bpf/test_progs
00401000-00674000 r-xp 00001000 fe:00 25867 /root/bpf/test_progs
00674000-0095f000 r--p 00274000 fe:00 25867 /root/bpf/test_progs
0095f000-00983000 r--p 0055e000 fe:00 25867 /root/bpf/test_progs
00983000-00a8a000 rw-p 00582000 fe:00 25867 /root/bpf/test_progs
00a8a000-0484e000 rw-p 00000000 00:00 0
7f6c64000000-7f6c64021000 rw-p 00000000 00:00 0
7f6c64021000-7f6c68000000 ---p 00000000 00:00 0
7f6c6ac8f000-7f6c6ac90000 r--s 00000000 00:0d 8032
anon_inode:bpf-map
7f6c6ac90000-7f6c6ac91000 ---p 00000000 00:00 0
7f6c6ac91000-7f6c6b491000 rw-p 00000000 00:00 0
7f6c6b491000-7f6c6b492000 r--s 00000000 00:0d 8032
anon_inode:bpf-map
7f6c6b492000-7f6c6b493000 rw-s 00000000 00:0d 8032
anon_inode:bpf-map
7ffc1e23d000-7ffc1e25e000 rw-p 00000000 00:00 0
7ffc1e3b8000-7ffc1e3bc000 r--p 00000000 00:00 0
7ffc1e3bc000-7ffc1e3bd000 r-xp 00000000 00:00 0
7fffffffe000-7ffffffff000 --xp 00000000 00:00 0
Fixes: e8168840e1 ("selftests/bpf: Add test for bpf_iter_task_vma")
Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20211130181811.594220-1-maximmi@nvidia.com
The first commit cited below attempts to fix the off-by-one error that
appeared in some comparisons with an open range. Due to this error,
arithmetically equivalent pieces of code could get different verdicts
from the verifier, for example (pseudocode):
// 1. Passes the verifier:
if (data + 8 > data_end)
return early
read *(u64 *)data, i.e. [data; data+7]
// 2. Rejected by the verifier (should still pass):
if (data + 7 >= data_end)
return early
read *(u64 *)data, i.e. [data; data+7]
The attempted fix, however, shifts the range by one in a wrong
direction, so the bug not only remains, but also such piece of code
starts failing in the verifier:
// 3. Rejected by the verifier, but the check is stricter than in #1.
if (data + 8 >= data_end)
return early
read *(u64 *)data, i.e. [data; data+7]
The change performed by that fix converted an off-by-one bug into
off-by-two. The second commit cited below added the BPF selftests
written to ensure than code chunks like #3 are rejected, however,
they should be accepted.
This commit fixes the off-by-two error by adjusting new_range in the
right direction and fixes the tests by changing the range into the
one that should actually fail.
Fixes: fb2a311a31 ("bpf: fix off by one for range markings with L{T, E} patterns")
Fixes: b37242c773 ("bpf: add test cases to bpf selftests to cover all access tests")
Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20211130181607.593149-1-maximmi@nvidia.com
Previously, the selftest framework always treats it as *ok* even though
some of them are failed actually. That's because the script always
returns 0.
It supports PASS/FAIL/SKIP exit code now.
CC: Philip Li <philip.li@intel.com>
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Li Zhijian <zhijianx.li@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
qdiscs/fq_pie requires CONFIG_NET_SCH_FQ_PIE, otherwise tc will fail
to create a fq_pie qdisc.
It fixes following issue:
# not ok 57 83be - Create FQ-PIE with invalid number of flows
# Command exited with 2, expected 0
# Error: Specified qdisc not found.
Signed-off-by: Li Zhijian <zhijianx.li@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Mark the summary result as FAIL to prevent from confusing the selftest
framework if some of them are failed.
Previously, the selftest framework always treats it as *ok* even though
some of them are failed actually. That's because the script tdc.sh always
return 0.
# All test results:
#
# 1..97
# ok 1 83be - Create FQ-PIE with invalid number of flows
# ok 2 8b6e - Create RED with no flags
[...snip]
# ok 6 5f15 - Create RED with flags ECN, harddrop
# ok 7 53e8 - Create RED with flags ECN, nodrop
# ok 8 d091 - Fail to create RED with only nodrop flag
# ok 9 af8e - Create RED with flags ECN, nodrop, harddrop
# not ok 10 ce7d - Add mq Qdisc to multi-queue device (4 queues)
# Could not match regex pattern. Verify command output:
# qdisc mq 1: root
# qdisc fq_codel 0: parent 1:4 limit 10240p flows 1024 quantum 1514 target 5ms interval 100ms memory_limit 32Mb ecn drop_batch 64
# qdisc fq_codel 0: parent 1:3 limit 10240p flows 1024 quantum 1514 target 5ms interval 100ms memory_limit 32Mb ecn drop_batch 64
[...snip]
# ok 96 6979 - Change quantum of a strict ETS band
# ok 97 9a7d - Change ETS strict band without quantum
#
#
#
#
ok 1 selftests: tc-testing: tdc.sh <<< summary result
CC: Philip Li <philip.li@intel.com>
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Li Zhijian <zhijianx.li@intel.com>
Acked-by: Davide Caratti <dcaratti@redhat.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently rp_filter tests in fib_tests.sh:fib_rp_filter_test() are
failing. ping sockets are bound to dummy1 using the "-I" option
(SO_BINDTODEVICE), but socket lookup is failing when receiving ping
replies, since the routing table thinks they belong to dummy0.
For example, suppose ping is using a SOCK_RAW socket for ICMP messages.
When receiving ping replies, in __raw_v4_lookup(), sk->sk_bound_dev_if
is 3 (dummy1), but dif (skb_rtable(skb)->rt_iif) says 2 (dummy0), so the
raw_sk_bound_dev_eq() check fails. Similar things happen in
ping_lookup() for SOCK_DGRAM sockets.
These tests used to pass due to a bug [1] in iputils, where "ping -I"
actually did not bind ICMP message sockets to device. The bug has been
fixed by iputils commit f455fee41c07 ("ping: also bind the ICMP socket
to the specific device") in 2016, which is why our rp_filter tests
started to fail. See [2] .
Fixing the tests while keeping everything in one netns turns out to be
nontrivial. Rework the tests and build the following topology:
┌─────────────────────────────┐ ┌─────────────────────────────┐
│ network namespace 1 (ns1) │ │ network namespace 2 (ns2) │
│ │ │ │
│ ┌────┐ ┌─────┐ │ │ ┌─────┐ ┌────┐ │
│ │ lo │<───>│veth1│<────────┼────┼─>│veth2│<──────────>│ lo │ │
│ └────┘ ├─────┴──────┐ │ │ ├─────┴──────┐ └────┘ │
│ │192.0.2.1/24│ │ │ │192.0.2.1/24│ │
│ └────────────┘ │ │ └────────────┘ │
└─────────────────────────────┘ └─────────────────────────────┘
Consider sending an ICMP_ECHO packet A in ns2. Both source and
destination IP addresses are 192.0.2.1, and we use strict mode rp_filter
in both ns1 and ns2:
1. A is routed to lo since its destination IP address is one of ns2's
local addresses (veth2);
2. A is redirected from lo's egress to veth2's egress using mirred;
3. A arrives at veth1's ingress in ns1;
4. A is redirected from veth1's ingress to lo's ingress, again, using
mirred;
5. In __fib_validate_source(), fib_info_nh_uses_dev() returns false,
since A was received on lo, but reverse path lookup says veth1;
6. However A is not dropped since we have relaxed this check for lo in
commit 66f8209547 ("fib: relax source validation check for loopback
packets");
Making sure A is not dropped here in this corner case is the whole point
of having this test.
7. As A reaches the ICMP layer, an ICMP_ECHOREPLY packet, B, is
generated;
8. Similarly, B is redirected from lo's egress to veth1's egress (in
ns1), then redirected once again from veth2's ingress to lo's
ingress (in ns2), using mirred.
Also test "ping 127.0.0.1" from ns2. It does not trigger the relaxed
check in __fib_validate_source(), but just to make sure the topology
works with loopback addresses.
Tested with ping from iputils 20210722-41-gf9fb573:
$ ./fib_tests.sh -t rp_filter
IPv4 rp_filter tests
TEST: rp_filter passes local packets [ OK ]
TEST: rp_filter passes loopback packets [ OK ]
[1] https://github.com/iputils/iputils/issues/55
[2] f455fee41c
Reported-by: Hangbin Liu <liuhangbin@gmail.com>
Fixes: adb701d6cf ("selftests: add a test case for rp_filter")
Reviewed-by: Cong Wang <cong.wang@bytedance.com>
Signed-off-by: Peilin Ye <peilin.ye@bytedance.com>
Acked-by: David Ahern <dsahern@kernel.org>
Link: https://lore.kernel.org/r/20211201004720.6357-1-yepeilin.cs@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Migrate all the selftests that were still using bpf_prog_load_xattr().
Few are converted to skeleton, others will use bpf_object__open_file()
API.
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20211201232824.3166325-7-andrii@kernel.org
xdpxceiver.c is using AF_XDP APIs that are deprecated starting from
libbpf 0.7. Until we migrate the test to libxdp or solve this issue in
some other way, mute deprecation warnings within xdpxceiver.c.
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20211201232824.3166325-6-andrii@kernel.org
The test_cmpxchg() and test_xchg() functions say "test_run add".
Therefore, make them say "test_run cmpxchg" and "test_run xchg",
respectively.
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20211201005030.GA3071525@paulmck-ThinkPad-P17-Gen-1
Add $(OUTPUT) prefix to testing_helpers.o, so it can be built out of
tree when necessary. At the moment, in addition to being built in-tree
even when out-of-tree is required, testing_helpers.o is not built with
the right recipe when cross-building.
For consistency the other helpers, cgroup_helpers and trace_helpers, can
also be passed as objects instead of source. Use *_HELPERS variable to
keep the Makefile readable.
Fixes: f87c1930ac ("selftests/bpf: Merge test_stub.c into testing_helpers.c")
Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20211201145101.823159-1-jean-philippe@linaro.org
and wireguard.
Current release - regressions:
- smc: keep smc_close_final()'s error code during active close
Current release - new code bugs:
- iwlwifi: various static checker fixes (int overflow, leaks, missing
error codes)
- rtw89: fix size of firmware header before transfer, avoid crash
- mt76: fix timestamp check in tx_status; fix pktid leak;
- mscc: ocelot: fix missing unlock on error in ocelot_hwstamp_set()
Previous releases - regressions:
- smc: fix list corruption in smc_lgr_cleanup_early
- ipv4: convert fib_num_tclassid_users to atomic_t
Previous releases - always broken:
- tls: fix authentication failure in CCM mode
- vrf: reset IPCB/IP6CB when processing outbound pkts, prevent
incorrect processing
- dsa: mv88e6xxx: fixes for various device errata
- rds: correct socket tunable error in rds_tcp_tune()
- ipv6: fix memory leak in fib6_rule_suppress
- wireguard: reset peer src endpoint when netns exits
- wireguard: improve resilience to DoS around incoming handshakes
- tcp: fix page frag corruption on page fault which involves TCP
- mpls: fix missing attributes in delete notifications
- mt7915: fix NULL pointer dereference with ad-hoc mode
Misc:
- rt2x00: be more lenient about EPROTO errors during start
- mlx4_en: update reported link modes for 1/10G
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEE6jPA+I1ugmIBA4hXMUZtbf5SIrsFAmGo6KMACgkQMUZtbf5S
Iruq+BAAhRMTcL+X4eRIL9lIEvWEKHMKLCA/pUaQWNlSxsbEeydJWRNSc37Cs3pv
z0rYIEhfieOz8+QXS1Kq+yZwJVXjA8Jvgld2qw9V9Y5w+N15Mj8RUtG8NaUw+o4E
U8PCAbaamnbzyPdlCYcVHschd8MD0BCXm5+jAGeIyCP+KQCnhEpFZv+bvHaWzQR8
FZLYrhXTR9W0DFsrKG9+haqFwFBR3+VDqTGILhaHPE+r2o6wKQQ5yJMhd8fq0SaC
nne8zDkGuFEeW3cxj0VbhdRMyrV97eMK+P4dZ2P0Z7xcrsed9/2XJkNQNJGtuRnj
GGJV6utupJRAY+lnJNUkifqS4Wt7KirfZsSsyaKKa4plyoVgtGhiqEYFTQVLagC0
CF4Qe+3qks6rESbRu6PEFN4oWSkMEhRzdcDpg7vBDURUKcrRs9fgtNUJUCi8nKFA
A/F/K+7IHBoBZyQYZbYmnGdNsNauKbF3rUY3hwMGBfQZIr/wsql9+jhtLsmZX77m
V/L7KzT2jhhNc5gDzuLps25K3P7snKuV19qQSsY2LeuGj1x3gmWZ+ibN6ynhB+Gt
KBnfHDMTI/4aciZBIbwJmwfeRhCF8tOfw0WZdUP7FRIXukbfVuDBoznWLz4BKKgf
GSYSTNDs/PHZQo5vCQ/onvTwUK5aN6zoPNy5ih7lp9YZBYtN2TI=
=r0Jh
-----END PGP SIGNATURE-----
Merge tag 'net-5.16-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from Jakub Kicinski:
"Including fixes from wireless, and wireguard.
Mostly scattered driver changes this week, with one big clump in
mv88e6xxx. Nothing of note, really.
Current release - regressions:
- smc: keep smc_close_final()'s error code during active close
Current release - new code bugs:
- iwlwifi: various static checker fixes (int overflow, leaks, missing
error codes)
- rtw89: fix size of firmware header before transfer, avoid crash
- mt76: fix timestamp check in tx_status; fix pktid leak;
- mscc: ocelot: fix missing unlock on error in ocelot_hwstamp_set()
Previous releases - regressions:
- smc: fix list corruption in smc_lgr_cleanup_early
- ipv4: convert fib_num_tclassid_users to atomic_t
Previous releases - always broken:
- tls: fix authentication failure in CCM mode
- vrf: reset IPCB/IP6CB when processing outbound pkts, prevent
incorrect processing
- dsa: mv88e6xxx: fixes for various device errata
- rds: correct socket tunable error in rds_tcp_tune()
- ipv6: fix memory leak in fib6_rule_suppress
- wireguard: reset peer src endpoint when netns exits
- wireguard: improve resilience to DoS around incoming handshakes
- tcp: fix page frag corruption on page fault which involves TCP
- mpls: fix missing attributes in delete notifications
- mt7915: fix NULL pointer dereference with ad-hoc mode
Misc:
- rt2x00: be more lenient about EPROTO errors during start
- mlx4_en: update reported link modes for 1/10G"
* tag 'net-5.16-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (85 commits)
net: dsa: b53: Add SPI ID table
gro: Fix inconsistent indenting
selftests: net: Correct case name
net/rds: correct socket tunable error in rds_tcp_tune()
mctp: Don't let RTM_DELROUTE delete local routes
net/smc: Keep smc_close_final rc during active close
ibmvnic: drop bad optimization in reuse_tx_pools()
ibmvnic: drop bad optimization in reuse_rx_pools()
net/smc: fix wrong list_del in smc_lgr_cleanup_early
Fix Comment of ETH_P_802_3_MIN
ethernet: aquantia: Try MAC address from device tree
ipv4: convert fib_num_tclassid_users to atomic_t
net: avoid uninit-value from tcp_conn_request
net: annotate data-races on txq->xmit_lock_owner
octeontx2-af: Fix a memleak bug in rvu_mbox_init()
net/mlx4_en: Fix an use-after-free bug in mlx4_en_try_alloc_resources()
vrf: Reset IPCB/IP6CB when processing outbound pkts in vrf dev xmit
net: qlogic: qlcnic: Fix a NULL pointer dereference in qlcnic_83xx_add_rings()
net: dsa: mv88e6xxx: Link in pcs_get_state() if AN is bypassed
net: dsa: mv88e6xxx: Fix inband AN for 2500base-x on 88E6393X family
...
The commit 087cba799c ("selftests/bpf: Add weak/typeless ksym test for light skeleton")
added test_ksyms_weak to light skeleton testing, but remove CO-RE access.
Revert that part of commit, since light skeleton can use CO-RE in the kernel.
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20211201181040.23337-17-alexei.starovoitov@gmail.com
Add a test where randmap() function is appended to three different bpf
programs. That action checks struct bpf_core_relo replication logic
and offset adjustment in gen loader part of libbpf.
Fourth bpf program has 360 CO-RE relocations from vmlinux, bpf_testmod,
and non-existing type. It tests candidate cache logic.
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20211201181040.23337-16-alexei.starovoitov@gmail.com
$ ./fcnal-test.sh -t help
Test names: help
Looks it intent to list the available tests but it didn't do the right
thing. I will add another option the do that in the later patch.
Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
ipv6_addr_bind/ipv4_addr_bind are function names. Previously, bind test
would not be run by default due to the wrong case names
Fixes: 34d0302ab8 ("selftests: Add ipv6 address bind tests to fcnal-test")
Fixes: 75b2b2b3db ("selftests: Add ipv4 address bind tests to fcnal-test")
Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The echo commands following initialization of the "oldrun" variable need
to be "tee"d to $oldrun/remote-log. This commit fixes several stragglers.
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
The "exit 4" in kvm-remote.sh is pointlessly redirected, so this commit
removes the redirection.
Fixes: 0092eae4cb ("torture: Add kvm-remote.sh script for distributed rcutorture test runs")
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
This commit adjusts the TRACE02 scenario to use a pair of callback-flood
kthreads. This in turn forces lock contention on the single RCU Tasks
Trace callback queue, which forces use of all CPUs' queues, thus testing
this transition. (No, there is not yet any way to transition back.
Cc: Neeraj Upadhyay <neeraj.iitr10@gmail.com>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
This commit enables two callback-flood kthreads for the TREE02 scenario
and 28 for the TREE10 scenario.
Cc: Neeraj Upadhyay <neeraj.iitr10@gmail.com>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Currently, a transient network error can kill a run if it happens while
downloading the tarball to one of the target systems. This commit
therefore does a 60-second wait and then a retry. If further experience
indicates, a more elaborate mechanism might be used later.
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
This commit makes kvm-find-errors.sh check for and report undefined
symbols that are detected at link time.
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
This commit brings the kvm.sh script's help text up to date with recently
(and some not-so-recently) added parameters.
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
All of the uses of CONFIG_RCU_FAST_NO_HZ=y that I have seen involve
systems with RCU callbacks offloaded. In this situation, all that this
Kconfig option does is slow down idle entry/exit with an additional
allways-taken early exit. If this is the only use case, then this
Kconfig option nothing but an attractive nuisance that needs to go away.
This commit therefore removes the RCU_FAST_NO_HZ Kconfig option.
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
All of the rcu scenarios that mentioning CONFIG_RCU_FAST_NO_HZ disable it.
But this Kconfig option is disabled by default, so this commit removes
the pointless "CONFIG_RCU_FAST_NO_HZ=n" lines from these scenarios.
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
All of the rcuscale and refscale scenarios that mention the Kconfig option
CONFIG_RCU_FAST_NO_HZ disable it. But this Kconfig option is disabled by
default, so this commit removes the pointless "CONFIG_RCU_FAST_NO_HZ=n"
lines from these scenarios.
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
With CONFIG_PREEMPT_DYNAMIC=y, the kernel builds with CONFIG_PREEMPTION=y
because preemption can be enabled at runtime. This prevents any tests
of Tiny RCU or Tiny SRCU from running correctly. This commit therefore
explicitly sets CONFIG_PREEMPT_DYNAMIC=n for those scenarios.
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Add benchmark to measure the throughput and latency of the bpf_loop
call.
Testing this on my dev machine on 1 thread, the data is as follows:
nr_loops: 10
bpf_loop - throughput: 198.519 ± 0.155 M ops/s, latency: 5.037 ns/op
nr_loops: 100
bpf_loop - throughput: 247.448 ± 0.305 M ops/s, latency: 4.041 ns/op
nr_loops: 500
bpf_loop - throughput: 260.839 ± 0.380 M ops/s, latency: 3.834 ns/op
nr_loops: 1000
bpf_loop - throughput: 262.806 ± 0.629 M ops/s, latency: 3.805 ns/op
nr_loops: 5000
bpf_loop - throughput: 264.211 ± 1.508 M ops/s, latency: 3.785 ns/op
nr_loops: 10000
bpf_loop - throughput: 265.366 ± 3.054 M ops/s, latency: 3.768 ns/op
nr_loops: 50000
bpf_loop - throughput: 235.986 ± 20.205 M ops/s, latency: 4.238 ns/op
nr_loops: 100000
bpf_loop - throughput: 264.482 ± 0.279 M ops/s, latency: 3.781 ns/op
nr_loops: 500000
bpf_loop - throughput: 309.773 ± 87.713 M ops/s, latency: 3.228 ns/op
nr_loops: 1000000
bpf_loop - throughput: 262.818 ± 4.143 M ops/s, latency: 3.805 ns/op
>From this data, we can see that the latency per loop decreases as the
number of loops increases. On this particular machine, each loop had an
overhead of about ~4 ns, and we were able to run ~250 million loops
per second.
Signed-off-by: Joanne Koong <joannekoong@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20211130030622.4131246-5-joannekoong@fb.com
This patch tests bpf_loop in pyperf and strobemeta, and measures the
verifier performance of replacing the traditional for loop
with bpf_loop.
The results are as follows:
~strobemeta~
Baseline
verification time 6808200 usec
stack depth 496
processed 554252 insns (limit 1000000) max_states_per_insn 16
total_states 15878 peak_states 13489 mark_read 3110
#192 verif_scale_strobemeta:OK (unrolled loop)
Using bpf_loop
verification time 31589 usec
stack depth 96+400
processed 1513 insns (limit 1000000) max_states_per_insn 2
total_states 106 peak_states 106 mark_read 60
#193 verif_scale_strobemeta_bpf_loop:OK
~pyperf600~
Baseline
verification time 29702486 usec
stack depth 368
processed 626838 insns (limit 1000000) max_states_per_insn 7
total_states 30368 peak_states 30279 mark_read 748
#182 verif_scale_pyperf600:OK (unrolled loop)
Using bpf_loop
verification time 148488 usec
stack depth 320+40
processed 10518 insns (limit 1000000) max_states_per_insn 10
total_states 705 peak_states 517 mark_read 38
#183 verif_scale_pyperf600_bpf_loop:OK
Using the bpf_loop helper led to approximately a 99% decrease
in the verification time and in the number of instructions.
Signed-off-by: Joanne Koong <joannekoong@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20211130030622.4131246-4-joannekoong@fb.com
* Fix constant sign extension affecting TCR_EL2 and preventing
running on ARMv8.7 models due to spurious bits being set
* Fix use of helpers using PSTATE early on exit by always sampling
it as soon as the exit takes place
* Move pkvm's 32bit handling into a common helper
RISC-V:
* Fix incorrect KVM_MAX_VCPUS value
* Unmap stage2 mapping when deleting/moving a memslot
x86:
* Fix and downgrade BUG_ON due to uninitialized cache
* Many APICv and MOVE_ENC_CONTEXT_FROM fixes
* Correctly emulate TLB flushes around nested vmentry/vmexit
and when the nested hypervisor uses VPID
* Prevent modifications to CPUID after the VM has run
* Other smaller bugfixes
Generic:
* Memslot handling bugfixes
-----BEGIN PGP SIGNATURE-----
iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmGmHBEUHHBib256aW5p
QHJlZGhhdC5jb20ACgkQv/vSX3jHroOkGgf/RBjt1d7H6Um7tD7oA5QiIHmNY4ko
K/90OAa8h62rilxpqxkRgLNmphBc5AzcbufVXN4J1hVhw2M+u1ouDxKeHS1GEZTA
/XdNb0dwK99TpOJkIcuV/NQVIZUxkM00VbIiCoLkX06VuIc1Gie1G4bqzLhWCP8Y
ts9l/pkfafvfEmjmcjVd7gkDOnEPbT+JPDJcuo/RA7C7Z2L4+8DsFeyfWGqBP647
J6omUUxD82QRm28OVOK4V7aNALWsAdlaqHrVFAPZywQl7QTWMO0UTcKTdCCB2B4Q
QnHejFV6pFh55q3/fhe7epy9e2Sw+NOsmWKTEGPbU5nn94R8lyW1GV4ZUQ==
=Nduu
-----END PGP SIGNATURE-----
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull kvm fixes from Paolo Bonzini:
"ARM64:
- Fix constant sign extension affecting TCR_EL2 and preventing
running on ARMv8.7 models due to spurious bits being set
- Fix use of helpers using PSTATE early on exit by always sampling it
as soon as the exit takes place
- Move pkvm's 32bit handling into a common helper
RISC-V:
- Fix incorrect KVM_MAX_VCPUS value
- Unmap stage2 mapping when deleting/moving a memslot
x86:
- Fix and downgrade BUG_ON due to uninitialized cache
- Many APICv and MOVE_ENC_CONTEXT_FROM fixes
- Correctly emulate TLB flushes around nested vmentry/vmexit and when
the nested hypervisor uses VPID
- Prevent modifications to CPUID after the VM has run
- Other smaller bugfixes
Generic:
- Memslot handling bugfixes"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (44 commits)
KVM: fix avic_set_running for preemptable kernels
KVM: VMX: clear vmx_x86_ops.sync_pir_to_irr if APICv is disabled
KVM: SEV: accept signals in sev_lock_two_vms
KVM: SEV: do not take kvm->lock when destroying
KVM: SEV: Prohibit migration of a VM that has mirrors
KVM: SEV: Do COPY_ENC_CONTEXT_FROM with both VMs locked
selftests: sev_migrate_tests: add tests for KVM_CAP_VM_COPY_ENC_CONTEXT_FROM
KVM: SEV: move mirror status to destination of KVM_CAP_VM_MOVE_ENC_CONTEXT_FROM
KVM: SEV: initialize regions_list of a mirror VM
KVM: SEV: cleanup locking for KVM_CAP_VM_MOVE_ENC_CONTEXT_FROM
KVM: SEV: do not use list_replace_init on an empty list
KVM: x86: Use a stable condition around all VT-d PI paths
KVM: x86: check PIR even for vCPUs with disabled APICv
KVM: VMX: prepare sync_pir_to_irr for running with APICv disabled
KVM: selftests: page_table_test: fix calculation of guest_test_phys_mem
KVM: x86/mmu: Handle "default" period when selectively waking kthread
KVM: MMU: shadow nested paging does not have PKU
KVM: x86/mmu: Remove spurious TLB flushes in TDP MMU zap collapsible path
KVM: x86/mmu: Use yield-safe TDP MMU root iter in MMU notifier unmapping
KVM: X86: Use vcpu->arch.walk_mmu for kvm_mmu_invlpg()
...
Commit 98e1385ef2 ("include/linux/radix-tree.h: replace kernel.h with
the necessary inclusions") broke the radix tree test suite in two
different ways; first by including math.h which didn't exist in the
tools directory, and second by removing an implicit include of
spinlock.h before lockdep.h. Fix both issues.
Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Acked-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
VMs that mirror an encryption context rely on the owner to keep the
ASID allocated. Performing a KVM_CAP_VM_MOVE_ENC_CONTEXT_FROM
would cause a dangling ASID:
1. copy context from A to B (gets ref to A)
2. move context from A to L (moves ASID from A to L)
3. close L (releases ASID from L, B still references it)
The right way to do the handoff instead is to create a fresh mirror VM
on the destination first:
1. copy context from A to B (gets ref to A)
[later] 2. close B (releases ref to A)
3. move context from A to L (moves ASID from A to L)
4. copy context from L to M
So, catch the situation by adding a count of how many VMs are
mirroring this one's encryption context.
Fixes: 0b020f5af0 ("KVM: SEV: Add support for SEV-ES intra host migration")
Message-Id: <20211123005036.2954379-11-pbonzini@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
I am putting the tests in sev_migrate_tests because the failure conditions are
very similar and some of the setup code can be reused, too.
The tests cover both successful creation of a mirror VM, and error
conditions.
Cc: Peter Gonda <pgonda@google.com>
Cc: Sean Christopherson <seanjc@google.com>
Message-Id: <20211123005036.2954379-9-pbonzini@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
A kvm_page_table_test run with its default settings fails on VMX due to
memory region add failure:
> ==== Test Assertion Failure ====
> lib/kvm_util.c:952: ret == 0
> pid=10538 tid=10538 errno=17 - File exists
> 1 0x00000000004057d1: vm_userspace_mem_region_add at kvm_util.c:947
> 2 0x0000000000401ee9: pre_init_before_test at kvm_page_table_test.c:302
> 3 (inlined by) run_test at kvm_page_table_test.c:374
> 4 0x0000000000409754: for_each_guest_mode at guest_modes.c:53
> 5 0x0000000000401860: main at kvm_page_table_test.c:500
> 6 0x00007f82ae2d8554: ?? ??:0
> 7 0x0000000000401894: _start at ??:?
> KVM_SET_USER_MEMORY_REGION IOCTL failed,
> rc: -1 errno: 17
> slot: 1 flags: 0x0
> guest_phys_addr: 0xc0000000 size: 0x40000000
This is because the memory range that this test is trying to add
(0x0c0000000 - 0x100000000) conflicts with LAPIC mapping at 0x0fee00000.
Looking at the code it seems that guest_test_*phys*_mem variable gets
mistakenly overwritten with guest_test_*virt*_mem while trying to adjust
the former for alignment.
With the correct variable adjusted this test runs successfully.
Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com>
Message-Id: <52e487458c3172923549bbcf9dfccfbe6faea60b.1637940473.git.maciej.szmigiero@oracle.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Each peer's endpoint contains a dst_cache entry that takes a reference
to another netdev. When the containing namespace exits, we take down the
socket and prevent future sockets from being created (by setting
creating_net to NULL), which removes that potential reference on the
netns. However, it doesn't release references to the netns that a netdev
cached in dst_cache might be taking, so the netns still might fail to
exit. Since the socket is gimped anyway, we can simply clear all the
dst_caches (by way of clearing the endpoint src), which will release all
references.
However, the current dst_cache_reset function only releases those
references lazily. But it turns out that all of our usages of
wg_socket_clear_peer_endpoint_src are called from contexts that are not
exactly high-speed or bottle-necked. For example, when there's
connection difficulty, or when userspace is reconfiguring the interface.
And in particular for this patch, when the netns is exiting. So for
those cases, it makes more sense to call dst_release immediately. For
that, we add a small helper function to dst_cache.
This patch also adds a test to netns.sh from Hangbin Liu to ensure this
doesn't regress.
Tested-by: Hangbin Liu <liuhangbin@gmail.com>
Reported-by: Xiumei Mu <xmu@redhat.com>
Cc: Toke Høiland-Jørgensen <toke@redhat.com>
Cc: Paolo Abeni <pabeni@redhat.com>
Fixes: 900575aa33 ("wireguard: device: avoid circular netns references")
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
DEBUG_PI_LIST was renamed to DEBUG_PLIST since 8e18faeac3 ("lib/plist:
rename DEBUG_PI_LIST to DEBUG_PLIST").
Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
Fixes: 8e18faeac3 ("lib/plist: rename DEBUG_PI_LIST to DEBUG_PLIST")
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
We previously removed the restriction on looping to self, and then added
a test to make sure the kernel didn't blow up during a routing loop. The
kernel didn't blow up, thankfully, but on certain architectures where
skb fragmentation is easier, such as ppc64, the skbs weren't actually
being discarded after a few rounds through. But the test wasn't catching
this. So actually test explicitly for massive increases in tx to see if
we have a routing loop. Note that the actual loop problem will need to
be addressed in a different commit.
Fixes: b673e24aad ("wireguard: socket: remove errant restriction on looping to self")
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
The selftests currently parse the kernel log at the end to track
potential memory leaks. With these tests now reading off the end of the
buffer, due to recent optimizations, some creation messages were lost,
making the tests think that there was a free without an alloc. Fix this
by increasing the kernel log size.
Fixes: 24b70eeeb4 ("wireguard: use synchronize_net rather than synchronize_rcu")
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Prior patch:
]# TESTS=vlmc_filtering_test ./bridge_vlan_mcast.sh
TEST: Vlan multicast snooping enable [ OK ]
Device "bridge" does not exist.
TEST: Disable multicast vlan snooping when vlan filtering is disabled [FAIL]
Vlan filtering is disabled but multicast vlan snooping is still enabled
After patch:
# TESTS=vlmc_filtering_test ./bridge_vlan_mcast.sh
TEST: Vlan multicast snooping enable [ OK ]
TEST: Disable multicast vlan snooping when vlan filtering is disabled [ OK ]
Fixes: f5a9dd58f4 ("selftests: net: bridge: add test for vlan_filtering dependency")
Cc: Nikolay Aleksandrov <nikolay@nvidia.com>
Signed-off-by: Ivan Vecera <ivecera@redhat.com>
Acked-by: Nikolay Aleksandrov <nikolay@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In BSD and abstract address cases, we store sockets in the hash table with
keys between 0 and UNIX_HASH_SIZE - 1. However, the hash saved in a socket
varies depending on its address type; sockets with BSD addresses always
have UNIX_HASH_SIZE in their unix_sk(sk)->addr->hash.
This is just for the UNIX_ABSTRACT() macro used to check the address type.
The difference of the saved hashes comes from the first byte of the address
in the first place. So, we can test it directly.
Then we can keep a real hash in each socket and replace unix_table_lock
with per-hash locks in the later patch.
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.co.jp>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Add a test for dependency of mcast_vlan_snooping on vlan_filtering. If
vlan_filtering gets disabled, then mcast_vlan_snooping must be
automatically disabled as well.
TEST: Disable multicast vlan snooping when vlan filtering is disabled [ OK ]
Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Add tests for the new per-port/vlan mcast_router option, verify that
unknown multicast packets are flooded only to router ports.
TEST: Port vlan 10 option mcast_router default value [ OK ]
TEST: Port vlan 10 mcast_router option changed to 2 [ OK ]
TEST: Flood unknown vlan multicast packets to router port only [ OK ]
Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Add tests which change the new per-vlan mcast_query_interval and verify
the new value is in effect, also add a test to change
mcast_query_response_interval's value.
TEST: Vlan mcast_query_interval global option default value [ OK ]
TEST: Vlan 10 mcast_query_interval option changed to 200 [ OK ]
TEST: Vlan mcast_query_response_interval global option default value [ OK ]
TEST: Vlan 10 mcast_query_response_interval option changed to 200 [ OK ]
Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Add tests which change the new per-vlan mcast_querier_interval and
verify that it is taken into account when an outside querier is present.
TEST: Vlan mcast_querier_interval global option default value [ OK ]
TEST: Vlan 10 mcast_querier_interval option changed to 100 [ OK ]
TEST: Vlan 10 mcast_querier_interval expire after outside query [ OK ]
Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Add a test which changes the new per-vlan mcast_membership_interval and
verifies that a newly learned mdb entry would expire in that interval.
TEST: Vlan mcast_membership_interval global option default value [ OK ]
TEST: Vlan 10 mcast_membership_interval option changed to 200 [ OK ]
TEST: Vlan 10 mcast_membership_interval mdb entry expire [ OK ]
Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Add tests which change the new per-vlan startup query count/interval
options and verify the proper number of queries are sent in the expected
interval.
TEST: Vlan mcast_startup_query_interval global option default value [ OK ]
TEST: Vlan mcast_startup_query_count global option default value [ OK ]
TEST: Vlan 10 mcast_startup_query_interval option changed to 100 [ OK ]
TEST: Vlan 10 mcast_startup_query_count option changed to 3 [ OK ]
Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Add tests which verify the default values of mcast_last_member_count
mcast_last_member_count and also try to change them.
TEST: Vlan mcast_last_member_count global option default value [ OK ]
TEST: Vlan mcast_last_member_interval global option default value [ OK ]
TEST: Vlan 10 mcast_last_member_count option changed to 3 [ OK ]
TEST: Vlan 10 mcast_last_member_interval option changed to 200 [ OK ]
Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Add tests which change the new per-vlan IGMP/MLD versions and verify
that proper tagged general query packets are sent.
TEST: Vlan mcast_igmp_version global option default value [ OK ]
TEST: Vlan mcast_mld_version global option default value [ OK ]
TEST: Vlan 10 mcast_igmp_version option changed to 3 [ OK ]
TEST: Vlan 10 tagged IGMPv3 general query sent [ OK ]
TEST: Vlan 10 mcast_mld_version option changed to 2 [ OK ]
TEST: Vlan 10 tagged MLDv2 general query sent [ OK ]
Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Add a test to try the new global vlan mcast_querier control and also
verify that tagged general query packets are properly generated when
querier is enabled for a single vlan.
TEST: Vlan mcast_querier global option default value [ OK ]
TEST: Vlan 10 multicast querier enable [ OK ]
TEST: Vlan 10 tagged IGMPv2 general query sent [ OK ]
TEST: Vlan 10 tagged MLD general query sent [ OK ]
Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Add the first test for bridge per-vlan multicast snooping which checks
if control of the global and per-vlan options work as expected, joins
and leaves are tested at each option value.
TEST: Vlan multicast snooping enable [ OK ]
TEST: Vlan global options existence [ OK ]
TEST: Vlan mcast_snooping global option default value [ OK ]
TEST: Vlan 10 multicast snooping control [ OK ]
Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Current release - regressions:
- r8169: fix incorrect mac address assignment
- vlan: fix underflow for the real_dev refcnt when vlan creation fails
- smc: avoid warning of possible recursive locking
Current release - new code bugs:
- vsock/virtio: suppress used length validation
- neigh: fix crash in v6 module initialization error path
Previous releases - regressions:
- af_unix: fix change in behavior in read after shutdown
- igb: fix netpoll exit with traffic, avoid warning
- tls: fix splice_read() when starting mid-record
- lan743x: fix deadlock in lan743x_phy_link_status_change()
- marvell: prestera: fix bridge port operation
Previous releases - always broken:
- tcp_cubic: fix spurious Hystart ACK train detections for
not-cwnd-limited flows
- nexthop: fix refcount issues when replacing IPv6 groups
- nexthop: fix null pointer dereference when IPv6 is not enabled
- phylink: force link down and retrigger resolve on interface change
- mptcp: fix delack timer length calculation and incorrect early
clearing
- ieee802154: handle iftypes as u32, prevent shift-out-of-bounds
- nfc: virtual_ncidev: change default device permissions
- netfilter: ctnetlink: fix error codes and flags used for kernel side
filtering of dumps
- netfilter: flowtable: fix IPv6 tunnel addr match
- ncsi: align payload to 32-bit to fix dropped packets
- iavf: fix deadlock and loss of config during VF interface reset
- ice: avoid bpf_prog refcount underflow
- ocelot: fix broken PTP over IP and PTP API violations
Misc:
- marvell: mvpp2: increase MTU limit when XDP enabled
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEE6jPA+I1ugmIBA4hXMUZtbf5SIrsFAmGhSCwACgkQMUZtbf5S
IrvAdw//aR54mgc9rc0mkvS5sbDeKzDscmTzav5ANGjl+2ooKTOe8Qd07s59z6TJ
H9IJlTu0Uc9Psbb2RvRo1T1HDohSpWy7SEN/Qlo6N+z1WzDHWbuXyC/KTQDM+8I1
coMYBBTwBGkblBosuoMUi60GWLbBslLv9gR7HUZj7gbtxMfk36BrX5UYz1ONy+tx
HiVshtOmzOgumBi+/j0tkI4lpI/ajf9eYaG6Vvd0A6F3idcbhWKNKfLPgw9qQF36
sQrbz1SYwL5Ucgk47EG+Lpk7oSzbkdNoO6Ro9ncsebB8OMoLUhddclmG/fbgPG0o
SWJ4kK3kmaRSTvSi6q4e5BM89oIhtFWhGRB6vURokrAQU1Ds+sq5F+8IwCaMqEYb
GNyEZ8cdJhLc50RU+/Im3lN6IrRHvQiirE1BN+ZuCMjeSTrsqX18ZYMh1pSJhxkZ
wRC03sSd2ZcaooFrSNJ5Scr3ndacrWNtVr78IQYCNrTjqJn1QUK7ZegTjP04FUfD
JLB7+en8Hd6EKosJLKyoAPRwoFPZN6mDAPC6RfF45B3OoZAHbvXmJrOT6PatcqHe
i0YwDkAJKPRijfcepN1IQYlY2Za5HwNWzCV6v0bf4tUCluDsSkczTKS02dZ1hegR
oYW1Ra1BIyYK4cbG4H0lD7iBQGLGgwt38U1NlFpawbJa/fECUSs=
=LtZ7
-----END PGP SIGNATURE-----
Merge tag 'net-5.16-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from Jakub Kicinski:
"Networking fixes, including fixes from netfilter.
Current release - regressions:
- r8169: fix incorrect mac address assignment
- vlan: fix underflow for the real_dev refcnt when vlan creation
fails
- smc: avoid warning of possible recursive locking
Current release - new code bugs:
- vsock/virtio: suppress used length validation
- neigh: fix crash in v6 module initialization error path
Previous releases - regressions:
- af_unix: fix change in behavior in read after shutdown
- igb: fix netpoll exit with traffic, avoid warning
- tls: fix splice_read() when starting mid-record
- lan743x: fix deadlock in lan743x_phy_link_status_change()
- marvell: prestera: fix bridge port operation
Previous releases - always broken:
- tcp_cubic: fix spurious Hystart ACK train detections for
not-cwnd-limited flows
- nexthop: fix refcount issues when replacing IPv6 groups
- nexthop: fix null pointer dereference when IPv6 is not enabled
- phylink: force link down and retrigger resolve on interface change
- mptcp: fix delack timer length calculation and incorrect early
clearing
- ieee802154: handle iftypes as u32, prevent shift-out-of-bounds
- nfc: virtual_ncidev: change default device permissions
- netfilter: ctnetlink: fix error codes and flags used for kernel
side filtering of dumps
- netfilter: flowtable: fix IPv6 tunnel addr match
- ncsi: align payload to 32-bit to fix dropped packets
- iavf: fix deadlock and loss of config during VF interface reset
- ice: avoid bpf_prog refcount underflow
- ocelot: fix broken PTP over IP and PTP API violations
Misc:
- marvell: mvpp2: increase MTU limit when XDP enabled"
* tag 'net-5.16-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (94 commits)
net: dsa: microchip: implement multi-bridge support
net: mscc: ocelot: correctly report the timestamping RX filters in ethtool
net: mscc: ocelot: set up traps for PTP packets
net: ptp: add a definition for the UDP port for IEEE 1588 general messages
net: mscc: ocelot: create a function that replaces an existing VCAP filter
net: mscc: ocelot: don't downgrade timestamping RX filters in SIOCSHWTSTAMP
net: hns3: fix incorrect components info of ethtool --reset command
net: hns3: fix one incorrect value of page pool info when queried by debugfs
net: hns3: add check NULL address for page pool
net: hns3: fix VF RSS failed problem after PF enable multi-TCs
net: qed: fix the array may be out of bound
net/smc: Don't call clcsock shutdown twice when smc shutdown
net: vlan: fix underflow for the real_dev refcnt
ptp: fix filter names in the documentation
ethtool: ioctl: fix potential NULL deref in ethtool_set_coalesce()
nfc: virtual_ncidev: change default device permissions
net/sched: sch_ets: don't peek at classes beyond 'nbands'
net: stmmac: Disable Tx queues when reconfiguring the interface
selftests: tls: test for correct proto_ops
tls: fix replacing proto_ops
...
With the elevated 'KVM_CAP_MAX_VCPUS' value kvm_create_max_vcpus test
may hit RLIMIT_NOFILE limits:
# ./kvm_create_max_vcpus
KVM_CAP_MAX_VCPU_ID: 4096
KVM_CAP_MAX_VCPUS: 1024
Testing creating 1024 vCPUs, with IDs 0...1023.
/dev/kvm not available (errno: 24), skipping test
Adjust RLIMIT_NOFILE limits to make sure KVM_CAP_MAX_VCPUS fds can be
opened. Note, raising hard limit ('rlim_max') requires CAP_SYS_RESOURCE
capability which is generally not needed to run kvm selftests (but without
raising the limit the test is doomed to fail anyway).
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Message-Id: <20211123135953.667434-1-vkuznets@redhat.com>
[Skip the test if the hard limit can be raised. - Paolo]
Reviewed-by: Sean Christopherson <seanjc@google.com>
Tested-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
hyperv_features's sole purpose is to test access to various Hyper-V MSRs
and hypercalls with different CPUID data. As KVM_SET_CPUID2 after KVM_RUN
is deprecated and soon-to-be forbidden, avoid it by re-creating test VM
for each sub-test.
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Message-Id: <20211122175818.608220-2-vkuznets@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Ensure that the ASID are freed promptly, which becomes more important
when more tests are added to this file.
Cc: Peter Gonda <pgonda@google.com>
Cc: Sean Christopherson <seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
KVM_CAP_VM_MOVE_ENC_CONTEXT_FROM leaves the source VM in a dead state,
so migrating back to the original source VM fails the ioctl. Adjust
the test.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Previous patch fixes overriding callbacks incorrectly. Triggering
the crash in sendpage_locked would be more spectacular but it's
hard to get to, so take the easier path of proving this is broken
and call getname. We're currently getting IPv4 socket info on an
IPv6 socket.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Similar to previous patch, just copy over necessary struct into local
stack variable before checking its fields.
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20211124002325.1737739-14-andrii@kernel.org
Buf can be not zero-terminated leading to strstr() to access data beyond
the intended buf[] array. Fix by forcing zero termination.
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20211124002325.1737739-12-andrii@kernel.org
Perfbuf doesn't guarantee 8-byte alignment of the data like BPF ringbuf
does, so struct get_stack_trace_t can arrive not properly aligned for
subsequent u64 accesses. Easiest fix is to just copy data locally.
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20211124002325.1737739-10-andrii@kernel.org
Prevent sanitizer from complaining about passing NULL into memcpy(),
even if it happens with zero size.
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20211124002325.1737739-9-andrii@kernel.org
Test is using __int128 variable as unsigned and highest order bit can be
set to 1 after bit shift. Use unsigned __int128 explicitly and prevent
UBSan from complaining.
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20211124002325.1737739-8-andrii@kernel.org
Conversion is straightforward for most cases. In few cases tests are
using mutable map_flags and attribute structs, but bpf_map_create_opts
can be used in the similar fashion, so there were no problems. Just lots
of repetitive conversions.
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20211124193233.3115996-5-andrii@kernel.org
Add selftest that combines two BPF programs within single BPF object
file such that one of the programs is using global variables, but can be
skipped at runtime on old kernels that don't support global data.
Another BPF program is written with the goal to be runnable on very old
kernels and only relies on explicitly accessed BPF maps.
Such test, run against old kernels (e.g., libbpf CI will run it against 4.9
kernel that doesn't support global data), allows to test the approach
and ensure that libbpf doesn't make unnecessary assumption about
necessary kernel features.
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Song Liu <songliubraving@fb.com>
Link: https://lore.kernel.org/bpf/20211123200105.387855-2-andrii@kernel.org
The /sys/class/dax compatibility option has shipped in the kernel for 4
years now which should be sufficient time for tools to abandon the old
ABI in favor of the /sys/bus/dax device-model. Delete it now and see if
anyone screams.
Since this compatibility option shipped there has been more reports of
users being surprised by the compat ABI than surprised by the "new", so
the compat infrastructure has outlived its usefulness. Recall that
/sys/bus/dax device-model is required for the dax kmem driver which
allows PMEM to be used as "System RAM".
The following projects were known to have a dependency on /sys/class/dax
and have dropped their dependency as of the listed version:
- ndctl (including libndctl, daxctl, and libdaxctl): v64+
- fio: v3.13+
- pmdk: v1.5.2+
As further evidence this option is no longer needed some distributions
have already stopped enabling CONFIG_DEV_DAX_PMEM_COMPAT.
Cc: Ira Weiny <ira.weiny@intel.com>
Cc: Dave Jiang <dave.jiang@intel.com>
Reported-by: Vishal Verma <vishal.l.verma@intel.com>
Acked-by: Dave Hansen <dave.hansen@linux.intel.com>
Reviewed-by: Jane Chu <jane.chu@oracle.com>
Link: https://lore.kernel.org/r/163701116195.3784476.726128179293466337.stgit@dwillia2-desk3.amr.corp.intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
The commit 1295e2cf30 ("inet: minor optimization for backlog setting in
listen(2)") added change so that sk_max_ack_backlog is initialised earlier
in inet_dccp_listen() and inet_listen(). Since then, we no longer use
backlog in inet_csk_listen_start(), so let's remove it.
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.co.jp>
Acked-by: Yafang Shao <laoar.shao@gmail.com>
Reviewed-by: Richard Sailer <richard_siegfried@systemli.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Fix trivial typo in comment from 'oveflow' to 'overflow'.
Reported-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Signed-off-by: Drew Fustini <dfustini@baylibre.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20211122070528.837806-1-dfustini@baylibre.com
The new selftest runs a sequence which causes circular refcount
dependency between deleted objects which cannot be released and results
in a netdevice refcount imbalance.
Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
old tc(iproute2-5.9.0) output:
action order 1: bpf action.o:[action-ok] id 60 tag bcf7977d3b93787c jited default-action pipe
newer tc(iproute2-5.14.0) output:
action order 1: bpf action.o:[action-ok] id 64 name tag bcf7977d3b93787c jited default-action pipe
It can fix below errors:
# ok 260 f84a - Add cBPF action with invalid bytecode
# not ok 261 e939 - Add eBPF action with valid object-file
# Could not match regex pattern. Verify command output:
# total acts 0
#
# action order 1: bpf action.o:[action-ok] id 42 name tag bcf7977d3b93787c jited default-action pipe
# index 667 ref 1 bind 0
Signed-off-by: Li Zhijian <zhijianx.li@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
We should not always presume all kernels use pfifo_fast as the default qdisc.
For example, a fq_codel qdisk could have below output:
qdisc fq_codel 0: parent 1:4 limit 10240p flows 1024 quantum 1514 target 5ms interval 100ms memory_limit 32Mb ecn drop_batch 64
Reported-by: kernel test robot <lkp@intel.com>
Suggested-by: Peilin Ye <peilin.ye@bytedance.com>
Signed-off-by: Li Zhijian <zhijianx.li@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
No hard dependencies here, just skip if test environ lacks
nft binary or the needed kernel config options.
The test case spawns listener in ns2 but ns1 will connect
to the ip address of ns4.
policy routing + tproxy rule will redirect packets to ns2 instead
of forward.
v3: - update mptcp/config (Mat Martineau)
- more verbose SKIP messages in mptcp_connect.sh
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
bpf_program__set_extra_flags has just been introduced so we can still
change it without breaking users.
This new interface is a bit more flexible (for example if someone wants
to clear a flag).
Signed-off-by: Florent Revest <revest@chromium.org>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20211119180035.1396139-1-revest@chromium.org
Add an artificial minimal example simulating compilers producing two
different types within a single CU that correspond to identical struct
definitions.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20211117194114.347675-2-andrii@kernel.org
Pablo Neira Ayuso says:
====================
Netfilter fixes for net
The following patchset contains Netfilter/IPVS fixes for net:
1) Add selftest for vrf+conntrack, from Florian Westphal.
2) Extend nfqueue selftest to cover nfqueue, also from Florian.
3) Remove duplicated include in nft_payload, from Wan Jiabing.
4) Several improvements to the nat port shadowing selftest,
from Phil Sutter.
5) Fix filtering of reply tuple in ctnetlink, from Florent Fourcot.
6) Do not override error with -EINVAL in filter setup path, also
from Florent.
7) Honor sysctl_expire_nodest_conn regardless conn_reuse_mode for
reused connections, from yangxingwu.
8) Replace snprintf() by sysfs_emit() in xt_IDLETIMER as reported
by Coccinelle, from Jing Yao.
9) Incorrect IPv6 tunnel match in flowtable offload, from Will
Mortensen.
10) Switch port shadow selftest to use socat, from Florian Westphal.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Current release - regressions:
- devlink: don't throw an error if flash notification sent before
devlink visible
- page_pool: Revert "page_pool: disable dma mapping support...",
turns out there are active arches who need it
Current release - new code bugs:
- amt: cancel delayed_work synchronously in amt_fini()
Previous releases - regressions:
- xsk: fix crash on double free in buffer pool
- bpf: fix inner map state pruning regression causing program
rejections
- mac80211: drop check for DONT_REORDER in __ieee80211_select_queue,
preventing mis-selecting the best effort queue
- mac80211: do not access the IV when it was stripped
- mac80211: fix radiotap header generation, off-by-one
- nl80211: fix getting radio statistics in survey dump
- e100: fix device suspend/resume
Previous releases - always broken:
- tcp: fix uninitialized access in skb frags array for Rx 0cp
- bpf: fix toctou on read-only map's constant scalar tracking
- bpf: forbid bpf_ktime_get_coarse_ns and bpf_timer_* in tracing progs
- tipc: only accept encrypted MSG_CRYPTO msgs
- smc: transfer remaining wait queue entries during fallback,
fix missing wake ups
- udp: validate checksum in udp_read_sock() (when sockmap is used)
- sched: act_mirred: drop dst for the direction from egress to ingress
- virtio_net_hdr_to_skb: count transport header in UFO, prevent
allowing bad skbs into the stack
- nfc: reorder the logic in nfc_{un,}register_device, fix unregister
- ipsec: check return value of ipv6_skip_exthdr
- usb: r8152: add MAC passthrough support for more Lenovo Docks
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEE6jPA+I1ugmIBA4hXMUZtbf5SIrsFAmGWf08ACgkQMUZtbf5S
Irt+lxAAj8FAoLoSmQKUK3LttLLh0ZQQXu8Riey+wrP8Z9Yp8xWXIaVRF1c0vCE6
clbrF+mLfk6Wvv/RzOgwyBMHvK+djr/oVDNSmjlRvss4MLDfOQZhUV8V4XpvF4Up
hI7wyKfHtd7niosNqil6wklJFpLU8WyIAWrPSIPE6JlPkJmcm3GUGsliwEPwdLY1
yl7z4zsxigjA+hKxYqNQX6tixF3xnbDUbAnWshrSPL89melwz4GMao45qmcxJEVr
EipPhKifk0hT067jG08FMXcKBFKt6rKk7SVNo4mtq8Tl6HleJBj8fdaJAjSdFahB
+rYJ0sDZwGoDL5CxZ5mD3fM1cDgh4WFEM0z//0b/bZhoPDRKEpLr9LPuv+N6+/rA
8D98EHsvyNjlFgdyd8celMstiGtBn4YLEoLNYYh9Qibgm0XsCuv0yox7g0AOLMmQ
QiBmh2EnaXNPQ8PRZNMK3VH5ol2KoYWL6yrpJYV+wOWVLfezwlSsjkPSfW5pF9FG
hU0iQBp/YTCdCadR9YLj8qfDWDUAkCN7WpqIu9EA9FXJcYjJVaix0MA/tAVlzXyR
xlB7cU6O5NABcs/+04zPkKLwKbVYNMqgvKE+FVDVm+BKxo0UMxcmz/Np/ZYxfhkF
bwKplaiPb2H4D6t0sdxqaeYirPwt1BcleLilae6vHG1jO90H9Vw=
=tlqV
-----END PGP SIGNATURE-----
Merge tag 'net-5.16-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from Jakub Kicinski:
"Including fixes from bpf, mac80211.
Current release - regressions:
- devlink: don't throw an error if flash notification sent before
devlink visible
- page_pool: Revert "page_pool: disable dma mapping support...",
turns out there are active arches who need it
Current release - new code bugs:
- amt: cancel delayed_work synchronously in amt_fini()
Previous releases - regressions:
- xsk: fix crash on double free in buffer pool
- bpf: fix inner map state pruning regression causing program
rejections
- mac80211: drop check for DONT_REORDER in __ieee80211_select_queue,
preventing mis-selecting the best effort queue
- mac80211: do not access the IV when it was stripped
- mac80211: fix radiotap header generation, off-by-one
- nl80211: fix getting radio statistics in survey dump
- e100: fix device suspend/resume
Previous releases - always broken:
- tcp: fix uninitialized access in skb frags array for Rx 0cp
- bpf: fix toctou on read-only map's constant scalar tracking
- bpf: forbid bpf_ktime_get_coarse_ns and bpf_timer_* in tracing
progs
- tipc: only accept encrypted MSG_CRYPTO msgs
- smc: transfer remaining wait queue entries during fallback, fix
missing wake ups
- udp: validate checksum in udp_read_sock() (when sockmap is used)
- sched: act_mirred: drop dst for the direction from egress to
ingress
- virtio_net_hdr_to_skb: count transport header in UFO, prevent
allowing bad skbs into the stack
- nfc: reorder the logic in nfc_{un,}register_device, fix unregister
- ipsec: check return value of ipv6_skip_exthdr
- usb: r8152: add MAC passthrough support for more Lenovo Docks"
* tag 'net-5.16-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (96 commits)
ptp: ocp: Fix a couple NULL vs IS_ERR() checks
net: ethernet: dec: tulip: de4x5: fix possible array overflows in type3_infoblock()
net: tulip: de4x5: fix the problem that the array 'lp->phy[8]' may be out of bound
ipv6: check return value of ipv6_skip_exthdr
e100: fix device suspend/resume
devlink: Don't throw an error if flash notification sent before devlink visible
page_pool: Revert "page_pool: disable dma mapping support..."
ethernet: hisilicon: hns: hns_dsaf_misc: fix a possible array overflow in hns_dsaf_ge_srst_by_port()
octeontx2-af: debugfs: don't corrupt user memory
NFC: add NCI_UNREG flag to eliminate the race
NFC: reorder the logic in nfc_{un,}register_device
NFC: reorganize the functions in nci_request
tipc: check for null after calling kmemdup
i40e: Fix display error code in dmesg
i40e: Fix creation of first queue by omitting it if is not power of two
i40e: Fix warning message and call stack during rmmod i40e driver
i40e: Fix ping is lost after configuring ADq on VF
i40e: Fix changing previously set num_queue_pairs for PFs
i40e: Fix NULL ptr dereference on VSI filter sync
i40e: Fix correct max_pkt_size on VF RX queue
...
* Cleanups for the perf test infrastructure and mapping hugepages
* Avoid contention on mmap_sem when the guests start to run
* Add event channel upcall support to xen_shinfo_test
x86 changes:
* Fixes for Xen emulation
* Kill kvm_map_gfn() / kvm_unmap_gfn() and broken gfn_to_pfn_cache
* Fixes for migration of 32-bit nested guests on 64-bit hypervisor
* Compilation fixes
* More SEV cleanups
Generic:
* Cap the return value of KVM_CAP_NR_VCPUS to both KVM_CAP_MAX_VCPUS
and num_online_cpus(). Most architectures were only using one of the two.
-----BEGIN PGP SIGNATURE-----
iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmGV/PAUHHBib256aW5p
QHJlZGhhdC5jb20ACgkQv/vSX3jHroMrogf/eAyilGRQL7lLETn3DTVlgLVv82+z
giX11HlUhUmATHIDluj/wVQUjVcY6AO4SnvFaudX7B+mibndkw4L19IubP/koQZu
xnKSJTn+mVANdzz3UdsHl0ujbPdQJaFCIPW6iewbn2GRRZMwA5F3vMK/H09XRApL
I7kq8CPA6sC0I3TPzPN3ROxigexzYunZmGQ4qQe0GUdtxHrJOYQN++ddmWbQoEIC
gdFTyF7CUQ+lmJe0b/Y88yhISFAJCEBuKFlg9tOTuxSfwvPX6lUu+pi+utEx9M+O
ckTSQli/apZ4RVcSzxMIwX/BciYqhqOz5uMG+w4DRlJixtGSHtjiEVxGxw==
=Iij4
-----END PGP SIGNATURE-----
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull KVM fixes from Paolo Bonzini:
"Selftest changes:
- Cleanups for the perf test infrastructure and mapping hugepages
- Avoid contention on mmap_sem when the guests start to run
- Add event channel upcall support to xen_shinfo_test
x86 changes:
- Fixes for Xen emulation
- Kill kvm_map_gfn() / kvm_unmap_gfn() and broken gfn_to_pfn_cache
- Fixes for migration of 32-bit nested guests on 64-bit hypervisor
- Compilation fixes
- More SEV cleanups
Generic:
- Cap the return value of KVM_CAP_NR_VCPUS to both KVM_CAP_MAX_VCPUS
and num_online_cpus(). Most architectures were only using one of
the two"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (42 commits)
KVM: x86: Cap KVM_CAP_NR_VCPUS by KVM_CAP_MAX_VCPUS
KVM: s390: Cap KVM_CAP_NR_VCPUS by num_online_cpus()
KVM: RISC-V: Cap KVM_CAP_NR_VCPUS by KVM_CAP_MAX_VCPUS
KVM: PPC: Cap KVM_CAP_NR_VCPUS by KVM_CAP_MAX_VCPUS
KVM: MIPS: Cap KVM_CAP_NR_VCPUS by KVM_CAP_MAX_VCPUS
KVM: arm64: Cap KVM_CAP_NR_VCPUS by kvm_arm_default_max_vcpus()
KVM: x86: Assume a 64-bit hypercall for guests with protected state
selftests: KVM: Add /x86_64/sev_migrate_tests to .gitignore
riscv: kvm: fix non-kernel-doc comment block
KVM: SEV: Fix typo in and tweak name of cmd_allowed_from_miror()
KVM: SEV: Drop a redundant setting of sev->asid during initialization
KVM: SEV: WARN if SEV-ES is marked active but SEV is not
KVM: SEV: Set sev_info.active after initial checks in sev_guest_init()
KVM: SEV: Disallow COPY_ENC_CONTEXT_FROM if target has created vCPUs
KVM: Kill kvm_map_gfn() / kvm_unmap_gfn() and gfn_to_pfn_cache
KVM: nVMX: Use a gfn_to_hva_cache for vmptrld
KVM: nVMX: Use kvm_read_guest_offset_cached() for nested VMCS check
KVM: x86/xen: Use sizeof_field() instead of open-coding it
KVM: nVMX: Use kvm_{read,write}_guest_cached() for shadow_vmcs12
KVM: x86/xen: Fix get_attr of KVM_XEN_ATTR_TYPE_SHARED_INFO
...
[1] added s390 support to libbpf CI and added an ${ARCH} prefix to a
number of paths and identifiers in libbpf GitHub repo, which vmtest.sh
relies upon. Update these and make use of the new s390 support.
[1] https://github.com/libbpf/libbpf/pull/204
Co-developed-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20211118115225.1349726-1-iii@linux.ibm.com
$ git status
nothing to commit, working tree clean
$
$ make -C tools/testing/selftests/kvm/ > /dev/null 2>&1
$ git status
Untracked files:
(use "git add <file>..." to include in what will be committed)
tools/testing/selftests/kvm/x86_64/sev_migrate_tests
nothing added to commit but untracked files present (use "git add" to track)
$
Fixes: 6a58150859 ("selftest: KVM: Add intra host migration tests")
Cc: Brijesh Singh <brijesh.singh@amd.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Marc Orr <marcorr@google.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Peter Gonda <pgonda@google.com>
Cc: Sean Christopherson <seanjc@google.com>
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Message-Id: <YZPIPfvYgRDCZi/w@kernel.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Add support to inet v4 raw sockets for binding to nonlocal addresses
through the IP_FREEBIND and IP_TRANSPARENT socket options, as well as
the ipv4.ip_nonlocal_bind kernel parameter.
Add helper function to inet_sock.h to check for bind address validity on
the base of the address type and whether nonlocal address are enabled
for the socket via any of the sockopts/sysctl, deduplicating checks in
ipv4/ping.c, ipv4/af_inet.c, ipv6/af_inet6.c (for mapped v4->v6
addresses), and ipv4/raw.c.
Add test cases with IP[V6]_FREEBIND verifying that both v4 and v6 raw
sockets support binding to nonlocal addresses after the change. Add
necessary support for the test cases to nettest.
Signed-off-by: Riccardo Paolo Bestetti <pbl@bestov.io>
Reviewed-by: David Ahern <dsahern@kernel.org>
Link: https://lore.kernel.org/r/20211117090010.125393-1-pbl@bestov.io
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
xsk_configure_umem() needs hugepages to work in unaligned mode. So when
hugepages are not configured, 'unaligned' tests should be skipped which
is determined by the helper function hugepages_present(). This function
erroneously returns true with MAP_NORESERVE flag even when no hugepages
are configured. The removal of this flag fixes the issue.
The test TEST_TYPE_UNALIGNED_INV_DESC also needs to be skipped when
there are no hugepages. However, this was not skipped as there was no
check for presence of hugepages and hence was failing. The check to skip
the test has now been added.
Fixes: a4ba98dd0c (selftests: xsk: Add test for unaligned mode)
Signed-off-by: Tirthendu Sarkar <tirthendu.sarkar@intel.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20211117123613.22288-1-tirthendu.sarkar@intel.com
add a selftest that verifies the correct behavior of TC act_mirred egress
to ingress: in particular, it checks if the dst_entry is removed from skb
before redirect egress -> ingress. The correct behavior is: an ICMP 'echo
request' generated by ping will be received and generate a reply the same
way as the one generated by mausezahn.
Suggested-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: Davide Caratti <dcaratti@redhat.com>
Acked-by: Cong Wang <cong.wang@bytedance.com>
Reviewed-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Daniel Borkmann says:
====================
pull-request: bpf 2021-11-16
We've added 12 non-merge commits during the last 5 day(s) which contain
a total of 23 files changed, 573 insertions(+), 73 deletions(-).
The main changes are:
1) Fix pruning regression where verifier went overly conservative rejecting
previsouly accepted programs, from Alexei Starovoitov and Lorenz Bauer.
2) Fix verifier TOCTOU bug when using read-only map's values as constant
scalars during verification, from Daniel Borkmann.
3) Fix a crash due to a double free in XSK's buffer pool, from Magnus Karlsson.
4) Fix libbpf regression when cross-building runqslower, from Jean-Philippe Brucker.
5) Forbid use of bpf_ktime_get_coarse_ns() and bpf_timer_*() helpers in tracing
programs due to deadlock possibilities, from Dmitrii Banshchikov.
6) Fix checksum validation in sockmap's udp_read_sock() callback, from Cong Wang.
7) Various BPF sample fixes such as XDP stats in xdp_sample_user, from Alexander Lobakin.
8) Fix libbpf gen_loader error handling wrt fd cleanup, from Kumar Kartikeya Dwivedi.
* https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf:
udp: Validate checksum in udp_read_sock()
bpf: Fix toctou on read-only map's constant scalar tracking
samples/bpf: Fix build error due to -isystem removal
selftests/bpf: Add tests for restricted helpers
bpf: Forbid bpf_ktime_get_coarse_ns and bpf_timer_* in tracing progs
libbpf: Perform map fd cleanup for gen_loader in case of error
samples/bpf: Fix incorrect use of strlen in xdp_redirect_cpu
tools/runqslower: Fix cross-build
samples/bpf: Fix summary per-sec stats in xdp_sample_user
selftests/bpf: Check map in map pruning
bpf: Fix inner map state pruning regression.
xsk: Fix crash on double free in buffer pool
====================
Link: https://lore.kernel.org/r/20211116141134.6490-1-daniel@iogearbox.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
- Cleanups for the perf test infrastructure and mapping hugepages
- Avoid contention on mmap_sem when the guests start to run
- Add event channel upcall support to xen_shinfo_test
Add benchmark to measure overhead of uprobes and uretprobes. Also have
a baseline (no uprobe attached) benchmark.
On my dev machine, baseline benchmark can trigger 130M user_target()
invocations. When uprobe is attached, this falls to just 700K. With
uretprobe, we get down to 520K:
$ sudo ./bench trig-uprobe-base -a
Summary: hits 131.289 ± 2.872M/s
# UPROBE
$ sudo ./bench -a trig-uprobe-without-nop
Summary: hits 0.729 ± 0.007M/s
$ sudo ./bench -a trig-uprobe-with-nop
Summary: hits 1.798 ± 0.017M/s
# URETPROBE
$ sudo ./bench -a trig-uretprobe-without-nop
Summary: hits 0.508 ± 0.012M/s
$ sudo ./bench -a trig-uretprobe-with-nop
Summary: hits 0.883 ± 0.008M/s
So there is almost 2.5x performance difference between probing nop vs
non-nop instruction for entry uprobe. And 1.7x difference for uretprobe.
This means that non-nop uprobe overhead is around 1.4 microseconds for uprobe
and 2 microseconds for non-nop uretprobe.
For nop variants, uprobe and uretprobe overhead is down to 0.556 and
1.13 microseconds, respectively.
For comparison, just doing a very low-overhead syscall (with no BPF
programs attached anywhere) gives:
$ sudo ./bench trig-base -a
Summary: hits 4.830 ± 0.036M/s
So uprobes are about 2.67x slower than pure context switch.
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20211116013041.4072571-1-andrii@kernel.org
Script test_bpftool_synctypes.py parses a number of files in the bpftool
directory (or even elsewhere in the repo) to make sure that the list of
types or options in those different files are consistent. Instead of
having fixed paths, let's make the directories configurable through
environment variable. This should make easier in the future to run the
script in a different setup, for example on an out-of-tree bpftool
mirror with a different layout.
Signed-off-by: Quentin Monnet <quentin@isovalent.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20211115225844.33943-4-quentin@isovalent.com
test_bpftool_synctypes.py helps detecting inconsistencies in bpftool
between the different list of types and options scattered in the
sources, the documentation, and the bash completion. For options that
apply to all bpftool commands, the script had a hardcoded list of
values, and would use them to check whether the man pages are
up-to-date. When writing the script, it felt acceptable to have this
list in order to avoid to open and parse bpftool's main.h every time,
and because the list of global options in bpftool doesn't change so
often.
However, this is prone to omissions, and we recently added a new
-l|--legacy option which was described in common_options.rst, but not
listed in the options summary of each manual page. The script did not
complain, because it keeps comparing the hardcoded list to the (now)
outdated list in the header file.
To address the issue, this commit brings the following changes:
- Options that are common to all bpftool commands (--json, --pretty, and
--debug) are moved to a dedicated file, and used in the definition of
a RST substitution. This substitution is used in the sources of all
the man pages.
- This list of common options is updated, with the addition of the new
-l|--legacy option.
- The script test_bpftool_synctypes.py is updated to compare:
- Options specific to a command, found in C files, for the
interactive help messages, with the same specific options from the
relevant man page for that command.
- Common options, checked just once: the list in main.h is
compared with the new list in substitutions.rst.
Signed-off-by: Quentin Monnet <quentin@isovalent.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20211115225844.33943-3-quentin@isovalent.com
Change memslot_modification_stress_test to use perf_test_destroy_vm
instead of manually calling ucall_uninit and kvm_vm_free.
No functional change intended.
Signed-off-by: David Matlack <dmatlack@google.com>
Reviewed-by: Ben Gardon <bgardon@google.com>
Message-Id: <20211111001257.1446428-5-dmatlack@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Thread creation requires taking the mmap_sem in write mode, which causes
vCPU threads running in guest mode to block while they are populating
memory. Fix this by waiting for all vCPU threads to be created and start
running before entering guest mode on any one vCPU thread.
This substantially improves the "Populate memory time" when using 1GiB
pages since it allows all vCPUs to zero pages in parallel rather than
blocking because a writer is waiting (which is waiting for another vCPU
that is busy zeroing a 1GiB page).
Before:
$ ./dirty_log_perf_test -v256 -s anonymous_hugetlb_1gb
...
Populate memory time: 52.811184013s
After:
$ ./dirty_log_perf_test -v256 -s anonymous_hugetlb_1gb
...
Populate memory time: 10.204573342s
Signed-off-by: David Matlack <dmatlack@google.com>
Message-Id: <20211111001257.1446428-4-dmatlack@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Move vCPU thread creation and joining to common helper functions. This
is in preparation for the next commit which ensures that all vCPU
threads are fully created before entering guest mode on any one
vCPU.
No functional change intended.
Signed-off-by: David Matlack <dmatlack@google.com>
Reviewed-by: Ben Gardon <bgardon@google.com>
Message-Id: <20211111001257.1446428-3-dmatlack@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Start at iteration 0 instead of -1 to avoid having to initialize
vcpu_last_completed_iteration when setting up vCPU threads. This
simplifies the next commit where we move vCPU thread initialization
out to a common helper.
No functional change intended.
Signed-off-by: David Matlack <dmatlack@google.com>
Message-Id: <20211111001257.1446428-2-dmatlack@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Copy perf_test_args to the guest during VM creation instead of relying on
the caller to do so at their leisure. Ideally, tests wouldn't even be
able to modify perf_test_args, i.e. they would have no motivation to do
the sync, but enforcing that is arguably a net negative for readability.
No functional change intended.
[Set wr_fract=1 by default and add helper to override it since the new
access_tracking_perf_test needs to set it dynamically.]
Signed-off-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: David Matlack <dmatlack@google.com>
Reviewed-by: Ben Gardon <bgardon@google.com>
Message-Id: <20211111000310.1435032-13-dmatlack@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Fill the per-vCPU args when creating the perf_test VM instead of having
the caller do so. This helps ensure that any adjustments to the number
of pages (and thus vcpu_memory_bytes) are reflected in the per-VM args.
Automatically filling the per-vCPU args will also allow a future patch
to do the sync to the guest during creation.
Signed-off-by: Sean Christopherson <seanjc@google.com>
[Updated access_tracking_perf_test as well.]
Signed-off-by: David Matlack <dmatlack@google.com>
Reviewed-by: Ben Gardon <bgardon@google.com>
Message-Id: <20211111000310.1435032-12-dmatlack@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Use the already computed guest_num_pages when creating the so called
extra VM pages for a perf test, and add a comment explaining why the
pages are allocated as extra pages.
Signed-off-by: Sean Christopherson <seanjc@google.com>
Reviewed-by: Ben Gardon <bgardon@google.com>
Signed-off-by: David Matlack <dmatlack@google.com>
Message-Id: <20211111000310.1435032-11-dmatlack@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Remove perf_test_args.host_page_size and instead use getpagesize() so
that it's somewhat obvious that, for tests that care about the host page
size, they care about the system page size, not the hardware page size,
e.g. that the logic is unchanged if hugepages are in play.
No functional change intended.
Signed-off-by: Sean Christopherson <seanjc@google.com>
Reviewed-by: Ben Gardon <bgardon@google.com>
Signed-off-by: David Matlack <dmatlack@google.com>
Message-Id: <20211111000310.1435032-10-dmatlack@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Move the per-VM GPA into perf_test_args instead of storing it as a
separate global variable. It's not obvious that guest_test_phys_mem
holds a GPA, nor that it's connected/coupled with per_vcpu->gpa.
No functional change intended.
Signed-off-by: Sean Christopherson <seanjc@google.com>
Reviewed-by: Ben Gardon <bgardon@google.com>
Signed-off-by: David Matlack <dmatlack@google.com>
Message-Id: <20211111000310.1435032-9-dmatlack@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Grab the per-vCPU GPA and number of pages from perf_util in the demand
paging test instead of duplicating perf_util's calculations.
Note, this may or may not result in a functional change. It's not clear
that the test's calculations are guaranteed to yield the same value as
perf_util, e.g. if guest_percpu_mem_size != vcpu_args->pages.
Signed-off-by: Sean Christopherson <seanjc@google.com>
Reviewed-by: Ben Gardon <bgardon@google.com>
Signed-off-by: David Matlack <dmatlack@google.com>
Message-Id: <20211111000310.1435032-8-dmatlack@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Capture the per-vCPU GPA in perf_test_vcpu_args so that tests can get
the GPA without having to calculate the GPA on their own.
No functional change intended.
Signed-off-by: Sean Christopherson <seanjc@google.com>
Reviewed-by: Ben Gardon <bgardon@google.com>
Signed-off-by: David Matlack <dmatlack@google.com>
Message-Id: <20211111000310.1435032-7-dmatlack@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Use 'pta' as a local pointer to the global perf_tests_args in order to
shorten line lengths and make the code borderline readable.
No functional change intended.
Signed-off-by: Sean Christopherson <seanjc@google.com>
Reviewed-by: Ben Gardon <bgardon@google.com>
Signed-off-by: David Matlack <dmatlack@google.com>
Message-Id: <20211111000310.1435032-6-dmatlack@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Assert that the GPA for a memslot backed by a hugepage is aligned to
the hugepage size and fix perf_test_util accordingly. Lack of GPA
alignment prevents KVM from backing the guest with hugepages, e.g. x86's
write-protection of hugepages when dirty logging is activated is
otherwise not exercised.
Add a comment explaining that guest_page_size is for non-huge pages to
try and avoid confusion about what it actually tracks.
Cc: Ben Gardon <bgardon@google.com>
Cc: Yanan Wang <wangyanan55@huawei.com>
Cc: Andrew Jones <drjones@redhat.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: Aaron Lewis <aaronlewis@google.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
[Used get_backing_src_pagesz() to determine alignment dynamically.]
Signed-off-by: David Matlack <dmatlack@google.com>
Message-Id: <20211111000310.1435032-5-dmatlack@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Manually padding and aligning the mmap region is only needed when using
THP. When using HugeTLB, mmap will always return an address aligned to
the HugeTLB page size. Add a comment to clarify this and assert the mmap
behavior for HugeTLB.
[Removed requirement that HugeTLB mmaps must be padded per Yanan's
feedback and added assertion that mmap returns aligned addresses
when using HugeTLB.]
Cc: Ben Gardon <bgardon@google.com>
Cc: Yanan Wang <wangyanan55@huawei.com>
Cc: Andrew Jones <drjones@redhat.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: Aaron Lewis <aaronlewis@google.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: David Matlack <dmatlack@google.com>
Message-Id: <20211111000310.1435032-4-dmatlack@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Refactor align() to work with non-pointers and split into separate
helpers for aligning up vs. down. Add align_ptr_up() for use with
pointers. Expose all helpers so that they can be used by tests and/or
other utilities. The align_down() helper in particular will be used to
ensure gpa alignment for hugepages.
No functional change intended.
[Added sepearate up/down helpers and replaced open-coded alignment
bit math throughout the KVM selftests.]
Signed-off-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: David Matlack <dmatlack@google.com>
Reviewed-by: Ben Gardon <bgardon@google.com>
Message-Id: <20211111000310.1435032-3-dmatlack@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Explicitly state the indices when populating vm_guest_mode_params to
make it marginally easier to visualize what's going on.
No functional change intended.
Signed-off-by: Sean Christopherson <seanjc@google.com>
Reviewed-by: Ben Gardon <bgardon@google.com>
[Added indices for new guest modes.]
Signed-off-by: David Matlack <dmatlack@google.com>
Message-Id: <20211111000310.1435032-2-dmatlack@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
When I first looked at this, there was no support for guest exception
handling in the KVM selftests. In fact it was merged into 5.10 before
the Xen support got merged in 5.11, and I could have used it from the
start.
Hook it up now, to exercise the Xen upcall delivery. I'm about to make
things a bit more interesting by handling the full 2level event channel
stuff in-kernel on top of the basic vector injection that we already
have, and I'll want to build more tests on top.
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Message-Id: <20211115165030.7422-3-dwmw2@infradead.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Without previous libbpf patch, the following error will occur:
$ ./test_progs -t btf
...
do_test_dedup:FAIL:check btf_dedup failed errno:-22#13/205 btf/dedup: btf_type_tag #5, struct:FAIL
And the previous libbpf patch fixed the issue.
Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20211115163943.3922547-1-yhs@fb.com
This patch adds tests that bpf_ktime_get_coarse_ns(), bpf_timer_* and
bpf_spin_lock()/bpf_spin_unlock() helpers are forbidden in tracing progs
as their use there may result in various locking issues.
Signed-off-by: Dmitrii Banshchikov <me@ubique.spb.ru>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20211113142227.566439-3-me@ubique.spb.ru
Each thread executing in an enclave is associated with a Thread Control
Structure (TCS). The SGX test enclave contains two hardcoded TCS, thus
supporting two threads in the enclave.
Add a test to ensure it is possible to enter enclave at both entrypoints.
Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org>
Acked-by: Dave Hansen <dave.hansen@linux.intel.com>
Link: https://lkml.kernel.org/r/7be151a57b4c7959a2364753b995e0006efa3da1.1636997631.git.reinette.chatre@intel.com
Each thread executing in an enclave is associated with a Thread Control
Structure (TCS). The test enclave contains two hardcoded TCS. Each TCS
contains meta-data used by the hardware to save and restore thread specific
information when entering/exiting the enclave.
The two TCS structures within the test enclave share their SSA (State Save
Area) resulting in the threads clobbering each other's data. Fix this by
providing each TCS their own SSA area.
Additionally, there is an 8K stack space and its address is
computed from the enclave entry point which is correctly done for
TCS #1 that starts on the first address inside the enclave but
results in out of bounds memory when entering as TCS #2. Split 8K
stack space into two separate pages with offset symbol between to ensure
the current enclave entry calculation can continue to be used for both
threads.
While using the enclave with multiple threads requires these fixes the
impact is not apparent because every test up to this point enters the
enclave from the first TCS.
More detail about the stack fix:
-------------------------------
Before this change the test enclave (test_encl) looks as follows:
.tcs (2 pages):
(page 1) TCS #1
(page 2) TCS #2
.text (1 page)
One page of code
.data (5 pages)
(page 1) encl_buffer
(page 2) encl_buffer
(page 3) SSA
(page 4 and 5) STACK
encl_stack:
As shown above there is a symbol, encl_stack, that points to the end of the
.data segment (pointing to the end of page 5 in .data) which is also the
end of the enclave.
The enclave entry code computes the stack address by adding encl_stack to
the pointer to the TCS that entered the enclave. When entering at TCS #1
the stack is computed correctly but when entering at TCS #2 the stack
pointer would point to one page beyond the end of the enclave and a #PF
would result when TCS #2 attempts to enter the enclave.
The fix involves moving the encl_stack symbol between the two stack pages.
Doing so enables the stack address computation in the entry code to compute
the correct stack address for each TCS.
Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org>
Acked-by: Dave Hansen <dave.hansen@linux.intel.com>
Link: https://lkml.kernel.org/r/a49dc0d85401db788a0a3f0d795e848abf3b1f44.1636997631.git.reinette.chatre@intel.com
The Enclave Page Cache Map (EPCM) is a secure structure used by the
processor to track the contents of the enclave page cache. The EPCM
contains permissions with which enclave pages can be accessed. SGX
support allows EPCM and PTE page permissions to differ - as long as
the PTE permissions do not exceed the EPCM permissions.
Add a test that:
(1) Creates an SGX enclave page with writable EPCM permission.
(2) Changes the PTE permission on the page to read-only. This should
be permitted because the permission does not exceed the EPCM
permission.
(3) Attempts a write to the page. This should generate a page fault
(#PF) because of the read-only PTE even though the EPCM
permissions allow the page to be written to.
This introduces the first test of SGX exception handling. In this test
the issue that caused the exception (PTE page permissions) can be fixed
from outside the enclave and after doing so it is possible to re-enter
enclave at original entrypoint with ERESUME.
Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org>
Acked-by: Dave Hansen <dave.hansen@linux.intel.com>
Link: https://lkml.kernel.org/r/3bcc73a4b9fe8780bdb40571805e7ced59e01df7.1636997631.git.reinette.chatre@intel.com
SGX selftests prepares a data structure outside of the enclave with
the type of and data for the operation that needs to be run within
the enclave. At this time only two complementary operations are supported
by the enclave: copying a value from outside the enclave into a default
buffer within the enclave and reading a value from the enclave's default
buffer into a variable accessible outside the enclave.
In preparation for more operations supported by the enclave the names of the
current enclave operations are changed to more accurately reflect the
operations and more easily distinguish it from future operations:
* The enums ENCL_OP_PUT and ENCL_OP_GET are renamed to ENCL_OP_PUT_TO_BUFFER
and ENCL_OP_GET_FROM_BUFFER respectively.
* The structs encl_op_put and encl_op_get are renamed to encl_op_put_to_buf
and encl_op_get_from_buf respectively.
* The enclave functions do_encl_op_put and do_encl_op_get are renamed to
do_encl_op_put_to_buf and do_encl_op_get_from_buf respectively.
No functional changes.
Suggested-by: Jarkko Sakkinen <jarkko@kernel.org>
Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Acked-by: Jarkko Sakkinen <jarkko@kernel.org>
Acked-by: Dave Hansen <dave.hansen@linux.intel.com>
Link: https://lkml.kernel.org/r/023fda047c787cf330b88ed9337705edae6a0078.1636997631.git.reinette.chatre@intel.com
To add more operations to the test enclave, the protocol needs to allow
to have operations with varying parameters. Create a separate parameter
struct for each existing operation, with the shared parameters in struct
encl_op_header.
[reinette: rebased to apply on top of oversubscription test series]
Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org>
Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Acked-by: Dave Hansen <dave.hansen@linux.intel.com>
Link: https://lkml.kernel.org/r/f9a4a8c436b538003b8ebddaa66083992053cef1.1636997631.git.reinette.chatre@intel.com
Add a variation of the unclobbered_vdso test.
In the new test, create a heap for the test enclave, which has the same
size as all available Enclave Page Cache (EPC) pages in the system. This
will guarantee that all test_encl.elf pages *and* SGX Enclave Control
Structure (SECS) have been swapped out by the page reclaimer during the
load time.
This test will trigger both the page reclaimer and the page fault handler.
The page reclaimer triggered, while the heap is being created during the
load time. The page fault handler is triggered for all the required pages,
while the test case is executing.
Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org>
Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Acked-by: Dave Hansen <dave.hansen@linux.intel.com>
Link: https://lkml.kernel.org/r/41f7c508eea79a3198b5014d7691903be08f9ff1.1636997631.git.reinette.chatre@intel.com
Create the test enclave inside each TEST_F(), instead of FIXTURE_SETUP(),
so that the heap size can be defined per test.
Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org>
Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Acked-by: Dave Hansen <dave.hansen@linux.intel.com>
Link: https://lkml.kernel.org/r/70ca264535d2ca0dc8dcaf2281e7d6965f8d4a24.1636997631.git.reinette.chatre@intel.com
Introduce setup_test_encl() so that the enclave creation can be moved to
TEST_F()'s. This is required for a reclaimer test where the heap size needs
to be set large enough to triger the page reclaimer.
Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org>
Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Acked-by: Dave Hansen <dave.hansen@linux.intel.com>
Link: https://lkml.kernel.org/r/bee0ca867a95828a569c1ba2a8e443a44047dc71.1636997631.git.reinette.chatre@intel.com
Logging is always a compromise between clarity and detail. The main use
case for dumping VMA's is when FIXTURE_SETUP() fails, and is less important
for enclaves that do initialize correctly. Therefore, print the segments
and /proc/self/maps only in the error case.
Finally, if a single test ever creates multiple enclaves, the amount of
log lines would become enormous.
Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org>
Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Acked-by: Dave Hansen <dave.hansen@linux.intel.com>
Link: https://lkml.kernel.org/r/23cef0ae1de3a8a74cbfbbe74eca48ca3f300fde.1636997631.git.reinette.chatre@intel.com
Create a heap for the test enclave, which is allocated from /dev/null,
and left unmeasured. This is beneficial by its own because it verifies
that an enclave built from multiple choices, works properly. If LSM
hooks are added for SGX some day, a multi source enclave has higher
probability to trigger bugs on access control checks.
The immediate need comes from the need to implement page reclaim tests.
In order to trigger the page reclaimer, one can just set the size of
the heap to high enough.
Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org>
Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Acked-by: Dave Hansen <dave.hansen@linux.intel.com>
Link: https://lkml.kernel.org/r/e070c5f23578c29608051cab879b1d276963a27a.1636997631.git.reinette.chatre@intel.com
For a heap makes sense to leave its contents "unmeasured" in the SGX
enclave build process, meaning that they won't contribute to the
cryptographic signature (a RSA-3072 signed SHA56 hash) of the enclave.
Enclaves are signed blobs where the signature is calculated both from
page data and also from "structural properties" of the pages. For
instance a page offset of *every* page added to the enclave is hashed.
For data, this is optional, not least because hashing a page has a
significant contribution to the enclave load time. Thus, where there is
no reason to hash, do not. The SGX ioctl interface supports this with
SGX_PAGE_MEASURE flag. Only when the flag is *set*, data is measured.
Add seg->measure boolean flag to struct encl_segment. Only when the
flag is set, include the segment data to the signature (represented
by SIGSTRUCT architectural structure).
Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org>
Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Acked-by: Dave Hansen <dave.hansen@linux.intel.com>
Link: https://lkml.kernel.org/r/625b6fe28fed76275e9238ec4e15ec3c0d87de81.1636997631.git.reinette.chatre@intel.com
Define source per segment so that enclave pages can be added from different
sources, e.g. anonymous VMA for zero pages. In other words, add 'src' field
to struct encl_segment, and assign it to 'encl->src' for pages inherited
from the enclave binary.
Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org>
Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Acked-by: Dave Hansen <dave.hansen@linux.intel.com>
Link: https://lkml.kernel.org/r/7850709c3089fe20e4bcecb8295ba87c54cc2b4a.1636997631.git.reinette.chatre@intel.com
The enclave binary (test_encl.elf) is built with only three sections (tcs,
text, and data) as controlled by its custom linker script.
If gcc is built with "--enable-linker-build-id" (this appears to be a
common configuration even if it is by default off) then gcc
will pass "--build-id" to the linker that will prompt it (the linker) to
write unique bits identifying the linked file to a ".note.gnu.build-id"
section.
The section ".note.gnu.build-id" does not exist in the test enclave
resulting in the following warning emitted by the linker:
/usr/bin/ld: warning: .note.gnu.build-id section discarded, --build-id
ignored
The test enclave does not use the build id within the binary so fix the
warning by passing a build id of "none" to the linker that will disable the
setting from any earlier "--build-id" options and thus disable the attempt
to write the build id to a ".note.gnu.build-id" section that does not
exist.
Link: https://lore.kernel.org/linux-sgx/20191017030340.18301-2-sean.j.christopherson@intel.com/
Suggested-by: Cedric Xing <cedric.xing@intel.com>
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org>
Acked-by: Dave Hansen <dave.hansen@linux.intel.com>
Link: https://lkml.kernel.org/r/ca0f8a81fc1e78af9bdbc6a88e0f9c37d82e53f2.1636997631.git.reinette.chatre@intel.com
Now that cxl_acpi has been converted to use the core ACPI CEDT sub-table
parser, update cxl_test to inject CFMWS and CHBS data directly into
cxl_acpi's handlers.
Cc: Alison Schofield <alison.schofield@intel.com>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Link: https://lore.kernel.org/r/163553711363.2509508.17428994087868269952.stgit@dwillia2-desk3.amr.corp.intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Add mocked health information for cxl_test memdevs. This allows
cxl-cli's 'list' command to display the canned health_info fields.
Cc: Dan Williams <dan.j.williams@intel.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Vishal Verma <vishal.l.verma@intel.com>
Link: https://lore.kernel.org/r/20211018051251.2289112-1-vishal.l.verma@intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
The 'struct cxl_mem' object actually represents the state of a CXL
device within the driver. Comments indicating that 'struct cxl_mem' is a
device itself are incorrect. It is data layered on top of a CXL Memory
Expander class device. Rename it 'struct cxl_dev_state'. The 'struct'
cxl_memdev' structure represents a Linux CXL memory device object, and
it uses services and information provided by 'struct cxl_dev_state'.
Update the structure name, function names, and the kdocs to reflect the
real uses of this structure.
Some helper functions that were previously prefixed "cxl_mem_" are
renamed to just "cxl_".
Acked-by: Ben Widawsky <ben.widawsky@intel.com>
Signed-off-by: Ira Weiny <ira.weiny@intel.com>
Link: https://lore.kernel.org/r/20211102202901.3675568-3-ira.weiny@intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Daniel Borkmann says:
====================
pull-request: bpf-next 2021-11-15
We've added 72 non-merge commits during the last 13 day(s) which contain
a total of 171 files changed, 2728 insertions(+), 1143 deletions(-).
The main changes are:
1) Add btf_type_tag attributes to bring kernel annotations like __user/__rcu to
BTF such that BPF verifier will be able to detect misuse, from Yonghong Song.
2) Big batch of libbpf improvements including various fixes, future proofing APIs,
and adding a unified, OPTS-based bpf_prog_load() low-level API, from Andrii Nakryiko.
3) Add ingress_ifindex to BPF_SK_LOOKUP program type for selectively applying the
programmable socket lookup logic to packets from a given netdev, from Mark Pashmfouroush.
4) Remove the 128M upper JIT limit for BPF programs on arm64 and add selftest to
ensure exception handling still works, from Russell King and Alan Maguire.
5) Add a new bpf_find_vma() helper for tracing to map an address to the backing
file such as shared library, from Song Liu.
6) Batch of various misc fixes to bpftool, fixing a memory leak in BPF program dump,
updating documentation and bash-completion among others, from Quentin Monnet.
7) Deprecate libbpf bpf_program__get_prog_info_linear() API and migrate its users as
the API is heavily tailored around perf and is non-generic, from Dave Marchevsky.
8) Enable libbpf's strict mode by default in bpftool and add a --legacy option as an
opt-out for more relaxed BPF program requirements, from Stanislav Fomichev.
9) Fix bpftool to use libbpf_get_error() to check for errors, from Hengqi Chen.
* https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (72 commits)
bpftool: Use libbpf_get_error() to check error
bpftool: Fix mixed indentation in documentation
bpftool: Update the lists of names for maps and prog-attach types
bpftool: Fix indent in option lists in the documentation
bpftool: Remove inclusion of utilities.mak from Makefiles
bpftool: Fix memory leak in prog_dump()
selftests/bpf: Fix a tautological-constant-out-of-range-compare compiler warning
selftests/bpf: Fix an unused-but-set-variable compiler warning
bpf: Introduce btf_tracing_ids
bpf: Extend BTF_ID_LIST_GLOBAL with parameter for number of IDs
bpftool: Enable libbpf's strict mode by default
docs/bpf: Update documentation for BTF_KIND_TYPE_TAG support
selftests/bpf: Clarify llvm dependency with btf_tag selftest
selftests/bpf: Add a C test for btf_type_tag
selftests/bpf: Rename progs/tag.c to progs/btf_decl_tag.c
selftests/bpf: Test BTF_KIND_DECL_TAG for deduplication
selftests/bpf: Add BTF_KIND_TYPE_TAG unit tests
selftests/bpf: Test libbpf API function btf__add_type_tag()
bpftool: Support BTF_KIND_TYPE_TAG
libbpf: Support BTF_KIND_TYPE_TAG
...
====================
Link: https://lore.kernel.org/r/20211115162008.25916-1-daniel@iogearbox.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
All the CFLAGS options were incorrectly removed in the recent rework
of the GPIO selftests. While some of the flags were specific to the old
implementation the remainder are still relevant. Restore those options.
Signed-off-by: Kent Gibson <warthog618@gmail.com>
Signed-off-by: Bartosz Golaszewski <brgl@bgdev.pl>
When compiled with -Wall gpio-mockup-cdev.c reports an uninitialised
variable warning. This is a false positive, as the variable is ignored
in the case it is uninitialised, but initialise the variable anyway
to remove the warning.
Signed-off-by: Kent Gibson <warthog618@gmail.com>
Signed-off-by: Bartosz Golaszewski <brgl@bgdev.pl>
The gpio selftests build against the system includes rather than the
headers from the linux tree. This results in the compile failing if
the system includes are outdated.
Prefer the headers from the linux tree, as per other selftests.
Fixes: 8bc395a6a2 ("selftests: gpio: rework and simplify test implementation")
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
[Kent: reworded commit comment and added Fixes:]
Signed-off-by: Kent Gibson <warthog618@gmail.com>
Signed-off-by: Bartosz Golaszewski <brgl@bgdev.pl>
There are now at least three distinct flavours of netcat/nc tool:
'original' version, one version ported from openbsd and nmap-ncat.
The script only works with original because it sets SOREUSEPORT option.
Other nc versions return 'port already in use' error and port shadow test fails:
PASS: inet IPv6 redirection for ns2-hMHcaRvx
nc: bind failed: Address already in use
ERROR: portshadow test default: got reply from "ROUTER", not CLIENT as intended
Switch to socat instead.
Reported-by: kernel test robot <oliver.sang@intel.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* Guest API and guest kernel support for SEV live migration
* SEV and SEV-ES intra-host migration
Bugfixes and cleanups for x86:
* Fix misuse of gfn-to-pfn cache when recording guest steal time / preempted status
* Fix selftests on APICv machines
* Fix sparse warnings
* Fix detection of KVM features in CPUID
* Cleanups for bogus writes to MSR_KVM_PV_EOI_EN
* Fixes and cleanups for MSR bitmap handling
* Cleanups for INVPCID
* Make x86 KVM_SOFT_MAX_VCPUS consistent with other architectures
Bugfixes for ARM:
* Fix finalization of host stage2 mappings
* Tighten the return value of kvm_vcpu_preferred_target()
* Make sure the extraction of ESR_ELx.EC is limited to architected bits
-----BEGIN PGP SIGNATURE-----
iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmGO1usUHHBib256aW5p
QHJlZGhhdC5jb20ACgkQv/vSX3jHroOsXgf/d2CgZmTxZii/pOt0JKGSDKwRdoti
pfbmtXwTkagqg2tIJtEsxwvcc26Q2VLyg9mlEelX1I8KVlexSa8mDtbGWHLFilsc
JIZY8lE96wtlr9AyuN06K44QingDMIbXzlxcwO+muS3zTlSPsNkPcaVDk5PL35sN
Wy2GA6GCLWv6iTLCtlX5EzbcgkrR+Mypj4lGSdXGRqfBKVFOQjeVt+Q3YzOPuacS
03NBlYOmRxTnAGXfeAVs0/zEa69u/dYjBmwDPe6ERma4u8thHv0U/fDAh5C/ES7q
42Thes/JOYnEZiBQ1xZLsHqRII0achbJKZhmxqPwjRf/u6eXsfz0KY9FBg==
=kN8U
-----END PGP SIGNATURE-----
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull more kvm updates from Paolo Bonzini:
"New x86 features:
- Guest API and guest kernel support for SEV live migration
- SEV and SEV-ES intra-host migration
Bugfixes and cleanups for x86:
- Fix misuse of gfn-to-pfn cache when recording guest steal time /
preempted status
- Fix selftests on APICv machines
- Fix sparse warnings
- Fix detection of KVM features in CPUID
- Cleanups for bogus writes to MSR_KVM_PV_EOI_EN
- Fixes and cleanups for MSR bitmap handling
- Cleanups for INVPCID
- Make x86 KVM_SOFT_MAX_VCPUS consistent with other architectures
Bugfixes for ARM:
- Fix finalization of host stage2 mappings
- Tighten the return value of kvm_vcpu_preferred_target()
- Make sure the extraction of ESR_ELx.EC is limited to architected
bits"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (34 commits)
KVM: SEV: unify cgroup cleanup code for svm_vm_migrate_from
KVM: x86: move guest_pv_has out of user_access section
KVM: x86: Drop arbitrary KVM_SOFT_MAX_VCPUS
KVM: Move INVPCID type check from vmx and svm to the common kvm_handle_invpcid()
KVM: VMX: Add a helper function to retrieve the GPR index for INVPCID, INVVPID, and INVEPT
KVM: nVMX: Clean up x2APIC MSR handling for L2
KVM: VMX: Macrofy the MSR bitmap getters and setters
KVM: nVMX: Handle dynamic MSR intercept toggling
KVM: nVMX: Query current VMCS when determining if MSR bitmaps are in use
KVM: x86: Don't update vcpu->arch.pv_eoi.msr_val when a bogus value was written to MSR_KVM_PV_EOI_EN
KVM: x86: Rename kvm_lapic_enable_pv_eoi()
KVM: x86: Make sure KVM_CPUID_FEATURES really are KVM_CPUID_FEATURES
KVM: x86: Add helper to consolidate core logic of SET_CPUID{2} flows
kvm: mmu: Use fast PF path for access tracking of huge pages when possible
KVM: x86/mmu: Properly dereference rcu-protected TDP MMU sptep iterator
KVM: x86: inhibit APICv when KVM_GUESTDBG_BLOCKIRQ active
kvm: x86: Convert return type of *is_valid_rdpmc_ecx() to bool
KVM: x86: Fix recording of guest steal time / preempted status
selftest: KVM: Add intra host migration tests
selftest: KVM: Add open sev dev helper
...
Commit a985442fde ("selftests: net: properly support IPv6 in GSO GRE test")
is not compatible with:
Ncat: Version 7.80 ( https://nmap.org/ncat )
(which is distributed with Fedora/Red Hat), tests fail with:
nc: invalid option -- 'N'
Let's switch to socat which is far more dependable.
Fixes: 025efa0a82 ("selftests: add simple GSO GRE test")
Fixes: a985442fde ("selftests: net: properly support IPv6 in GSO GRE test")
Tested-by: Andrea Righi <andrea.righi@canonical.com>
Link: https://lore.kernel.org/r/20211111162929.530470-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Commit be79505caf ("tools/runqslower: Install libbpf headers when
building") uses the target libbpf to build the host bpftool, which
doesn't work when cross-building:
make ARCH=arm64 CROSS_COMPILE=aarch64-linux-gnu- -C tools/bpf/runqslower O=/tmp/runqslower
...
LINK /tmp/runqslower/bpftool/bpftool
/usr/bin/ld: /tmp/runqslower/libbpf/libbpf.a(libbpf-in.o): Relocations in generic ELF (EM: 183)
/usr/bin/ld: /tmp/runqslower/libbpf/libbpf.a: error adding symbols: file in wrong format
collect2: error: ld returned 1 exit status
When cross-building, the target architecture differs from the host. The
bpftool used for building runqslower is executed on the host, and thus
must use a different libbpf than that used for runqslower itself.
Remove the LIBBPF_OUTPUT and LIBBPF_DESTDIR parameters, so the bpftool
build makes its own library if necessary.
In the selftests, pass the host bpftool, already a prerequisite for the
runqslower recipe, as BPFTOOL_OUTPUT. The runqslower Makefile will use
the bpftool that's already built for selftests instead of making a new
one.
Fixes: be79505caf ("tools/runqslower: Install libbpf headers when building")
Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: Quentin Monnet <quentin@isovalent.com>
Link: https://lore.kernel.org/bpf/20211112155128.565680-1-jean-philippe@linaro.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Ensure that two registers with a map_value loaded from a nested
map are considered equivalent for the purpose of state pruning
and don't cause the verifier to revisit a pruning point.
This uses a rather crude match on the number of insns visited by
the verifier, which might change in the future. I've therefore
tried to keep the code as "unpruneable" as possible by having
the code paths only converge on the second to last instruction.
Should you require to adjust the test in the future, reducing the
number of processed instructions should always be safe. Increasing
them could cause another regression, so proceed with caution.
Suggested-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Lorenz Bauer <lmb@cloudflare.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/CACAyw99hVEJFoiBH_ZGyy=+oO-jyydoz6v1DeKPKs2HVsUH28w@mail.gmail.com
Link: https://lore.kernel.org/bpf/20211111161452.86864-1-lmb@cloudflare.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
When using clang to build selftests with LLVM=1 in make commandline,
I hit the following compiler warning:
benchs/bench_bloom_filter_map.c:84:46: warning: result of comparison of constant 256
with expression of type '__u8' (aka 'unsigned char') is always false
[-Wtautological-constant-out-of-range-compare]
if (args.value_size < 2 || args.value_size > 256) {
~~~~~~~~~~~~~~~ ^ ~~~
The reason is arg.vaue_size has type __u8, so comparison "args.value_size > 256"
is always false.
This patch fixed the issue by doing proper comparison before assigning the
value to args.value_size. The patch also fixed the same issue in two
other places.
Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20211112204838.3579953-1-yhs@fb.com
When using clang to build selftests with LLVM=1 in make commandline,
I hit the following compiler warning:
xdpxceiver.c:747:6: warning: variable 'total' set but not used [-Wunused-but-set-variable]
u32 total = 0;
^
This patch fixed the issue by removing that declaration and its
assocatied unused operation.
Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20211112204833.3579457-1-yhs@fb.com
btf_tag selftest needs certain llvm versions (>= llvm14).
Make it clear in the selftests README.rst file.
Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20211112012651.1508549-1-yhs@fb.com
The following is the main btf_type_tag usage in the
C test:
#define __tag1 __attribute__((btf_type_tag("tag1")))
#define __tag2 __attribute__((btf_type_tag("tag2")))
struct btf_type_tag_test {
int __tag1 * __tag1 __tag2 *p;
} g;
The bpftool raw dump with related types:
[4] INT 'int' size=4 bits_offset=0 nr_bits=32 encoding=SIGNED
[11] STRUCT 'btf_type_tag_test' size=8 vlen=1
'p' type_id=14 bits_offset=0
[12] TYPE_TAG 'tag1' type_id=16
[13] TYPE_TAG 'tag2' type_id=12
[14] PTR '(anon)' type_id=13
[15] TYPE_TAG 'tag1' type_id=4
[16] PTR '(anon)' type_id=15
[17] VAR 'g' type_id=11, linkage=global
With format C dump, we have
struct btf_type_tag_test {
int __attribute__((btf_type_tag("tag1"))) * __attribute__((btf_type_tag("tag1"))) __attribute__((btf_type_tag("tag2"))) *p;
};
The result C code is identical to the original definition except macro's are gone.
Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20211112012646.1508231-1-yhs@fb.com
Rename progs/tag.c to progs/btf_decl_tag.c so we can introduce
progs/btf_type_tag.c in the next patch.
Also create a subtest for btf_decl_tag in prog_tests/btf_tag.c
so we can introduce btf_type_tag subtest in the next patch.
I also took opportunity to remove the check whether __has_attribute
is defined or not in progs/btf_decl_tag.c since all recent
clangs should already support this macro.
Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20211112012641.1507144-1-yhs@fb.com
btf__dedup() and struct btf_dedup_opts were added before we figured out
OPTS mechanism. As such, btf_dedup_opts is non-extensible without
breaking an ABI and potentially crashing user application.
Unfortunately, btf__dedup() and btf_dedup_opts are short and succinct
names that would be great to preserve and use going forward. So we use
___libbpf_override() macro approach, used previously for bpf_prog_load()
API, to define a new btf__dedup() variant that accepts only struct btf *
and struct btf_dedup_opts * arguments, and rename the old btf__dedup()
implementation into btf__dedup_deprecated(). This keeps both source and
binary compatibility with old and new applications.
The biggest problem was struct btf_dedup_opts, which wasn't OPTS-based,
and as such doesn't have `size_t sz;` as a first field. But btf__dedup()
is a pretty rarely used API and I believe that the only currently known
users (besides selftests) are libbpf's own bpf_linker and pahole.
Neither use case actually uses options and just passes NULL. So instead
of doing extra hacks, just rewrite struct btf_dedup_opts into OPTS-based
one, move btf_ext argument into those opts (only bpf_linker needs to
dedup btf_ext, so it's not a typical thing to specify), and drop never
used `dont_resolve_fwds` option (it was never used anywhere, AFAIK, it
makes BTF dedup much less useful and efficient).
Just in case, for old implementation, btf__dedup_deprecated(), detect
non-NULL options and error out with helpful message, to help users
migrate, if there are any user playing with btf__dedup().
The last remaining piece is dedup_table_size, which is another
anachronism from very early days of BTF dedup. Since then it has been
reduced to the only valid value, 1, to request forced hash collisions.
This is only used during testing. So instead introduce a bool flag to
force collisions explicitly.
This patch also adapts selftests to new btf__dedup() and btf_dedup_opts
use to avoid selftests breakage.
[0] Closes: https://github.com/libbpf/libbpf/issues/281
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20211111053624.190580-4-andrii@kernel.org
Few clean ups and single-line simplifications. Also split CLEAN command
into multiple $(RM) invocations as it gets dangerously close to too long
argument list. Make sure that -o <output.o> is used always as the last
argument for saner verbose make output.
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20211111053624.190580-3-andrii@kernel.org
After recent refactoring bpf_prog_test_load(), used across multiple
selftests, lost ability to specify extra log_level 1 or 2 (for -vv and
-vvv, respectively). Fix that problem by using bpf_object__load_xattr()
API that supports extra log_level flags. Also restore
BPF_F_TEST_RND_HI32 prog_flags by utilizing new bpf_program__set_extra_flags()
API.
Fixes: f87c1930ac ("selftests/bpf: Merge test_stub.c into testing_helpers.c")
Reported-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20211111051758.92283-3-andrii@kernel.org
and netfilter.
Current release - regressions:
- bpf: do not reject when the stack read size is different
from the tracked scalar size
- net: fix premature exit from NAPI state polling in napi_disable()
- riscv, bpf: fix RV32 broken build, and silence RV64 warning
Current release - new code bugs:
- net: fix possible NULL deref in sock_reserve_memory
- amt: fix error return code in amt_init(); fix stopping the workqueue
- ax88796c: use the correct ioctl callback
Previous releases - always broken:
- bpf: stop caching subprog index in the bpf_pseudo_func insn
- security: fixups for the security hooks in sctp
- nfc: add necessary privilege flags in netlink layer, limit operations
to admin only
- vsock: prevent unnecessary refcnt inc for non-blocking connect
- net/smc: fix sk_refcnt underflow on link down and fallback
- nfnetlink_queue: fix OOB when mac header was cleared
- can: j1939: ignore invalid messages per standard
- bpf, sockmap:
- fix race in ingress receive verdict with redirect to self
- fix incorrect sk_skb data_end access when src_reg = dst_reg
- strparser, and tls are reusing qdisc_skb_cb and colliding
- ethtool: fix ethtool msg len calculation for pause stats
- vlan: fix a UAF in vlan_dev_real_dev() when ref-holder tries
to access an unregistering real_dev
- udp6: make encap_rcv() bump the v6 not v4 stats
- drv: prestera: add explicit padding to fix m68k build
- drv: felix: fix broken VLAN-tagged PTP under VLAN-aware bridge
- drv: mvpp2: fix wrong SerDes reconfiguration order
Misc & small latecomers:
- ipvs: auto-load ipvs on genl access
- mctp: sanity check the struct sockaddr_mctp padding fields
- libfs: support RENAME_EXCHANGE in simple_rename()
- avoid double accounting for pure zerocopy skbs
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEE6jPA+I1ugmIBA4hXMUZtbf5SIrsFAmGNQdwACgkQMUZtbf5S
IrsiMQ//f66lTJ8PJ5Qj70hX9dC897olx7uGHB9eiKoyOcJI459hFlfXwRU2T4Tf
fPNwPNUQ9Mynw9tX/jWEi+7zd6r6TSHGXK49U9/rIbQ95QjKY4LHowIE63x+vPl2
5Cpf+80zXC3DUX1fijgyG1ujnU3kBaqopTxDLmlsHw2PGkwT5Ox1DUwkhc370eEL
xlpq3PYGWA8/AQNyhSVBkG/UmoLaq0jYNP5yVcOj4jGjgcgLe1SLrqczENr35QHZ
cRkuBsFBMBZF7wSX2f9qQIB/+b1pcLlD9IO+K3S7Ruq+rUd7qfL/tmwNxEh0axYK
AyIun1Bxcy7QJGjtpGAz+Ku7jS9T3HxzyxhqilQo3co8jAW0WJ1YwHl+XPgQXyjV
DLG6Vxt4syiwsoSXGn8MQugs4nlBT+0qWl8YamIR+o7KkAYPc2QWkXlzEDfNeIW8
JNCZA3sy7VGi1ytorZGx16sQsEWnyRG9a6/WV20Dr+HVs1SKPcFzIfG6mVngR07T
mQMHnbAF6Z5d8VTcPQfMxd7UH48s1bHtk5lcSTa3j0Cw+GkA6ytTmjPdJ1qRcdkH
dl9jAfADe4O6frG+9XH7FEFqhmkghVI7bOCA4ZOhClVaIcDGgEZc2y7sY9/oZ7P4
KXBD2R5X1caCUM0UtzwL7/8ddOtPtHIrFnhY+7+I6ijt9qmI0BY=
=Ttgq
-----END PGP SIGNATURE-----
Merge tag 'net-5.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from Jakub Kicinski:
"Including fixes from bpf, can and netfilter.
Current release - regressions:
- bpf: do not reject when the stack read size is different from the
tracked scalar size
- net: fix premature exit from NAPI state polling in napi_disable()
- riscv, bpf: fix RV32 broken build, and silence RV64 warning
Current release - new code bugs:
- net: fix possible NULL deref in sock_reserve_memory
- amt: fix error return code in amt_init(); fix stopping the
workqueue
- ax88796c: use the correct ioctl callback
Previous releases - always broken:
- bpf: stop caching subprog index in the bpf_pseudo_func insn
- security: fixups for the security hooks in sctp
- nfc: add necessary privilege flags in netlink layer, limit
operations to admin only
- vsock: prevent unnecessary refcnt inc for non-blocking connect
- net/smc: fix sk_refcnt underflow on link down and fallback
- nfnetlink_queue: fix OOB when mac header was cleared
- can: j1939: ignore invalid messages per standard
- bpf, sockmap:
- fix race in ingress receive verdict with redirect to self
- fix incorrect sk_skb data_end access when src_reg = dst_reg
- strparser, and tls are reusing qdisc_skb_cb and colliding
- ethtool: fix ethtool msg len calculation for pause stats
- vlan: fix a UAF in vlan_dev_real_dev() when ref-holder tries to
access an unregistering real_dev
- udp6: make encap_rcv() bump the v6 not v4 stats
- drv: prestera: add explicit padding to fix m68k build
- drv: felix: fix broken VLAN-tagged PTP under VLAN-aware bridge
- drv: mvpp2: fix wrong SerDes reconfiguration order
Misc & small latecomers:
- ipvs: auto-load ipvs on genl access
- mctp: sanity check the struct sockaddr_mctp padding fields
- libfs: support RENAME_EXCHANGE in simple_rename()
- avoid double accounting for pure zerocopy skbs"
* tag 'net-5.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (123 commits)
selftests/net: udpgso_bench_rx: fix port argument
net: wwan: iosm: fix compilation warning
cxgb4: fix eeprom len when diagnostics not implemented
net: fix premature exit from NAPI state polling in napi_disable()
net/smc: fix sk_refcnt underflow on linkdown and fallback
net/mlx5: Lag, fix a potential Oops with mlx5_lag_create_definer()
gve: fix unmatched u64_stats_update_end()
net: ethernet: lantiq_etop: Fix compilation error
selftests: forwarding: Fix packet matching in mirroring selftests
vsock: prevent unnecessary refcnt inc for nonblocking connect
net: marvell: mvpp2: Fix wrong SerDes reconfiguration order
net: ethernet: ti: cpsw_ale: Fix access to un-initialized memory
net: stmmac: allow a tc-taprio base-time of zero
selftests: net: test_vxlan_under_vrf: fix HV connectivity test
net: hns3: allow configure ETS bandwidth of all TCs
net: hns3: remove check VF uc mac exist when set by PF
net: hns3: fix some mac statistics is always 0 in device version V2
net: hns3: fix kernel crash when unload VF while it is being reset
net: hns3: sync rx ring head in echo common pull
net: hns3: fix pfc packet number incorrect after querying pfc parameters
...
Add support for AMD SEV and SEV-ES intra-host migration support. Intra
host migration provides a low-cost mechanism for userspace VMM upgrades.
In the common case for intra host migration, we can rely on the normal
ioctls for passing data from one VMM to the next. SEV, SEV-ES, and other
confidential compute environments make most of this information opaque, and
render KVM ioctls such as "KVM_GET_REGS" irrelevant. As a result, we need
the ability to pass this opaque metadata from one VMM to the next. The
easiest way to do this is to leave this data in the kernel, and transfer
ownership of the metadata from one KVM VM (or vCPU) to the next. In-kernel
hand off makes it possible to move any data that would be
unsafe/impossible for the kernel to hand directly to userspace, and
cannot be reproduced using data that can be handed to userspace.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Adds testcases for intra host migration for SEV and SEV-ES. Also adds
locking test to confirm no deadlock exists.
Signed-off-by: Peter Gonda <pgonda@google.com>
Suggested-by: Sean Christopherson <seanjc@google.com>
Cc: Marc Orr <marcorr@google.com>
Cc: Sean Christopherson <seanjc@google.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Brijesh Singh <brijesh.singh@amd.com>
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: kvm@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Message-Id: <20211021174303.385706-6-pgonda@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Refactors out open path support from open_kvm_dev_path_or_exit() and
adds new helper for SEV device path.
Signed-off-by: Peter Gonda <pgonda@google.com>
Suggested-by: Sean Christopherson <seanjc@google.com>
Cc: Marc Orr <marcorr@google.com>
Cc: Sean Christopherson <seanjc@google.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Brijesh Singh <brijesh.singh@amd.com>
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: kvm@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Message-Id: <20211021174303.385706-5-pgonda@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The below commit added optional support for passing a bind address.
It configures the sockaddr bind arguments before parsing options and
reconfigures on options -b and -4.
This broke support for passing port (-p) on its own.
Configure sockaddr after parsing all arguments.
Fixes: 3327a9c463 ("selftests: add functionals test for UDP GRO")
Reported-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
A new field was added to the bpf_sk_lookup data that users can access.
Add tests that validate that the new ingress_ifindex field contains the
right data.
Signed-off-by: Mark Pashmfouroush <markpash@cloudflare.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20211110111016.5670-3-markpash@cloudflare.com
In commit 6de6e46d27 ("cls_flower: Fix inability to match GRE/IPIP
packets"), cls_flower was fixed to match an outer packet of a tunneled
packet as would be expected, rather than dissecting to the inner packet and
matching on that.
This fix uncovered several issues in packet matching in mirroring
selftests:
- in mirror_gre_bridge_1d_vlan.sh and mirror_gre_vlan_bridge_1q.sh, the
vlan_ethtype match is copied around as "ip", even as some of the tests
are running over ip6gretap. This is fixed by using an "ipv6" for
vlan_ethtype in the ip6gretap tests.
- in mirror_gre_changes.sh, a filter to count GRE packets is set up to
match TTL of 50. This used to trigger in the offloaded datapath, where
the envelope TTL was matched, but not in the software datapath, which
considered TTL of the inner packet. Now that both match consistently, all
the packets were double-counted. This is fixed by marking the filter as
skip_hw, leaving only the SW datapath component active.
Fixes: 6de6e46d27 ("cls_flower: Fix inability to match GRE/IPIP packets")
Signed-off-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
It looks like test_vxlan_under_vrf.sh is always failing to verify the
connectivity test during the ping between the two simulated VMs.
This is due to the fact that veth-hv in each VM should have a distinct
MAC address.
Fix by setting a unique MAC address on each simulated VM interface.
Without this fix:
$ sudo ./tools/testing/selftests/net/test_vxlan_under_vrf.sh
Checking HV connectivity [ OK ]
Check VM connectivity through VXLAN (underlay in the default VRF) [FAIL]
With this fix applied:
$ sudo ./tools/testing/selftests/net/test_vxlan_under_vrf.sh
Checking HV connectivity [ OK ]
Check VM connectivity through VXLAN (underlay in the default VRF) [ OK ]
Check VM connectivity through VXLAN (underlay in a VRF) [FAIL]
NOTE: the connectivity test with the underlay VRF is still failing; it
seems that ARP requests are blocked at the simulated hypervisor level,
probably due to some missing ARP forwarding rules. This requires more
investigation (in the meantime we may consider to set that test as
expected failure - XFAIL).
Signed-off-by: Andrea Righi <andrea.righi@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Alexei Starovoitov says:
====================
pull-request: bpf 2021-11-09
We've added 7 non-merge commits during the last 3 day(s) which contain
a total of 10 files changed, 174 insertions(+), 48 deletions(-).
The main changes are:
1) Various sockmap fixes, from John and Jussi.
2) Fix out-of-bound issue with bpf_pseudo_func, from Martin.
* https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf:
bpf, sockmap: sk_skb data_end access incorrect when src_reg = dst_reg
bpf: sockmap, strparser, and tls are reusing qdisc_skb_cb and colliding
bpf, sockmap: Fix race in ingress receive verdict with redirect to self
bpf, sockmap: Remove unhash handler for BPF sockmap usage
bpf, sockmap: Use stricter sk state checks in sk_lookup_assign
bpf: selftest: Trigger a DCE on the whole subprog
bpf: Stop caching subprog index in the bpf_pseudo_func insn
====================
Link: https://lore.kernel.org/r/20211109215702.38350-1-alexei.starovoitov@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Merge more updates from Andrew Morton:
"87 patches.
Subsystems affected by this patch series: mm (pagecache and hugetlb),
procfs, misc, MAINTAINERS, lib, checkpatch, binfmt, kallsyms, ramfs,
init, codafs, nilfs2, hfs, crash_dump, signals, seq_file, fork,
sysvfs, kcov, gdb, resource, selftests, and ipc"
* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (87 commits)
ipc/ipc_sysctl.c: remove fallback for !CONFIG_PROC_SYSCTL
ipc: check checkpoint_restore_ns_capable() to modify C/R proc files
selftests/kselftest/runner/run_one(): allow running non-executable files
virtio-mem: disallow mapping virtio-mem memory via /dev/mem
kernel/resource: disallow access to exclusive system RAM regions
kernel/resource: clean up and optimize iomem_is_exclusive()
scripts/gdb: handle split debug for vmlinux
kcov: replace local_irq_save() with a local_lock_t
kcov: avoid enable+disable interrupts if !in_task()
kcov: allocate per-CPU memory on the relevant node
Documentation/kcov: define `ip' in the example
Documentation/kcov: include types.h in the example
sysv: use BUILD_BUG_ON instead of runtime check
kernel/fork.c: unshare(): use swap() to make code cleaner
seq_file: fix passing wrong private data
seq_file: move seq_escape() to a header
signal: remove duplicate include in signal.h
crash_dump: remove duplicate include in crash_dump.h
crash_dump: fix boolreturn.cocci warning
hfs/hfsplus: use WARN_ON for sanity check
...
When running a test program, 'run_one()' checks if the program has the
execution permission and fails if it doesn't. However, it's easy to
mistakenly lose the permissions, as some common tools like 'diff' don't
support the permission change well[1]. Compared to that, making mistakes
in the test program's path would only rare, as those are explicitly listed
in 'TEST_PROGS'. Therefore, it might make more sense to resolve the
situation on our own and run the program.
For this reason, this commit makes the test program runner function still
print the warning message but to try parsing the interpreter of the
program and to explicitly run it with the interpreter, in this case.
[1] https://lore.kernel.org/mm-commits/YRJisBs9AunccCD4@kroah.com/
Link: https://lkml.kernel.org/r/20210810164534.25902-1-sj38.park@gmail.com
Signed-off-by: SeongJae Park <sjpark@amazon.de>
Suggested-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Exception handling is triggered in BPF tracing programs when a NULL pointer
is dereferenced; the exception handler zeroes the target register and
execution of the BPF program progresses.
To test exception handling then, we need to trigger a NULL pointer dereference
for a field which should never be zero; if it is, the only explanation is the
exception handler ran. task->task_works is the NULL pointer chosen (for a new
task from fork() no work is associated), and the task_works->func field should
not be zero if task_works is non-NULL. The test verifies that task_works and
task_works->func are 0.
Signed-off-by: Alan Maguire <alan.maguire@oracle.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/1636131046-5982-3-git-send-email-alan.maguire@oracle.com