Parsing of USDT arguments is architecture-specific. On aarch64 it is
relatively easy since registers used are x[0-31] and sp. Format is
slightly different compared to x86_64. Possible forms are:
- "size@[reg[,offset]]" for dereferences, e.g. "-8@[sp,76]" and "-4@[sp]";
- "size@reg" for register values, e.g. "-4@x0";
- "size@value" for raw values, e.g. "-8@1".
Signed-off-by: Alan Maguire <alan.maguire@oracle.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/1649690496-1902-2-git-send-email-alan.maguire@oracle.com
The seq argument is assigned to meta.seq twice, the second one is
redundant, remove it.
This patch also removes a redundant space in bpf_iter_link_attach().
Signed-off-by: Yuntao Wang <ytcoode@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Song Liu <songliubraving@fb.com>
Link: https://lore.kernel.org/bpf/20220410060020.307283-1-ytcoode@gmail.com
This patch implement more BPF atomic operations for RV64. The newly
added operations are shown below:
atomic[64]_[fetch_]add
atomic[64]_[fetch_]and
atomic[64]_[fetch_]or
atomic[64]_xchg
atomic[64]_cmpxchg
Since riscv specification does not provide AMO instruction for CAS
operation, we use lr/sc instruction for cmpxchg operation, and AMO
instructions for the rest ops.
Tests "test_bpf.ko" and "test_progs -t atomic" have passed, as well
as "test_verifier" with no new failure cases.
Signed-off-by: Pu Lehui <pulehui@huawei.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Song Liu <songliubraving@fb.com>
Acked-by: Björn Töpel <bjorn@kernel.org>
Link: https://lore.kernel.org/bpf/20220410101246.232875-1-pulehui@huawei.com
Yafang Shao says:
====================
We have switched to memcg-based memory accouting and thus the rlimit is
not needed any more. LIBBPF_STRICT_AUTO_RLIMIT_MEMLOCK was introduced in
libbpf for backward compatibility, so we can use it instead now.
This patchset cleanups the usage of RLIMIT_MEMLOCK in tools/bpf/,
tools/testing/selftests/bpf and samples/bpf. The file
tools/testing/selftests/bpf/bpf_rlimit.h is removed. The included header
sys/resource.h is removed from many files as it is useless in these files.
- v4: Squash patches and use customary subject prefixes. (Andrii)
- v3: Get rid of bpf_rlimit.h and fix some typos (Andrii)
- v2: Use libbpf_set_strict_mode instead. (Andrii)
- v1: https://lore.kernel.org/bpf/20220320060815.7716-2-laoar.shao@gmail.com/
====================
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
We have switched to memcg-based memory accouting and thus the rlimit is
not needed any more. LIBBPF_STRICT_AUTO_RLIMIT_MEMLOCK was introduced in
libbpf for backward compatibility, so we can use it instead now.
libbpf_set_strict_mode always return 0, so we don't need to check whether
the return value is 0 or not.
Signed-off-by: Yafang Shao <laoar.shao@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20220409125958.92629-4-laoar.shao@gmail.com
We have switched to memcg-based memory accouting and thus the rlimit is
not needed any more. LIBBPF_STRICT_AUTO_RLIMIT_MEMLOCK was introduced in
libbpf for backward compatibility, so we can use it instead now. After
this change, the header tools/testing/selftests/bpf/bpf_rlimit.h can be
removed.
This patch also removes the useless header sys/resource.h from many files
in tools/testing/selftests/bpf/.
Signed-off-by: Yafang Shao <laoar.shao@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20220409125958.92629-3-laoar.shao@gmail.com
We have switched to memcg-based memory accouting and thus the rlimit is
not needed any more. LIBBPF_STRICT_AUTO_RLIMIT_MEMLOCK was introduced in
libbpf for backward compatibility, so we can use it instead now.
This patch also removes the useless header sys/resource.h from many files
in samples/bpf.
Signed-off-by: Yafang Shao <laoar.shao@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20220409125958.92629-2-laoar.shao@gmail.com
Background:
Libbpf automatically replaces calls to BPF bpf_probe_read_{kernel,user}
[_str]() helpers with bpf_probe_read[_str](), if libbpf detects that
kernel doesn't support new APIs. Specifically, libbpf invokes the
probe_kern_probe_read_kernel function to load a small eBPF program into
the kernel in which bpf_probe_read_kernel API is invoked and lets the
kernel checks whether the new API is valid. If the loading fails, libbpf
considers the new API invalid and replaces it with the old API.
static int probe_kern_probe_read_kernel(void)
{
struct bpf_insn insns[] = {
BPF_MOV64_REG(BPF_REG_1, BPF_REG_10), /* r1 = r10 (fp) */
BPF_ALU64_IMM(BPF_ADD, BPF_REG_1, -8), /* r1 += -8 */
BPF_MOV64_IMM(BPF_REG_2, 8), /* r2 = 8 */
BPF_MOV64_IMM(BPF_REG_3, 0), /* r3 = 0 */
BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0, BPF_FUNC_probe_read_kernel),
BPF_EXIT_INSN(),
};
int fd, insn_cnt = ARRAY_SIZE(insns);
fd = bpf_prog_load(BPF_PROG_TYPE_KPROBE, NULL,
"GPL", insns, insn_cnt, NULL);
return probe_fd(fd);
}
Bug:
On older kernel versions [0], the kernel checks whether the version
number provided in the bpf syscall, matches the LINUX_VERSION_CODE.
If not matched, the bpf syscall fails. eBPF However, the
probe_kern_probe_read_kernel code does not set the kernel version
number provided to the bpf syscall, which causes the loading process
alwasys fails for old versions. It means that libbpf will replace the
new API with the old one even the kernel supports the new one.
Solution:
After a discussion in [1], the solution is using BPF_PROG_TYPE_TRACEPOINT
program type instead of BPF_PROG_TYPE_KPROBE because kernel does not
enfoce version check for tracepoint programs. I test the patch in old
kernels (4.18 and 4.19) and it works well.
[0] https://elixir.bootlin.com/linux/v4.19/source/kernel/bpf/syscall.c#L1360
[1] Closes: https://github.com/libbpf/libbpf/issues/473
Signed-off-by: Runqing Yang <rainkin1993@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20220409144928.27499-1-rainkin1993@gmail.com
Improve subtest selection logic when using -t/-a/-d parameters.
In particular, more than one subtest can be specified or a
combination of tests / subtests.
-a send_signal -d send_signal/send_signal_nmi* - runs send_signal
test without nmi tests
-a send_signal/send_signal_nmi*,find_vma - runs two send_signal
subtests and find_vma test
-a 'send_signal*' -a find_vma -d send_signal/send_signal_nmi* -
runs 2 send_signal test and find_vma test. Disables two send_signal
nmi subtests
-t send_signal -t find_vma - runs two *send_signal* tests and one
*find_vma* test
This will allow us to have granular control over which subtests
to disable in the CI system instead of disabling whole tests.
Also, add new selftest to avoid possible regression when
changing prog_test test name selection logic.
Signed-off-by: Mykola Lysenko <mykolal@fb.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20220409001750.529930-1-mykolal@fb.com
Daniel Borkmann says:
====================
pull-request: bpf-next 2022-04-09
We've added 63 non-merge commits during the last 9 day(s) which contain
a total of 68 files changed, 4852 insertions(+), 619 deletions(-).
The main changes are:
1) Add libbpf support for USDT (User Statically-Defined Tracing) probes.
USDTs are an abstraction built on top of uprobes, critical for tracing
and BPF, and widely used in production applications, from Andrii Nakryiko.
2) While Andrii was adding support for x86{-64}-specific logic of parsing
USDT argument specification, Ilya followed-up with USDT support for s390
architecture, from Ilya Leoshkevich.
3) Support name-based attaching for uprobe BPF programs in libbpf. The format
supported is `u[ret]probe/binary_path:[raw_offset|function[+offset]]`, e.g.
attaching to libc malloc can be done in BPF via SEC("uprobe/libc.so.6:malloc")
now, from Alan Maguire.
4) Various load/store optimizations for the arm64 JIT to shrink the image
size by using arm64 str/ldr immediate instructions. Also enable pointer
authentication to verify return address for JITed code, from Xu Kuohai.
5) BPF verifier fixes for write access checks to helper functions, e.g.
rd-only memory from bpf_*_cpu_ptr() must not be passed to helpers that
write into passed buffers, from Kumar Kartikeya Dwivedi.
6) Fix overly excessive stack map allocation for its base map structure and
buckets which slipped-in from cleanups during the rlimit accounting removal
back then, from Yuntao Wang.
7) Extend the unstable CT lookup helpers for XDP and tc/BPF to report netfilter
connection tracking tuple direction, from Lorenzo Bianconi.
8) Improve bpftool dump to show BPF program/link type names, Milan Landaverde.
9) Minor cleanups all over the place from various others.
* https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (63 commits)
bpf: Fix excessive memory allocation in stack_map_alloc()
selftests/bpf: Fix return value checks in perf_event_stackmap test
selftests/bpf: Add CO-RE relos into linked_funcs selftests
libbpf: Use weak hidden modifier for USDT BPF-side API functions
libbpf: Don't error out on CO-RE relos for overriden weak subprogs
samples, bpf: Move routes monitor in xdp_router_ipv4 in a dedicated thread
libbpf: Allow WEAK and GLOBAL bindings during BTF fixup
libbpf: Use strlcpy() in path resolution fallback logic
libbpf: Add s390-specific USDT arg spec parsing logic
libbpf: Make BPF-side of USDT support work on big-endian machines
libbpf: Minor style improvements in USDT code
libbpf: Fix use #ifdef instead of #if to avoid compiler warning
libbpf: Potential NULL dereference in usdt_manager_attach_usdt()
selftests/bpf: Uprobe tests should verify param/return values
libbpf: Improve string parsing for uprobe auto-attach
libbpf: Improve library identification for uprobe binary path resolution
selftests/bpf: Test for writes to map key from BPF helpers
selftests/bpf: Test passing rdonly mem to global func
bpf: Reject writes for PTR_TO_MAP_KEY in check_helper_mem_access
bpf: Check PTR_TO_MEM | MEM_RDONLY in check_helper_mem_access
...
====================
Link: https://lore.kernel.org/r/20220408231741.19116-1-daniel@iogearbox.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
The 'n_buckets * (value_size + sizeof(struct stack_map_bucket))' part of the
allocated memory for 'smap' is never used after the memlock accounting was
removed, thus get rid of it.
[ Note, Daniel:
Commit b936ca643a ("bpf: rework memlock-based memory accounting for maps")
moved `cost += n_buckets * (value_size + sizeof(struct stack_map_bucket))`
up and therefore before the bpf_map_area_alloc() allocation, sigh. In a later
step commit c85d69135a ("bpf: move memory size checks to bpf_map_charge_init()"),
and the overflow checks of `cost >= U32_MAX - PAGE_SIZE` moved into
bpf_map_charge_init(). And then 370868107b ("bpf: Eliminate rlimit-based
memory accounting for stackmap maps") finally removed the bpf_map_charge_init().
Anyway, the original code did the allocation same way as /after/ this fix. ]
Fixes: b936ca643a ("bpf: rework memlock-based memory accounting for maps")
Signed-off-by: Yuntao Wang <ytcoode@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20220407130423.798386-1-ytcoode@gmail.com
The 8000 series and newer NICs all get hardware timestamps from the MAC
and can provide timestamps on a normal TX queue, rather than via a slow
path through the MC. As such we can use this path for any packet where a
hardware timestamp is requested.
This also enables support for PTP over transports other than IPv4+UDP.
Signed-off-by: Bert Kenward <bkenward@solarflare.com>
Signed-off-by: Edward Cree <ecree@xilinx.com>
Link: https://lore.kernel.org/r/510652dc-54b4-0e11-657e-e37ee3ca26a9@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Add cable test support for Micrel KSZ9x31 PHYs.
Tested on i.MX8M Mini with KSZ9131RNX in 100/Full mode with pairs shuffled
before magnetics:
(note: Cable test started/completed messages are omitted)
mx8mm-ksz9131-a-d-connected$ ethtool --cable-test eth0
Pair A code OK
Pair B code Short within Pair
Pair B, fault length: 0.80m
Pair C code Short within Pair
Pair C, fault length: 0.80m
Pair D code OK
mx8mm-ksz9131-a-b-connected$ ethtool --cable-test eth0
Pair A code OK
Pair B code OK
Pair C code Short within Pair
Pair C, fault length: 0.00m
Pair D code Short within Pair
Pair D, fault length: 0.00m
Tested on R8A77951 Salvator-XS with KSZ9031RNX and all four pairs connected:
(note: Cable test started/completed messages are omitted)
r8a7795-ksz9031-all-connected$ ethtool --cable-test eth0
Pair A code OK
Pair B code OK
Pair C code OK
Pair D code OK
The CTRL1000 CTL1000_ENABLE_MASTER and CTL1000_AS_MASTER bits are not
restored by calling phy_init_hw(), they must be manually cached in
ksz9x31_cable_test_start() and restored at the end of
ksz9x31_cable_test_get_status().
Signed-off-by: Marek Vasut <marex@denx.de>
Cc: Heiner Kallweit <hkallweit1@gmail.com>
Cc: Oleksij Rempel <linux@rempel-privat.de>
Cc: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Tested-by: Claudiu Beznea <claudiu.beznea@microchip.com>
Link: https://lore.kernel.org/r/20220407105534.85833-1-marex@denx.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
The bpf_get_stackid() function may also return 0 on success as per UAPI BPF
helper documentation. Therefore, correct checks from 'val > 0' to 'val >= 0'
to ensure that they cover all possible success return values.
Signed-off-by: Yuntao Wang <ytcoode@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20220408041452.933944-1-ytcoode@gmail.com
Add CO-RE relocations into __weak subprogs for multi-file linked_funcs
selftest to make sure libbpf handles such combination well.
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20220408181425.2287230-4-andrii@kernel.org
Use __weak __hidden for bpf_usdt_xxx() APIs instead of much more
confusing `static inline __noinline`. This was previously impossible due
to libbpf erroring out on CO-RE relocations pointing to eliminated weak
subprogs. Now that previous patch fixed this issue, switch back to
__weak __hidden as it's a more direct way of specifying the desired
behavior.
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20220408181425.2287230-3-andrii@kernel.org
During BPF static linking, all the ELF relocations and .BTF.ext
information (including CO-RE relocations) are preserved for __weak
subprograms that were logically overriden by either previous weak
subprogram instance or by corresponding "strong" (non-weak) subprogram.
This is just how native user-space linkers work, nothing new.
But libbpf is over-zealous when processing CO-RE relocation to error out
when CO-RE relocation belonging to such eliminated weak subprogram is
encountered. Instead of erroring out on this expected situation, log
debug-level message and skip the relocation.
Fixes: db2b8b0642 ("libbpf: Support CO-RE relocations for multi-prog sections")
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20220408181425.2287230-2-andrii@kernel.org
During BTF fix up for global variables, global variable can be global
weak and will have STB_WEAK binding in ELF. Support such global
variables in addition to non-weak ones.
This is not the problem when using BPF static linking, as BPF static
linker "fixes up" BTF during generation so that libbpf doesn't have to
do it anymore during bpf_object__open(), which led to this not being
noticed for a while, along with a pretty rare (currently) use of __weak
variables and maps.
Reported-by: Hengqi Chen <hengqi.chen@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20220407230446.3980075-2-andrii@kernel.org
Coverity static analyzer complains that strcpy() can cause buffer
overflow. Use libbpf_strlcpy() instead to be 100% sure this doesn't
happen.
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20220407230446.3980075-1-andrii@kernel.org
Ilya Leoshkevich says:
====================
This series adds USDT support for s390, making the "usdt" test pass
there. Patch 1 is a collection of minor cleanups, patch 2 adds
BPF-side support, patch 3 adds userspace-side support.
====================
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
The logic is superficially similar to that of x86, but the small
differences (no need for register table and dynamic allocation of
register names, no $ sign before constants) make maintaining a common
implementation too burdensome. Therefore simply add a s390x-specific
version of parse_usdt_arg().
Note that while bcc supports index registers, this patch does not. This
should not be a problem in most cases, since s390 uses a default value
"nor" for STAP_SDT_ARG_CONSTRAINT.
Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20220407214411.257260-4-iii@linux.ibm.com
Ido Schimmel says:
====================
net/sched: Better error reporting for offload failures
This patchset improves error reporting to user space when offload fails
during the flow action setup phase. That is, when failures occur in the
actions themselves, even before calling device drivers. Requested /
reported in [1].
This is done by passing extack to the offload_act_setup() callback and
making use of it in the various actions.
Patches #1-#2 change matchall and flower to log error messages to user
space in accordance with the verbose flag.
Patch #3 passes extack to the offload_act_setup() callback from the
various call sites, including matchall and flower.
Patches #4-#11 make use of extack in the various actions to report
offload failures.
Patch #12 adds an error message when the action does not support offload
at all.
Patches #13-#14 change matchall and flower to stop overwriting more
specific error messages.
[1] https://lore.kernel.org/netdev/20220317185249.5mff5u2x624pjewv@skbuf/
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
The various error paths of tc_setup_offload_action() now report specific
error messages. Remove the generic messages to avoid overwriting the
more specific ones.
Before:
# tc filter add dev dummy0 ingress pref 1 proto ip flower skip_sw dst_ip 198.51.100.1 action police rate 100Mbit burst 10000
Error: cls_flower: Failed to setup flow action.
We have an error talking to the kernel
After:
# tc filter add dev dummy0 ingress pref 1 proto ip flower skip_sw dst_ip 198.51.100.1 action police rate 100Mbit burst 10000
Error: act_police: Offload not supported when conform/exceed action is "reclassify".
We have an error talking to the kernel
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The various error paths of tc_setup_offload_action() now report specific
error messages. Remove the generic messages to avoid overwriting the
more specific ones.
Before:
# tc filter add dev dummy0 ingress pref 1 proto all matchall skip_sw action police rate 100Mbit burst 10000
Error: cls_matchall: Failed to setup flow action.
We have an error talking to the kernel
After:
# tc filter add dev dummy0 ingress pref 1 proto all matchall skip_sw action police rate 100Mbit burst 10000
Error: act_police: Offload not supported when conform/exceed action is "reclassify".
We have an error talking to the kernel
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
For better error reporting to user space, add an extack message when the
requested action does not support offload.
Example:
# echo 1 > /sys/kernel/tracing/events/netlink/netlink_extack/enable
# tc filter add dev dummy0 ingress pref 1 proto all matchall skip_sw action nat ingress 192.0.2.1 198.51.100.1
Error: cls_matchall: Failed to setup flow action.
We have an error talking to the kernel
# cat /sys/kernel/tracing/trace_pipe
tc-181 [000] b..1. 88.406093: netlink_extack: msg=Action does not support offload
tc-181 [000] ..... 88.406108: netlink_extack: msg=cls_matchall: Failed to setup flow action
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
For better error reporting to user space, add an extack message when
vlan action offload fails.
Currently, the failure cannot be triggered, but add a message in case
the action is extended in the future to support more than the current
set of modes.
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
For better error reporting to user space, add an extack message when
tunnel_key action offload fails.
Currently, the failure cannot be triggered, but add a message in case
the action is extended in the future to support more than set/release
modes.
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
For better error reporting to user space, add extack messages when
skbedit action offload fails.
Example:
# echo 1 > /sys/kernel/tracing/events/netlink/netlink_extack/enable
# tc filter add dev dummy0 ingress pref 1 proto all matchall skip_sw action skbedit queue_mapping 1234
Error: cls_matchall: Failed to setup flow action.
We have an error talking to the kernel
# cat /sys/kernel/tracing/trace_pipe
tc-185 [002] b..1. 31.802414: netlink_extack: msg=act_skbedit: Offload not supported when "queue_mapping" option is used
tc-185 [002] ..... 31.802418: netlink_extack: msg=cls_matchall: Failed to setup flow action
# tc filter add dev dummy0 ingress pref 1 proto all matchall skip_sw action skbedit inheritdsfield
Error: cls_matchall: Failed to setup flow action.
We have an error talking to the kernel
# cat /sys/kernel/tracing/trace_pipe
tc-187 [002] b..1. 45.985145: netlink_extack: msg=act_skbedit: Offload not supported when "inheritdsfield" option is used
tc-187 [002] ..... 45.985160: netlink_extack: msg=cls_matchall: Failed to setup flow action
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
For better error reporting to user space, add extack messages when
police action offload fails.
Example:
# echo 1 > /sys/kernel/tracing/events/netlink/netlink_extack/enable
# tc filter add dev dummy0 ingress pref 1 proto all matchall skip_sw action police rate 100Mbit burst 10000
Error: cls_matchall: Failed to setup flow action.
We have an error talking to the kernel
# cat /sys/kernel/tracing/trace_pipe
tc-182 [000] b..1. 21.592969: netlink_extack: msg=act_police: Offload not supported when conform/exceed action is "reclassify"
tc-182 [000] ..... 21.592982: netlink_extack: msg=cls_matchall: Failed to setup flow action
# tc filter add dev dummy0 ingress pref 1 proto all matchall skip_sw action police rate 100Mbit burst 10000 conform-exceed drop/continue
Error: cls_matchall: Failed to setup flow action.
We have an error talking to the kernel
# cat /sys/kernel/tracing/trace_pipe
tc-184 [000] b..1. 38.882579: netlink_extack: msg=act_police: Offload not supported when conform/exceed action is "continue"
tc-184 [000] ..... 38.882593: netlink_extack: msg=cls_matchall: Failed to setup flow action
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
For better error reporting to user space, add an extack message when
pedit action offload fails.
Currently, the failure cannot be triggered, but add a message in case
the action is extended in the future to support more than set/add
commands.
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
For better error reporting to user space, add extack messages when mpls
action offload fails.
Example:
# echo 1 > /sys/kernel/tracing/events/netlink/netlink_extack/enable
# tc filter add dev dummy0 ingress pref 1 proto all matchall skip_sw action mpls dec_ttl
Error: cls_matchall: Failed to setup flow action.
We have an error talking to the kernel
# cat /sys/kernel/tracing/trace_pipe
tc-182 [000] b..1. 18.693915: netlink_extack: msg=act_mpls: Offload not supported when "dec_ttl" option is used
tc-182 [000] ..... 18.693921: netlink_extack: msg=cls_matchall: Failed to setup flow action
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
For better error reporting to user space, add an extack message when
mirred action offload fails.
Currently, the failure cannot be triggered, but add a message in case
the action is extended in the future to support more than ingress/egress
mirror/redirect.
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
For better error reporting to user space, add extack messages when gact
action offload fails.
Example:
# echo 1 > /sys/kernel/tracing/events/netlink/netlink_extack/enable
# tc filter add dev dummy0 ingress pref 1 proto all matchall skip_sw action continue
Error: cls_matchall: Failed to setup flow action.
We have an error talking to the kernel
# cat /sys/kernel/tracing/trace_pipe
tc-181 [002] b..1. 105.493450: netlink_extack: msg=act_gact: Offload of "continue" action is not supported
tc-181 [002] ..... 105.493466: netlink_extack: msg=cls_matchall: Failed to setup flow action
# tc filter add dev dummy0 ingress pref 1 proto all matchall skip_sw action reclassify
Error: cls_matchall: Failed to setup flow action.
We have an error talking to the kernel
# cat /sys/kernel/tracing/trace_pipe
tc-183 [002] b..1. 124.126477: netlink_extack: msg=act_gact: Offload of "reclassify" action is not supported
tc-183 [002] ..... 124.126489: netlink_extack: msg=cls_matchall: Failed to setup flow action
# tc filter add dev dummy0 ingress pref 1 proto all matchall skip_sw action pipe action drop
Error: cls_matchall: Failed to setup flow action.
We have an error talking to the kernel
# cat /sys/kernel/tracing/trace_pipe
tc-185 [002] b..1. 137.097791: netlink_extack: msg=act_gact: Offload of "pipe" action is not supported
tc-185 [002] ..... 137.097804: netlink_extack: msg=cls_matchall: Failed to setup flow action
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The callback is used by various actions to populate the flow action
structure prior to offload. Pass extack to this callback so that the
various actions will be able to report accurate error messages to user
space.
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The verbose flag was added in commit 81c7288b17 ("sched: cls: enable
verbose logging") to avoid suppressing logging of error messages that
occur "when the rule is not to be exclusively executed by the hardware".
However, such error messages are currently suppressed when setup of flow
action fails. Take the verbose flag into account to avoid suppressing
error messages. This is done by using the extack pointer initialized by
tc_cls_common_offload_init(), which performs the necessary checks.
Before:
# tc filter add dev dummy0 ingress pref 1 proto ip flower dst_ip 198.51.100.1 action police rate 100Mbit burst 10000
# tc filter add dev dummy0 ingress pref 2 proto ip flower verbose dst_ip 198.51.100.1 action police rate 100Mbit burst 10000
After:
# tc filter add dev dummy0 ingress pref 1 proto ip flower dst_ip 198.51.100.1 action police rate 100Mbit burst 10000
# tc filter add dev dummy0 ingress pref 2 proto ip flower verbose dst_ip 198.51.100.1 action police rate 100Mbit burst 10000
Warning: cls_flower: Failed to setup flow action.
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The verbose flag was added in commit 81c7288b17 ("sched: cls: enable
verbose logging") to avoid suppressing logging of error messages that
occur "when the rule is not to be exclusively executed by the hardware".
However, such error messages are currently suppressed when setup of flow
action fails. Take the verbose flag into account to avoid suppressing
error messages. This is done by using the extack pointer initialized by
tc_cls_common_offload_init(), which performs the necessary checks.
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
t-queue
Tony Nguyen says:
====================
100GbE Intel Wired LAN Driver Updates 2022-04-07
Alexander Lobakin says:
This hunts down several places around packet templates/dummies for
switch rules which are either repetitive, fragile or just not
really readable code.
It's a common need to add new packet templates and to review such
changes as well, try to simplify both with the help of a pair
macros and aliases.
ice_find_dummy_packet() became very complex at this point with tons
of nested if-elses. It clearly showed this approach does not scale,
so convert its logics to the simple mask-match + static const array.
bloat-o-meter is happy about that (built w/ LLVM 13):
add/remove: 0/1 grow/shrink: 1/1 up/down: 2/-1058 (-1056)
Function old new delta
ice_fill_adv_dummy_packet 289 291 +2
ice_adv_add_update_vsi_list 201 - -201
ice_add_adv_rule 2950 2093 -857
Total: Before=414512, After=413456, chg -0.25%
add/remove: 53/52 grow/shrink: 0/0 up/down: 4660/-3988 (672)
RO Data old new delta
ice_dummy_pkt_profiles - 672 +672
Total: Before=37895, After=38567, chg +1.77%
Diffstat also looks nice, and adding new packet templates now takes
less lines.
We'll probably come out with dynamic template crafting in a while,
but for now let's improve what we have currently.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Potin Lai says:
====================
mdio: aspeed: Add Clause 45 support for Aspeed MDIO
This patch series add Clause 45 support for Aspeed MDIO driver, and
separate c22 and c45 implementation into different functions.
LINK: [v1] https://lore.kernel.org/all/20220329161949.19762-1-potin.lai@quantatw.com/
LINK: [v2] https://lore.kernel.org/all/20220406170055.28516-1-potin.lai@quantatw.com/
Changes v2 --> v3:
- sort local variable sequence in reverse Christmas tree format.
Changes v1 --> v2:
- add C45 to probe_capabilities
- break one patch into 3 small patches
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Add Clause 45 support for Aspeed mdio driver.
Signed-off-by: Potin Lai <potin.lai@quantatw.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add following additional functions to move out the implementation from
aspeed_mdio_read() and aspeed_mdio_write().
c22:
- aspeed_mdio_read_c22()
- aspeed_mdio_write_c22()
c45:
- aspeed_mdio_read_c45()
- aspeed_mdio_write_c45()
Signed-off-by: Potin Lai <potin.lai@quantatw.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add aspeed_mdio_op() and aseed_mdio_get_data() for register accessing.
aspeed_mdio_op() handles operations, write command to control register,
then check and wait operations is finished (bit 31 is cleared).
aseed_mdio_get_data() fetchs the result value of operation from data
register.
Signed-off-by: Potin Lai <potin.lai@quantatw.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
The driver for ATM Ambassador devices spews build warnings on
microblaze. The virt_to_bus() calls discard the volatile keyword.
The right thing to do would be to migrate this driver to a modern
DMA API but it seems unlikely anyone is actually using it.
There had been no fixes or functional changes here since
the git era begun.
In fact it sounds like the FW loading was broken from 2008
'til 2012 - see commit fcdc90b025 ("atm: forever loop loading
ambassador firmware").
Let's remove this driver, there isn't much changing in the APIs,
if users come forward we can apologize and revert.
Link: https://lore.kernel.org/all/20220321144013.440d7fc0@kicinski-fedora-pc1c0hjn.dhcp.thefacebook.com/
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Michael Chan says:
====================
bnxt: Support XDP multi buffer
This series adds XDP multi buffer support, allowing MTU to go beyond
the page size limit.
v4: Rebase with latest net-next
v3: Simplify page mode buffer size calculation
Check to make sure XDP program supports multipage packets
v2: Fix uninitialized variable warnings in patch 1 and 10.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Allow aggregation buffers to be in place in the receive path and
allow XDP programs to be attached when using a larger than 4k MTU.
v3: Add a check to sure XDP program supports multipage packets.
Signed-off-by: Andy Gospodarek <gospo@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch adds the following features:
- Support for XDP_TX and XDP_DROP action when using xdp_buff
with frags
- Support for freeing all frags attached to an xdp_buff
- Cleanup of TX ring buffers after transmits complete
- Slight change in definition of bnxt_sw_tx_bd since nr_frags
and RX producer may both need to be used
- Clear out skb_shared_info at the end of the buffer
v2: Fix uninitialized variable warning in bnxt_xdp_buff_frags_free().
Signed-off-by: Andy Gospodarek <gospo@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Since we have an xdp_buff with frags there needs to be a way to
convert that into a valid sk_buff in the event that XDP_PASS is
the resulting operation. This adds a new rx_skb_func when the
netdev has an MTU that prevents the packets from sitting in a
single page.
This also make sure that GRO/LRO stay disabled even when using
the aggregation ring for large buffers.
v3: Use BNXT_PAGE_MODE_BUF_SIZE for build_skb
Signed-off-by: Andy Gospodarek <gospo@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>