This patch added the id argument for setting the address flags in
pm_nl_ctl.
Usage:
pm_nl_ctl set id 1 flags backup
Signed-off-by: Geliang Tang <geliang.tang@suse.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
This patch implemented a new function named pm_nl_set_endpoint(), wrapped
the PM netlink commands 'ip mptcp endpoint change flags' and 'pm_nl_ctl
set flags' in it, and used a new argument 'ip_mptcp' to choose which one
to use to set the flags of the PM endpoint.
'ip mptcp' used the ID number argument to find out the address to change
flags, while 'pm_nl_ctl' used the address and port number arguments. So
we need to parse the address ID from the PM dump output as well as the
address and port number.
Used this wrapper in do_transfer() instead of using the pm_nl_ctl command
directly.
Signed-off-by: Geliang Tang <geliang.tang@suse.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
This patch implemented a new function named pm_nl_show_endpoints(), wrapped
the PM netlink commands 'ip mptcp endpoint show' and 'pm_nl_ctl dump' in
it, used a new argument 'ip_mptcp' to choose which one to use to show all
the PM endpoints.
Used this wrapper in do_transfer() instead of using the pm_nl_ctl commands
directly.
The original 'pos+=5' in the remoing tests only works for the output of
'pm_nl_ctl show':
id 1 flags subflow 10.0.1.1
It doesn't work for the output of 'ip mptcp endpoint show':
10.0.1.1 id 1 subflow
So implemented a more flexible approach to get the address ID from the PM
dump output to fit for both commands.
Signed-off-by: Geliang Tang <geliang.tang@suse.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
This patch added four basic 'ip mptcp' wrappers:
pm_nl_set_limits()
pm_nl_add_endpoint()
pm_nl_del_endpoint()
pm_nl_flush_endpoint().
Wrapped the PM netlink commands 'ip mptcp' and 'pm_nl_ctl' in them, and
used a new argument 'ip_mptcp' to choose which one to use for setting the
PM limits, adding or deleting the PM endpoint.
Used the wrappers in all the selftests in mptcp_join.sh instead of using
the pm_nl_ctl commands directly.
Signed-off-by: Geliang Tang <geliang.tang@suse.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
This patch added the backup testcase using an address with a port number.
The original backup tests only work for the output of 'pm_nl_ctl dump'
without the port number. It chooses the last item in the dump to parse
the address in it, and in this case, the address is showed at the end
of the item.
But it doesn't work for the dump with the port number, in this case, the
port number is showed at the end of the item, not the address.
So implemented a more flexible approach to get the address and the port
number from the dump to fit for the port number case.
Signed-off-by: Geliang Tang <geliang.tang@suse.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
This patch added the port argument for setting the address flags in
pm_nl_ctl.
Usage:
pm_nl_ctl set 10.0.2.1 flags backup port 10100
Signed-off-by: Geliang Tang <geliang.tang@suse.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
It's illegal to use both port and non-signal flags for adding address.
But it's legal to use both of them for setting flags, which always uses
non-signal flags, backup or fullmesh.
This patch moves this non-signal flag with port check from
mptcp_pm_parse_addr() to mptcp_nl_cmd_add_addr(). Do the check only when
adding addresses, not setting flags or deleting addresses.
Signed-off-by: Geliang Tang <geliang.tang@suse.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Justin Iurman says:
====================
Support for the IOAM insertion frequency
The insertion frequency is represented as "k/n", meaning IOAM will be
added to {k} packets over {n} packets, with 0 < k <= n and 1 <= {k,n} <=
1000000. Therefore, it provides the following percentages of insertion
frequency: [0.0001% (min) ... 100% (max)].
Not only this solution allows an operator to apply dynamic frequencies
based on the current traffic load, but it also provides some
flexibility, i.e., by distinguishing similar cases (e.g., "1/2" and
"2/4").
"1/2" = Y N Y N Y N Y N ...
"2/4" = Y Y N N Y Y N N ...
====================
Link: https://lore.kernel.org/r/20220202142554.9691-1-justin.iurman@uliege.be
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Add support for the IOAM insertion frequency inside its lwtunnel output
function. This patch introduces a new (atomic) counter for packets,
based on which the algorithm will decide if IOAM should be added or not.
Default frequency is "1/1" (i.e., applied to all packets) for backward
compatibility. The iproute2 patch is ready and will be submitted as soon
as this one is accepted.
Previous iproute2 command:
ip -6 ro ad fc00::1/128 encap ioam6 [ mode ... ] ...
New iproute2 command:
ip -6 ro ad fc00::1/128 encap ioam6 [ freq k/n ] [ mode ... ] ...
Signed-off-by: Justin Iurman <justin.iurman@uliege.be>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Add the insertion frequency uapi for IOAM lwtunnels.
Signed-off-by: Justin Iurman <justin.iurman@uliege.be>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
There are cases where clang compiler is packaged in a way
readelf is a symbolic link to llvm-readelf. In such cases,
llvm-readelf will be used instead of default binutils readelf,
and the following error will appear during libbpf build:
Warning: Num of global symbols in
/home/yhs/work/bpf-next/tools/testing/selftests/bpf/tools/build/libbpf/sharedobjs/libbpf-in.o (367)
does NOT match with num of versioned symbols in
/home/yhs/work/bpf-next/tools/testing/selftests/bpf/tools/build/libbpf/libbpf.so libbpf.map (383).
Please make sure all LIBBPF_API symbols are versioned in libbpf.map.
--- /home/yhs/work/bpf-next/tools/testing/selftests/bpf/tools/build/libbpf/libbpf_global_syms.tmp ...
+++ /home/yhs/work/bpf-next/tools/testing/selftests/bpf/tools/build/libbpf/libbpf_versioned_syms.tmp ...
@@ -324,6 +324,22 @@
btf__str_by_offset
btf__type_by_id
btf__type_cnt
+LIBBPF_0.0.1
+LIBBPF_0.0.2
+LIBBPF_0.0.3
+LIBBPF_0.0.4
+LIBBPF_0.0.5
+LIBBPF_0.0.6
+LIBBPF_0.0.7
+LIBBPF_0.0.8
+LIBBPF_0.0.9
+LIBBPF_0.1.0
+LIBBPF_0.2.0
+LIBBPF_0.3.0
+LIBBPF_0.4.0
+LIBBPF_0.5.0
+LIBBPF_0.6.0
+LIBBPF_0.7.0
libbpf_attach_type_by_name
libbpf_find_kernel_btf
libbpf_find_vmlinux_btf_id
make[2]: *** [Makefile:184: check_abi] Error 1
make[1]: *** [Makefile:140: all] Error 2
The above failure is due to different printouts for some ABS
versioned symbols. For example, with the same libbpf.so,
$ /bin/readelf --dyn-syms --wide tools/lib/bpf/libbpf.so | grep "LIBBPF" | grep ABS
134: 0000000000000000 0 OBJECT GLOBAL DEFAULT ABS LIBBPF_0.5.0
202: 0000000000000000 0 OBJECT GLOBAL DEFAULT ABS LIBBPF_0.6.0
...
$ /opt/llvm/bin/readelf --dyn-syms --wide tools/lib/bpf/libbpf.so | grep "LIBBPF" | grep ABS
134: 0000000000000000 0 OBJECT GLOBAL DEFAULT ABS LIBBPF_0.5.0@@LIBBPF_0.5.0
202: 0000000000000000 0 OBJECT GLOBAL DEFAULT ABS LIBBPF_0.6.0@@LIBBPF_0.6.0
...
The binutils readelf doesn't print out the symbol LIBBPF_* version and llvm-readelf does.
Such a difference caused libbpf build failure with llvm-readelf.
The proposed fix filters out all ABS symbols as they are not part of the comparison.
This works for both binutils readelf and llvm-readelf.
Reported-by: Delyan Kratunov <delyank@fb.com>
Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20220204214355.502108-1-yhs@fb.com
Add several tests to check bpf_core_types_are_compat() functionality:
- candidate type name exists and types match
- candidate type name exists but types don't match
- nested func protos at kernel recursion limit
- nested func protos above kernel recursion limit. Such bpf prog
is rejected during the load.
Signed-off-by: Matteo Croce <mcroce@microsoft.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20220204005519.60361-3-mcroce@linux.microsoft.com
Adopt libbpf's bpf_core_types_are_compat() for kernel duty by adding
explicit recursion limit of 2 which is enough to handle 2 levels of
function prototypes.
Signed-off-by: Matteo Croce <mcroce@microsoft.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20220204005519.60361-2-mcroce@linux.microsoft.com
Since commit b2eed9b588 ("arm64/kernel: kaslr: reduce module
randomization range to 2 GB"), for arm64 whether KASLR is enabled
or not, the module is placed within 2GB of the kernel region, so
s32 in bpf_kfunc_desc is sufficient to represente the offset of
module function relative to __bpf_call_base. The only thing needed
is to override bpf_jit_supports_kfunc_call().
Signed-off-by: Hou Tao <houtao1@huawei.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20220130092917.14544-2-hotforest@gmail.com
Alex Elder says:
====================
net: ipa: improve RX buffer replenishing
This series revises the algorithm used for replenishing receive
buffers on RX endpoints. Currently there are two atomic variables
that track how many receive buffers can be sent to the hardware.
The new algorithm obviates the need for those, by just assuming we
always want to provide the hardware with buffers until it can hold
no more.
The first patch eliminates an atomic variable that's not required.
The next moves some code into the main replenish function's caller,
making one of the called function's arguments unnecessary. The
next six refactor things a bit more, adding a new helper function
that allows us to eliminate an additional atomic variable. And the
final two implement two more minor improvements.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Rather than tracking the number of receive buffer transactions that
have been submitted without a doorbell, just track the total number
of transactions that have been issued. Then ring the doorbell when
that number modulo the replenish batch size is 0.
The effect is roughly the same, but the new count is slightly more
interesting, and this approach will someday allow the replenish
batch size to be tuned at runtime.
Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Replenishing is now solely driven by whether transactions are
available for a channel, and it doesn't really matter whether
we replenish before or after we deliver received packets to the
network stack.
Replenishing before delivering the payload adds a little latency.
Eliminate that by requesting a replenish after the payload is
delivered.
Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
We no longer use the replenish_backlog atomic variable to decide
when we've got work to do providing receive buffers to hardware.
Basically, we try to keep the hardware as full as possible, all the
time. We keep supplying buffers until the hardware has no more
space for them.
As a result, we can get rid of the replenish_backlog field and the
atomic operations performed on it.
Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Create a new function that returns true if all transactions for a
channel are available for use.
Use it in ipa_endpoint_replenish_enable() to see whether to start
replenishing, and in ipa_endpoint_replenish() to determine whether
it's necessary after a failure to schedule delayed work to ensure a
future replenish attempt occurs.
Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Rather than determining when to stop replenishing using the
replenish backlog, just stop when we have exhausted all available
transactions.
Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
When replenishing, have ipa_endpoint_replenish() allocate a
transaction, and pass that to ipa_endpoint_replenish_one() to fill.
Then, if that produces no error, commit the transaction within the
replenish loop as well. In this way we can distinguish between
transaction failures and buffer allocation/mapping failures.
Failure to allocate a transaction simply means the hardware already
has as many receive buffers as it can hold. In that case we can
break out of the replenish loop because there's nothing more to do.
If we fail to allocate or map pages for the receive buffer, just
try again later.
Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Decide whether the doorbell should be signaled when committing a
replenish transaction in the main replenish loop, rather than in
ipa_endpoint_replenish_one(). This is a step to facilitate the
next patch.
Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Three spots call ipa_endpoint_replenish(), and just one of those
requests that the backlog be incremented after completing the
replenish operation.
Instead, have the caller increment the backlog, and get rid of the
add_one argument to ipa_endpoint_replenish().
Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
A transaction failure only occurs if no more transactions are
available for an endpoint. It's a very cheap test.
When replenishing an RX endpoint buffer, there's no point in
allocating pages if transactions are exhausted. So don't bother
doing so unless the transaction allocation succeeds.
Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
The replenish_saved field keeps track of the number of times a new
buffer is added to the backlog when replenishing is disabled. We
don't really use it though, so there's no need for us to track it
separately. Whether replenishing is enabled or not, we can simply
increment the backlog.
Get rid of replenish_saved, and initialize and increment the backlog
where it would have otherwise been used.
Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
TLS recvmsg() passes user pages as destination for decrypt.
The decrypt operation is repeated record by record, each
record being 16kB, max. TLS allocates an sg_table and uses
iov_iter_get_pages() to populate it with enough pages to
fit the decrypted record.
Even though we decrypt a single message at a time we size
the sg_table based on the entire length of the iovec.
This leads to unnecessarily large allocations, risking
triggering OOM conditions.
Use iov_iter_truncate() / iov_iter_reexpand() to construct
a "capped" version of iov_iter_npages(). Alternatively we
could parametrize iov_iter_npages() to take the size as
arg instead of using i->count, or do something else..
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Populate the supported interfaces and MAC capabilities for the Realtek
rtl8365 DSA switch and remove the old validate implementation to allow
DSA to use phylink_generic_validate() for this switch driver.
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
Tony Nguyen says:
====================
40GbE Intel Wired LAN Driver Updates 2022-02-03
This series contains updates to the i40e client header file and driver.
Mateusz disables HW TC offload by default.
Joe Damato removes a no longer used statistic.
Jakub Kicinski removes an unused enum from the client header file.
Jedrzej changes some admin queue commands to occur under atomic context
and adds new functions for admin queue MAC VLAN filters to avoid a
potential race that could occur due storing results in a structure that
could be overwritten by the next admin queue call.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
No need to have the datapath call the always-true comment match stub.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
This allows to replace a tcp option with nop padding to selectively disable
a particular tcp option.
Optstrip mode is chosen when userspace passes the exthdr expression with
neither a source nor a destination register attribute.
This is identical to xtables TCPOPTSTRIP extension.
The only difference is that TCPOPTSTRIP allows to pass in a bitmap
of options to remove rather than a single number.
Unlike TCPOPTSTRIP this expression can be used multiple times
in the same rule to get the same effect.
We could add a new nested attribute later on in case there is a
use case for single-expression-multi-remove.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Instead of exposing the four hooks individually use a sinle hook ops
structure.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
These no longer register/unregister a meaningful structure so remove it.
Cc: Paul Blakey <paulb@nvidia.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
The nat module already exposes a few functions to the conntrack core.
Move the nat extension destroy hook to it.
After this, no conntrack extension needs a destroy hook.
'struct nf_ct_ext_type' and the register/unregister api can be removed
in a followup patch.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
No need to specify this in the registration modules, we already
collect all sizes for build-time checks on the maximum combined size.
After this change, all extensions except nat have no meaningful content
in their nf_ct_ext_type struct definition.
Next patch handles nat, this will then allow to remove the dynamic
register api completely.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
All extensions except one need 8 byte alignment, so just make that the
default.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
This info could be useful to improve traffic analysis.
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Acked-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
The udp_error function verifies the checksum of incoming UDP packets if
one is set. This has the desirable side effect of setting skb->ip_summed
to CHECKSUM_COMPLETE, signalling that this verification need not be
repeated further up the stack.
Conversely, when the UDP checksum is empty, which is perfectly legal (at least
inside IPv4), udp_error previously left no trace that the checksum had been
deemed acceptable.
This was a problem in particular for nf_reject_ipv4, which verifies the
checksum in nf_send_unreach() before sending ICMP_DEST_UNREACH. It makes
no accommodation for zero UDP checksums unless they are already marked
as CHECKSUM_UNNECESSARY.
This commit ensures packets with empty UDP checksum are marked as
CHECKSUM_UNNECESSARY, which is explicitly recommended in skbuff.h.
Signed-off-by: Kevin Mitchell <kevmitch@arista.com>
Acked-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Convert lan966x to use the mac_select_interface instead of
phylink_set_pcs.
Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com>
Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Link: https://lore.kernel.org/r/20220202114949.833075-1-horatiu.vultur@microchip.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Using tos 0x1 with 'ip route get <IPv4 address> ...' doesn't test much
of the tos option handling: 0x1 just sets an ECN bit, which is cleared
by inet_rtm_getroute() before doing the fib lookup. Let's use 0x10
instead, which is actually taken into account in the route lookup (and
is less surprising for the reader).
For consistency, use 0x10 for the IPv6 route lookup too (IPv6 currently
doesn't clear ECN bits, but might do so in the future).
Signed-off-by: Guillaume Nault <gnault@redhat.com>
Link: https://lore.kernel.org/r/d61119e68d01ba7ef3ba50c1345a5123a11de123.1643815297.git.gnault@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Although both iproute2 and the kernel accept 1 and 2 as tos values for
new routes, those are invalid. These values only set ECN bits, which
are ignored during IPv4 fib lookups. Therefore, no packet can actually
match such routes. This selftest therefore only succeeds because it
doesn't verify that the new routes do actually work in practice (it
just checks if the routes are offloaded or not).
It makes more sense to use tos values that don't conflict with ECN.
This way, the selftest won't be affected if we later decide to warn or
even reject invalid tos configurations for new routes.
Signed-off-by: Guillaume Nault <gnault@redhat.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Link: https://lore.kernel.org/r/5e43b343720360a1c0e4f5947d9e917b26f30fbf.1643826556.git.gnault@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
__dev_alloc_name() allocates a private zeroed page,
then sets bits in it while iterating through net devices.
It can use __set_bit() to avoid unnecessary locked operations.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Link: https://lore.kernel.org/r/20220203064609.3242863-1-eric.dumazet@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
While the stackleak plugin was already using notrace, objtool is now a
bit more picky. Update the notrace uses to noinstr. Silences the
following objtool warnings when building with:
CONFIG_DEBUG_ENTRY=y
CONFIG_STACK_VALIDATION=y
CONFIG_VMLINUX_VALIDATION=y
CONFIG_GCC_PLUGIN_STACKLEAK=y
vmlinux.o: warning: objtool: do_syscall_64()+0x9: call to stackleak_track_stack() leaves .noinstr.text section
vmlinux.o: warning: objtool: do_int80_syscall_32()+0x9: call to stackleak_track_stack() leaves .noinstr.text section
vmlinux.o: warning: objtool: exc_general_protection()+0x22: call to stackleak_track_stack() leaves .noinstr.text section
vmlinux.o: warning: objtool: fixup_bad_iret()+0x20: call to stackleak_track_stack() leaves .noinstr.text section
vmlinux.o: warning: objtool: do_machine_check()+0x27: call to stackleak_track_stack() leaves .noinstr.text section
vmlinux.o: warning: objtool: .text+0x5346e: call to stackleak_erase() leaves .noinstr.text section
vmlinux.o: warning: objtool: .entry.text+0x143: call to stackleak_erase() leaves .noinstr.text section
vmlinux.o: warning: objtool: .entry.text+0x10eb: call to stackleak_erase() leaves .noinstr.text section
vmlinux.o: warning: objtool: .entry.text+0x17f9: call to stackleak_erase() leaves .noinstr.text section
Note that the plugin's addition of calls to stackleak_track_stack() from
noinstr functions is expected to be safe, as it isn't runtime
instrumentation and is self-contained.
Cc: Alexander Popov <alex.popov@linux.com>
Suggested-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
and ieee802154.
Current release - regressions:
- Partially revert "net/smc: Add netlink net namespace support",
fix uABI breakage
- netfilter:
- nft_ct: fix use after free when attaching zone template
- nft_byteorder: track register operations
Previous releases - regressions:
- ipheth: fix EOVERFLOW in ipheth_rcvbulk_callback
- phy: qca8081: fix speeds lower than 2.5Gb/s
- sched: fix use-after-free in tc_new_tfilter()
Previous releases - always broken:
- tcp: fix mem under-charging with zerocopy sendmsg()
- tcp: add missing tcp_skb_can_collapse() test in tcp_shift_skb_data()
- neigh: do not trigger immediate probes on NUD_FAILED from
neigh_managed_work, avoid a deadlock
- bpf: use VM_MAP instead of VM_ALLOC for ringbuf, avoid KASAN
false-positives
- netfilter: nft_reject_bridge: fix for missing reply from prerouting
- smc: forward wakeup to smc socket waitqueue after fallback
- ieee802154:
- return meaningful error codes from the netlink helpers
- mcr20a: fix lifs/sifs periods
- at86rf230, ca8210: stop leaking skbs on error paths
- macsec: add missing un-offload call for NETDEV_UNREGISTER of parent
- ax25: add refcount in ax25_dev to avoid UAF bugs
- eth: mlx5e:
- fix SFP module EEPROM query
- fix broken SKB allocation in HW-GRO
- IPsec offload: fix tunnel mode crypto for non-TCP/UDP flows
- eth: amd-xgbe:
- fix skb data length underflow
- ensure reset of the tx_timer_active flag, avoid Tx timeouts
- eth: stmmac: fix runtime pm use in stmmac_dvr_remove()
- eth: e1000e: handshake with CSME starts from Alder Lake platforms
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEE6jPA+I1ugmIBA4hXMUZtbf5SIrsFAmH8X9UACgkQMUZtbf5S
IrsxuhAAlAvFHGL6y5Y2gAmhKvVUvCYjiIJBcvk7R66CwYVRxofvlhmxi6GM/Czs
9SrVSaN4RXu3p3d7UtAl1gAQwHqzLIHH3m2g5dSKVvHZWQgkm/+n74x0aZQ9Fll7
mWs9uu5fWsQr/qZBnnjoQTvUxRUNVd4trBy7nXGzkNqJL5j0+2TT4BhH4qalhE28
iPc9YFCyKPdjoWFksteZqD3hAQbXxK/xRRr6xuvFHENlZdEHM6ARftHnJthTG/fY
32rdn9YUkQ9lNtOBJNMN9yP2z1B7TcxASBqjjk55I7XtT1QAI9/PskszavHC0hOk
BCSMX779bLNW4+G0wiSKVB4tq4tvswtawq8Hxa6zdU4TKIzfQ84ZL/Nf66GtH+4W
C0mbZohmyJV9hQFkNT0ZLeihljd7i4BkDttlbK3uz2IL9tHeX3uSo5V7AgS/Xaf6
frXgbGgjQTaR6IL9AUhfN3GTCx60mzpH/aRpFho8A5xAl3EtHWCJcRhbY/CEhQBR
zyCndcLcG5mUzbhx/TxlKrrpRCLxqCUG/Tsb2wCh5jMxO1zonW9Hhv4P1ie6EFuI
h+XiJT2WWObS/KTze9S86WOR0zcqrtRqaOGJlNB+/+K8ClZU8UsDTFXLQ0dqpVZF
Mvp7VchBzyFFJrrvO8WkkJgLTKdaPJmM9wuWUZb4J6d2MWlmDkE=
=qKvf
-----END PGP SIGNATURE-----
Merge tag 'net-5.17-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from Jakub Kicinski:
"Including fixes from bpf, netfilter, and ieee802154.
Current release - regressions:
- Partially revert "net/smc: Add netlink net namespace support", fix
uABI breakage
- netfilter:
- nft_ct: fix use after free when attaching zone template
- nft_byteorder: track register operations
Previous releases - regressions:
- ipheth: fix EOVERFLOW in ipheth_rcvbulk_callback
- phy: qca8081: fix speeds lower than 2.5Gb/s
- sched: fix use-after-free in tc_new_tfilter()
Previous releases - always broken:
- tcp: fix mem under-charging with zerocopy sendmsg()
- tcp: add missing tcp_skb_can_collapse() test in
tcp_shift_skb_data()
- neigh: do not trigger immediate probes on NUD_FAILED from
neigh_managed_work, avoid a deadlock
- bpf: use VM_MAP instead of VM_ALLOC for ringbuf, avoid KASAN
false-positives
- netfilter: nft_reject_bridge: fix for missing reply from prerouting
- smc: forward wakeup to smc socket waitqueue after fallback
- ieee802154:
- return meaningful error codes from the netlink helpers
- mcr20a: fix lifs/sifs periods
- at86rf230, ca8210: stop leaking skbs on error paths
- macsec: add missing un-offload call for NETDEV_UNREGISTER of parent
- ax25: add refcount in ax25_dev to avoid UAF bugs
- eth: mlx5e:
- fix SFP module EEPROM query
- fix broken SKB allocation in HW-GRO
- IPsec offload: fix tunnel mode crypto for non-TCP/UDP flows
- eth: amd-xgbe:
- fix skb data length underflow
- ensure reset of the tx_timer_active flag, avoid Tx timeouts
- eth: stmmac: fix runtime pm use in stmmac_dvr_remove()
- eth: e1000e: handshake with CSME starts from Alder Lake platforms"
* tag 'net-5.17-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (69 commits)
ax25: fix reference count leaks of ax25_dev
net: stmmac: ensure PTP time register reads are consistent
net: ipa: request IPA register values be retained
dt-bindings: net: qcom,ipa: add optional qcom,qmp property
tools/resolve_btfids: Do not print any commands when building silently
bpf: Use VM_MAP instead of VM_ALLOC for ringbuf
net, neigh: Do not trigger immediate probes on NUD_FAILED from neigh_managed_work
tcp: add missing tcp_skb_can_collapse() test in tcp_shift_skb_data()
net: sparx5: do not refer to skb after passing it on
Partially revert "net/smc: Add netlink net namespace support"
net/mlx5e: Avoid field-overflowing memcpy()
net/mlx5e: Use struct_group() for memcpy() region
net/mlx5e: Avoid implicit modify hdr for decap drop rule
net/mlx5e: IPsec: Fix tunnel mode crypto offload for non TCP/UDP traffic
net/mlx5e: IPsec: Fix crypto offload for non TCP/UDP encapsulated traffic
net/mlx5e: Don't treat small ceil values as unlimited in HTB offload
net/mlx5: E-Switch, Fix uninitialized variable modact
net/mlx5e: Fix handling of wrong devices during bond netevent
net/mlx5e: Fix broken SKB allocation in HW-GRO
net/mlx5e: Fix wrong calculation of header index in HW_GRO
...
-----BEGIN PGP SIGNATURE-----
iQJIBAABCAAyFiEES0KozwfymdVUl37v6iDy2pc3iXMFAmH8VkAUHHBhdWxAcGF1
bC1tb29yZS5jb20ACgkQ6iDy2pc3iXNyLQ/9GAsvvB7PYeUpj0CGLMaAT9Hys5l+
WPjP+NU+HF+r+AUsSCJwKkK4yKnpEDK9nidOdkwiYjAO/83yl7kkBPGRgisQep1A
tEbuJ5vqZnR59jLxNKCmQE0gY+gByjk3jZIVFLSWwG/ho7s1LQyNoYpm7rbFIgAz
6qe7IR1nsATzxRhDoJI3RIPlQjzhM1qEX9PBEtwW+LLieShtvMc+ijdiUw7bqNl9
RTM6hRf4fTX4jLHtxfZYZ99bHEjIseksFbSAnjKxxkt0W5EFha73VX8hjwnG24J/
XZQAhsyvpQmcZKJGZPWUSa+UFcytoauMnNdgJOQw7TcMT4Y2mMuvcoZ/KkFtDjdr
30qhp46/gml2yqnByXRfzshGQm9E4ZoqSCn+lFWAfjlrhcqdgZFKpILpwMixbdin
NgTA/pbwXovrlho8UflB0sbDMrbyV3qNGZXD/4hRg66Vm3F7ipgqPBbM89qoDniG
CXiQnmRQ+rwcftyeE7me7+kD6djYTWOfEY5HRNiCf9NhnQG8GP7YzZ4KACxJ2PwQ
R9+Egc9nAl4UG6PrEjZeud81rLzc+ws2SJLokxOcIGnid8lZidf83HfWekAmRloA
J5+tmpx5q26ug/j2uXV/rp36xaQWhjJrrnhEKamIYYAVioXa9srRhtz3qRI8r/13
mrZ5hu4le8aC/5s=
=AyIM
-----END PGP SIGNATURE-----
Merge tag 'selinux-pr-20220203' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux
Pull selinux fix from Paul Moore:
"One small SELinux patch to ensure that a policy structure field is
properly reset after freeing so that we don't inadvertently do a
double-free on certain error conditions"
* tag 'selinux-pr-20220203' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux:
selinux: fix double free of cond_list on error paths
This Kselftest fixes update for Linux 5.17-rc3 consists of important
fixes to several tests and documentation clarification on running
mainline kselftest on stable releases. A few notable fixes:
- fix kselftest run hang due to child processes that haven't been
terminated. Fix signals all child processes
- fix false pass/fail results from vdso_test_abi, openat2, mincore
- build failures when using -j (multiple jobs) option
- exec test build failure due to incorrect build rule for a run-time
created "pipe"
- zram test fixes related to interaction with zram-generator to
make sure zram test to coordinate deleted with zram-generator
- zram test compression ratio calculation fix and skipping
max_comp_streams.
- increasing rtc test timeout
- cpufreq test to write test results to stdout which will necessary on
automated test systems
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEPZKym/RZuOCGeA/kCwJExA0NQxwFAmH8OcAACgkQCwJExA0N
QxzTkg//YF+iSc1aao8nmUOvsK6oc1RBwIj3hkLUHjP3H1qFkm9OYxzTcLYGcnyo
JahxNjeVoDuVESYx/AyLZ568aCCxRXJEDzNm5eIVNfBrGtVTfFPM19HwC/R3I1Ew
KJUruxRx++8AvI1RYMEzsDumKpLVe3bor7sj3CcO1E9/qkOoUAukxt7FVmSNMZlW
qYCDgc3yBa/XrImHCbJdZc4CUhbmh+l05sZgG3V3fxQSlgfIClY0Qg8W7Ucu+r4S
6W5nwoEJIG32Zl2avaZ2VTF4T+CTQB70g/n4OBEX8TAxuIIi9W12N6zMZ76q8qbp
iRs7UqgUSqPWdz/3ZHiQ5gy0WsCJ/W1379TtiG0doEeU2vwZ6fR8NMn+2FrEH18W
xHBPOWeN+PkVAFjUeoyt1c5OGNprK6EEE2kQ6CLoBTwlKWLDQ87ZNPf84uAsez1x
G0m7AX/T5adeTLoZfEXNXVY4OROs0nxbAkGC5ghVtKQu1giMcUKj+KHUowgj5OIJ
Zaj+uSPiN3hnwj5L2fk+orOEC+3bZxVoQzqSB2Bs6stQOQFZLP18xHIjQIDZoQuC
O512ZY+dwMzSyTi2KoQmb/M0Ft3gSfhRVXc7gfEFfOvC3ZbqRGFfeES1ZILup3rZ
izMTJBDOe+BGSm/GCFPHxu36YfdPqiAyHBTVSYy5EhLGpxafZvI=
=rH2m
-----END PGP SIGNATURE-----
Merge tag 'linux-kselftest-fixes-5.17-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest
Pull Kselftest fixes from Shuah Khan:
"Important fixes to several tests and documentation clarification on
running mainline kselftest on stable releases. A few notable fixes:
- fix kselftest run hang due to child processes that haven't been
terminated. Fix signals all child processes
- fix false pass/fail results from vdso_test_abi, openat2, mincore
- build failures when using -j (multiple jobs) option
- exec test build failure due to incorrect build rule for a run-time
created "pipe"
- zram test fixes related to interaction with zram-generator to make
sure zram test to coordinate deleted with zram-generator
- zram test compression ratio calculation fix and skipping
max_comp_streams.
- increasing rtc test timeout
- cpufreq test to write test results to stdout which will necessary
on automated test systems"
* tag 'linux-kselftest-fixes-5.17-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest:
kselftest: Fix vdso_test_abi return status
selftests: skip mincore.check_file_mmap when fs lacks needed support
selftests: openat2: Skip testcases that fail with EOPNOTSUPP
selftests: openat2: Add missing dependency in Makefile
selftests: openat2: Print also errno in failure messages
selftests: futex: Use variable MAKE instead of make
selftests/exec: Remove pipe from TEST_GEN_FILES
selftests/zram: Adapt the situation that /dev/zram0 is being used
selftests/zram01.sh: Fix compression ratio calculation
selftests/zram: Skip max_comp_streams interface on newer kernel
docs/kselftest: clarify running mainline tests on stables
kselftest: signal all child processes
selftests: cpufreq: Write test output to stdout as well
selftests: rtc: Increase test timeout so that all tests run
The ice driver provides QoS information to auxiliary drivers
through the exported function ice_get_qos_params. This function
doesn't currently support L3 DSCP QoS.
Add the necessary defines, structure elements and code to support
DSCP QoS through the IIDC functions.
Signed-off-by: Dave Ertman <david.m.ertman@intel.com>
Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
The previous commit d01ffb9eee ("ax25: add refcount in ax25_dev
to avoid UAF bugs") introduces refcount into ax25_dev, but there
are reference leak paths in ax25_ctl_ioctl(), ax25_fwd_ioctl(),
ax25_rt_add(), ax25_rt_del() and ax25_rt_opt().
This patch uses ax25_dev_put() and adjusts the position of
ax25_addr_ax25dev() to fix reference cout leaks of ax25_dev.
Fixes: d01ffb9eee ("ax25: add refcount in ax25_dev to avoid UAF bugs")
Signed-off-by: Duoming Zhou <duoming@zju.edu.cn>
Reviewed-by: Dan Carpenter <dan.carpenter@oracle.com>
Link: https://lore.kernel.org/r/20220203150811.42256-1-duoming@zju.edu.cn
Signed-off-by: Jakub Kicinski <kuba@kernel.org>