rpc_clnt_test_and_add_xprt() invokes rpc_call_null_helper(), which
return the value of rpc_run_task() to "task". Since rpc_run_task() is
impossible to return an ERR pointer, there is no need to add the
IS_ERR() condition on "task" here. So we need to remove it.
Fixes: 7f55489058 ("SUNRPC: Allow addition of new transports to a struct rpc_clnt")
Signed-off-by: Xiyu Yang <xiyuyang19@fudan.edu.cn>
Signed-off-by: Xin Tan <tanxin.ctf@gmail.com>
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Commit 018d26fcd1 ("cgroup, netclassid: periodically release file_lock
on classid") added a second cond_resched to write_classid indirectly by
update_classid_task. Remove the one in write_classid.
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Cc: Dmitry Yakunin <zeil@yandex-team.ru>
Cc: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
Cc: David S. Miller <davem@davemloft.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Get rid of linux/vermagic.h includes, so that MODULE_ARCH_VERMAGIC from
the arch header arch/x86/include/asm/module.h won't be redefined.
In file included from ./include/linux/module.h:30,
from drivers/net/ethernet/3com/3c515.c:56:
./arch/x86/include/asm/module.h:73: warning: "MODULE_ARCH_VERMAGIC"
redefined
73 | # define MODULE_ARCH_VERMAGIC MODULE_PROC_FAMILY
|
In file included from drivers/net/ethernet/3com/3c515.c:25:
./include/linux/vermagic.h:28: note: this is the location of the
previous definition
28 | #define MODULE_ARCH_VERMAGIC ""
|
Fixes: 6bba2e89a8 ("net/3com: Delete driver and module versions from 3com drivers")
Co-developed-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Acked-by: Shannon Nelson <snelson@pensando.io> # ionic
Acked-by: Sebastian Reichel <sre@kernel.org> # power
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pablo Neira Ayuso says:
====================
Netfilter fixes for net
The following patchset contains Netfilter fixes for net:
1) flow_block_cb memleak in nf_flow_table_offload_del_cb(), from Roi Dayan.
2) Fix error path handling in nf_nat_inet_register_fn(), from Hillf Danton.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
batadv_v_ogm_process() invokes batadv_hardif_neigh_get(), which returns
a reference of the neighbor object to "hardif_neigh" with increased
refcount.
When batadv_v_ogm_process() returns, "hardif_neigh" becomes invalid, so
the refcount should be decreased to keep refcount balanced.
The reference counting issue happens in one exception handling paths of
batadv_v_ogm_process(). When batadv_v_ogm_orig_get() fails to get the
orig node and returns NULL, the refcnt increased by
batadv_hardif_neigh_get() is not decreased, causing a refcnt leak.
Fix this issue by jumping to "out" label when batadv_v_ogm_orig_get()
fails to get the orig node.
Fixes: 9323158ef9 ("batman-adv: OGMv2 - implement originators logic")
Signed-off-by: Xiyu Yang <xiyuyang19@fudan.edu.cn>
Signed-off-by: Xin Tan <tanxin.ctf@gmail.com>
Signed-off-by: Sven Eckelmann <sven@narfation.org>
Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
batadv_show_throughput_override() invokes batadv_hardif_get_by_netdev(),
which gets a batadv_hard_iface object from net_dev with increased refcnt
and its reference is assigned to a local pointer 'hard_iface'.
When batadv_store_throughput_override() returns, "hard_iface" becomes
invalid, so the refcount should be decreased to keep refcount balanced.
The issue happens in one error path of
batadv_store_throughput_override(). When batadv_parse_throughput()
returns NULL, the refcnt increased by batadv_hardif_get_by_netdev() is
not decreased, causing a refcnt leak.
Fix this issue by jumping to "out" label when batadv_parse_throughput()
returns NULL.
Fixes: 0b5ecc6811 ("batman-adv: add throughput override attribute to hard_ifaces")
Signed-off-by: Xiyu Yang <xiyuyang19@fudan.edu.cn>
Signed-off-by: Xin Tan <tanxin.ctf@gmail.com>
Signed-off-by: Sven Eckelmann <sven@narfation.org>
Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
batadv_show_throughput_override() invokes batadv_hardif_get_by_netdev(),
which gets a batadv_hard_iface object from net_dev with increased refcnt
and its reference is assigned to a local pointer 'hard_iface'.
When batadv_show_throughput_override() returns, "hard_iface" becomes
invalid, so the refcount should be decreased to keep refcount balanced.
The issue happens in the normal path of
batadv_show_throughput_override(), which forgets to decrease the refcnt
increased by batadv_hardif_get_by_netdev() before the function returns,
causing a refcnt leak.
Fix this issue by calling batadv_hardif_put() before the
batadv_show_throughput_override() returns in the normal path.
Fixes: 0b5ecc6811 ("batman-adv: add throughput override attribute to hard_ifaces")
Signed-off-by: Xiyu Yang <xiyuyang19@fudan.edu.cn>
Signed-off-by: Xin Tan <tanxin.ctf@gmail.com>
Signed-off-by: Sven Eckelmann <sven@narfation.org>
Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
and change to pseudorandom numbers, as this is a traffic dithering
operation that doesn't need crypto-grade.
The previous code operated in 4 steps:
1. Generate a random byte 0 <= rand_tq <= 255
2. Multiply it by BATADV_TQ_MAX_VALUE - tq
3. Divide by 255 (= BATADV_TQ_MAX_VALUE)
4. Return BATADV_TQ_MAX_VALUE - rand_tq
This would apperar to scale (BATADV_TQ_MAX_VALUE - tq) by a random
value between 0/255 and 255/255.
But! The intermediate value between steps 3 and 4 is stored in a u8
variable. So it's truncated, and most of the time, is less than 255, after
which the division produces 0. Specifically, if tq is odd, the product is
always even, and can never be 255. If tq is even, there's exactly one
random byte value that will produce a product byte of 255.
Thus, the return value is 255 (511/512 of the time) or 254 (1/512
of the time).
If we assume that the truncation is a bug, and the code is meant to scale
the input, a simpler way of looking at it is that it's returning a random
value between tq and BATADV_TQ_MAX_VALUE, inclusive.
Well, we have an optimized function for doing just that.
Fixes: 3c12de9a5c ("batman-adv: network coding - code and transmit packets if possible")
Signed-off-by: George Spelvin <lkml@sdf.org>
Signed-off-by: Sven Eckelmann <sven@narfation.org>
Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
The kernel provides a function to create random values from 0 - (max-1)
since commit f337db64af ("random32: add prandom_u32_max and convert open
coded users"). Simply use this function to replace code sections which use
prandom_u32 and a handcrafted method to map it to the correct range.
Signed-off-by: Sven Eckelmann <sven@narfation.org>
The commit 04ae87a520 ("ftrace: Rework event_create_dir()") restructured
various macros in the ftrace framework. These changes also had the nice
side effect that the linux/types.h include is no longer necessary to define
some of the types used by these macros.
Signed-off-by: Sven Eckelmann <sven@narfation.org>
Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
checkpatch warns about a typo in the word bufFer which was introduced in
commit 2191c1bcbc64 ("batman-adv: kernel doc for types.h").
Signed-off-by: Sven Eckelmann <sven@narfation.org>
Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
An use-after-free crash can be triggered when sending big packets over
vxlan over esp with esp offload enabled:
[] BUG: KASAN: use-after-free in ipv6_gso_pull_exthdrs.part.8+0x32c/0x4e0
[] Call Trace:
[] dump_stack+0x75/0xa0
[] kasan_report+0x37/0x50
[] ipv6_gso_pull_exthdrs.part.8+0x32c/0x4e0
[] ipv6_gso_segment+0x2c8/0x13c0
[] skb_mac_gso_segment+0x1cb/0x420
[] skb_udp_tunnel_segment+0x6b5/0x1c90
[] inet_gso_segment+0x440/0x1380
[] skb_mac_gso_segment+0x1cb/0x420
[] esp4_gso_segment+0xae8/0x1709 [esp4_offload]
[] inet_gso_segment+0x440/0x1380
[] skb_mac_gso_segment+0x1cb/0x420
[] __skb_gso_segment+0x2d7/0x5f0
[] validate_xmit_skb+0x527/0xb10
[] __dev_queue_xmit+0x10f8/0x2320 <---
[] ip_finish_output2+0xa2e/0x1b50
[] ip_output+0x1a8/0x2f0
[] xfrm_output_resume+0x110e/0x15f0
[] __xfrm4_output+0xe1/0x1b0
[] xfrm4_output+0xa0/0x200
[] iptunnel_xmit+0x5a7/0x920
[] vxlan_xmit_one+0x1658/0x37a0 [vxlan]
[] vxlan_xmit+0x5e4/0x3ec8 [vxlan]
[] dev_hard_start_xmit+0x125/0x540
[] __dev_queue_xmit+0x17bd/0x2320 <---
[] ip6_finish_output2+0xb20/0x1b80
[] ip6_output+0x1b3/0x390
[] ip6_xmit+0xb82/0x17e0
[] inet6_csk_xmit+0x225/0x3d0
[] __tcp_transmit_skb+0x1763/0x3520
[] tcp_write_xmit+0xd64/0x5fe0
[] __tcp_push_pending_frames+0x8c/0x320
[] tcp_sendmsg_locked+0x2245/0x3500
[] tcp_sendmsg+0x27/0x40
As on the tx path of vxlan over esp, skb->inner_network_header would be
set on vxlan_xmit() and xfrm4_tunnel_encap_add(), and the later one can
overwrite the former one. It causes skb_udp_tunnel_segment() to use a
wrong skb->inner_network_header, then the issue occurs.
This patch is to fix it by calling xfrm_output_gso() instead when the
inner_protocol is set, in which gso_segment of inner_protocol will be
done first.
While at it, also improve some code around.
Fixes: 7862b4058b ("esp: Add gso handlers for esp4 and esp6")
Reported-by: Xiumei Mu <xmu@redhat.com>
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
For beet mode, when it's ipv6 inner address with nexthdrs set,
the packet format might be:
----------------------------------------------------
| outer | | dest | | | ESP | ESP |
| IP hdr | ESP | opts.| TCP | Data | Trailer | ICV |
----------------------------------------------------
Before doing gso segment in xfrm4_beet_gso_segment(), the same
thing is needed as it does in xfrm6_beet_gso_segment() in last
patch 'esp6: support ipv6 nexthdrs process for beet gso segment'.
v1->v2:
- remove skb_transport_offset(), as it will always return 0
in xfrm6_beet_gso_segment(), thank Sabrina's check.
Fixes: 384a46ea7b ("esp4: add gso_segment for esp4 beet mode")
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
For beet mode, when it's ipv6 inner address with nexthdrs set,
the packet format might be:
----------------------------------------------------
| outer | | dest | | | ESP | ESP |
| IP6 hdr| ESP | opts.| TCP | Data | Trailer | ICV |
----------------------------------------------------
Before doing gso segment in xfrm6_beet_gso_segment(), it should
skip all nexthdrs and get the real transport proto, and set
transport_header properly.
This patch is to fix it by simply calling ipv6_skip_exthdr()
in xfrm6_beet_gso_segment().
v1->v2:
- remove skb_transport_offset(), as it will always return 0
in xfrm6_beet_gso_segment(), thank Sabrina's check.
Fixes: 7f9e40eb18 ("esp6: add gso_segment for esp6 beet mode")
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
The variable rc is being assigned with a value that is never read
and it is being updated later with a new value. The initialization is
redundant and can be removed.
Addresses-Coverity: ("Unused value")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
We don't need them, as we can use the current ingress opt
data instead. Setting them in syn_recv_sock() may causes
inconsistent mptcp socket status, as per previous commit.
Fixes: cc7972ea19 ("mptcp: parse and emit MP_CAPABLE option according to v1 spec")
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
If multiple CPUs races on the same req_sock in syn_recv_sock(),
flipping such field can cause inconsistent child socket status.
When racing, the CPU losing the req ownership may still change
the mptcp request socket mp_capable flag while the CPU owning
the request is cloning the socket, leaving the child socket with
'is_mptcp' set but no 'mp_capable' flag.
Such socket will stay with 'conn' field cleared, heading to oops
in later mptcp callback.
Address the issue tracking the fallback status in a local variable.
Fixes: 58b0991962 ("mptcp: create msk early")
Co-developed-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Following splat can occur during self test:
BUG: KASAN: use-after-free in subflow_data_ready+0x156/0x160
Read of size 8 at addr ffff888100c35c28 by task mptcp_connect/4808
subflow_data_ready+0x156/0x160
tcp_child_process+0x6a3/0xb30
tcp_v4_rcv+0x2231/0x3730
ip_protocol_deliver_rcu+0x5c/0x860
ip_local_deliver_finish+0x220/0x360
ip_local_deliver+0x1c8/0x4e0
ip_rcv_finish+0x1da/0x2f0
ip_rcv+0xd0/0x3c0
__netif_receive_skb_one_core+0xf5/0x160
__netif_receive_skb+0x27/0x1c0
process_backlog+0x21e/0x780
net_rx_action+0x35f/0xe90
do_softirq+0x4c/0x50
[..]
This occurs when accessing subflow_ctx->conn.
Problem is that tcp_child_process() calls listen sockets'
sk_data_ready() notification, but it doesn't hold the listener
lock. Another cpu calling close() on the listener will then cause
transition of refcount to 0.
Fixes: 58b0991962 ("mptcp: create msk early")
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When an interface is executing a self test, put the interface into
operative status testing.
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Similar to speed, duplex and dorment, report the testing status
in sysfs.
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
RFC 2863 defines the operational state testing. Add support for this
state, both as a IF_LINK_MODE_ and __LINK_STATE_.
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Commit b6f6118901 ("ipv6: restrict IPV6_ADDRFORM operation") fixed a
problem found by syzbot an unfortunate logic error meant that it
also broke IPV6_ADDRFORM.
Rearrange the checks so that the earlier test is just one of the series
of checks made before moving the socket from IPv6 to IPv4.
Fixes: b6f6118901 ("ipv6: restrict IPV6_ADDRFORM operation")
Signed-off-by: John Haxby <john.haxby@oracle.com>
Cc: stable@vger.kernel.org
Signed-off-by: David S. Miller <davem@davemloft.net>
These new helpers do not return 0 on success, they return the
encoded size. Thus they are not a drop-in replacement for the
old helpers.
Fixes: 5c266df527 ("SUNRPC: Add encoders for list item discriminators")
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
It's not safe to use resources pointed to by the @send_wr of
ib_post_send() _after_ that function returns. Those resources are
typically freed by the Send completion handler, which can run before
ib_post_send() returns.
Thus the trace points currently around ib_post_send() in the
client's RPC/RDMA transport are a hazard, even when they are
disabled. Rearrange them so that they touch the Work Request only
_before_ ib_post_send() is invoked.
Fixes: ab03eff58e ("xprtrdma: Add trace points in RPC Call transmit paths")
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Commit e28ce90083 ("xprtrdma: kmalloc rpcrdma_ep separate from
rpcrdma_xprt") erroneously removed a xprt_force_disconnect()
call from the "transport disconnect" path. The result was that the
client no longer responded to server-side disconnect requests.
Restore that call.
Fixes: e28ce90083 ("xprtrdma: kmalloc rpcrdma_ep separate from rpcrdma_xprt")
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
When ESP encapsulation is enabled on a TCP socket, I'm replacing the
existing ->sk_destruct callback with espintcp_destruct. We still need to
call the old callback to perform the other cleanups when the socket is
destroyed. Save the old callback, and call it from espintcp_destruct.
Fixes: e27cca96cd ("xfrm: add espintcp (RFC 8229)")
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
This xfrm_state_put call in esp4/6_gro_receive() will cause
double put for state, as in out_reset path secpath_reset()
will put all states set in skb sec_path.
So fix it by simply remove the xfrm_state_put call.
Fixes: 6ed69184ed ("xfrm: Reset secpath in xfrm failure")
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
We need to set sk_state to CLOSED, else we will get following:
IPv4: Attempt to release TCP socket in state 3 00000000b95f109e
IPv4: Attempt to release TCP socket in state 10 00000000b95f109e
First one is from inet_sock_destruct(), second one from
mptcp_sk_clone failure handling. Setting sk_state to CLOSED isn't
enough, we also need to orphan sk so it has DEAD flag set.
Otherwise, a very similar warning is printed from inet_sock_destruct().
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Following snippet (replicated from syzkaller reproducer) generates
warning: "IPv4: Attempt to release TCP socket in state 1".
int main(void) {
struct sockaddr_in sin1 = { .sin_family = 2, .sin_port = 0x4e20,
.sin_addr.s_addr = 0x010000e0, };
struct sockaddr_in sin2 = { .sin_family = 2,
.sin_addr.s_addr = 0x0100007f, };
struct sockaddr_in sin3 = { .sin_family = 2, .sin_port = 0x4e20,
.sin_addr.s_addr = 0x0100007f, };
int r0 = socket(0x2, 0x1, 0x106);
int r1 = socket(0x2, 0x1, 0x106);
bind(r1, (void *)&sin1, sizeof(sin1));
connect(r1, (void *)&sin2, sizeof(sin2));
listen(r1, 3);
return connect(r0, (void *)&sin3, 0x4d);
}
Reason is that the newly generated mptcp socket is closed via the ulp
release of the tcp listener socket when its accept backlog gets purged.
To fix this, delay setting the ESTABLISHED state until after userspace
calls accept and via mptcp specific destructor.
Fixes: 58b0991962 ("mptcp: create msk early")
Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/9
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch makes it impossible that cmpri or cmpre values are set to the
value 16 which is not possible, because these are 4 bit values. We
currently run in an overflow when assigning the value 16 to it.
According to the standard a value of 16 can be interpreted as a full
elided address which isn't possible to set as compression value. A reason
why this cannot be set is that the current ipv6 header destination address
should never show up inside the segments of the rpl header. In this case we
run in a overflow and the address will have no compression at all. Means
cmpri or compre is set to 0.
As we handle cmpri and cmpre sometimes as unsigned char or 4 bit value
inside the rpl header the current behaviour ends in an invalid header
format. This patch simple use the best compression method if we ever run
into the case that the destination address is showed up inside the rpl
segments. We avoid the overflow handling and the rpl header is still valid,
even when we have the destination address inside the rpl segments.
Signed-off-by: Alexander Aring <alex.aring@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
tipc_rcv() invokes tipc_node_find() twice, which returns a reference of
the specified tipc_node object to "n" with increased refcnt.
When tipc_rcv() returns or a new object is assigned to "n", the original
local reference of "n" becomes invalid, so the refcount should be
decreased to keep refcount balanced.
The issue happens in some paths of tipc_rcv(), which forget to decrease
the refcnt increased by tipc_node_find() and will cause a refcnt leak.
Fix this issue by calling tipc_node_put() before the original object
pointed by "n" becomes invalid.
Signed-off-by: Xiyu Yang <xiyuyang19@fudan.edu.cn>
Signed-off-by: Xin Tan <tanxin.ctf@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
tipc_crypto_rcv() invokes tipc_aead_get(), which returns a reference of
the tipc_aead object to "aead" with increased refcnt.
When tipc_crypto_rcv() returns, the original local reference of "aead"
becomes invalid, so the refcount should be decreased to keep refcount
balanced.
The issue happens in one error path of tipc_crypto_rcv(). When TIPC
message decryption status is EINPROGRESS or EBUSY, the function forgets
to decrease the refcnt increased by tipc_aead_get() and causes a refcnt
leak.
Fix this issue by calling tipc_aead_put() on the error path when TIPC
message decryption status is EINPROGRESS or EBUSY.
Signed-off-by: Xiyu Yang <xiyuyang19@fudan.edu.cn>
Signed-off-by: Xin Tan <tanxin.ctf@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
nr_add_node() invokes nr_neigh_get_dev(), which returns a local
reference of the nr_neigh object to "nr_neigh" with increased refcnt.
When nr_add_node() returns, "nr_neigh" becomes invalid, so the refcount
should be decreased to keep refcount balanced.
The issue happens in one normal path of nr_add_node(), which forgets to
decrease the refcnt increased by nr_neigh_get_dev() and causes a refcnt
leak. It should decrease the refcnt before the function returns like
other normal paths do.
Fix this issue by calling nr_neigh_put() before the nr_add_node()
returns.
Signed-off-by: Xiyu Yang <xiyuyang19@fudan.edu.cn>
Signed-off-by: Xin Tan <tanxin.ctf@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Johan Hedberg says:
====================
pull request: bluetooth-next 2020-04-17
Here's the first bluetooth-next pull request for the 5.8 kernel:
- Added debugfs option to control MITM flag usage during pairing
- Added new BT_MODE socket option
- Added support for Qualcom QCA6390 device
- Added support for Realtek RTL8761B device
- Added support for mSBC audio codec over USB endpoints
- Added framework for Microsoft HCI vendor extensions
- Added new Read Security Information management command
- Fixes/cleanup to link layer privacy related code
- Various other smaller cleanups & fixes
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Utilize the xpo_release_rqst transport method to ensure that each
rqstp's svc_rdma_recv_ctxt object is released even when the server
cannot return a Reply for that rqstp.
Without this fix, each RPC whose Reply cannot be sent leaks one
svc_rdma_recv_ctxt. This is a 2.5KB structure, a 4KB DMA-mapped
Receive buffer, and any pages that might be part of the Reply
message.
The leak is infrequent unless the network fabric is unreliable or
Kerberos is in use, as GSS sequence window overruns, which result
in connection loss, are more common on fast transports.
Fixes: 3a88092ee3 ("svcrdma: Preserve Receive buffer until svc_rdma_sendto")
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Currently, after the forward channel connection goes away,
backchannel operations are causing soft lockups on the server
because call_transmit_status's SOFTCONN logic ignores ENOTCONN.
Such backchannel Calls are aggressively retried until the client
reconnects.
Backchannel Calls should use RPC_TASK_NOCONNECT rather than
RPC_TASK_SOFTCONN. If there is no forward connection, the server is
not capable of establishing a connection back to the client, thus
that backchannel request should fail before the server attempts to
send it. Commit 58255a4e3c ("NFSD: NFSv4 callback client should
use RPC_TASK_SOFTCONN") was merged several years before
RPC_TASK_NOCONNECT was available.
Because setup_callback_client() explicitly sets NOPING, the NFSv4.0
callback connection depends on the first callback RPC to initiate
a connection to the client. Thus NFSv4.0 needs to continue to use
RPC_TASK_SOFTCONN.
Suggested-by: Trond Myklebust <trondmy@hammerspace.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Cc: <stable@vger.kernel.org> # v4.20+
Pull networking fixes from David Miller:
1) Disable RISCV BPF JIT builds when !MMU, from Björn Töpel.
2) nf_tables leaves dangling pointer after free, fix from Eric Dumazet.
3) Out of boundary write in __xsk_rcv_memcpy(), fix from Li RongQing.
4) Adjust icmp6 message source address selection when routes have a
preferred source address set, from Tim Stallard.
5) Be sure to validate HSR protocol version when creating new links,
from Taehee Yoo.
6) CAP_NET_ADMIN should be sufficient to manage l2tp tunnels even in
non-initial namespaces, from Michael Weiß.
7) Missing release firmware call in mlx5, from Eran Ben Elisha.
8) Fix variable type in macsec_changelink(), caught by KASAN. Fix from
Taehee Yoo.
9) Fix pause frame negotiation in marvell phy driver, from Clemens
Gruber.
10) Record RX queue early enough in tun packet paths such that XDP
programs will see the correct RX queue index, from Gilberto Bertin.
11) Fix double unlock in mptcp, from Florian Westphal.
12) Fix offset overflow in ARM bpf JIT, from Luke Nelson.
13) marvell10g needs to soft reset PHY when coming out of low power
mode, from Russell King.
14) Fix MTU setting regression in stmmac for some chip types, from
Florian Fainelli.
* git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (101 commits)
amd-xgbe: Use __napi_schedule() in BH context
mISDN: make dmril and dmrim static
net: stmmac: dwmac-sunxi: Provide TX and RX fifo sizes
net: dsa: mt7530: fix tagged frames pass-through in VLAN-unaware mode
tipc: fix incorrect increasing of link window
Documentation: Fix tcp_challenge_ack_limit default value
net: tulip: make early_486_chipsets static
dt-bindings: net: ethernet-phy: add desciption for ethernet-phy-id1234.d400
ipv6: remove redundant assignment to variable err
net/rds: Use ERR_PTR for rds_message_alloc_sgs()
net: mscc: ocelot: fix untagged packet drops when enslaving to vlan aware bridge
selftests/bpf: Check for correct program attach/detach in xdp_attach test
libbpf: Fix type of old_fd in bpf_xdp_set_link_opts
libbpf: Always specify expected_attach_type on program load if supported
xsk: Add missing check on user supplied headroom size
mac80211: fix channel switch trigger from unknown mesh peer
mac80211: fix race in ieee80211_register_hw()
net: marvell10g: soft-reset the PHY when coming out of low power
net: marvell10g: report firmware version
net/cxgb4: Check the return from t4_query_params properly
...
The Enhanced Connection Complete event is use in conjunction with LL
Privacy and not Extended Advertising.
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
In commit 16ad3f4022 ("tipc: introduce variable window congestion
control"), we allow link window to change with the congestion avoidance
algorithm. However, there is a bug that during the slow-start if packet
retransmission occurs, the link will enter the fast-recovery phase, set
its window to the 'ssthresh' which is never less than 300, so the link
window suddenly increases to that limit instead of decreasing.
Consequently, two issues have been observed:
- For broadcast-link: it can leave a gap between the link queues that a
new packet will be inserted and sent before the previous ones, i.e. not
in-order.
- For unicast: the algorithm does not work as expected, the link window
jumps to the slow-start threshold whereas packet retransmission occurs.
This commit fixes the issues by avoiding such the link window increase,
but still decreasing if the 'ssthresh' is lowered.
Fixes: 16ad3f4022 ("tipc: introduce variable window congestion control")
Acked-by: Jon Maloy <jmaloy@redhat.com>
Signed-off-by: Tuong Lien <tuong.t.lien@dektech.com.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
The variable err is being initialized with a value that is never read
and it is being updated later with a new value. The initialization is
redundant and can be removed.
Addresses-Coverity: ("Unused value")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
tls_build_proto() uses WRITE_ONCE() to assign a 'const' pointer to a
'non-const' pointer. Cleanups to the implementation of WRITE_ONCE() mean
that this will give rise to a compiler warning, just like a plain old
assignment would do:
| net/tls/tls_main.c: In function ‘tls_build_proto’:
| ./include/linux/compiler.h:229:30: warning: assignment discards ‘const’ qualifier from pointer target type [-Wdiscarded-qualifiers]
| net/tls/tls_main.c:640:4: note: in expansion of macro ‘smp_store_release’
| 640 | smp_store_release(&saved_tcpv6_prot, prot);
| | ^~~~~~~~~~~~~~~~~
Drop the const qualifier from the local 'prot' variable, as it isn't
needed.
Cc: Boris Pismenny <borisp@mellanox.com>
Cc: Aviad Yehezkel <aviadye@mellanox.com>
Cc: John Fastabend <john.fastabend@gmail.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Will Deacon <will@kernel.org>
nf_remove_net_hook() uses WRITE_ONCE() to assign a 'const' pointer to a
'non-const' pointer. Cleanups to the implementation of WRITE_ONCE() mean
that this will give rise to a compiler warning, just like a plain old
assignment would do:
| In file included from ./include/linux/export.h:43,
| from ./include/linux/linkage.h:7,
| from ./include/linux/kernel.h:8,
| from net/netfilter/core.c:9:
| net/netfilter/core.c: In function ‘nf_remove_net_hook’:
| ./include/linux/compiler.h:216:30: warning: assignment discards ‘const’ qualifier from pointer target type [-Wdiscarded-qualifiers]
| *(volatile typeof(x) *)&(x) = (val); \
| ^
| net/netfilter/core.c:379:3: note: in expansion of macro ‘WRITE_ONCE’
| WRITE_ONCE(orig_ops[i], &dummy_ops);
| ^~~~~~~~~~
Follow the pattern used elsewhere in this file and add a cast to 'void *'
to squash the warning.
Cc: Pablo Neira Ayuso <pablo@netfilter.org>
Cc: Jozsef Kadlecsik <kadlec@netfilter.org>
Cc: Florian Westphal <fw@strlen.de>
Cc: "David S. Miller" <davem@davemloft.net>
Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
Signed-off-by: Will Deacon <will@kernel.org>
Returning the error code via a 'int *ret' when the function returns a
pointer is very un-kernely and causes gcc 10's static analysis to choke:
net/rds/message.c: In function ‘rds_message_map_pages’:
net/rds/message.c:358:10: warning: ‘ret’ may be used uninitialized in this function [-Wmaybe-uninitialized]
358 | return ERR_PTR(ret);
Use a typical ERR_PTR return instead.
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Daniel Borkmann says:
====================
pull-request: bpf 2020-04-15
The following pull-request contains BPF updates for your *net* tree.
We've added 10 non-merge commits during the last 3 day(s) which contain
a total of 11 files changed, 238 insertions(+), 95 deletions(-).
The main changes are:
1) Fix offset overflow for BPF_MEM BPF_DW insn mapping on arm32 JIT,
from Luke Nelson and Xi Wang.
2) Prevent mprotect() to make frozen & mmap()'ed BPF map writeable
again, from Andrii Nakryiko and Jann Horn.
3) Fix type of old_fd in bpf_xdp_set_link_opts to int in libbpf and add
selftests, from Toke Høiland-Jørgensen.
4) Fix AF_XDP to check that headroom cannot be larger than the available
space in the chunk, from Magnus Karlsson.
5) Fix reset of XDP prog when expected_fd is set, from David Ahern.
6) Fix a segfault in bpftool's struct_ops command when BTF is not
available, from Daniel T. Lee.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Johannes Berg says:
====================
A couple of fixes:
* FTM responder policy netlink validation fix
(but the only user validates again later)
* kernel-doc fixes
* a fix for a race in mac80211 radio registration vs. userspace
* a mesh channel switch fix
* a fix for a syzbot reported kasprintf() issue
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
In case LL Privacy is supported by the controller, it is also a good
idea to use the LE Enhanced Connection Complete event for getting all
information about the new connection and its addresses.
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>