linux/net
Magnus Karlsson a9744f7ca2 xsk: fix potential race in SKB TX completion code
There is a potential race in the TX completion code for the SKB
case. One process enters the sendmsg code of an AF_XDP socket in order
to send a frame. The execution eventually trickles down to the driver
that is told to send the packet. However, it decides to drop the
packet due to some error condition (e.g., rings full) and frees the
SKB. This will trigger the SKB destructor and a completion will be
sent to the AF_XDP user space through its
single-producer/single-consumer queues.

At the same time a TX interrupt has fired on another core and it
dispatches the TX completion code in the driver. It does its HW
specific things and ends up freeing the SKB associated with the
transmitted packet. This will trigger the SKB destructor and a
completion will be sent to the AF_XDP user space through its
single-producer/single-consumer queues. With a pseudo call stack, it
would look like this:

Core 1:
sendmsg() being called in the application
  netdev_start_xmit()
    Driver entered through ndo_start_xmit
      Driver decides to free the SKB for some reason (e.g., rings full)
        Destructor of SKB called
          xskq_produce_addr() is called to signal completion to user space

Core 2:
TX completion irq
  NAPI loop
    Driver irq handler for TX completions
      Frees the SKB
        Destructor of SKB called
          xskq_produce_addr() is called to signal completion to user space

We now have a violation of the single-producer/single-consumer
principle for our queues as there are two threads trying to produce at
the same time on the same queue.

Fixed by introducing a spin_lock in the destructor. In regards to the
performance, I get around 1.74 Mpps for txonly before and after the
introduction of the spinlock. There is of course some impact due to
the spin lock but it is in the less significant digits that are too
noisy for me to measure. But let us say that the version without the
spin lock got 1.745 Mpps in the best case and the version with 1.735
Mpps in the worst case, then that would mean a maximum drop in
performance of 0.5%.

Fixes: 35fcde7f8d ("xsk: support for Tx")
Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-07-02 18:37:12 -07:00
..
6lowpan
9p treewide: kmalloc() -> kmalloc_array() 2018-06-12 16:19:22 -07:00
802
8021q net: fix use-after-free in GRO with ESP 2018-07-02 20:34:04 +09:00
appletalk Revert changes to convert to ->poll_mask() and aio IOCB_CMD_POLL 2018-06-28 10:40:47 -07:00
atm Revert changes to convert to ->poll_mask() and aio IOCB_CMD_POLL 2018-06-28 10:40:47 -07:00
ax25 Revert changes to convert to ->poll_mask() and aio IOCB_CMD_POLL 2018-06-28 10:40:47 -07:00
batman-adv Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2018-05-26 19:46:15 -04:00
bluetooth Revert changes to convert to ->poll_mask() and aio IOCB_CMD_POLL 2018-06-28 10:40:47 -07:00
bpf bpf: making bpf_prog_test run aware of possible data_end ptr change 2018-04-18 23:34:16 +02:00
bpfilter bpfilter: include bpfilter_umh in assembly instead of using objcopy 2018-06-28 21:39:16 +09:00
bridge Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2018-06-16 07:39:34 +09:00
caif Revert changes to convert to ->poll_mask() and aio IOCB_CMD_POLL 2018-06-28 10:40:47 -07:00
can Revert changes to convert to ->poll_mask() and aio IOCB_CMD_POLL 2018-06-28 10:40:47 -07:00
ceph The main piece is a set of libceph changes that revamps how OSD 2018-06-15 07:24:58 +09:00
core Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2018-07-02 11:18:28 -07:00
dcb treewide: kmalloc() -> kmalloc_array() 2018-06-12 16:19:22 -07:00
dccp Revert changes to convert to ->poll_mask() and aio IOCB_CMD_POLL 2018-06-28 10:40:47 -07:00
decnet Revert changes to convert to ->poll_mask() and aio IOCB_CMD_POLL 2018-06-28 10:40:47 -07:00
dns_resolver KEYS: DNS: limit the length of option strings 2018-04-17 15:17:41 -04:00
dsa net: dsa: add error handling for pskb_trim_rcsum 2018-06-11 14:19:38 -07:00
ethernet net: core: rework basic flow dissection helper 2018-05-08 00:02:36 -04:00
hsr
ieee802154 Revert changes to convert to ->poll_mask() and aio IOCB_CMD_POLL 2018-06-28 10:40:47 -07:00
ife net: sched: ife: check on metadata length 2018-04-22 21:12:00 -04:00
ipv4 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2018-07-02 11:18:28 -07:00
ipv6 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2018-07-02 11:18:28 -07:00
iucv Revert changes to convert to ->poll_mask() and aio IOCB_CMD_POLL 2018-06-28 10:40:47 -07:00
kcm Revert changes to convert to ->poll_mask() and aio IOCB_CMD_POLL 2018-06-28 10:40:47 -07:00
key Revert changes to convert to ->poll_mask() and aio IOCB_CMD_POLL 2018-06-28 10:40:47 -07:00
l2tp Revert changes to convert to ->poll_mask() and aio IOCB_CMD_POLL 2018-06-28 10:40:47 -07:00
l3mdev
lapb
llc Revert changes to convert to ->poll_mask() and aio IOCB_CMD_POLL 2018-06-28 10:40:47 -07:00
mac80211 mac80211: disable BHs/preemption in ieee80211_tx_control_port() 2018-06-29 09:39:08 +02:00
mac802154 net/mac802154: disambiguate mac80215 vs mac802154 trace events 2018-03-28 22:55:18 +02:00
mpls net: Drop pernet_operations::async 2018-03-27 13:18:09 -04:00
ncsi net/ncsi: Use netdev_dbg for debug messages 2018-06-20 07:26:58 +09:00
netfilter netfilter: nf_conncount: fix garbage collection confirm race 2018-06-26 18:28:57 +02:00
netlabel audit: use inline function to get audit context 2018-05-14 17:24:18 -04:00
netlink Revert changes to convert to ->poll_mask() and aio IOCB_CMD_POLL 2018-06-28 10:40:47 -07:00
netrom Revert changes to convert to ->poll_mask() and aio IOCB_CMD_POLL 2018-06-28 10:40:47 -07:00
nfc Revert changes to convert to ->poll_mask() and aio IOCB_CMD_POLL 2018-06-28 10:40:47 -07:00
nsh nsh: fix infinite loop 2018-05-04 12:54:38 -04:00
openvswitch treewide: kzalloc() -> kcalloc() 2018-06-12 16:19:22 -07:00
packet Revert changes to convert to ->poll_mask() and aio IOCB_CMD_POLL 2018-06-28 10:40:47 -07:00
phonet Revert changes to convert to ->poll_mask() and aio IOCB_CMD_POLL 2018-06-28 10:40:47 -07:00
psample
qrtr Revert changes to convert to ->poll_mask() and aio IOCB_CMD_POLL 2018-06-28 10:40:47 -07:00
rds rds: clean up loopback rds_connections on netns deletion 2018-06-27 10:11:03 +09:00
rfkill rfkill: Create rfkill-none LED trigger 2018-05-23 11:26:45 +02:00
rose Revert changes to convert to ->poll_mask() and aio IOCB_CMD_POLL 2018-06-28 10:40:47 -07:00
rxrpc Revert changes to convert to ->poll_mask() and aio IOCB_CMD_POLL 2018-06-28 10:40:47 -07:00
sched net_sched: remove a bogus warning in hfsc 2018-06-23 10:58:46 +09:00
sctp Revert changes to convert to ->poll_mask() and aio IOCB_CMD_POLL 2018-06-28 10:40:47 -07:00
smc Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2018-07-02 11:18:28 -07:00
strparser strparser: Remove early eaten to fix full tcp receive buffer stall 2018-06-28 21:37:26 +09:00
sunrpc NFS client bugfixes for Linux 4.18 2018-06-22 06:21:34 +09:00
switchdev
tipc Revert changes to convert to ->poll_mask() and aio IOCB_CMD_POLL 2018-06-28 10:40:47 -07:00
tls Revert changes to convert to ->poll_mask() and aio IOCB_CMD_POLL 2018-06-28 10:40:47 -07:00
unix Revert changes to convert to ->poll_mask() and aio IOCB_CMD_POLL 2018-06-28 10:40:47 -07:00
vmw_vsock Revert changes to convert to ->poll_mask() and aio IOCB_CMD_POLL 2018-06-28 10:40:47 -07:00
wimax
wireless nl80211: check nla_parse_nested() return values 2018-06-29 09:44:51 +02:00
x25 Revert changes to convert to ->poll_mask() and aio IOCB_CMD_POLL 2018-06-28 10:40:47 -07:00
xdp xsk: fix potential race in SKB TX completion code 2018-07-02 18:37:12 -07:00
xfrm Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next 2018-06-06 18:39:49 -07:00
compat.c net: support compat 64-bit time in {s,g}etsockopt 2018-04-27 19:46:06 -04:00
Kconfig net: Introduce generic failover module 2018-05-28 22:59:54 -04:00
Makefile bpfilter: check compiler capability in Kconfig 2018-06-28 13:36:39 +09:00
socket.c net: handle NULL ->poll gracefully 2018-06-29 06:51:51 -07:00
sysctl_net.c net: Drop pernet_operations::async 2018-03-27 13:18:09 -04:00