linux/drivers/net
Jesper Dangaard Brouer 7324f5399b virtio_net: disable XDP_REDIRECT in receive_mergeable() case
The virtio_net code have three different RX code-paths in receive_buf().
Two of these code paths can handle XDP, but one of them is broken for
at least XDP_REDIRECT.

Function(1): receive_big() does not support XDP.
Function(2): receive_small() support XDP fully and uses build_skb().
Function(3): receive_mergeable() broken XDP_REDIRECT uses napi_alloc_skb().

The simple explanation is that receive_mergeable() is broken because
it uses napi_alloc_skb(), which violates XDP given XDP assumes packet
header+data in single page and enough tail room for skb_shared_info.

The longer explaination is that receive_mergeable() tries to
work-around and satisfy these XDP requiresments e.g. by having a
function xdp_linearize_page() that allocates and memcpy RX buffers
around (in case packet is scattered across multiple rx buffers).  This
does currently satisfy XDP_PASS, XDP_DROP and XDP_TX (but only because
we have not implemented bpf_xdp_adjust_tail yet).

The XDP_REDIRECT action combined with cpumap is broken, and cause hard
to debug crashes.  The main issue is that the RX packet does not have
the needed tail-room (SKB_DATA_ALIGN(skb_shared_info)), causing
skb_shared_info to overlap the next packets head-room (in which cpumap
stores info).

Reproducing depend on the packet payload length and if RX-buffer size
happened to have tail-room for skb_shared_info or not.  But to make
this even harder to troubleshoot, the RX-buffer size is runtime
dynamically change based on an Exponentially Weighted Moving Average
(EWMA) over the packet length, when refilling RX rings.

This patch only disable XDP_REDIRECT support in receive_mergeable()
case, because it can cause a real crash.

IMHO we should consider NOT supporting XDP in receive_mergeable() at
all, because the principles behind XDP are to gain speed by (1) code
simplicity, (2) sacrificing memory and (3) where possible moving
runtime checks to setup time.  These principles are clearly being
violated in receive_mergeable(), that e.g. runtime track average
buffer size to save memory consumption.

In the longer run, we should consider introducing a separate receive
function when attaching an XDP program, and also change the memory
model to be compatible with XDP when attaching an XDP prog.

Fixes: 186b3c998c ("virtio-net: support XDP_REDIRECT")
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-21 15:09:29 -05:00
..
appletalk
arcnet
bonding net: bonding: Replace mac address parsing 2017-12-20 12:47:29 -05:00
caif net: caif: remove redundant re-assignment of pointer pfrm 2018-01-22 10:51:56 -05:00
can can: migrate documentation to restructured text 2018-01-26 10:46:44 +01:00
cris
dsa net: dsa: mv88e6xxx: Free ATU/VTU irq only when there is chip irq 2018-01-19 15:57:02 -05:00
ethernet mlx5-fixes-2018-02-20 2018-02-21 14:57:35 -05:00
fddi
fjes
hamradio
hippi
hyperv hv_netvsc: Use the num_online_cpus() for channel limit 2018-01-22 16:24:08 -05:00
ieee802154 vfs: do bulk POLL* -> EPOLL* replacement 2018-02-11 14:34:03 -08:00
ipvlan ipvlan: remove excessive packet scrubbing 2017-12-15 11:36:53 -05:00
netdevsim netdevsim: fix overflow on the error path 2018-02-01 11:22:51 +01:00
phy net: phy: fix wrong mask to phy_modify() 2018-02-12 11:42:48 -05:00
plip
ppp vfs: do bulk POLL* -> EPOLL* replacement 2018-02-11 14:34:03 -08:00
slip
team
usb r8152: set rx mode early when linking on 2018-02-02 19:19:00 -05:00
vmxnet3 vmxnet3: remove redundant initialization of pointer 'rq' 2018-02-01 14:54:28 -05:00
wan Merge branch 'misc.poll' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2018-01-30 17:58:07 -08:00
wimax treewide: Use DEVICE_ATTR_WO 2018-01-09 16:34:35 +01:00
wireless vfs: do bulk POLL* -> EPOLL* replacement 2018-02-11 14:34:03 -08:00
xen-netback
dummy.c
eql.c
geneve.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2018-01-29 10:15:51 -05:00
gtp.c
ifb.c
Kconfig
LICENSE.SRC
loopback.c
macsec.c macsec: restore uAPI after addition of GCM-AES-256 2018-01-22 15:40:16 -05:00
macvlan.c macvlan: Fix one possible double free 2018-01-02 13:30:14 -05:00
macvtap.c
Makefile
mdio.c
mii.c
netconsole.c
nlmon.c
ntb_netdev.c
rionet.c
sb1000.c
Space.c
sungem_phy.c
tap.c vfs: do bulk POLL* -> EPOLL* replacement 2018-02-11 14:34:03 -08:00
thunderbolt.c net: thunderbolt: Run disconnect flow asynchronously when logout is received 2018-02-12 12:03:04 -05:00
tun.c tun: fix tun_napi_alloc_frags() frag allocator 2018-02-16 16:20:46 -05:00
veth.c
virtio_net.c virtio_net: disable XDP_REDIRECT in receive_mergeable() case 2018-02-21 15:09:29 -05:00
vrf.c net: vrf: Add support for sends to local broadcast address 2018-01-25 21:51:03 -05:00
vsockmon.c
vxlan.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2018-01-29 10:15:51 -05:00
xen-netfront.c xen-netfront: Fix race between device setup and open 2018-02-06 09:55:40 +01:00