linux/drivers/net
Vijay Pandurangan ce8c839b74 veth: don’t modify ip_summed; doing so treats packets with bad checksums as good.
Packets that arrive from real hardware devices have ip_summed ==
CHECKSUM_UNNECESSARY if the hardware verified the checksums, or
CHECKSUM_NONE if the packet is bad or it was unable to verify it. The
current version of veth will replace CHECKSUM_NONE with
CHECKSUM_UNNECESSARY, which causes corrupt packets routed from hardware to
a veth device to be delivered to the application. This caused applications
at Twitter to receive corrupt data when network hardware was corrupting
packets.

We believe this was added as an optimization to skip computing and
verifying checksums for communication between containers. However, locally
generated packets have ip_summed == CHECKSUM_PARTIAL, so the code as
written does nothing for them. As far as we can tell, after removing this
code, these packets are transmitted from one stack to another unmodified
(tcpdump shows invalid checksums on both sides, as expected), and they are
delivered correctly to applications. We didn’t test every possible network
configuration, but we tried a few common ones such as bridging containers,
using NAT between the host and a container, and routing from hardware
devices to containers. We have effectively deployed this in production at
Twitter (by disabling RX checksum offloading on veth devices).

This code dates back to the first version of the driver, commit
<e314dbdc1c0dc6a548ecf> ("[NET]: Virtual ethernet device driver"), so I
suspect this bug occurred mostly because the driver API has evolved
significantly since then. Commit <0b7967503dc97864f283a> ("net/veth: Fix
packet checksumming") (in December 2010) fixed this for packets that get
created locally and sent to hardware devices, by not changing
CHECKSUM_PARTIAL. However, the same issue still occurs for packets coming
in from hardware devices.

Co-authored-by: Evan Jones <ej@evanjones.ca>
Signed-off-by: Evan Jones <ej@evanjones.ca>
Cc: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Cc: Phil Sutter <phil@nwl.cc>
Cc: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp>
Cc: netdev@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Vijay Pandurangan <vijayp@vijayp.ca>
Acked-by: Cong Wang <cwang@twopensource.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-22 15:15:34 -05:00
..
appletalk
arcnet arcnet/com20020: add LEDS_CLASS dependency 2015-11-03 11:29:56 -05:00
bonding bonding: fix panic on non-ARPHRD_ETHER enslave failure 2015-11-07 13:17:32 -05:00
caif net: caif: check return value of alloc_netdev 2015-11-09 11:31:13 -05:00
can can: remove obsolete assignment for CAN protocol error type 2015-11-23 09:37:38 +01:00
cris
dsa net: dsa: mv88e6060: replace magic values with register defines 2015-11-15 20:16:16 -05:00
ethernet natsemi: add checks for dma mapping errors 2015-12-19 12:58:46 -05:00
fddi
fjes fjes: fix inconsistent indenting 2015-11-15 17:09:23 -05:00
hamradio mkiss: Fix use after free in mkiss_close(). 2015-12-18 16:03:03 -05:00
hippi
hyperv
ieee802154 spi: Updates for v4.4 2015-11-05 13:15:12 -08:00
ipvlan ipvlan: fix use after free of skb 2015-11-17 14:39:29 -05:00
irda
phy net: phy: mdio-mux: Check return value of mdiobus_alloc() 2015-12-14 14:27:40 -05:00
plip
ppp pptp: verify sockaddr_len in pptp_bind() and pptp_connect() 2015-12-15 00:29:34 -05:00
slip ppp, slip: Validate VJ compression slot parameters completely 2015-11-02 16:25:00 -05:00
team
usb net: usb: cdc_ncm: Adding Dell DW5813 LTE AT&T Mobile Broadband Card 2015-12-21 14:51:14 -05:00
vmxnet3 vmxnet3: fix checks for dma mapping errors 2015-12-01 15:19:16 -05:00
wan wan/x25: Fix use-after-free in x25_asy_open_tty() 2015-12-01 15:17:42 -05:00
wimax
wireless rtlwifi: rtl8821ae: Fix lockups on boot 2015-11-17 15:58:53 +02:00
xen-netback xen: features for 4.4-rc0 2015-11-04 17:32:42 -08:00
dummy.c net: dummy: add more features 2015-10-21 19:36:10 -07:00
eql.c
geneve.c geneve: Fix IPv6 xmit stats update. 2015-12-08 22:39:03 -05:00
ifb.c
Kconfig
LICENSE.SRC
loopback.c
macvlan.c macvlan: fix leak in macvlan_handle_frame 2015-11-17 14:39:29 -05:00
macvtap.c net: rename SOCK_ASYNC_NOSPACE and SOCK_ASYNC_WAITDATA 2015-12-01 15:45:05 -05:00
Makefile
mdio.c
mii.c
netconsole.c
nlmon.c
ntb_netdev.c
rionet.c
sb1000.c
Space.c
sungem_phy.c
tun.c net: rename SOCK_ASYNC_NOSPACE and SOCK_ASYNC_WAITDATA 2015-12-01 15:45:05 -05:00
veth.c veth: don’t modify ip_summed; doing so treats packets with bad checksums as good. 2015-12-22 15:15:34 -05:00
virtio_net.c virtio-net: Stop doing DMA from the stack 2015-12-07 16:10:53 +02:00
vrf.c vrf: fix double free and memory corruption on register_netdevice failure 2015-11-23 17:52:46 -05:00
vxlan.c vxlan: interpret IP headers for ECN correctly 2015-12-07 17:06:33 -05:00
xen-netfront.c xen: features for 4.4-rc0 2015-11-04 17:32:42 -08:00