linux/net
Sheng Lan 5845f70638 net: netem: fix skb length BUG_ON in __skb_to_sgvec
It can be reproduced by following steps:
1. virtio_net NIC is configured with gso/tso on
2. configure nginx as http server with an index file bigger than 1M bytes
3. use tc netem to produce duplicate packets and delay:
   tc qdisc add dev eth0 root netem delay 100ms 10ms 30% duplicate 90%
4. continually curl the nginx http server to get index file on client
5. BUG_ON is seen quickly

[10258690.371129] kernel BUG at net/core/skbuff.c:4028!
[10258690.371748] invalid opcode: 0000 [#1] SMP PTI
[10258690.372094] CPU: 5 PID: 0 Comm: swapper/5 Tainted: G        W         5.0.0-rc6 #2
[10258690.372094] RSP: 0018:ffffa05797b43da0 EFLAGS: 00010202
[10258690.372094] RBP: 00000000000005ea R08: 0000000000000000 R09: 00000000000005ea
[10258690.372094] R10: ffffa0579334d800 R11: 00000000000002c0 R12: 0000000000000002
[10258690.372094] R13: 0000000000000000 R14: ffffa05793122900 R15: ffffa0578f7cb028
[10258690.372094] FS:  0000000000000000(0000) GS:ffffa05797b40000(0000) knlGS:0000000000000000
[10258690.372094] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[10258690.372094] CR2: 00007f1a6dc00868 CR3: 000000001000e000 CR4: 00000000000006e0
[10258690.372094] Call Trace:
[10258690.372094]  <IRQ>
[10258690.372094]  skb_to_sgvec+0x11/0x40
[10258690.372094]  start_xmit+0x38c/0x520 [virtio_net]
[10258690.372094]  dev_hard_start_xmit+0x9b/0x200
[10258690.372094]  sch_direct_xmit+0xff/0x260
[10258690.372094]  __qdisc_run+0x15e/0x4e0
[10258690.372094]  net_tx_action+0x137/0x210
[10258690.372094]  __do_softirq+0xd6/0x2a9
[10258690.372094]  irq_exit+0xde/0xf0
[10258690.372094]  smp_apic_timer_interrupt+0x74/0x140
[10258690.372094]  apic_timer_interrupt+0xf/0x20
[10258690.372094]  </IRQ>

In __skb_to_sgvec(), the skb->len is not equal to the sum of the skb's
linear data size and nonlinear data size, thus BUG_ON triggered.
Because the skb is cloned and a part of nonlinear data is split off.

Duplicate packet is cloned in netem_enqueue() and may be delayed
some time in qdisc. When qdisc len reached the limit and returns
NET_XMIT_DROP, the skb will be retransmit later in write queue.
the skb will be fragmented by tso_fragment(), the limit size
that depends on cwnd and mss decrease, the skb's nonlinear
data will be split off. The length of the skb cloned by netem
will not be updated. When we use virtio_net NIC and invoke skb_to_sgvec(),
the BUG_ON trigger.

To fix it, netem returns NET_XMIT_SUCCESS to upper stack
when it clones a duplicate packet.

Fixes: 35d889d1 ("sch_netem: fix skb leak in netem_enqueue()")
Signed-off-by: Sheng Lan <lansheng@huawei.com>
Reported-by: Qin Ji <jiqin.ji@huawei.com>
Suggested-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-28 10:31:31 -08:00
..
6lowpan 6lowpan: convert to DEFINE_SHOW_ATTRIBUTE 2018-12-19 00:28:05 +01:00
9p 9p/net: put a lower bound on msize 2018-12-25 17:07:49 +09:00
802
8021q net: core: dev: Add extack argument to dev_change_flags() 2018-12-06 13:26:07 -08:00
appletalk
atm Revert "net: simplify sock_poll_wait" 2018-10-23 10:57:06 -07:00
ax25 ax25: fix possible use-after-free 2019-01-23 11:18:00 -08:00
batman-adv batman-adv: fix uninit-value in batadv_interface_tx() 2019-02-12 13:30:43 -05:00
bluetooth Merge branch 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 2018-12-27 13:53:32 -08:00
bpf bpf/test_run: fix unkillable BPF_PROG_TEST_RUN 2019-02-19 00:17:03 +01:00
bpfilter net: bpfilter: change section name of bpfilter UMH blob. 2019-01-16 15:46:46 -08:00
bridge Revert "bridge: do not add port to router list when receives query with source 0.0.0.0" 2019-02-23 18:36:06 -08:00
caif Revert "net: simplify sock_poll_wait" 2018-10-23 10:57:06 -07:00
can can: bcm: check timer values before ktime conversion 2019-01-22 11:33:46 +01:00
ceph libceph: handle an empty authorize reply 2019-02-18 18:05:33 +01:00
core net: Do not allocate page fragments that are not skb aligned 2019-02-17 15:48:43 -08:00
dcb
dccp dccp: fool proof ccid_hc_[rt]x_parse_options() 2019-02-01 14:49:10 -08:00
decnet decnet: fix DN_IFREQ_SIZE 2019-01-27 23:11:55 -08:00
dns_resolver
dsa net: dsa: fix a leaked reference by adding missing of_node_put 2019-02-25 09:34:52 -08:00
ethernet net: ethernet: provide nvmem_get_mac_address() 2018-12-03 15:40:30 -08:00
hsr
ieee802154 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2018-12-24 16:19:56 -08:00
ife
ipv4 netlabel: fix out-of-bounds memory accesses 2019-02-27 21:45:24 -08:00
ipv6 ipv6: Return error for RTA_VIA attribute 2019-02-26 13:23:17 -08:00
iucv iucv: Remove SKB list assumptions. 2018-11-10 16:55:11 -08:00
kcm
key af_key: unconditionally clone on broadcast 2019-02-12 10:36:42 +01:00
l2tp l2tp: copy 4 more bytes to linear part if necessary 2019-01-31 08:58:46 -08:00
l3mdev l3mdev: add function to retreive upper master 2018-12-03 14:15:26 -08:00
lapb
llc llc: do not use sk_eat_skb() 2018-10-22 19:59:20 -07:00
mac80211 mac80211: allocate tailroom for forwarded mesh packets 2019-02-22 14:00:40 +01:00
mac802154
mpls mpls: Return error for RTA_GATEWAY attribute 2019-02-26 13:23:17 -08:00
ncsi net/ncsi: Add NCSI Mellanox OEM command 2018-11-27 16:37:20 -08:00
netfilter ipvs: fix warning on unused variable 2019-02-16 10:41:42 +01:00
netlabel netlabel: fix out-of-bounds memory accesses 2019-02-27 21:45:24 -08:00
netlink net: netlink: rename NETLINK_DUMP_STRICT_CHK -> NETLINK_GET_STRICT_CHK 2018-12-14 11:44:31 -08:00
netrom netrom: switch to sock timer API 2019-01-27 10:38:04 -08:00
nfc net: nfc: Fix NULL dereference on nfc_llcp_build_tlv fails 2019-02-27 12:47:08 -08:00
nsh
openvswitch openvswitch: Avoid OOB read when parsing flow nlattrs 2019-01-16 13:35:21 -08:00
packet net/packet: fix 4gb buffer limit due to overflow check 2019-02-12 13:37:23 -05:00
phonet phonet: fix building with clang 2019-02-21 16:23:56 -08:00
psample
qrtr
rds rds: fix refcount bug in rds_sock_addref 2019-01-31 09:43:27 -08:00
rfkill rfkill: gpio: Remove unused include 2018-12-18 13:13:56 +01:00
rose net/rose: fix NULL ax25_cb kernel panic 2019-01-27 10:40:01 -08:00
rxrpc rxrpc: bad unlock balance in rxrpc_recvmsg 2019-02-06 10:54:07 -08:00
sched net: netem: fix skb length BUG_ON in __skb_to_sgvec 2019-02-28 10:31:31 -08:00
sctp sctp: don't compare hb_timer expire date before starting it 2019-02-22 11:11:54 -08:00
smc net/smc: fix smc_poll in SMC_INIT state 2019-02-21 10:19:20 -08:00
strparser
sunrpc Two small fixes, one for crashes using nfs/krb5 with older enctypes, one 2019-02-16 17:38:01 -08:00
switchdev net: switchdev: Add extack to switchdev_handle_port_obj_add() callback 2018-12-12 16:34:22 -08:00
tipc tipc: fix race condition causing hung sendto 2019-02-26 14:50:50 -08:00
tls net: tls: Fix deadlock in free_resources tx 2019-01-28 23:07:08 -08:00
unix missing barriers in some of unix_sock ->addr and ->path accesses 2019-02-20 20:06:28 -08:00
vmw_vsock vsock: cope with memory allocation failure at socket creation time 2019-02-08 22:32:05 -08:00
wimax
wireless cfg80211: prevent speculation on cfg80211_classify8021d() return 2019-02-11 15:50:56 +01:00
x25 net/x25: fix a race in x25_bind() 2019-02-23 18:41:06 -08:00
xdp Revert "xsk: simplify AF_XDP socket teardown" 2019-02-21 16:32:25 +01:00
xfrm xfrm: Fix inbound traffic via XFRM interfaces across network namespaces 2019-02-18 10:58:54 +01:00
compat.c net: socket: add check for negative optlen in compat setsockopt 2019-02-22 11:49:28 -08:00
Kconfig net: convert bridge_nf to use skb extension infrastructure 2018-12-19 11:21:37 -08:00
Makefile
socket.c net: socket: set sock->sk to NULL after calling proto_ops::release() 2019-02-25 10:40:57 -08:00
sysctl_net.c