linux/net
Neal Cardwell d88270eef4 tcp: fix tcp_mark_head_lost to check skb len before fragmenting
This commit fixes a corner case in tcp_mark_head_lost() which was
causing the WARN_ON(len > skb->len) in tcp_fragment() to fire.

tcp_mark_head_lost() was assuming that if a packet has
tcp_skb_pcount(skb) of N, then it's safe to fragment off a prefix of
M*mss bytes, for any M < N. But with the tricky way TCP pcounts are
maintained, this is not always true.

For example, suppose the sender sends 4 1-byte packets and have the
last 3 packet sacked. It will merge the last 3 packets in the write
queue into an skb with pcount = 3 and len = 3 bytes. If another
recovery happens after a sack reneging event, tcp_mark_head_lost()
may attempt to split the skb assuming it has more than 2*MSS bytes.

This sounds very counterintuitive, but as the commit description for
the related commit c0638c247f ("tcp: don't fragment SACKed skbs in
tcp_mark_head_lost()") notes, this is because tcp_shifted_skb()
coalesces adjacent regions of SACKed skbs, and when doing this it
preserves the sum of their packet counts in order to reflect the
real-world dynamics on the wire. The c0638c247f commit tried to
avoid problems by not fragmenting SACKed skbs, since SACKed skbs are
where the non-proportionality between pcount and skb->len/mss is known
to be possible. However, that commit did not handle the case where
during a reneging event one of these weird SACKed skbs becomes an
un-SACKed skb, which tcp_mark_head_lost() can then try to fragment.

The fix is to simply mark the entire skb lost when this happens.
This makes the recovery slightly more aggressive in such corner
cases before we detect reordering. But once we detect reordering
this code path is by-passed because FACK is disabled.

Signed-off-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: Yuchung Cheng <ycheng@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-01-28 16:02:48 -08:00
..
6lowpan 6lowpan: fix debugfs interface entry name 2015-12-20 08:21:00 +01:00
9p ... and a couple in net/9p 2016-01-04 10:29:17 -05:00
802
8021q net: Rename NETIF_F_ALL_CSUM to NETIF_F_CSUM_MASK 2015-12-15 16:50:08 -05:00
appletalk
atm net: Generalise wq_has_sleeper helper 2015-11-30 14:47:33 -05:00
ax25 net: add validation for the socket syscall protocol argument 2015-12-14 16:09:30 -05:00
batman-adv batman-adv: Drop immediate orig_node free function 2016-01-16 22:50:00 +08:00
bluetooth Bluetooth: avoid rebuilding hci_sock all the time 2016-01-06 16:36:44 +01:00
bridge bridge: fix lockdep addr_list_lock false positive splat 2016-01-15 15:40:45 -05:00
caif net: rename SOCK_ASYNC_NOSPACE and SOCK_ASYNC_WAITDATA 2015-12-01 15:45:05 -05:00
can
ceph Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client 2015-11-13 09:24:40 -08:00
core gro: Make GRO aware of lightweight tunnels. 2016-01-20 18:48:38 -08:00
dcb
dccp Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2015-12-03 21:09:12 -05:00
decnet net: add validation for the socket syscall protocol argument 2015-12-14 16:09:30 -05:00
dns_resolver net: dns_resolver: convert time_t to time64_t 2015-11-18 16:27:46 -05:00
dsa dsa: Register netdev before phy 2016-01-07 14:31:26 -05:00
ethernet net: Add eth_platform_get_mac_address() helper. 2016-01-06 16:31:56 -05:00
hsr net/hsr: fix a warning message 2015-11-23 14:56:15 -05:00
ieee802154 inet: kill unused skb_free op 2016-01-05 22:25:57 -05:00
ipv4 tcp: fix tcp_mark_head_lost to check skb len before fragmenting 2016-01-28 16:02:48 -08:00
ipv6 sit: set rtnl_link_ops before calling register_netdevice 2016-01-25 10:51:53 -08:00
ipx
irda net: add validation for the socket syscall protocol argument 2015-12-14 16:09:30 -05:00
iucv af_iucv: Validate socket address length in iucv_sock_bind() 2016-01-19 14:21:08 -05:00
key
l2tp l2tp: rely on ppp layer for skb scrubbing 2016-01-04 16:45:24 -05:00
l3mdev
lapb
llc
mac80211 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2015-12-17 22:08:28 -05:00
mac802154 mac802154: constify ieee802154_llsec_ops structure 2016-01-04 20:40:41 +01:00
mpls Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2015-12-17 22:08:28 -05:00
netfilter netfilter: nf_conntrack: use safer way to lock all buckets 2016-01-20 14:15:31 +01:00
netlabel
netlink genetlink: Fix off-by-one in genl_allocate_reserve_groups() 2016-01-13 10:28:06 -05:00
netrom
nfc NFC 4.5 pull request 2016-01-04 21:48:15 -05:00
openvswitch ovs: limit ovs recursions in ovs_execute_actions to not corrupt stack 2016-01-18 12:09:45 -05:00
packet packet: Allow packets with only a header (but no payload) 2015-11-29 22:17:17 -05:00
phonet phonet: properly unshare skbs in phonet_rcv() 2016-01-12 12:05:38 -05:00
rds RDS: don't pretend to use cpu notifiers 2015-12-22 15:23:05 -05:00
rfkill Bluetooth: hci_bcm: move all Broadcom ACPI IDs to BCM HCI driver 2016-01-04 19:22:05 +01:00
rose
rxrpc Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next 2016-01-12 18:57:02 -08:00
sched Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2016-01-11 23:55:43 -05:00
sctp sctp: remove the dead field of sctp_transport 2016-01-28 15:59:32 -08:00
sunrpc Smaller bugfixes and cleanup, including a fix for a failures of 2016-01-15 12:49:44 -08:00
switchdev switchdev: Adding MDB entry offload 2016-01-10 16:50:20 -05:00
tipc ip_tunnel: Move stats update to iptunnel_xmit() 2015-12-25 23:32:23 -05:00
unix af_unix: fix struct pid memory leak 2016-01-24 22:04:49 -08:00
vmw_vsock Revert "Merge branch 'vsock-virtio'" 2015-12-08 21:55:49 -05:00
wimax
wireless Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2015-12-17 22:08:28 -05:00
x25
xfrm net: preserve IP control block during GSO segmentation 2016-01-15 14:35:24 -05:00
compat.c
Kconfig net, sched: add clsact qdisc 2016-01-10 22:13:15 -05:00
Makefile
socket.c kmemcg: account certain kmem allocations to memcg 2016-01-14 16:00:49 -08:00
sysctl_net.c