linux/net/ipv4
Jerry Chu 44f5324b5d TCP: fix a bug that triggers large number of TCP RST by mistake
This patch fixes a bug that causes TCP RST packets to be generated
on otherwise correctly behaved applications, e.g., no unread data
on close,..., etc. To trigger the bug, at least two conditions must
be met:

1. The FIN flag is set on the last data packet, i.e., it's not on a
separate, FIN only packet.
2. The size of the last data chunk on the receive side matches
exactly with the size of buffer posted by the receiver, and the
receiver closes the socket without any further read attempt.

This bug was first noticed on our netperf based testbed for our IW10
proposal to IETF where a large number of RST packets were observed.
netperf's read side code meets the condition 2 above 100%.

Before the fix, tcp_data_queue() will queue the last skb that meets
condition 1 to sk_receive_queue even though it has fully copied out
(skb_copy_datagram_iovec()) the data. Then if condition 2 is also met,
tcp_recvmsg() often returns all the copied out data successfully
without actually consuming the skb, due to a check
"if ((chunk = len - tp->ucopy.len) != 0) {"
and
"len -= chunk;"
after tcp_prequeue_process() that causes "len" to become 0 and an
early exit from the big while loop.

I don't see any reason not to free the skb whose data have been fully
consumed in tcp_data_queue(), regardless of the FIN flag.  We won't
get there if MSG_PEEK is on. Am I missing some arcane cases related
to urgent data?

Signed-off-by: H.K. Jerry Chu <hkchu@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-01-25 13:46:30 -08:00
..
netfilter netfilter: x_tables: dont block BH while reading counters 2011-01-10 20:11:38 +01:00
af_inet.c net: use the macros defined for the members of flowi 2010-11-17 12:27:45 -08:00
ah4.c ah: reload pointers to skb data after calling skb_cow_data() 2011-01-11 14:03:10 -08:00
arp.c net: arp_ioctl() must hold RTNL 2011-01-24 13:16:16 -08:00
cipso_ipv4.c Update broken web addresses in the kernel. 2010-10-18 11:03:14 +02:00
datagram.c net: return operator cleanup 2010-09-23 14:33:39 -07:00
devinet.c ipv4: Don't pre-seed hoplimit metric. 2010-12-12 22:08:17 -08:00
esp4.c xfrm: Traffic Flow Confidentiality for IPv4 ESP 2010-12-10 14:43:59 -08:00
fib_frontend.c Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 2010-12-26 22:37:05 -08:00
fib_hash.c fib: Fix fib zone and its hash leak on namespace stop 2010-10-28 10:27:03 -07:00
fib_lookup.h fib: fib_result_assign() should not change fib refcounts 2010-11-04 12:05:32 -07:00
fib_rules.c fib: RCU conversion of fib_lookup() 2010-10-05 20:39:38 -07:00
fib_semantics.c net: use the macros defined for the members of flowi 2010-11-17 12:27:45 -08:00
fib_trie.c net: allow GFP_HIGHMEM in __vmalloc() 2010-11-21 10:04:04 -08:00
gre.c tunnels: add _rcu annotations 2010-10-25 13:09:45 -07:00
icmp.c Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 2010-11-19 13:13:47 -08:00
igmp.c igmp: refine skb allocations 2010-11-18 11:02:23 -08:00
inet_connection_sock.c tcp: disallow bind() to reuse addr/port 2011-01-11 14:03:07 -08:00
inet_diag.c Revert "netlink: test for all flags of the NLM_F_DUMP composite" 2011-01-19 13:34:20 -08:00
inet_fragment.c net/ipv4: EXPORT_SYMBOL cleanups 2010-07-12 12:57:54 -07:00
inet_hashtables.c inet: Fix __inet_inherit_port() to correctly increment bsockets and num_owners 2010-11-28 18:18:44 -08:00
inet_lro.c
inet_timewait_sock.c
inetpeer.c inetpeer: Use correct AVL tree base pointer in inet_getpeer(). 2011-01-24 14:38:09 -08:00
ip_forward.c net-next: remove useless union keyword 2010-06-10 23:31:35 -07:00
ip_fragment.c ipv4: IP defragmentation must be ECN aware 2011-01-06 11:21:30 -08:00
ip_gre.c ipv4: Don't pre-seed hoplimit metric. 2010-12-12 22:08:17 -08:00
ip_input.c net: use this_cpu_ptr() 2010-06-28 23:24:29 -07:00
ip_options.c bridge : Sanitize skb before it enters the IP stack 2010-09-19 12:42:34 -07:00
ip_output.c ipv4: Don't pre-seed hoplimit metric. 2010-12-12 22:08:17 -08:00
ip_sockglue.c ipv4: add __rcu annotations to ip_ra_chain 2010-10-25 14:18:28 -07:00
ipcomp.c
ipconfig.c net: add some KERN_CONT markers to continuation lines 2010-11-28 10:47:17 -08:00
ipip.c ipip: add module alias for tunl0 tunnel device 2010-12-01 12:53:23 -08:00
ipmr.c net: use the macros defined for the members of flowi 2010-11-17 12:27:45 -08:00
Kconfig Docs/Kconfig: Update: osdl.org -> linuxfoundation.org 2010-11-15 23:50:13 +01:00
Makefile PPTP: PPP over IPv4 (Point-to-Point Tunneling Protocol) 2010-08-21 23:05:39 -07:00
netfilter.c net: use the macros defined for the members of flowi 2010-11-17 12:27:45 -08:00
proc.c tcp: Replace time wait bucket msg by counter 2010-12-08 12:16:33 -08:00
protocol.c net: add __rcu annotations to protocol 2010-10-27 11:37:31 -07:00
raw.c net: use the macros defined for the members of flowi 2010-11-17 12:27:45 -08:00
route.c Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 2011-01-04 11:57:25 -08:00
syncookies.c net: use the macros defined for the members of flowi 2010-11-17 12:27:45 -08:00
sysctl_net_ipv4.c net: add limits to ip_default_ttl 2010-12-13 12:16:14 -08:00
tcp_bic.c
tcp_cong.c net/ipv4: Eliminate kstrdup memory leak 2010-08-27 19:31:56 -07:00
tcp_cubic.c
tcp_diag.c
tcp_highspeed.c
tcp_htcp.c
tcp_hybla.c TCP: tcp_hybla: Fix integer overflow in slow start increment 2010-06-02 07:15:48 -07:00
tcp_illinois.c Update broken web addresses in the kernel. 2010-10-18 11:03:14 +02:00
tcp_input.c TCP: fix a bug that triggers large number of TCP RST by mistake 2011-01-25 13:46:30 -08:00
tcp_ipv4.c tcp: fix bug in listening_get_next() 2011-01-24 14:41:20 -08:00
tcp_lp.c
tcp_minisocks.c Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 2010-12-08 13:47:38 -08:00
tcp_output.c Merge branch 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial 2011-01-13 10:05:56 -08:00
tcp_probe.c net: ipv4: tcp_probe: cleanup snprintf() use 2010-11-17 12:27:46 -08:00
tcp_scalable.c
tcp_timer.c tcp: use correct counters in CA_CWR state too 2010-10-17 13:46:33 -07:00
tcp_vegas.c
tcp_vegas.h
tcp_veno.c Update broken web addresses in the kernel. 2010-10-18 11:03:14 +02:00
tcp_westwood.c net: return operator cleanup 2010-09-23 14:33:39 -07:00
tcp_yeah.c
tcp.c Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 2010-12-08 13:47:38 -08:00
tunnel4.c tunnels: add __rcu annotations 2010-10-27 11:37:32 -07:00
udp_impl.h
udp.c Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 2010-12-17 12:27:22 -08:00
udplite.c net: fix nulls list corruptions in sk_prot_alloc 2010-12-16 14:26:56 -08:00
xfrm4_input.c net/ipv4: EXPORT_SYMBOL cleanups 2010-07-12 12:57:54 -07:00
xfrm4_mode_beet.c
xfrm4_mode_transport.c
xfrm4_mode_tunnel.c ipv4: Don't pre-seed hoplimit metric. 2010-12-12 22:08:17 -08:00
xfrm4_output.c
xfrm4_policy.c net: use the macros defined for the members of flowi 2010-11-17 12:27:45 -08:00
xfrm4_state.c xfrm: Allow different selector family in temporary state 2010-09-20 11:11:38 -07:00
xfrm4_tunnel.c net: struct xfrm_tunnel in read_mostly section 2010-08-30 13:50:45 -07:00