linux/include/net
Ilpo Järvinen 68f8353b48 [TCP]: Rewrite SACK block processing & sack_recv_cache use
Key points of this patch are:

  - In case new SACK information is advance only type, no skb
    processing below previously discovered highest point is done
  - Optimize cases below highest point too since there's no need
    to always go up to highest point (which is very likely still
    present in that SACK), this is not entirely true though
    because I'm dropping the fastpath_skb_hint which could
    previously optimize those cases even better. Whether that's
    significant, I'm not too sure.

Currently it will provide skipping by walking. Combined with
RB-tree, all skipping would become fast too regardless of window
size (can be done incrementally later).

Previously a number of cases in TCP SACK processing fails to
take advantage of costly stored information in sack_recv_cache,
most importantly, expected events such as cumulative ACK and new
hole ACKs. Processing on such ACKs result in rather long walks
building up latencies (which easily gets nasty when window is
huge). Those latencies are often completely unnecessary
compared with the amount of _new_ information received, usually
for cumulative ACK there's no new information at all, yet TCP
walks whole queue unnecessary potentially taking a number of
costly cache misses on the way, etc.!

Since the inclusion of highest_sack, there's a lot information
that is very likely redundant (SACK fastpath hint stuff,
fackets_out, highest_sack), though there's no ultimate guarantee
that they'll remain the same whole the time (in all unearthly
scenarios). Take advantage of this knowledge here and drop
fastpath hint and use direct access to highest SACKed skb as
a replacement.

Effectively "special cased" fastpath is dropped. This change
adds some complexity to introduce better coveraged "fastpath",
though the added complexity should make TCP behave more cache
friendly.

The current ACK's SACK blocks are compared against each cached
block individially and only ranges that are new are then scanned
by the high constant walk. For other parts of write queue, even
when in previously known part of the SACK blocks, a faster skip
function is used (if necessary at all). In addition, whenever
possible, TCP fast-forwards to highest_sack skb that was made
available by an earlier patch. In typical case, no other things
but this fast-forward and mandatory markings after that occur
making the access pattern quite similar to the former fastpath
"special case".

DSACKs are special case that must always be walked.

The local to recv_sack_cache copying could be more intelligent
w.r.t DSACKs which are likely to be there only once but that
is left to a separate patch.

Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 14:54:07 -08:00
..
9p Use helpers to obtain task pid in printks 2007-10-19 11:53:43 -07:00
bluetooth [Bluetooth] Add support for handling simple eSCO links 2007-10-22 02:59:47 -07:00
irda [NET] include/net/: Spelling fixes 2007-12-20 13:56:32 -08:00
iucv [AF_IUCV]: postpone receival of iucv-packets 2007-10-10 16:54:51 -07:00
netfilter [NETFILTER]: Introduce NF_INET_ hook values 2008-01-28 14:53:55 -08:00
sctp [SCTP]: Fix the name of the authentication event. 2008-01-08 23:30:02 -08:00
tc_act [PKT_SCHED]: Add stateless NAT 2007-10-10 16:53:11 -07:00
tipc [TIPC]: Optimize stream send routine to avoid fragmentation 2007-07-10 22:06:12 -07:00
act_api.h [NET_SCHED]: Kill CONFIG_NET_CLS_POLICE 2007-07-15 00:03:05 -07:00
addrconf.h [IPV6] ADDRCONF: Support RFC3484 configurable address selection policy table. 2008-01-28 14:53:58 -08:00
af_rxrpc.h [AF_RXRPC]: Add an interface to the AF_RXRPC module for the AFS filesystem to use 2007-04-26 15:50:17 -07:00
af_unix.h [AF_UNIX]: Make unix_tot_inflight counter non-atomic 2007-11-10 22:06:01 -08:00
ah.h [IPSEC]: Get rid of ipv6_{auth,esp,comp}_hdr 2007-10-10 16:55:55 -07:00
arp.h [IPV6]: Assorted trivial endianness annotations. 2006-12-02 21:22:50 -08:00
atmclip.h [ATM]: Annotations. 2006-12-02 21:22:55 -08:00
ax25.h [NET] include/net/: Spelling fixes 2007-12-20 13:56:32 -08:00
ax88796.h ax88796: add 93cx6 eeprom support 2007-10-10 16:53:56 -07:00
cfg80211.h [NL80211]: add netlink interface to cfg80211 2007-10-10 16:52:14 -07:00
checksum.h [NET]: Make mangling a checksum (0 -> 0xffff on the wire) explicit. 2006-12-02 21:23:39 -08:00
cipso_ipv4.h [NetLabel]: consolidate the struct socket/sock handling to just struct sock 2007-06-08 13:33:09 -07:00
compat.h [NET]: Introduce SIOCGSTAMPNS ioctl to get timestamps with nanosec resolution 2007-04-25 22:24:04 -07:00
datalink.h
dn_dev.h
dn_fib.h [DECNet]: Use rtnl registration interface 2007-04-25 22:27:12 -07:00
dn_neigh.h
dn_nsp.h
dn_route.h [NET]: Wrap netdevice hardware header creation. 2007-10-10 16:52:50 -07:00
dn.h [DECNET]: Another unnecessary net/tcp.h inclusion in net/dn.h 2007-07-10 23:02:12 -07:00
dsfield.h [NET]: IP header modifier helpers annotations. 2006-12-02 21:23:40 -08:00
dst.h [IPSEC]: Merge most of the output path 2008-01-28 14:53:48 -08:00
esp.h cleanup asm/scatterlist.h includes 2007-11-02 08:47:06 +01:00
fib_rules.h [INET]: Small possible memory leak in FIB rules 2007-11-10 22:12:03 -08:00
flow.h [IPV6] MIP6: Kill unnecessary ifdefs. 2007-07-10 22:15:41 -07:00
gen_stats.h
genetlink.h [GENETLINK]: Dynamic multicast groups. 2007-07-18 15:47:52 -07:00
icmp.h [IPV4]: Add ICMPMsgStats MIB (RFC 4293) 2007-10-10 16:51:28 -07:00
ieee80211_crypt.h [PATCH] Update my email address from jkmaline@cc.hut.fi to j@w1.fi 2007-04-28 11:01:01 -04:00
ieee80211_radiotap.h [MAC80211]: Add get_unaligned to ieee80211_get_radiotap_len 2007-10-10 16:47:40 -07:00
ieee80211.h ieee80211: Stop net_ratelimit/IEEE80211_DEBUG_DROP log pollution 2007-11-20 16:43:17 -05:00
ieee80211softmac_wx.h [PATCH] softmac: add SIOCSIWMLME 2006-04-24 16:15:58 -04:00
ieee80211softmac.h [IEEE80211]: Fix softmac lockdep reports. 2007-10-10 16:52:22 -07:00
if_inet6.h IPoIB: improve IPv4/IPv6 to IB mcast mapping functions 2008-01-25 14:15:37 -08:00
inet6_connection_sock.h [TCP]: Restore SKB socket owner setting in tcp_transmit_skb(). 2007-01-26 01:04:55 -08:00
inet6_hashtables.h [INET]: Use jhash + random secret for ehash. 2007-04-25 22:28:06 -07:00
inet_common.h [INET]: Remove leftover prototypes from include/net/inet_common.h 2007-11-12 21:02:51 -08:00
inet_connection_sock.h [TCP]: Restore SKB socket owner setting in tcp_transmit_skb(). 2007-01-26 01:04:55 -08:00
inet_ecn.h [INET]: Give outer DSCP directly to ip*_copy_dscp 2008-01-28 14:53:45 -08:00
inet_frag.h [INET]: Remove no longer needed ->equal callback 2007-10-17 19:47:56 -07:00
inet_hashtables.h [IPV4]: Fix memory leak in inet_hashtables.h when NUMA is on 2007-11-26 20:23:31 +08:00
inet_sock.h [UDP]: Make use of inet_iif() when doing socket lookups. 2007-10-25 18:54:46 -07:00
inet_timewait_sock.h [NET]: Add a network namespace parameter to struct sock 2007-10-10 16:49:05 -07:00
inetpeer.h [INET]: Use list_head-s in inetpeer.c 2007-11-12 21:27:28 -08:00
ip6_checksum.h [IPV6]: Dumb typo in generic csum_ipv6_magic() 2006-12-22 11:12:07 -08:00
ip6_fib.h [IPV6]: Move nfheader_len into rt6_info 2008-01-28 14:53:37 -08:00
ip6_route.h [IPv6]: Use rtnl registration interface 2007-04-25 22:27:13 -07:00
ip6_tunnel.h [NET] include/net/: Spelling fixes 2007-12-20 13:56:32 -08:00
ip_fib.h [IPV4]: Compact some ifdefs in the fib code. 2007-11-07 04:11:41 -08:00
ip_vs.h [IPVS]: Move remaining sysctl handlers over to CTL_UNNUMBERED 2007-11-19 21:51:13 -08:00
ip.h [IPV4]: Add ip_local_out 2008-01-28 14:53:47 -08:00
ipcomp.h [IPSEC]: Get rid of ipv6_{auth,esp,comp}_hdr 2007-10-10 16:55:55 -07:00
ipconfig.h [NET]: ipconfig and nfsroot annotations 2006-12-02 21:21:09 -08:00
ipip.h [IPV4]: Add ip_local_out 2008-01-28 14:53:47 -08:00
ipv6.h [IPV6]: Add ip6_local_out 2008-01-28 14:53:47 -08:00
ipx.h [SK_BUFF]: Introduce skb_transport_header(skb) 2007-04-25 22:25:31 -07:00
iw_handler.h [NL80211]: add netlink interface to cfg80211 2007-10-10 16:52:14 -07:00
lapb.h
llc_c_ac.h
llc_c_ev.h
llc_c_st.h
llc_conn.h [NET]: Make socket creation namespace safe. 2007-10-10 16:49:07 -07:00
llc_if.h [LLC]: add multicast support for datagrams 2006-06-17 21:26:08 -07:00
llc_pdu.h [SK_BUFF]: Introduce skb_network_header() 2007-04-25 22:24:59 -07:00
llc_s_ac.h
llc_s_ev.h
llc_s_st.h
llc_sap.h
llc.h
mac80211.h mac80211: remove unused driver ops 2007-11-10 22:01:15 -08:00
mip6.h [IPV6] MIP6: Loadable module support for MIPv6. 2007-07-10 22:15:42 -07:00
ndisc.h [IPv6]: Export userland ND options through netlink (RDNSS support) 2007-10-10 21:22:05 -07:00
neighbour.h [NEIGH]: Use rtnl registration interface 2007-04-25 22:27:06 -07:00
net_namespace.h [NET]: Move unneeded data to initdata section. 2007-11-13 03:23:50 -08:00
netdma.h Remove all inclusions of <linux/config.h> 2006-10-04 03:38:54 -04:00
netevent.h [NET]: Remove unnecessary inclusion of dst.h 2008-01-28 14:53:38 -08:00
netlabel.h SELinux: restore proper NetLabel caching behavior 2007-08-02 11:52:21 -04:00
netlink.h [NET]: make netlink user -> kernel interface synchronious 2007-10-10 21:15:29 -07:00
netrom.h [PATCH] mark struct file_operations const 1 2007-02-12 09:48:44 -08:00
nexthop.h [IPv4]: FIB configuration using struct fib_config 2006-09-22 14:55:04 -07:00
p8022.h
pkt_cls.h [NET]: Make the device list and device lookups per namespace. 2007-10-10 16:49:10 -07:00
pkt_sched.h [NET]: Move hardware header operations out of netdevice. 2007-10-10 16:52:52 -07:00
protocol.h [IPV6]: Replace sk_buff ** with sk_buff * in input handlers 2007-10-15 12:50:28 -07:00
psnap.h
raw.h Merge git://git.infradead.org/hdrcleanup-2.6 2006-06-20 15:10:08 -07:00
rawv6.h [IPV6] MIP6: Loadable module support for MIPv6. 2007-07-10 22:15:42 -07:00
red.h [NET_SCHED]: turn PSCHED_GET_TIME into inline function 2007-04-25 22:27:55 -07:00
request_sock.h [INET]: Fix potential kfree on vmalloc-ed area of request_sock_queue 2007-11-15 02:57:06 -08:00
rose.h [ROSE]: Fix rose.ko oops on unload 2007-10-07 23:44:17 -07:00
route.h [IPV4]: Remove prototype of ip_rt_advice 2007-12-07 01:07:38 -08:00
rtnetlink.h [NET]: Make the device list and device lookups per namespace. 2007-10-10 16:49:10 -07:00
sch_generic.h [NET]: Move Qdisc_class_ops and Qdisc_ops in appropriate sections. 2008-01-28 14:53:58 -08:00
scm.h pid namespaces: changes to show virtual ids to user 2007-10-19 11:53:40 -07:00
slhc_vj.h
snmp.h [IPV4]: Add ICMPMsgStats MIB (RFC 4293) 2007-10-10 16:51:28 -07:00
sock.h [NET]: Move sock_valbool_flag to socket.c 2008-01-28 14:54:00 -08:00
syncppp.h
tcp_states.h
tcp.h [TCP]: Rewrite SACK block processing & sack_recv_cache use 2008-01-28 14:54:07 -08:00
timewait_sock.h [PATCH] slab: remove kmem_cache_t 2006-12-07 08:39:25 -08:00
transp_v6.h [NET]: Supporting UDP-Lite (RFC 3828) in Linux 2006-12-02 21:22:46 -08:00
udp.h [UDP]: Revert 2-pass hashing changes. 2007-06-07 13:40:50 -07:00
udplite.h [UDP]: Revert 2-pass hashing changes. 2007-06-07 13:40:50 -07:00
wext.h [NET]: Make the device list and device lookups per namespace. 2007-10-10 16:49:10 -07:00
wireless.h [WIRELESS] cfg80211: New wireless config infrastructure. 2007-04-25 22:29:41 -07:00
x25.h [X.25]: Adds /proc/sys/net/x25/x25_forward to control forwarding. 2007-02-08 13:34:36 -08:00
x25device.h [SK_BUFF]: Introduce skb_reset_mac_header(skb) 2007-04-25 22:24:32 -07:00
xfrm.h [IPSEC]: Kill afinfo->nf_post_routing 2008-01-28 14:53:55 -08:00