linux/net/ipv6
Florian Westphal fe6cc55f3a net: ip, ipv6: handle gso skbs in forwarding path
Marcelo Ricardo Leitner reported problems when the forwarding link path
has a lower mtu than the incoming one if the inbound interface supports GRO.

Given:
Host <mtu1500> R1 <mtu1200> R2

Host sends tcp stream which is routed via R1 and R2.  R1 performs GRO.

In this case, the kernel will fail to send ICMP fragmentation needed
messages (or pkt too big for ipv6), as GSO packets currently bypass dstmtu
checks in forward path. Instead, Linux tries to send out packets exceeding
the mtu.

When locking route MTU on Host (i.e., no ipv4 DF bit set), R1 does
not fragment the packets when forwarding, and again tries to send out
packets exceeding R1-R2 link mtu.

This alters the forwarding dstmtu checks to take the individual gso
segment lengths into account.

For ipv6, we send out pkt too big error for gso if the individual
segments are too big.

For ipv4, we either send icmp fragmentation needed, or, if the DF bit
is not set, perform software segmentation and let the output path
create fragments when the packet is leaving the machine.
It is not 100% correct as the error message will contain the headers of
the GRO skb instead of the original/segmented one, but it seems to
work fine in my (limited) tests.

Eric Dumazet suggested to simply shrink mss via ->gso_size to avoid
sofware segmentation.

However it turns out that skb_segment() assumes skb nr_frags is related
to mss size so we would BUG there.  I don't want to mess with it considering
Herbert and Eric disagree on what the correct behavior should be.

Hannes Frederic Sowa notes that when we would shrink gso_size
skb_segment would then also need to deal with the case where
SKB_MAX_FRAGS would be exceeded.

This uses sofware segmentation in the forward path when we hit ipv4
non-DF packets and the outgoing link mtu is too small.  Its not perfect,
but given the lack of bug reports wrt. GRO fwd being broken this is a
rare case anyway.  Also its not like this could not be improved later
once the dust settles.

Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Reported-by: Marcelo Ricardo Leitner <mleitner@redhat.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-13 17:17:02 -05:00
..
netfilter netfilter: nf_tables: add reject module for NFPROTO_INET 2014-02-06 09:44:18 +01:00
addrconf_core.c ipv6: move in6_dev_finish_destroy() into core kernel 2013-08-31 22:30:00 -04:00
addrconf.c ipv6: reallocate addrconf router for ipv6 address when lo device up 2014-01-24 15:59:38 -08:00
addrlabel.c ipv6: fix null pointer dereference in __ip6addrlbl_add 2013-09-04 14:14:53 -04:00
af_inet6.c ipv6: add flowlabel_consistency sysctl 2014-01-19 17:12:31 -08:00
ah6.c ipv4/ipv6: Fix FSF address in file headers 2013-12-06 12:37:56 -05:00
anycast.c ipv6: enable anycast addresses as source addresses for datagrams 2014-01-22 21:57:05 -08:00
datagram.c ipv6: enable anycast addresses as source addresses for datagrams 2014-01-22 21:57:05 -08:00
esp6.c ipv4/ipv6: Fix FSF address in file headers 2013-12-06 12:37:56 -05:00
exthdrs_core.c ipv6: Correct comparisons and calculations using skb->tail and skb-transport_header 2013-05-28 23:49:07 -07:00
exthdrs_offload.c ipv6: Pull IPv6 GSO registration out of the module 2012-11-15 17:39:24 -05:00
exthdrs.c ipv6/exthdrs: accept tlv which includes only padding 2013-09-11 15:52:27 -04:00
fib6_rules.c ipv6: move IPV6_TCLASS_SHIFT into ipv6.h and define a helper 2014-01-15 15:53:18 -08:00
icmp.c ipv6: icmp6_send: fix Oops when pinging a not set up IPv6 peer on a sit tunnel 2014-02-09 18:12:40 -08:00
inet6_connection_sock.c net: Remove FLOWI_FLAG_CAN_SLEEP 2013-12-06 07:24:39 +01:00
inet6_hashtables.c inet: convert inet_ehash_secret and ipv6_hash_secret to net_get_random_once 2013-10-19 19:45:35 -04:00
ip6_checksum.c ipv6: move csum_ipv6_magic() and udp6_csum_init() into static library 2013-01-08 17:56:10 -08:00
ip6_fib.c ipv6: remove prune parameter for fib6_clean_all 2014-01-02 03:30:35 -05:00
ip6_flowlabel.c ipv6: add flowlabel_consistency sysctl 2014-01-19 17:12:31 -08:00
ip6_gre.c ipv6: move IPV6_TCLASS_SHIFT into ipv6.h and define a helper 2014-01-15 15:53:18 -08:00
ip6_icmp.c ipv6: Kill ipv6 dependency of icmpv6_send(). 2013-04-29 13:54:36 -04:00
ip6_input.c net: Fix memory leak if TPROXY used with TCP early demux 2014-01-27 16:22:11 -08:00
ip6_offload.c net-gre-gro: Add GRE support to the GRO stack 2014-01-07 16:21:31 -05:00
ip6_offload.h ipv6: Pull IPv6 GSO registration out of the module 2012-11-15 17:39:24 -05:00
ip6_output.c net: ip, ipv6: handle gso skbs in forwarding path 2014-02-13 17:17:02 -05:00
ip6_tunnel.c ipv6: move IPV6_TCLASS_SHIFT into ipv6.h and define a helper 2014-01-15 15:53:18 -08:00
ip6_vti.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2014-01-14 14:42:42 -08:00
ip6mr.c net: avoid reference counter overflows on fib_rules in multicast forwarding 2014-01-14 17:37:25 -08:00
ipcomp6.c ipv4/ipv6: Fix FSF address in file headers 2013-12-06 12:37:56 -05:00
ipv6_sockglue.c ipv6: make IPV6_RECVPKTINFO work for ipv4 datagrams 2014-01-19 19:53:18 -08:00
Kconfig ipv6: Remove privacy config option. 2013-10-28 20:07:50 -04:00
Makefile ipv6: Add support for IPsec virtual tunnel interfaces 2013-10-10 12:00:01 +02:00
mcast.c ipv6: send Change Status Report after DAD is completed 2014-01-17 18:12:29 -08:00
mip6.c ipv4/ipv6: Fix FSF address in file headers 2013-12-06 12:37:56 -05:00
ndisc.c neigh: use tbl->family to distinguish ipv4 from ipv6 2013-12-09 20:56:12 -05:00
netfilter.c netfilter: add nf_ipv6_ops hook to fix xt_addrtype with IPv6 2013-05-23 11:58:55 +02:00
output_core.c ipv6: move ip6_local_out into core kernel 2013-08-31 22:30:00 -04:00
ping.c ipv6: protect protocols not handling ipv4 from v4 connection/bind attempts 2014-01-21 16:59:19 -08:00
proc.c net: add SNMP counters tracking incoming ECN bits 2013-08-08 22:24:59 -07:00
protocol.c net: remove outdated comment for ipv4 and ipv6 protocol handler 2013-11-28 18:47:51 -05:00
raw.c ipv6: protect protocols not handling ipv4 from v4 connection/bind attempts 2014-01-21 16:59:19 -08:00
reassembly.c ipv6: split inet6_hash_frag for netfilter and initialize secrets with net_get_random_once 2013-10-23 17:01:40 -04:00
route.c Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2014-01-06 17:37:45 -05:00
sit.c Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2014-01-06 17:37:45 -05:00
syncookies.c net: Remove FLOWI_FLAG_CAN_SLEEP 2013-12-06 07:24:39 +01:00
sysctl_net_ipv6.c ipv6: add flowlabel_consistency sysctl 2014-01-19 17:12:31 -08:00
tcp_ipv6.c tcp: delete redundant calls of tcp_mtup_init() 2014-01-21 16:52:31 -08:00
tcpv6_offload.c net-gro: Prepare GRO stack for the upcoming tunneling support 2013-12-12 13:47:53 -05:00
tunnel6.c ipv4/ipv6: Fix FSF address in file headers 2013-12-06 12:37:56 -05:00
udp_impl.h net: ipv4/ipv6: Remove extern from function prototypes 2013-10-19 19:12:11 -04:00
udp_offload.c ipv6: fix headroom calculation in udp6_ufo_fragment 2013-11-05 22:09:53 -05:00
udp.c ipv6: make IPV6_RECVPKTINFO work for ipv4 datagrams 2014-01-19 19:53:18 -08:00
udplite.c ipv6: do not clear pinet6 field 2013-05-11 16:26:38 -07:00
xfrm6_input.c
xfrm6_mode_beet.c ipsec: be careful of non existing mac headers 2012-02-23 16:50:45 -05:00
xfrm6_mode_ro.c ipv4/ipv6: Fix FSF address in file headers 2013-12-06 12:37:56 -05:00
xfrm6_mode_transport.c
xfrm6_mode_tunnel.c ipv6: Add a receive path hook for vti6 in xfrm6_mode_tunnel. 2013-10-09 13:16:36 +02:00
xfrm6_output.c xfrm: revert ipv4 mtu determination to dst_mtu 2013-08-26 12:40:53 +02:00
xfrm6_policy.c xfrm: Fix null pointer dereference when decoding sessions 2013-11-01 07:08:46 +01:00
xfrm6_state.c xfrm: make local error reporting more robust 2013-08-14 13:07:12 +02:00
xfrm6_tunnel.c ipv4/ipv6: Fix FSF address in file headers 2013-12-06 12:37:56 -05:00