linux/net/ipv6
David S. Miller 62fa8a846d net: Implement read-only protection and COW'ing of metrics.
Routing metrics are now copy-on-write.

Initially a route entry points it's metrics at a read-only location.
If a routing table entry exists, it will point there.  Else it will
point at the all zero metric place-holder called 'dst_default_metrics'.

The writeability state of the metrics is stored in the low bits of the
metrics pointer, we have two bits left to spare if we want to store
more states.

For the initial implementation, COW is implemented simply via kmalloc.
However future enhancements will change this to place the writable
metrics somewhere else, in order to increase sharing.  Very likely
this "somewhere else" will be the inetpeer cache.

Note also that this means that metrics updates may transiently fail
if we cannot COW the metrics successfully.

But even by itself, this patch should decrease memory usage and
increase cache locality especially for routing workloads.  In those
cases the read-only metric copies stay in place and never get written
to.

TCP workloads where metrics get updated, and those rare cases where
PMTU triggers occur, will take a very slight performance hit.  But
that hit will be alleviated when the long-term writable metrics
move to a more sharable location.

Since the metrics storage went from a u32 array of RTAX_MAX entries to
what is essentially a pointer, some retooling of the dst_entry layout
was necessary.

Most importantly, we need to preserve the alignment of the reference
count so that it doesn't share cache lines with the read-mostly state,
as per Eric Dumazet's alignment assertion checks.

The only non-trivial bit here is the move of the 'flags' member into
the writeable cacheline.  This is OK since we are always accessing the
flags around the same moment when we made a modification to the
reference count.

Signed-off-by: David S. Miller <davem@davemloft.net>
2011-01-26 20:51:05 -08:00
..
netfilter netfilter: add a missing include in nf_conntrack_reasm.c 2011-01-20 21:00:38 +01:00
addrconf_core.c ipv6: Remove IPV6_ADDR_RESERVED 2010-02-26 03:59:07 -08:00
addrconf.c ipv6: Revert 'administrative down' address handling changes. 2011-01-25 12:49:08 -08:00
addrlabel.c Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 2010-09-27 01:03:03 -07:00
af_inet6.c net: change netdev->features to u32 2011-01-24 15:32:47 -08:00
ah6.c ah: reload pointers to skb data after calling skb_cow_data() 2011-01-11 14:03:10 -08:00
anycast.c net-next: remove useless union keyword 2010-06-10 23:31:35 -07:00
datagram.c tproxy: added tproxy sockopt interface in the IPV6 layer 2010-10-21 16:08:28 +02:00
esp6.c xfrm: Traffic Flow Confidentiality for IPv6 ESP 2010-12-10 14:43:59 -08:00
exthdrs_core.c net: return operator cleanup 2010-09-23 14:33:39 -07:00
exthdrs.c ipv6: avoid two atomics in ipv6_rthdr_rcv() 2010-06-14 23:13:06 -07:00
fib6_rules.c fib6: use FIB_LOOKUP_NOREF in fib6_rule_lookup() 2010-10-16 11:13:21 -07:00
icmp.c ipv6: fix ICMP6_MIB_OUTERRORS 2010-06-09 18:39:27 -07:00
inet6_connection_sock.c tcp: disallow bind() to reuse addr/port 2011-01-11 14:03:07 -08:00
inet6_hashtables.c tcp: Fix a connect() race with timewait sockets 2009-12-08 20:17:51 -08:00
ip6_fib.c fib: avoid false sharing on fib_table_hash 2010-10-16 11:13:23 -07:00
ip6_flowlabel.c IPv6: Add dontfrag argument to relevant functions 2010-04-23 23:35:28 -07:00
ip6_input.c Merge branch 'master' of /repos/git/net-next-2.6 2010-04-20 16:02:01 +02:00
ip6_output.c inet6: prevent network storms caused by linux IPv6 routers 2011-01-12 18:51:55 -08:00
ip6_tunnel.c Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 2010-12-08 13:47:38 -08:00
ip6mr.c net: use the macros defined for the members of flowi 2010-11-17 12:27:45 -08:00
ipcomp6.c xfrm: SA lookups signature with mark 2010-02-22 16:20:22 -08:00
ipv6_sockglue.c tproxy: Add missing CAP_NET_ADMIN check to ipv6 side 2010-10-24 16:07:50 -07:00
Kconfig ipv6: ip6mr: support multiple tables 2010-05-11 14:40:55 +02:00
Makefile [IPV6] MROUTE: Support multicast forwarding. 2008-04-05 22:33:38 +09:00
mcast.c ipv6: mcast: RCU conversion 2010-11-24 11:16:42 -08:00
mip6.c IPv6: fix CoA check in RH2 input handler (mip6_rthdr_input()) 2010-07-18 15:04:33 -07:00
ndisc.c net: Abstract away all dst_entry metrics accesses. 2010-12-09 10:46:36 -08:00
netfilter.c net: use the macros defined for the members of flowi 2010-11-17 12:27:45 -08:00
proc.c ipv6/udp: report SndbufErrors and RcvbufErrors 2010-10-30 16:17:23 -07:00
protocol.c net: add __rcu annotations to protocol 2010-10-27 11:37:31 -07:00
raw.c ipv6: raw: rcu annotations 2011-01-20 16:59:34 -08:00
reassembly.c ipv6: Prepare the tree for un-inlined jhash. 2010-11-28 11:26:21 -08:00
route.c net: Implement read-only protection and COW'ing of metrics. 2011-01-26 20:51:05 -08:00
sit.c net: ipv6: sit: fix rcu annotations 2011-01-20 16:59:33 -08:00
syncookies.c syncookies: add support for ECN 2010-06-26 22:00:03 -07:00
sysctl_net_ipv6.c include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h 2010-03-30 22:02:32 +09:00
tcp_ipv6.c net: Abstract default ADVMSS behind an accessor. 2010-12-13 12:52:14 -08:00
tunnel6.c tunnels: add _rcu annotations 2010-10-25 13:09:45 -07:00
udp_impl.h net: Make setsockopt() optlen be unsigned. 2009-09-30 16:12:20 -07:00
udp.c net: change netdev->features to u32 2011-01-24 15:32:47 -08:00
udplite.c net: fix nulls list corruptions in sk_prot_alloc 2010-12-16 14:26:56 -08:00
xfrm6_input.c netfilter: ipv6: use NFPROTO values for NF_HOOK invocation 2010-03-25 16:00:49 +01:00
xfrm6_mode_beet.c ipsec: Interfamily IPSec BEET, ipv4-inner ipv6-outer 2008-08-06 02:40:25 -07:00
xfrm6_mode_ro.c
xfrm6_mode_transport.c
xfrm6_mode_tunnel.c ipv6: Use ip6_dst_hoplimit() instead of direct dst_metric() calls. 2010-12-12 21:14:46 -08:00
xfrm6_output.c ipv6: Fragment locally generated tunnel-mode IPSec6 packets as needed. 2010-12-19 20:22:23 -08:00
xfrm6_policy.c net: Implement read-only protection and COW'ing of metrics. 2011-01-26 20:51:05 -08:00
xfrm6_state.c xfrm: Allow different selector family in temporary state 2010-09-20 11:11:38 -07:00
xfrm6_tunnel.c xfrm6: make xfrm6_tunnel_free_spi local 2010-10-21 03:09:45 -07:00