linux

Author	SHA1	Message	Date
YOSHIFUJI Hideaki	303065a854	[IPV6] ADDRCONF: Allow address selection policy with ifindex. This patch allows ifindex to be a key for address selection policy table. Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-01-28 14:53:57 -08:00
YOSHIFUJI Hideaki	c1ee656ccb	[IPV6] ADDRCONF: Rename ipv6_saddr_label() to ipv6_addr_label(). This patch renames ipv6_saddr_label() to ipv6_addr_label() because address label is used for both of source address and destination address. Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-01-28 14:53:56 -08:00
David S. Miller	294b4baf29	[IPSEC]: Kill afinfo->nf_post_routing After changeset: [NETFILTER]: Introduce NF_INET_ hook values It always evaluates to NF_INET_POST_ROUTING. Signed-off-by: David S. Miller <davem@davemloft.net>	2008-01-28 14:53:55 -08:00
Patrick McHardy	6e23ae2a48	[NETFILTER]: Introduce NF_INET_ hook values The IPv4 and IPv6 hook values are identical, yet some code tries to figure out the "correct" value by looking at the address family. Introduce NF_INET_* values for both IPv4 and IPv6. The old values are kept in a #ifndef __KERNEL__ section for userspace compatibility. Signed-off-by: Patrick McHardy <kaber@trash.net> Acked-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-01-28 14:53:55 -08:00
Herbert Xu	1bf06cd2e3	[IPSEC]: Add async resume support on input This patch adds support for async resumptions on input. To do so, the transform would return -EINPROGRESS and subsequently invoke the function xfrm_input_resume to resume processing. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-01-28 14:53:54 -08:00
Herbert Xu	60d5fcfb19	[IPSEC]: Remove nhoff from xfrm_input The nhoff field isn't actually necessary in xfrm_input. For tunnel mode transforms we now throw away the output IP header so it makes no sense to fill in the nexthdr field. For transport mode we can now let the function transport_finish do the setting and it knows where the nexthdr field is. The only other thing that needs the nexthdr field to be set is the header extraction code. However, we can simply move the protocol extraction out of the generic header extraction. We want to minimise the amount of info we have to carry around between transforms as this simplifies the resumption process for async crypto. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-01-28 14:53:53 -08:00
Herbert Xu	d26f398400	[IPSEC]: Make x->lastused an unsigned long Currently x->lastused is u64 which means that it cannot be read/written atomically on all architectures. David Miller observed that the value stored in it is only an unsigned long which is always atomic. So based on his suggestion this patch changes the internal representation from u64 to unsigned long while the user-interface still refers to it as u64. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-01-28 14:53:52 -08:00
Herbert Xu	0ebea8ef35	[IPSEC]: Move state lock into x->type->input This patch releases the lock on the state before calling x->type->input. It also adds the lock to the spots where they're currently needed. Most of those places (all except mip6) are expected to disappear with async crypto. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-01-28 14:53:52 -08:00
Herbert Xu	668dc8af31	[IPSEC]: Move integrity stat collection into xfrm_input Similar to the moving out of the replay processing on the output, this patch moves the integrity stat collectin from x->type->input into xfrm_input. This would eventually allow transforms such as AH/ESP to be lockless. The error value EBADMSG (currently unused in the crypto layer) is used to indicate a failed integrity check. In future this error can be directly returned by the crypto layer once we switch to aead algorithms. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-01-28 14:53:51 -08:00
Herbert Xu	716062fd4c	[IPSEC]: Merge most of the input path As part of the work on asynchronous cryptographic operations, we need to be able to resume from the spot where they occur. As such, it helps if we isolate them to one spot. This patch moves most of the remaining family-specific processing into the common input code. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-01-28 14:53:50 -08:00
Herbert Xu	862b82c6f9	[IPSEC]: Merge most of the output path As part of the work on asynchrnous cryptographic operations, we need to be able to resume from the spot where they occur. As such, it helps if we isolate them to one spot. This patch moves most of the remaining family-specific processing into the common output code. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-01-28 14:53:48 -08:00
Herbert Xu	ef76bc23ef	[IPV6]: Add ip6_local_out Most callers of the LOCAL_OUT chain will set the IP packet length before doing so. They also share the same output function dst_output. This patch creates a new function called ip6_local_out which does all of that and converts the appropriate users over to it. Apart from removing duplicate code, it will also help in merging the IPsec output path. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-01-28 14:53:47 -08:00
Herbert Xu	227620e295	[IPSEC]: Separate inner/outer mode processing on input With inter-family transforms the inner mode differs from the outer mode. Attempting to handle both sides from the same function means that it needs to handle both IPv4 and IPv6 which creates duplication and confusion. This patch separates the two parts on the input path so that each function deals with one family only. In particular, the functions xfrm4_extract_inut/xfrm6_extract_inut moves the pertinent fields from the IPv4/IPv6 IP headers into a neutral format stored in skb->cb. This is then used by the inner mode input functions to modify the inner IP header. In this way the input function no longer has to know about the outer address family. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-01-28 14:53:46 -08:00
Herbert Xu	36cf9acf93	[IPSEC]: Separate inner/outer mode processing on output With inter-family transforms the inner mode differs from the outer mode. Attempting to handle both sides from the same function means that it needs to handle both IPv4 and IPv6 which creates duplication and confusion. This patch separates the two parts on the output path so that each function deals with one family only. In particular, the functions xfrm4_extract_output/xfrm6_extract_output moves the pertinent fields from the IPv4/IPv6 IP headers into a neutral format stored in skb->cb. This is then used by the outer mode output functions to write the outer IP header. In this way the output function no longer has to know about the inner address family. Since the extract functions are only called by tunnel modes (the only modes that can support inter-family transforms), I've also moved the xfrm*_tunnel_check_size calls into them. This allows the correct ICMP message to be sent as opposed to now where you might call icmp_send with an IPv6 packet and vice versa. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-01-28 14:53:45 -08:00
Herbert Xu	29bb43b4ec	[INET]: Give outer DSCP directly to ip*_copy_dscp This patch changes the prototype of ipv4_copy_dscp and ipv6_copy_dscp so that they directly take the outer DSCP rather than the outer IP header. This will help us to unify the code for inter-family tunnels. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-01-28 14:53:45 -08:00
Herbert Xu	a2deb6d26f	[IPSEC]: Move x->outer_mode->output out of locked section RO mode is the only one that requires a locked output function. So it's easier to move the lock into that function rather than requiring everyone else to run under the lock. In particular, this allows us to move the size check into the output function without causing a potential dead-lock should the ICMP error somehow hit the same SA on transmission. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-01-28 14:53:44 -08:00
Herbert Xu	e40b328615	[IPSEC]: Forbid BEET + ipcomp for now While BEET can theoretically work with IPComp the current code can't do that because it tries to construct a BEET mode tunnel type which doesn't (and cannot) exist. In fact as it is it won't even attach a tunnel object at all for BEET which is bogus. To support this fully we'd also need to change the policy checks on input to recognise a plain tunnel as a legal variant of an optional BEET transform. This patch simply fails such constructions for now. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-01-28 14:53:43 -08:00
Herbert Xu	25ee3286dc	[IPSEC]: Merge common code into xfrm_bundle_create Half of the code in xfrm4_bundle_create and xfrm6_bundle_create are common. This patch extracts that logic and puts it into xfrm_bundle_create. The rest of it are then accessed through afinfo. As a result this fixes the problem with inter-family transforms where we treat every xfrm dst in the bundle as if it belongs to the top family. This patch also fixes a long-standing error-path bug where we may free the xfrm states twice. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-01-28 14:53:43 -08:00
Herbert Xu	66cdb3ca27	[IPSEC]: Move flow construction into xfrm_dst_lookup This patch moves the flow construction from the callers of xfrm_dst_lookup into that function. It also changes xfrm_dst_lookup so that it takes an xfrm state as its argument instead of explicit addresses. This removes any address-specific logic from the callers of xfrm_dst_lookup which is needed to correctly support inter-family transforms. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-01-28 14:53:42 -08:00
Herbert Xu	f04e7e8d7f	[IPSEC]: Replace x->type->{local,remote}_addr with flags The functions local_addr and remote_addr are more than what they're needed for. The same thing can be done easily with flags on the type object. This patch does that and simplifies the wrapper functions in xfrm6_policy accordingly. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-01-28 14:53:41 -08:00
Herbert Xu	fff6938880	[IPSEC]: Make sure idev is consistent with dev in xfrm_dst Previously we took the device from the bottom route and idev from the top route. This is bad because idev may well point to a different device. This patch changes it so that we get the idev from the device directly. It also makes it an error if either dev or idev is NULL. This is consistent with the rest of the routing code which also treats these cases as errors. I've removed the err initialisation in xfrm6_policy.c because it achieves no purpose and hid a bug when an initial version of this patch neglected to set err to -ENODEV (fortunately the IPv4 version warned about it). Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-01-28 14:53:40 -08:00
Herbert Xu	45ff5a3f9a	[IPSEC]: Set dst->input to dst_discard The input function should never be invoked on IPsec dst objects. This is because we don't apply IPsec on input until after we've made the routing decision. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-01-28 14:53:40 -08:00
Herbert Xu	8ce68ceb55	[IPSEC]: Only set neighbour on top xfrm dst The neighbour field is only used by dst_confirm which only ever happens on the top-most xfrm dst. So it's a waste to duplicate for every other xfrm dst. This patch moves its setting out of the loop so that only the top one gets set. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-01-28 14:53:39 -08:00
Herbert Xu	352e512c32	[NET]: Eliminate duplicate copies of dst_discard We have a number of copies of dst_discard scattered around the place which all do the same thing, namely free a packet on the input or output paths. This patch deletes all of them except dst_discard and points all the users to it. The only non-trivial bit is decnet where it returns an error. However, conceptually this is identical to the blackhole functions used in IPv4 and IPv6 which do not return errors. So they should either all return errors or all return zero. For now I've stuck with the majority and picked zero as the return value. It doesn't really matter in practice since few if any driver would react differently depending on a zero return value or NET_RX_DROP. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-01-28 14:53:37 -08:00
Herbert Xu	b4ce92775c	[IPV6]: Move nfheader_len into rt6_info The dst member nfheader_len is only used by IPv6. It's also currently creating a rather ugly alignment hole in struct dst. Therefore this patch moves it from there into struct rt6_info. It also reorders the fields in rt6_info to minimize holes. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-01-28 14:53:37 -08:00
Herbert Xu	0148894223	[IPV6]: Only set nfheader_len for top xfrm dst We only need to set nfheader_len in the top xfrm dst. This is because we only ever read the nfheader_len from the top xfrm dst. It is also easier to count nfheader_len as part of header_len which then lets us remove the ugly wrapper functions for incrementing and decrementing header lengths in xfrm6_policy.c. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-01-28 14:53:35 -08:00
Pavel Emelyanov	b24b8a247f	[NET]: Convert init_timer into setup_timer Many-many code in the kernel initialized the timer->function and timer->data together with calling init_timer(timer). There is already a helper for this. Use it for networking code. The patch is HUGE, but makes the code 130 lines shorter (98 insertions(+), 228 deletions(-)). Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-01-28 14:53:35 -08:00
Wang Chen	a92aa318b4	[IPV6]: Add raw6 drops counter. Add raw drops counter for IPv6 in /proc/net/raw6 . Signed-off-by: Wang Chen <wangchen@cn.fujitsu.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-01-28 14:53:34 -08:00
Jens Axboe	a0974dd3da	[TCP] splice: add tcp_splice_read() to IPV6 Thanks to YOSHIFUJI Hideaki for the hint! Signed-off-by: Jens Axboe <jens.axboe@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-01-28 14:53:32 -08:00
Rolf Manderscheid	a9e527e3f9	IPoIB: improve IPv4/IPv6 to IB mcast mapping functions An IPoIB subnet on an IB fabric that spans multiple IB subnets can't use link-local scope in multicast GIDs. The existing routines that map IP/IPv6 multicast addresses into IB link-level addresses hard-code the scope to link-local, and they also leave the partition key field uninitialised. This patch adds a parameter (the link-level broadcast address) to the mapping routines, allowing them to initialise both the scope and the P_Key appropriately, and fixes up the call sites. The next step will be to add a way to configure the scope for an IPoIB interface. Signed-off-by: Rolf Manderscheid <rvm@obsidianresearch.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2008-01-25 14:15:37 -08:00
Herbert Xu	f945fa7ad9	[INET]: Fix truesize setting in ip_append_data As it is ip_append_data only counts page fragments to the skb that allocated it. As such it means that the first skb gets hit with a 4K charge even though it might have only used a fraction of it while all subsequent skb's that use the same page gets away with no charge at all. This bug was exposed by the UDP accounting patch. [ The wmem_alloc bumping needs to be moved with the truesize, noticed by Takahiro Yasui. -DaveM ] Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-01-23 03:11:43 -08:00
Wang Chen	fa95c28322	[IPV6]: RFC 2011 compatibility broken The snmp6 entry name was changed, and it broke compatibility to RFC 2011. Signed-off-by: Wang Chen <wangchen@cn.fujitsu.com> Acked-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-01-21 03:05:43 -08:00
Wang Chen	c964ff4ffb	[IPV6]: ICMP6_MIB_OUTMSGS increment duplicated icmpv6_send() calls ip6_push_pending_frames() indirectly. Both ip6_push_pending_frames() and icmpv6_send() increment counter ICMP6_MIB_OUTMSGS. This patch remove the increment from icmpv6_send. Signed-off-by: Wang Chen <wangchen@cn.fujitsu.com> Acked-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-01-21 03:05:20 -08:00
YOSHIFUJI Hideaki	398bcbebb6	[IPV6] ROUTE: Make sending algorithm more friendly with RFC 4861. We omit (or delay) sending NSes for known-to-unreachable routers (in NUD_FAILED state) according to RFC 4191 (Default Router Preferences and More-Specific Routes). But this is not fully compatible with RFC 4861 (Neighbor Discovery Protocol for IPv6), which does not remember unreachability of neighbors. So, let's avoid mixing sending algorithm of RFC 4191 and that of RFC 4861, and make the algorithm more friendly with RFC 4861 if RFC 4191 is disabled. Issue was found by IPv6 Ready Logo Core Self_Test 1.5.0b2 (by TAHI Project), and has been tracked down by Mitsuru Chinen <mitch@linux.vnet.ibm.com>. Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-01-20 20:31:40 -08:00
Pavel Emelyanov	b3652b2dc5	[IPV6]: Mischecked tw match in __inet6_check_established. When looking for a conflicting connection the !sk->sk_bound_dev_if check is performed only for live sockets, but not for timewait-ed. This is not the case for ipv4, for __inet6_lookup_established in both ipv4 and ipv6 and for other places that check for tw-s. Was this missed accidentally? If so, then this patch fixes it and besides makes use if the dif variable declared in the function. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-01-20 20:31:36 -08:00
Yasuyuki Kozakai	8f41f01786	[NETFILTER]: ip6t_eui64: Fixes calculation of Universal/Local bit RFC2464 says that the next to lowerst order bit of the first octet of the Interface Identifier is formed by complementing the Universal/Local bit of the EUI-64. But ip6t_eui64 uses OR not XOR. Thanks Peter Ivancik for reporing this bug and posting a patch for it. Signed-off-by: Yasuyuki Kozakai <yasuyuki.kozakai@toshiba.co.jp> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-01-10 22:40:39 -08:00
Brian Haley	1ac4f00885	[IPV6]: IPV6_MULTICAST_IF setting is ignored on link-local connect() Signed-off-by: Brian Haley <brian.haley@hp.com> Acked-by: David L Stevens <dlstevens@us.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-01-08 23:52:21 -08:00
Joe Perches	bea8519547	[IPV6]: Spelling fixes Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-12-20 14:01:35 -08:00
Wei Yongjun	cf6fc4a924	[IPV6]: Fix the return value of ipv6_getsockopt If CONFIG_NETFILTER if not selected when compile the kernel source code, ipv6_getsockopt will returen an EINVAL error if optname is not supported by the kernel. But if CONFIG_NETFILTER is selected, ENOPROTOOPT error will be return. This patch fix to always return ENOPROTOOPT error if optname argument of ipv6_getsockopt is not supported by the kernel. Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-12-16 13:39:57 -08:00
Thomas Graf	95a02cfd4d	[IPv6] ESP: Discard dummy packets introduced in rfc4303 RFC4303 introduces dummy packets with a nexthdr value of 59 to implement traffic confidentiality. Such packets need to be dropped silently and the payload may not be attempted to be parsed as it consists of random chunk. Signed-off-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-12-11 02:45:27 -08:00
YOSHIFUJI Hideaki	1df2e44560	[IPV6] XFRM: Fix auditing rt6i_flags; use RTF_xxx flags instead of RTCF_xxx. RTCF_xxx flags, defined in include/linux/in_route.h) are available for IPv4 route (rtable) entries only. Use RTF_xxx flags instead, defined in include/linux/ipv6_route.h, for IPv6 route entries (rt6_info). Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-12-11 02:45:24 -08:00
Mitsuru Chinen	ca46f9c834	[IPv6] SNMP: Increment OutNoRoutes when connecting to unreachable network IPv6 stack doesn't increment OutNoRoutes counter when IP datagrams is being discarded because no route could be found to transmit them to their destination. IPv6 stack should increment the counter. Incidentally, IPv4 stack increments that counter in such situation. Signed-off-by: Mitsuru Chinen <mitch@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-12-07 01:06:30 -08:00
Evgeniy Polyakov	d31c7b8fa3	[IPV6]: Restore IPv6 when MTU is big enough Avaid provided test application, so bug got fixed. IPv6 addrconf removes ipv6 inner device from netdev each time cmu changes and new value is less than IPV6_MIN_MTU (1280 bytes). When mtu is changed and new value is greater than IPV6_MIN_MTU, it does not add ipv6 addresses and inner device bac. This patch fixes that. Tested with Avaid's application, which works ok now. Signed-off-by: Evgeniy Polyakov <johnpol@2ka.mipt.ru> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2007-11-30 23:36:08 +11:00
YOSHIFUJI Hideaki	77adefdc98	[IPV6] TCPMD5: Fix deleting key operation. Due to the bug, refcnt for md5sig pool was leaked when an user try to delete a key if we have more than one key. In addition to the leakage, we returned incorrect return result value for userspace. This fix should close Bug #9418, reported by <ming-baini@163.com>. Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-11-20 17:31:23 -08:00
YOSHIFUJI Hideaki	aacbe8c880	[IPV6] TCPMD5: Check return value of tcp_alloc_md5sig_pool(). Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-11-20 17:30:56 -08:00
Joe Perches	3b6d821c4f	[IPV6]: Add missing "space" Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-11-19 23:47:25 -08:00
Pierre Ynard	dbb2ed2485	[IPV6]: Add ifindex field to ND user option messages. Userland neighbor discovery options are typically heavily involved with the interface on which thay are received: add a missing ifindex field to the original struct. Thanks to R�mi Denis-Courmont. Signed-off-by: Pierre Ynard <linkfanel@yahoo.fr> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-11-12 17:58:35 -08:00
Denis V. Lunev	2994c63863	[INET]: Small possible memory leak in FIB rules This patch fixes a small memory leak. Default fib rules can be deleted by the user if the rule does not carry FIB_RULE_PERMANENT flag, f.e. by ip rule flush Such a rule will not be freed as the ref-counter has 2 on start and becomes clearly unreachable after removal. Signed-off-by: Denis V. Lunev <den@openvz.org> Acked-by: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-11-10 22:12:03 -08:00
Pavel Emelyanov	03f49f3457	[NET]: Make helper to get dst entry and "use" it There are many places that get the dst entry, increase the __use counter and set the "lastuse" time stamp. Make a helper for this. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-11-10 21:28:34 -08:00
Eric Dumazet	230140cffa	[INET]: Remove per bucket rwlock in tcp/dccp ehash table. As done two years ago on IP route cache table (commit `22c047ccbc`) , we can avoid using one lock per hash bucket for the huge TCP/DCCP hash tables. On a typical x86_64 platform, this saves about 2MB or 4MB of ram, for litle performance differences. (we hit a different cache line for the rwlock, but then the bucket cache line have a better sharing factor among cpus, since we dirty it less often). For netstat or ss commands that want a full scan of hash table, we perform fewer memory accesses. Using a 'small' table of hashed rwlocks should be more than enough to provide correct SMP concurrency between different buckets, without using too much memory. Sizing of this table depends on num_possible_cpus() and various CONFIG settings. This patch provides some locking abstraction that may ease a future work using a different model for TCP/DCCP table. Signed-off-by: Eric Dumazet <dada1@cosmosbay.com> Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-11-07 04:15:11 -08:00
Herbert Xu	4999f3621f	[IPSEC]: Fix crypto_alloc_comp error checking The function crypto_alloc_comp returns an errno instead of NULL to indicate error. So it needs to be tested with IS_ERR. This is based on a patch by Vicen� Beltran Querol. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-11-07 04:15:03 -08:00
Alexey Dobriyan	33120b30cc	[IPV6]: Convert /proc/net/ipv6_route to seq_file interface This removes last proc_net_create() user. Kudos to Benjamin Thery and Stephen Hemminger for comments on previous version. Signed-off-by: Alexey Dobriyan <adobriyan@sw.ru> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-11-07 04:09:18 -08:00
Eric Dumazet	c5a432f1a1	[IPV6]: Use the {DEFINE\|REF}_PROTO_INUSE infrastructure Trivial patch to make "tcpv6,udpv6,udplitev6,rawv6" protocols uses the fast "inuse sockets" infrastructure Each protocol use then a static percpu var, instead of a dynamic one. This saves some ram and some cpu cycles Signed-off-by: Eric Dumazet <dada1@cosmosbay.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-11-07 04:08:59 -08:00
Eric Dumazet	286ab3d460	[NET]: Define infrastructure to keep 'inuse' changes in an efficent SMP/NUMA way. "struct proto" currently uses an array stats[NR_CPUS] to track change on 'inuse' sockets per protocol. If NR_CPUS is big, this means we use a big memory area for this. Moreover, all this memory area is located on a single node on NUMA machines, increasing memory pressure on the boot node. In this patch, I tried to : - Keep a fast !CONFIG_SMP implementation - Keep a fast CONFIG_SMP implementation for often used protocols (tcp,udp,raw,...) - Introduce a NUMA efficient implementation Some helper macros are defined in include/net/sock.h These macros take into account CONFIG_SMP If a "struct proto" is declared without using DEFINE_PROTO_INUSE / REF_PROTO_INUSE macros, it will automatically use a default implementation, using a dynamically allocated percpu zone. This default implementation will be NUMA efficient, but might use 32/64 bytes per possible cpu because of current alloc_percpu() implementation. However it still should be better than previous implementation based on stats[NR_CPUS] field. When a "struct proto" is changed to use the new macros, we use a single static "int" percpu variable, lowering the memory and cpu costs, still preserving NUMA efficiency. Signed-off-by: Eric Dumazet <dada1@cosmosbay.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-11-07 04:08:57 -08:00
Mitsuru Chinen	7a0ff716c2	[IPv6] SNMP: Restore Udp6InErrors incrementation As the checksum verification is postponed till user calls recv or poll, the inrementation of Udp6InErrors counter should be also postponed. Currently, it is postponed in non-blocking operation case. However it should be postponed in all case like the IPv4 code. Signed-off-by: Mitsuru Chinen <mitch@linux.vnet.ibm.com> Acked-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-11-07 04:08:54 -08:00
Pavel Emelyanov	bf138862b1	[IPV6]: Consolidate the ip cork destruction in ip6_output.c The ip6_push_pending_frames and ip6_flush_pending_frames do the same things to flush the sock's cork. Move this into a separate function and save ~100 bytes from the .text Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-11-07 04:08:26 -08:00
Jan Engelhardt	0795c65d9f	[NETFILTER]: Clean up Makefile Sort matches and targets in the NF makefiles. Signed-off-by: Jan Engelhardt <jengelh@computergmbh.de> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-11-07 04:08:22 -08:00
Alexey Dobriyan	7351a22a3a	[NETFILTER]: ip{,6}_queue: convert to seq_file interface I plan to kill ->get_info which means killing proc_net_create(). Signed-off-by: Alexey Dobriyan <adobriyan@sw.ru> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-11-07 04:08:20 -08:00
Jens Axboe	c46f2334c8	[SG] Get rid of __sg_mark_end() sg_mark_end() overwrites the page_link information, but all users want __sg_mark_end() behaviour where we just set the end bit. That is the most natural way to use the sg list, since you'll fill it in and then mark the end point. So change sg_mark_end() to only set the termination bit. Add a sg_magic debug check as well, and clear a chain pointer if it is set. Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2007-11-02 08:47:06 +01:00
Adrian Bunk	87ae9afdca	cleanup asm/scatterlist.h includes Not architecture specific code should not #include <asm/scatterlist.h>. This patch therefore either replaces them with #include <linux/scatterlist.h> or simply removes them if they were unused. Signed-off-by: Adrian Bunk <bunk@kernel.org> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2007-11-02 08:47:06 +01:00
Pavel Emelyanov	6257ff2177	[NET]: Forget the zero_it argument of sk_alloc() Finally, the zero_it argument can be completely removed from the callers and from the function prototype. Besides, fix the checkpatch.pl warnings about using the assignments inside if-s. This patch is rather big, and it is a part of the previous one. I splitted it wishing to make the patches more readable. Hope this particular split helped. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-11-01 00:39:31 -07:00
David S. Miller	51c739d1f4	[NET]: Fix incorrect sg_mark_end() calls. This fixes scatterlist corruptions added by commit `68e3f5dd4d` [CRYPTO] users: Fix up scatterlist conversion errors The issue is that the code calls sg_mark_end() which clobbers the sg_page() pointer of the final scatterlist entry. The first part fo the fix makes skb_to_sgvec() do __sg_mark_end(). After considering all skb_to_sgvec() call sites the most correct solution is to call __sg_mark_end() in skb_to_sgvec() since that is what all of the callers would end up doing anyways. I suspect this might have fixed some problems in virtio_net which is the sole non-crypto user of skb_to_sgvec(). Other similar sg_mark_end() cases were converted over to __sg_mark_end() as well. Arguably sg_mark_end() is a poorly named function because it doesn't just "mark", it clears out the page pointer as a side effect, which is what led to these bugs in the first place. The one remaining plain sg_mark_end() call is in scsi_alloc_sgtable() and arguably it could be converted to __sg_mark_end() if only so that we can delete this confusing interface from linux/scatterlist.h Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-30 21:29:29 -07:00
Daniel Lezcano	1675c7b254	[IPV6]: remove duplicate call to proc_net_remove The file /proc/net/if_inet6 is removed twice. First time in: inet6_exit ->addrconf_cleanup And followed a few lines after by: inet6_exit -> if6_proc_exit Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-30 21:16:24 -07:00
Matthias M. Dellweg	b0a713e9e6	[TCP] MD5: Remove some more unnecessary casting. while reviewing the tcp_md5-related code further i came across with another two of these casts which you probably have missed. I don't actually think that they impose a problem by now, but as you said we should remove them. Signed-off-by: Matthias M. Dellweg <2500@gmx.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-29 22:37:27 -07:00
YOSHIFUJI Hideaki	ad02ac145d	[IPV6] NDISC: Fix setting base_reachable_time_ms variable. This bug was introduced by the commit `d12af679bc` (sysctl: fix neighbour table sysctls). Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-29 22:37:22 -07:00
Herbert Xu	68e3f5dd4d	[CRYPTO] users: Fix up scatterlist conversion errors This patch fixes the errors made in the users of the crypto layer during the sg_init_table conversion. It also adds a few conversions that were missing altogether. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-27 00:52:07 -07:00
Adrian Bunk	72998d8c84	[INET] ESP: Must #include <linux/scatterlist.h> This patch fixes the following compile errors in some configurations: <-- snip --> ... CC net/ipv4/esp4.o /home/bunk/linux/kernel-2.6/git/linux-2.6/net/ipv4/esp4.c: In function 'esp_output': /home/bunk/linux/kernel-2.6/git/linux-2.6/net/ipv4/esp4.c:113: error: implicit declaration of function 'sg_init_table' make[3]: * [net/ipv4/esp4.o] Error 1 ... /home/bunk/linux/kernel-2.6/git/linux-2.6/net/ipv6/esp6.c: In function 'esp6_output': /home/bunk/linux/kernel-2.6/git/linux-2.6/net/ipv6/esp6.c:112: error: implicit declaration of function 'sg_init_table' make[3]: * [net/ipv6/esp6.o] Error 1 <-- snip --> Signed-off-by: Adrian Bunk <bunk@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-26 22:53:58 -07:00
Jeff Garzik	18134bed02	[TCP] IPV6: fix softnet build breakage net/ipv6/tcp_ipv6.c: In function 'tcp_v6_rcv': net/ipv6/tcp_ipv6.c:1736: error: implicit declaration of function 'get_softnet_dma' net/ipv6/tcp_ipv6.c:1736: warning: assignment makes pointer from integer without a cast Signed-off-by: Jeff Garzik <jgarzik@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-26 22:53:14 -07:00
David S. Miller	b4caea8aa8	[TCP]: Add missing I/O AT code to ipv6 side. Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-26 04:20:13 -07:00
David S. Miller	c7da57a183	[TCP]: Fix scatterlist handling in MD5 signature support. Use sg_init_table() and sg_mark_end() as needed. Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-26 00:41:21 -07:00
David S. Miller	ed0e7e0ca3	[IPSEC]: Add missing sg_init_table() calls to ESP. Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-26 00:38:39 -07:00
Chuck Lever	c2636b4d9e	[NET]: Treat the sign of the result of skb_headroom() consistently In some places, the result of skb_headroom() is compared to an unsigned integer, and in others, the result is compared to a signed integer. Make the comparisons consistent and correct. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-23 21:27:55 -07:00
Masahide NAKAMURA	ea2c47b42f	[IPSEC] IPV6: Fix to add tunnel mode SA correctly. Signed-off-by: Masahide NAKAMURA <nakam@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-22 02:59:58 -07:00
Anton Arapov	a25de534f8	[INET]: Justification for local port range robustness. There is a justifying patch for Stephen's patches. Stephen's patches disallows using a port range of one single port and brakes the meaning of the 'remaining' variable, in some places it has different meaning. My patch gives back the sense of 'remaining' variable. It should mean how many ports are remaining and nothing else. Also my patch allows using a single port. I sure we must be able to use mentioned port range, this does not restricted by documentation and does not brake current behavior. usefull links: Patches posted by Stephen Hemminger http://marc.info/?l=linux-netdev&m=119206106218187&w=2 http://marc.info/?l=linux-netdev&m=119206109918235&w=2 Andrew Morton's comment http://marc.info/?l=linux-kernel&m=119248225007737&w=2 1. Allows using a port range of one single port. 2. Gives back sense of 'remaining' variable. Signed-off-by: Anton Arapov <aarapov@redhat.com> Acked-by: Stephen Hemminger <shemminger@linux-foundation.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-18 22:00:17 -07:00
Linus Torvalds	a57793651f	Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 * 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6: (51 commits) [IPV6]: Fix again the fl6_sock_lookup() fixed locking [NETFILTER]: nf_conntrack_tcp: fix connection reopening fix [IPV6]: Fix race in ipv6_flowlabel_opt() when inserting two labels [IPV6]: Lost locking in fl6_sock_lookup [IPV6]: Lost locking when inserting a flowlabel in ipv6_fl_list [NETFILTER]: xt_sctp: fix mistake to pass a pointer where array is required [NET]: Fix OOPS due to missing check in dev_parse_header(). [TCP]: Remove lost_retrans zero seqno special cases [NET]: fix carrier-on bug? [NET]: Fix uninitialised variable in ip_frag_reasm() [IPSEC]: Rename mode to outer_mode and add inner_mode [IPSEC]: Disallow combinations of RO and AH/ESP/IPCOMP [IPSEC]: Use the top IPv4 route's peer instead of the bottom [IPSEC]: Store afinfo pointer in xfrm_mode [IPSEC]: Add missing BEET checks [IPSEC]: Move type and mode map into xfrm_state.c [IPSEC]: Fix length check in xfrm_parse_spi [IPSEC]: Move ip_summed zapping out of xfrm6_rcv_spi [IPSEC]: Get nexthdr from caller in xfrm6_rcv_spi [IPSEC]: Move tunnel parsing for IPv4 out of xfrm4_input ...	2007-10-18 14:40:30 -07:00
Eric W. Biederman	064b5bba0c	sysctl: remove broken netfilter binary sysctls No one has bothered to set strategy routine for the the netfilter sysctls that return jiffies to be sysctl_jiffies. So it appears the sys_sysctl path is unused and untested, so this patch removes the binary sysctl numbers. Which fixes the netfilter oops in 2.6.23-rc2-mm2 for me. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Cc: Patrick McHardy <kaber@trash.net> Cc: "David S. Miller" <davem@davemloft.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-18 14:37:23 -07:00
Eric W. Biederman	428b367bff	sysctl: ipv6 route flushing (kill binary path) We don't preoperly support the sysctl binary path for flushing the ipv6 routes. So remove support for a binary path. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Cc: Alexey Dobriyan <adobriyan@sw.ru> Cc: "David S. Miller" <davem@davemloft.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-18 14:37:22 -07:00
Eric W. Biederman	d12af679bc	sysctl: fix neighbour table sysctls. - In ipv6 ndisc_ifinfo_syctl_change so it doesn't depend on binary sysctl names for a function that works with proc. - In neighbour.c reorder the table to put the possibly unused entries at the end so we can remove them by terminating the table early. - In neighbour.c kill the entries with questionable binary sysctl handling behavior. - In neighbour.c if we don't have a strategy routine remove the binary path. So we don't the default sysctl strategy routine on data that is not ready for it. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Cc: Alexey Dobriyan <adobriyan@sw.ru> Cc: "David S. Miller" <davem@davemloft.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-18 14:37:22 -07:00
Pavel Emelyanov	52f095ee88	[IPV6]: Fix again the fl6_sock_lookup() fixed locking YOSHIFUJI fairly pointed out, that the users increment should be done under the ip6_sk_fl_lock not to give IPV6_FL_A_PUT a chance to put this count to zero and release the flowlabel. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-18 05:38:48 -07:00
Pavel Emelyanov	78c2e50253	[IPV6]: Fix race in ipv6_flowlabel_opt() when inserting two labels In the IPV6_FL_A_GET case the hash is checked for flowlabels with the given label. If it is not found, the lock, protecting the hash, is dropped to be re-get for writing. After this a newly allocated entry is inserted, but no checks are performed to catch a classical SMP race, when the conflicting label may be inserted on another cpu. Use the (currently unused) return value from fl_intern() to return the conflicting entry (if found) and re-check, whether we can reuse it (IPV6_FL_F_EXCL) or return -EEXISTS. Also add the comment, about why not re-lookup the current sock for conflicting flowlabel entry. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-18 05:18:56 -07:00
Pavel Emelyanov	bd0bf57700	[IPV6]: Lost locking in fl6_sock_lookup This routine scans the ipv6_fl_list whose update is protected with the socket lock and the ip6_sk_fl_lock. Since the socket lock is not taken in the lookup, use the other one. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-18 05:15:57 -07:00
Pavel Emelyanov	04028045a1	[IPV6]: Lost locking when inserting a flowlabel in ipv6_fl_list The new flowlabels should be inserted into the sock list under the ip6_sk_fl_lock. This was lost in one place. This list is naturally protected with the socket lock, but the fl6_sock_lookup() is called without it, so another protection is required. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-18 05:14:58 -07:00
Herbert Xu	13996378e6	[IPSEC]: Rename mode to outer_mode and add inner_mode This patch adds a new field to xfrm states called inner_mode. The existing mode object is renamed to outer_mode. This is the first part of an attempt to fix inter-family transforms. As it is we always use the outer family when determining which mode to use. As a result we may end up shoving IPv4 packets into netfilter6 and vice versa. What we really want is to use the inner family for the first part of outbound processing and the outer family for the second part. For inbound processing we'd use the opposite pairing. I've also added a check to prevent silly combinations such as transport mode with inter-family transforms. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-17 21:35:51 -07:00
Herbert Xu	ca68145f16	[IPSEC]: Disallow combinations of RO and AH/ESP/IPCOMP Combining RO and AH/ESP/IPCOMP does not make sense. So this patch adds a check in the state initialisation function to prevent this. This allows us to safely remove the mode input function of RO since it can never be called anymore. Indeed, if somehow it does get called we'll know about it through an OOPS instead of it slipping past silently. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-17 21:35:15 -07:00
Herbert Xu	17c2a42a24	[IPSEC]: Store afinfo pointer in xfrm_mode It is convenient to have a pointer from xfrm_state to address-specific functions such as the output function for a family. Currently the address-specific policy code calls out to the xfrm state code to get those pointers when we could get it in an easier way via the state itself. This patch adds an xfrm_state_afinfo to xfrm_mode (since they're address-specific) and changes the policy code to use it. I've also added an owner field to do reference counting on the module providing the afinfo even though it isn't strictly necessary today since IPv6 can't be unloaded yet. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-17 21:33:12 -07:00
Herbert Xu	1bfcb10f67	[IPSEC]: Add missing BEET checks Currently BEET mode does not reinject the packet back into the stack like tunnel mode does. Since BEET should behave just like tunnel mode this is incorrect. This patch fixes this by introducing a flags field to xfrm_mode that tells the IPsec code whether it should terminate and reinject the packet back into the stack. It then sets the flag for BEET and tunnel mode. I've also added a number of missing BEET checks elsewhere where we check whether a given mode is a tunnel or not. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-17 21:31:50 -07:00
Herbert Xu	7aa68cb906	[IPSEC]: Move ip_summed zapping out of xfrm6_rcv_spi Not every transform needs to zap ip_summed. For example, a pure tunnel mode encapsulation does not affect the hardware checksum at all. In fact, every algorithm (that needs this) other than AH6 already does its own ip_summed zapping. This patch moves the zapping into AH6 which is in line with what IPv4 does. Possible future optimisation: Checksum the data as we copy them in IPComp. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-17 21:30:07 -07:00
Herbert Xu	33b5ecb8f6	[IPSEC]: Get nexthdr from caller in xfrm6_rcv_spi Currently xfrm6_rcv_spi gets the nexthdr value itself from the packet. This means that we need to fix up the value in case we have a 4-on-6 tunnel. Moving this logic into the caller simplifies things and allows us to merge the code with IPv4. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-17 21:29:25 -07:00
Herbert Xu	04663d0b8b	[IPSEC]: Fix pure tunnel modes involving IPv6 I noticed that my recent patch broke 6-on-4 pure IPsec tunnels (the ones that are only used for incompressible IPsec packets). Subsequent reviews show that I broke 6-on-6 pure tunnels more than three years ago and nobody ever noticed. I suppose every must be testing 6-on-6 IPComp with large pings which are very compressible :) This patch fixes both cases. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-17 21:28:06 -07:00
Pavel Emelyanov	aaf70ec7fd	[IPV6]: Cleanup snmp6_alloc_dev() This functions is never called with NULL or not setup argument, so the checks inside are redundant. Also, the return value is always -ENOMEM, so no need in additional variable for this. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-17 21:25:32 -07:00
Pavel Emelyanov	16910b9829	[IPV6]: Fix return type for snmp6_free_dev() This call is essentially void. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-17 21:23:43 -07:00
Pavel Emelyanov	c95477090a	[INET]: Consolidate frag queues freeing Since we now allocate the queues in inet_fragment.c, we can safely free it in the same place. The ->destructor callback thus becomes optional for inet_frags. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-17 19:48:26 -07:00
Pavel Emelyanov	48d6005638	[INET]: Remove no longer needed ->equal callback Since this callback is used to check for conflicts in hashtable when inserting a newly created frag queue, we can do the same by checking for matching the queue with the argument, used to create one. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-17 19:47:56 -07:00
Pavel Emelyanov	abd6523d15	[INET]: Consolidate xxx_find() in fragment management Here we need another callback ->match to check whether the entry found in hash matches the key passed. The key used is the same as the creation argument for inet_frag_create. Yet again, this ->match is the same for netfilter and ipv6. Running a frew steps forward - this callback will later replace the ->equal one. Since the inet_frag_find() uses the already consolidated inet_frag_create() remove the xxx_frag_create from protocol codes. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-17 19:47:21 -07:00
Pavel Emelyanov	c6fda28229	[INET]: Consolidate xxx_frag_create() This one uses the xxx_frag_intern() and xxx_frag_alloc() routines, which are already consolidated, so remove them from protocol code (as promised). The ->constructor callback is used to init the rest of the frag queue and it is the same for netfilter and ipv6. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-17 19:46:47 -07:00
Pavel Emelyanov	e521db9d79	[INET]: Consolidate xxx_frag_alloc() Just perform the kzalloc() allocation and setup common fields in the inet_frag_queue(). Then return the result to the caller to initialize the rest. The inet_frag_alloc() may return NULL, so check the return value before doing the container_of(). This looks ugly, but the xxx_frag_alloc() will be removed soon. The xxx_expire() timer callbacks are patches, because the argument is now the inet_frag_queue, not the protocol specific queue. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-17 19:45:23 -07:00
Pavel Emelyanov	2588fe1d78	[INET]: Consolidate xxx_frag_intern This routine checks for the existence of a given entry in the hash table and inserts the new one if needed. The ->equal callback is used to compare two frag_queue-s together, but this one is temporary and will be removed later. The netfilter code and the ipv6 one use the same routine to compare frags. The inet_frag_intern() always returns non-NULL pointer, so convert the inet_frag_queue into protocol specific one (with the container_of) without any checks. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-17 19:44:34 -07:00
Pavel Emelyanov	fd9e63544c	[INET]: Omit double hash calculations in xxx_frag_intern Since the hash value is already calculated in xxx_find, we can simply use it later. This is already done in netfilter code, so make the same in ipv4 and ipv6. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-17 19:43:37 -07:00
Pavel Emelyanov	dc8a82ad28	[IPV6]: Fix memory leak in cleanup_ipv6_mibs() The icmpv6msg mib statistics is not freed. This is almost not critical for current kernel, since ipv6 module is unloadable, but this can happen on load error and will happen every time we stop the network namespace (when we have one, of course). Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Acked-by: David L Stevens <dlstevens@us.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-17 19:30:40 -07:00
Pavel Emelyanov	4acad72ded	[IPV6]: Consolidate the ip6_pol_route_(input\|output) pair The difference in both functions is in the "id" passed to the rt6_select, so just pass it as an extra argument from two outer helpers. This is minus 60 lines of code and 360 bytes of .text Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-15 13:02:51 -07:00
Denis V. Lunev	f1673ca52c	[INET]: kmalloc+memset -> kzalloc in frag_alloc_queue kmalloc + memset -> kzalloc in frag_alloc_queue Signed-off-by: Denis V. Lunev <den@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-15 12:53:13 -07:00
Herbert Xu	e5bbef20e0	[IPV6]: Replace sk_buff ** with sk_buff * in input handlers With all the users of the double pointers removed from the IPv6 input path, this patch converts all occurances of sk_buff ** to sk_buff * in IPv6 input handlers. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-15 12:50:28 -07:00
Pavel Emelyanov	762cc40801	[INET]: Consolidate the xxx_put These ones use the generic data types too, so move them in one place. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-15 12:26:43 -07:00
Pavel Emelyanov	4b6cb5d8e3	[INET]: Small cleanup for xxx_put after evictor consolidation After the evictor code is consolidated there is no need in passing the extra pointer to the xxx_put() functions. The only place when it made sense was the evictor code itself. Maybe this change must got with the previous (or with the next) patch, but I try to make them shorter as much as possible to simplify the review (but they are still large anyway), so this change goes in a separate patch. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-15 12:26:43 -07:00
Pavel Emelyanov	8e7999c44e	[INET]: Consolidate the xxx_evictor The evictors collect some statistics for ipv4 and ipv6, so make it return the number of evicted queues and account them all at once in the caller. The XXX_ADD_STATS_BH() macros are just for this case, but maybe there are places in code, that can make use of them as well. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-15 12:26:42 -07:00
Pavel Emelyanov	1e4b82873a	[INET]: Consolidate the xxx_frag_destroy To make in possible we need to know the exact frag queue size for inet_frags->mem management and two callbacks: * to destoy the skb (optional, used in conntracks only) * to free the queue itself (mandatory, but later I plan to move the allocation and the destruction of frag_queues into the common place, so this callback will most likely be optional too). Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-15 12:26:42 -07:00
Pavel Emelyanov	321a3a99e4	[INET]: Consolidate xxx_the secret_rebuild This code works with the generic data types as well, so move this into inet_fragment.c This move makes it possible to hide the secret_timer management and the secret_rebuild routine completely in the inet_fragment.c Introduce the ->hashfn() callback in inet_frags() to get the hashfun for a given inet_frag_queue() object. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-15 12:26:41 -07:00
Pavel Emelyanov	277e650ddf	[INET]: Consolidate the xxx_frag_kill Since now all the xxx_frag_kill functions now work with the generic inet_frag_queue data type, this can be moved into a common place. The xxx_unlink() code is moved as well. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-15 12:26:41 -07:00
Pavel Emelyanov	04128f233f	[INET]: Collect common frag sysctl variables together Some sysctl variables are used to tune the frag queues management and it will be useful to work with them in a common way in the future, so move them into one structure, moreover they are the same for all the frag management codes. I don't place them in the existing inet_frags object, introduced in the previous patch for two reasons: 1. to keep them in the __read_mostly section; 2. not to export the whole inet_frags objects outside. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-15 12:26:40 -07:00
Pavel Emelyanov	7eb95156d9	[INET]: Collect frag queues management objects together There are some objects that are common in all the places which are used to keep track of frag queues, they are: * hash table * LRU list * rw lock * rnd number for hash function * the number of queues * the amount of memory occupied by queues * secret timer Move all this stuff into one structure (struct inet_frags) to make it possible use them uniformly in the future. Like with the previous patch this mostly consists of hunks like - write_lock(&ipfrag_lock); + write_lock(&ip4_frags.lock); To address the issue with exporting the number of queues and the amount of memory occupied by queues outside the .c file they are declared in, I introduce a couple of helpers. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-15 12:26:39 -07:00
Pavel Emelyanov	5ab11c98d3	[INET]: Move common fields from frag_queues in one place. Introduce the struct inet_frag_queue in include/net/inet_frag.h file and place there all the common fields from three structs: * struct ipq in ipv4/ip_fragment.c * struct nf_ct_frag6_queue in nf_conntrack_reasm.c * struct frag_queue in ipv6/reassembly.c After this, replace these fields on appropriate structures with this structure instance and fix the users to use correct names i.e. hunks like - atomic_dec(&fq->refcnt); + atomic_dec(&fq->q.refcnt); (these occupy most of the patch) Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-15 12:26:38 -07:00
Patrick McHardy	ad643a793b	[IPV6]: Uninline netfilter okfns Uninline netfilter okfns for those cases where gcc can generate tail-calls. Before: text data bss dec hex filename 8994153 1016524 524652 10535329 a0c1a1 vmlinux After: text data bss dec hex filename 8992761 1016524 524652 10533937 a0bc31 vmlinux ------------------------------------------------------- -1392 All cases have been verified to generate tail-calls with and without netfilter. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-15 12:26:36 -07:00
Adrian Bunk	1dff92e09e	[IPV6] __inet6_csk_dst_store(): fix check-after-use The Coverity checker spotted that we have already oops'ed if "dst" was NULL. Since "dst" being NULL doesn't seem to be possible at this point this patch removes the NULL check. Signed-off-by: Adrian Bunk <bunk@stusta.de> Acked-by: Masahide NAKAMURA <nakam@linux-ipv6.org> Acked-by: Noriaki TAKAMIYA <takamiya@po.ntts.co.jp> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-15 12:26:32 -07:00
Herbert Xu	65c8846660	[IPV6]: Avoid skb_copy/pskb_copy/skb_realloc_headroom on input This patch replaces unnecessary uses of skb_copy by pskb_expand_head on the IPv6 input path. This allows us to remove the double pointers later. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-15 12:26:31 -07:00
Herbert Xu	f61944efdf	[IPV6]: Make ipv6_frag_rcv return the same packet This patch implements the same change taht was done to ip_defrag. It makes ipv6_frag_rcv return the last packet received of a train of fragments rather than the head of that sequence. This allows us to get rid of the sk_buff ** argument later. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-15 12:26:30 -07:00
Herbert Xu	3db05fea51	[NETFILTER]: Replace sk_buff ** with sk_buff * With all the users of the double pointers removed, this patch mops up by finally replacing all occurances of sk_buff ** in the netfilter API by sk_buff *. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-15 12:26:29 -07:00
Herbert Xu	2ca7b0ac02	[NETFILTER]: Avoid skb_copy/pskb_copy/skb_realloc_headroom This patch replaces unnecessary uses of skb_copy, pskb_copy and skb_realloc_headroom by functions such as skb_make_writable and pskb_expand_head. This allows us to remove the double pointers later. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-15 12:26:28 -07:00
Herbert Xu	37d4187922	[NETFILTER]: Do not copy skb in skb_make_writable Now that all callers of netfilter can guarantee that the skb is not shared, we no longer have to copy the skb in skb_make_writable. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-15 12:26:27 -07:00
Brian Haley	4953f0fcc0	[IPv6]: Update setsockopt(IPV6_MULTICAST_IF) to support RFC 3493, try2 From RFC 3493, Section 5.2: IPV6_MULTICAST_IF Set the interface to use for outgoing multicast packets. The argument is the index of the interface to use. If the interface index is specified as zero, the system selects the interface (for example, by looking up the address in a routing table and using the resulting interface). This patch adds support for (index == 0) to reset the value to it's original state, allowing the system to choose the best interface. IPv4 already behaves this way. Signed-off-by: Brian Haley <brian.haley@hp.com> Acked-by: David L Stevens <dlstevens@us.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-11 14:39:29 -07:00
Pierre Ynard	31910575a9	[IPv6]: Export userland ND options through netlink (RDNSS support) As discussed before, this patch provides userland with a way to access relevant options in Router Advertisements, after they are processed and validated by the kernel. Extra options are processed in a generic way; this patch only exports RDNSS options described in RFC5006, but support to control which options are exported could be easily added. A new rtnetlink message type is defined, to transport Neighbor Discovery options, along with optional context information. At the moment only the address of the router sending an RDNSS option is included, but additional attributes may be later defined, if needed by new use cases. Signed-off-by: Pierre Ynard <linkfanel@yahoo.fr> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-10 21:22:05 -07:00
Denis V. Lunev	cd40b7d398	[NET]: make netlink user -> kernel interface synchronious This patch make processing netlink user -> kernel messages synchronious. This change was inspired by the talk with Alexey Kuznetsov about current netlink messages processing. He says that he was badly wrong when introduced asynchronious user -> kernel communication. The call netlink_unicast is the only path to send message to the kernel netlink socket. But, unfortunately, it is also used to send data to the user. Before this change the user message has been attached to the socket queue and sk->sk_data_ready was called. The process has been blocked until all pending messages were processed. The bad thing is that this processing may occur in the arbitrary process context. This patch changes nlk->data_ready callback to get 1 skb and force packet processing right in the netlink_unicast. Kernel -> user path in netlink_unicast remains untouched. EINTR processing for in netlink_run_queue was changed. It forces rtnl_lock drop, but the process remains in the cycle until the message will be fully processed. So, there is no need to use this kludges now. Signed-off-by: Denis V. Lunev <den@openvz.org> Acked-by: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-10 21:15:29 -07:00
Stephen Hemminger	227b60f510	[INET]: local port range robustness Expansion of original idea from Denis V. Lunev <den@openvz.org> Add robustness and locking to the local_port_range sysctl. 1. Enforce that low < high when setting. 2. Use seqlock to ensure atomic update. The locking might seem like overkill, but there are cases where sysadmin might want to change value in the middle of a DoS attack. Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-10 17:30:46 -07:00
Herbert Xu	ceb1eec829	[IPSEC]: Move IP length/checksum setting out of transforms This patch moves the setting of the IP length and checksum fields out of the transforms and into the xfrmX_output functions. This would help future efforts in merging the transforms themselves. It also adds an optimisation to ipcomp due to the fact that the transport offset is guaranteed to be zero. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-10 16:55:56 -07:00
Herbert Xu	87bdc48d30	[IPSEC]: Get rid of ipv6_{auth,esp,comp}_hdr This patch removes the duplicate ipv6_{auth,esp,comp}_hdr structures since they're identical to the IPv4 versions. Duplicating them would only create problems for ourselves later when we need to add things like extended sequence numbers. I've also added transport header type conversion headers for these types which are now used by the transforms. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-10 16:55:55 -07:00
Herbert Xu	37fedd3aab	[IPSEC]: Use IPv6 calling convention as the convention for x->mode->output The IPv6 calling convention for x->mode->output is more general and could help an eventual protocol-generic x->type->output implementation. This patch adopts it for IPv4 as well and modifies the IPv4 type output functions accordingly. It also rewrites the IPv6 mac/transport header calculation to be based off the network header where practical. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-10 16:55:54 -07:00
Herbert Xu	7b277b1a5f	[IPSEC]: Set skb->data to payload in x->mode->output This patch changes the calling convention so that on entry from x->mode->output and before entry into x->type->output skb->data will point to the payload instead of the IP header. This is essentially a redistribution of skb_push/skb_pull calls with the aim of minimising them on the common path of tunnel + ESP. It'll also let us use the same calling convention between IPv4 and IPv6 with the next patch. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-10 16:55:54 -07:00
Herbert Xu	bee0b40c06	[IPSEC] beet: Fix extension header support on output The beet output function completely kills any extension headers by replacing them with the IPv6 header. This is because it essentially ignores the result of ip6_find_1stfragopt by simply acting as if there aren't any extension headers. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-10 16:55:53 -07:00
Mitsuru Chinen	f24e3d658c	[IPV6]: Defer IPv6 device initialization until a valid qdisc is specified To judge the timing for DAD, netif_carrier_ok() is used. However, there is a possibility that dev->qdisc stays noop_qdisc even if netif_carrier_ok() returns true. In that case, DAD NS is not sent out. We need to defer the IPv6 device initialization until a valid qdisc is specified. Signed-off-by: Mitsuru Chinen <mitch@linux.vnet.ibm.com> Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-10 16:55:52 -07:00
Pavel Emelyanov	cf7732e4cc	[NET]: Make core networking code use seq_open_private This concerns the ipv4 and ipv6 code mostly, but also the netlink and unix sockets. The netlink code is an example of how to use the __seq_open_private() call - it saves the net namespace on this private. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-10 16:55:33 -07:00
Herbert Xu	b7c6538cd8	[IPSEC]: Move state lock into x->type->output This patch releases the lock on the state before calling x->type->output. It also adds the lock to the spots where they're currently needed. Most of those places (all except mip6) are expected to disappear with async crypto. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-10 16:55:03 -07:00
Herbert Xu	007f0211a8	[IPSEC]: Store IPv6 nh pointer in mac_header on output Current the x->mode->output functions store the IPv6 nh pointer in the skb network header. This is inconvenient because the network header then has to be fixed up before the packet can leave the IPsec stack. The mac header field is unused on output so we can use that to store this instead. This patch does that and removes the network header fix-up in xfrm_output. It also uses ipv6_hdr where appropriate in the x->type->output functions. There is also a minor clean-up in esp4 to make it use the same code as esp6 to help any subsequent effort to merge the two. Lastly it kills two redundant skb_set_* statements in BEET that were simply copied over from transport mode. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-10 16:55:00 -07:00
Benjamin Thery	0a8891a0a4	[IPv6]: use container_of() macro in fib6_clean_node() In ip6_fib.c, fib6_clean_node() casts a fib6_walker_t pointer to a fib6_cleaner_t pointer assuming a struct fib6_walker_t (field 'w') is the first field in struct fib6_walker_t. To prevent any future problems that may occur if one day a field is inadvertently inserted before the 'w' field in struct fib6_cleaner_t, (and to improve readability), this patch uses the container_of() macro. Signed-off-by: Benjamin Thery <benjamin.thery@bull.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-10 16:54:58 -07:00
Herbert Xu	45b17f48ea	[IPSEC]: Move RO-specific output code into xfrm6_mode_ro.c The lastused update check in xfrm_output can be done just as well in the mode output function which is specific to RO. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-10 16:54:56 -07:00
Herbert Xu	436a0a4022	[IPSEC]: Move output replay code into xfrm_output The replay counter is one of only two remaining things in the output code that requires a lock on the xfrm state (the other being the crypto). This patch moves it into the generic xfrm_output so we can remove the lock from the transforms themselves. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-10 16:54:54 -07:00
Herbert Xu	406ef77c89	[IPSEC]: Move common output code to xfrm_output Most of the code in xfrm4_output_one and xfrm6_output_one are identical so this patch moves them into a common xfrm_output function which will live in net/xfrm. In fact this would seem to fix a bug as on IPv4 we never reset the network header after a transform which may upset netfilter later on. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-10 16:54:53 -07:00
Herbert Xu	bc31d3b2c7	[IPSEC] ah: Remove keys from ah_data structure The keys are only used during initialisation so we don't need to carry them in esp_data. Since we don't have to allocate them again, there is no need to place a limit on the authentication key length anymore. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-10 16:54:53 -07:00
Herbert Xu	4b7137ff8f	[IPSEC] esp: Remove keys from esp_data structure The keys are only used during initialisation so we don't need to carry them in esp_data. Since we don't have to allocate them again, there is no need to place a limit on the authentication key length anymore. This patch also kills the unused auth.icv member. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-10 16:54:52 -07:00
Stephen Hemminger	cfcabdcc2d	[NET]: sparse warning fixes Fix a bunch of sparse warnings. Mostly about 0 used as NULL pointer, and shadowed variable declarations. One notable case was that hash size should have been unsigned. Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-10 16:54:48 -07:00
Patrick McHardy	f73e924cdd	[NETFILTER]: ctnetlink: use netlink policy Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-10 16:53:35 -07:00
Patrick McHardy	fdf708322d	[NETFILTER]: nfnetlink: rename functions containing 'nfattr' There is no struct nfattr anymore, rename functions to 'nlattr'. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-10 16:53:32 -07:00
Patrick McHardy	df6fb868d6	[NETFILTER]: nfnetlink: convert to generic netlink attribute functions Get rid of the duplicated rtnetlink macros and use the generic netlink attribute functions. The old duplicated stuff is moved to a new header file that exists just for userspace. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-10 16:53:31 -07:00
Stephen Hemminger	3b04ddde02	[NET]: Move hardware header operations out of netdevice. Since hardware header operations are part of the protocol class not the device instance, make them into a separate object and save memory. Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-10 16:52:52 -07:00
Stephen Hemminger	b95cce3576	[NET]: Wrap hard_header_parse Wrap the hard_header_parse function to simplify next step of header_ops conversion. Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-10 16:52:51 -07:00
Stephen Hemminger	0c4e85813d	[NET]: Wrap netdevice hardware header creation. Add inline for common usage of hardware header creation, and fix bug in IPV6 mcast where the assumption about negative return is an errno. Negative return from hard_header means not enough space was available,(ie -N bytes). Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-10 16:52:50 -07:00
Eric W. Biederman	2774c7aba6	[NET]: Make the loopback device per network namespace. This patch makes loopback_dev per network namespace. Adding code to create a different loopback device for each network namespace and adding the code to free a loopback device when a network namespace exits. This patch modifies all users the loopback_dev so they access it as init_net.loopback_dev, keeping all of the code compiling and working. A later pass will be needed to update the users to use something other than the initial network namespace. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-10 16:52:49 -07:00
Daniel Lezcano	de3cb747ff	[NET]: Dynamically allocate the loopback device, part 1. This patch replaces all occurences to the static variable loopback_dev to a pointer loopback_dev. That provides the mindless, trivial, uninteressting change part for the dynamic allocation for the loopback. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com> Acked-By: Kirill Korotaev <dev@sw.ru> Acked-by: Benjamin Thery <benjamin.thery@bull.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-10 16:52:14 -07:00
David L Stevens	14878f75ab	[IPV6]: Add ICMPMsgStats MIB (RFC 4293) [rev 2] Background: RFC 4293 deprecates existing individual, named ICMP type counters to be replaced with the ICMPMsgStatsTable. This table includes entries for both IPv4 and IPv6, and requires counting of all ICMP types, whether or not the machine implements the type. These patches "remove" (but not really) the existing counters, and replace them with the ICMPMsgStats tables for v4 and v6. It includes the named counters in the /proc places they were, but gets the values for them from the new tables. It also counts packets generated from raw socket output (e.g., OutEchoes, MLD queries, RA's from radvd, etc). Changes: 1) create icmpmsg_statistics mib 2) create icmpv6msg_statistics mib 3) modify existing counters to use these 4) modify /proc/net/snmp to add "IcmpMsg" with all ICMP types listed by number for easy SNMP parsing 5) modify /proc/net/snmp printing for "Icmp" to get the named data from new counters. [new to 2nd revision] 6) support per-interface ICMP stats 7) use common macro for per-device stat macros Signed-off-by: David L Stevens <dlstevens@us.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-10 16:51:27 -07:00
Denis V. Lunev	76c72d4f44	[IPV4/IPV6/DECNET]: Small cleanup for fib rules. This patch slightly cleanups FIB rules framework. rules_list as a pointer on struct fib_rules_ops is useless. It is always assigned with a static per/subsystem list in IPv4, IPv6 and DecNet. Signed-off-by: Denis V. Lunev <den@openvz.org> Acked-by: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-10 16:51:22 -07:00
Milan Kocian	0b69d4bd26	[IPV6]: Remove redundant RTM_DELLINK message. Remove useless message. We get the right message from another subsystem. Signed-off-by: Milan Kocian <milon@wq.cz> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-10 16:51:18 -07:00
Ralf Baechle	10d024c1b2	[NET]: Nuke SET_MODULE_OWNER macro. It's been a useless no-op for long enough in 2.6 so I figured it's time to remove it. The number of people that could object because they're maintaining unified 2.4 and 2.6 drivers is probably rather small. [ Handled drivers added by netdev tree and some missed IRDA cases... -DaveM ] Signed-off-by: Ralf Baechle <ralf@linux-mips.org> Signed-off-by: Jeff Garzik <jeff@garzik.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-10 16:51:13 -07:00
Thomas Graf	8f4c1f9b04	[NETLINK]: Introduce nested and byteorder flag to netlink attribute This change allows the generic attribute interface to be used within the netfilter subsystem where this flag was initially introduced. The byte-order flag is yet unused, it's intended use is to allow automatic byte order convertions for all atomic types. Signed-off-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-10 16:49:16 -07:00
Eric W. Biederman	881d966b48	[NET]: Make the device list and device lookups per namespace. This patch makes most of the generic device layer network namespace safe. This patch makes dev_base_head a network namespace variable, and then it picks up a few associated variables. The functions: dev_getbyhwaddr dev_getfirsthwbytype dev_get_by_flags dev_get_by_name __dev_get_by_name dev_get_by_index __dev_get_by_index dev_ioctl dev_ethtool dev_load wireless_process_ioctl were modified to take a network namespace argument, and deal with it. vlan_ioctl_set and brioctl_set were modified so their hooks will receive a network namespace argument. So basically anthing in the core of the network stack that was affected to by the change of dev_base was modified to handle multiple network namespaces. The rest of the network stack was simply modified to explicitly use &init_net the initial network namespace. This can be fixed when those components of the network stack are modified to handle multiple network namespaces. For now the ifindex generator is left global. Fundametally ifindex numbers are per namespace, or else we will have corner case problems with migration when we get that far. At the same time there are assumptions in the network stack that the ifindex of a network device won't change. Making the ifindex number global seems a good compromise until the network stack can cope with ifindex changes when you change namespaces, and the like. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-10 16:49:10 -07:00
Eric W. Biederman	b4b510290b	[NET]: Support multiple network namespaces with netlink Each netlink socket will live in exactly one network namespace, this includes the controlling kernel sockets. This patch updates all of the existing netlink protocols to only support the initial network namespace. Request by clients in other namespaces will get -ECONREFUSED. As they would if the kernel did not have the support for that netlink protocol compiled in. As each netlink protocol is updated to be multiple network namespace safe it can register multiple kernel sockets to acquire a presence in the rest of the network namespaces. The implementation in af_netlink is a simple filter implementation at hash table insertion and hash table look up time. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-10 16:49:09 -07:00
Eric W. Biederman	e9dc865340	[NET]: Make device event notification network namespace safe Every user of the network device notifiers is either a protocol stack or a pseudo device. If a protocol stack that does not have support for multiple network namespaces receives an event for a device that is not in the initial network namespace it quite possibly can get confused and do the wrong thing. To avoid problems until all of the protocol stacks are converted this patch modifies all netdev event handlers to ignore events on devices that are not in the initial network namespace. As the rest of the code is made network namespace aware these checks can be removed. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-10 16:49:09 -07:00
Eric W. Biederman	e730c15519	[NET]: Make packet reception network namespace safe This patch modifies every packet receive function registered with dev_add_pack() to drop packets if they are not from the initial network namespace. This should ensure that the various network stacks do not receive packets in a anything but the initial network namespace until the code has been converted and is ready for them. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-10 16:49:08 -07:00
Eric W. Biederman	1b8d7ae42d	[NET]: Make socket creation namespace safe. This patch passes in the namespace a new socket should be created in and has the socket code do the appropriate reference counting. By virtue of this all socket create methods are touched. In addition the socket create methods are modified so that they will fail if you attempt to create a socket in a non-default network namespace. Failing if we attempt to create a socket outside of the default network namespace ensures that as we incrementally make the network stack network namespace aware we will not export functionality that someone has not audited and made certain is network namespace safe. Allowing us to partially enable network namespaces before all of the exotic protocols are supported. Any protocol layers I have missed will fail to compile because I now pass an extra parameter into the socket creation code. [ Integrated AF_IUCV build fixes from Andrew Morton... -DaveM ] Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-10 16:49:07 -07:00
Eric W. Biederman	457c4cbc5a	[NET]: Make /proc/net per network namespace This patch makes /proc/net per network namespace. It modifies the global variables proc_net and proc_net_stat to be per network namespace. The proc_net file helpers are modified to take a network namespace argument, and all of their callers are fixed to pass &init_net for that argument. This ensures that all of the /proc/net files are only visible and usable in the initial network namespace until the code behind them has been updated to be handle multiple network namespaces. Making /proc/net per namespace is necessary as at least some files in /proc/net depend upon the set of network devices which is per network namespace, and even more files in /proc/net have contents that are relevant to a single network namespace. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-10 16:49:06 -07:00
Micah Gruber	1dfcae7765	[IPV6]: Remove unneeded pointer iph from ipcomp6_input() in net/ipv6/ipcomp6.c This trivial patch removes the unneeded pointer iph, which is never used. Signed-off-by: Micah Gruber <micah.gruber@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-10 16:48:58 -07:00
Masahide NAKAMURA	1e5dc14617	[IPV6] IPSEC: Omit redirect for tunnelled packet. IPv6 IPsec tunnel gateway incorrectly sends redirect to router or sender when network device the IPsec tunnelled packet is arrived is the same as the one the decapsulated packet is sent. With this patch, it omits to send the redirect when the forwarding skbuff carries secpath, since such skbuff should be assumed as a decapsulated packet from IPsec tunnel by own. It may be a rare case for an IPsec security gateway, however it is not rare when the gateway is MIPv6 Home Agent since the another tunnel end-point is Mobile Node and it changes the attached network. Signed-off-by: Masahide NAKAMURA <nakam@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-10 16:48:33 -07:00
Noriaki TAKAMIYA	a47ed4cd8c	[IPV6] XFRM: Fix connected socket to use transformation. When XFRM policy and state are ready after TCP connection is started, the traffic should be transformed immediately, however it does not on IPv6 TCP. It depends on a dst cache replacement policy with connected socket. It seems that the replacement is always done for IPv4, however, on IPv6 case it is done only when routing cookie is changed. This patch fix that non-transformation dst can be changed to transformation one. This behavior is required by MIPv6 and improves IPv6 IPsec. Fixes by Masahide NAKAMURA. Signed-off-by: Noriaki TAKAMIYA <takamiya@po.ntts.co.jp> Signed-off-by: Masahide NAKAMURA <nakam@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-10 16:48:32 -07:00
Brian Haley	e773e4faa1	[IPV6]: Add v4mapped address inline Add v4mapped address inline to avoid calls to ipv6_addr_type(). Signed-off-by: Brian Haley <brian.haley@hp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-10 16:48:32 -07:00
Brian Haley	bf0b48dfc3	[IPv6]: Fix ICMPv6 redirect handling with target multicast address When the ICMPv6 Target address is multicast, Linux processes the redirect instead of dropping it. The problem is in this code in ndisc_redirect_rcv(): if (ipv6_addr_equal(dest, target)) { on_link = 1; } else if (!(ipv6_addr_type(target) & IPV6_ADDR_LINKLOCAL)) { ND_PRINTK2(KERN_WARNING "ICMPv6 Redirect: target address is not link-local.\n"); return; } This second check will succeed if the Target address is, for example, FF02::1 because it has link-local scope. Instead, it should be checking if it's a unicast link-local address, as stated in RFC 2461/4861 Section 8.1: - The ICMP Target Address is either a link-local address (when redirected to a router) or the same as the ICMP Destination Address (when redirected to the on-link destination). I know this doesn't explicitly say unicast link-local address, but it's implied. This bug is preventing Linux kernels from achieving IPv6 Logo Phase II certification because of a recent error that was found in the TAHI test suite - Neighbor Disovery suite test 206 (v6LC.2.3.6_G) had the multicast address in the Destination field instead of Target field, so we were passing the test. This won't be the case anymore. The patch below fixes this problem, and also fixes ndisc_send_redirect() to not send an invalid redirect with a multicast address in the Target field. I re-ran the TAHI Neighbor Discovery section to make sure Linux passes all 245 tests now. Signed-off-by: Brian Haley <brian.haley@hp.com> Acked-by: David L Stevens <dlstevens@us.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-08 00:12:05 -07:00
David S. Miller	f8ab18d2d9	[TCP]: Fix MD5 signature handling on big-endian. Based upon a report and initial patch by Peter Lieven. tcp4_md5sig_key and tcp6_md5sig_key need to start with the exact same members as tcp_md5sig_key. Because they are both cast to that type by tcp_v{4,6}_md5_do_lookup(). Unfortunately tcp{4,6}_md5sig_key use a u16 for the key length instead of a u8, which is what tcp_md5sig_key uses. This just so happens to work by accident on little-endian, but on big-endian it doesn't. Instead of casting, just place tcp_md5sig_key as the first member of the address-family specific structures, adjust the access sites, and kill off the ugly casts. Signed-off-by: David S. Miller <davem@davemloft.net>	2007-09-28 15:18:35 -07:00
Jiri Kosina	6ae5f983cf	[IPV6]: Fix source address selection. The commit 95c385 broke proper source address selection for cases in which there is a address which is makred 'deprecated'. The commit mistakenly changed ifa->flags to ifa_result->flags (probably copy/paste error from a few lines above) in the 'Rule 3' address selection code. The patch restores the previous RFC-compliant behavior. Signed-off-by: Jiri Kosina <jkosina@suse.cz> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-09-16 14:48:21 -07:00
YOSHIFUJI Hideaki	cd562c9859	[IPV6]: Just increment OutDatagrams once per a datagram. Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-09-14 17:15:01 -07:00
YOSHIFUJI Hideaki	3ef9d943d2	[IPV6]: Fix unbalanced socket reference with MSG_CONFIRM. Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-09-14 16:45:40 -07:00
YOSHIFUJI Hideaki	e1f52208bb	[IPv6]: Fix NULL pointer dereference in ip6_flush_pending_frames Some of skbs in sk->write_queue do not have skb->dst because we do not fill skb->dst when we allocate new skb in append_data(). BTW, I think we may not need to (or we should not) increment some stats when using corking; if 100 sendmsg() (with MSG_MORE) result in 2 packets, how many should we increment? If 100, we should set skb->dst for every queued skbs. If 1 (or 2 ()), we increment the stats for the first queued skb and we should just skip incrementing OutDiscards for the rest of queued skbs, adn we should also impelement this semantics in other places; e.g., we should increment other stats just once, not 100 times. : depends on the place we are discarding the datagram. I guess should just increment by 1 (or 2). Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-09-11 11:31:43 +02:00
Neil Horman	16fcec35e7	[NETFILTER]: Fix/improve deadlock condition on module removal netfilter So I've had a deadlock reported to me. I've found that the sequence of events goes like this: 1) process A (modprobe) runs to remove ip_tables.ko 2) process B (iptables-restore) runs and calls setsockopt on a netfilter socket, increasing the ip_tables socket_ops use count 3) process A acquires a file lock on the file ip_tables.ko, calls remove_module in the kernel, which in turn executes the ip_tables module cleanup routine, which calls nf_unregister_sockopt 4) nf_unregister_sockopt, seeing that the use count is non-zero, puts the calling process into uninterruptible sleep, expecting the process using the socket option code to wake it up when it exits the kernel 4) the user of the socket option code (process B) in do_ipt_get_ctl, calls ipt_find_table_lock, which in this case calls request_module to load ip_tables_nat.ko 5) request_module forks a copy of modprobe (process C) to load the module and blocks until modprobe exits. 6) Process C. forked by request_module process the dependencies of ip_tables_nat.ko, of which ip_tables.ko is one. 7) Process C attempts to lock the request module and all its dependencies, it blocks when it attempts to lock ip_tables.ko (which was previously locked in step 3) Theres not really any great permanent solution to this that I can see, but I've developed a two part solution that corrects the problem Part 1) Modifies the nf_sockopt registration code so that, instead of using a use counter internal to the nf_sockopt_ops structure, we instead use a pointer to the registering modules owner to do module reference counting when nf_sockopt calls a modules set/get routine. This prevents the deadlock by preventing set 4 from happening. Part 2) Enhances the modprobe utilty so that by default it preforms non-blocking remove operations (the same way rmmod does), and add an option to explicity request blocking operation. So if you select blocking operation in modprobe you can still cause the above deadlock, but only if you explicity try (and since root can do any old stupid thing it would like.... :) ). Signed-off-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-09-11 11:28:26 +02:00
Denis V. Lunev	9e3be4b343	[IPV6]: Freeing alive inet6 address From: Denis V. Lunev <den@openvz.org> addrconf_dad_failure calls addrconf_dad_stop which takes referenced address and drops the count. So, in6_ifa_put perrformed at out: is extra. This results in message: "Freeing alive inet6 address" and not released dst entries. Signed-off-by: Denis V. Lunev <den@openvz.org> Signed-off-by: Alexey Dobriyan <adobriyan@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-09-11 11:04:49 +02:00
Flavio Leitner	a96fb49be3	[NET]: Fix IP_ADD/DROP_MEMBERSHIP to handle only connectionless Fix IP[V6]_ADD_MEMBERSHIP and IP[V6]_DROP_MEMBERSHIP to return -EPROTO for connection oriented sockets. Signed-off-by: Flavio Leitner <fleitner@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-08-26 18:35:35 -07:00
Wei Yongjun	8984e41d18	[IPV6]: Fix kernel panic while send SCTP data with IP fragments If ICMP6 message with "Packet Too Big" is received after send SCTP DATA, kernel panic will occur when SCTP DATA is send again. This is because of a bad dest address when call to skb_copy_bits(). The messages sequence is like this: Endpoint A Endpoint B <------- SCTP DATA (size=1432) ICMP6 message -------> (Packet Too Big pmtu=1280) <------- Resend SCTP DATA (size=1432) ------------kernel panic--------------- printing eip: c05be62a pde = 00000000 Oops: 0002 [#1] SMP Modules linked in: scomm l2cap bluetooth ipv6 dm_mirror dm_mod video output sbs battery lp floppy sg i2c_piix4 i2c_core pcnet32 mii button ac parport_pc parport ide_cd cdrom serio_raw mptspi mptscsih mptbase scsi_transport_spi sd_mod scsi_mod ext3 jbd ehci_hcd ohci_hcd uhci_hcd CPU: 0 EIP: 0060:[<c05be62a>] Not tainted VLI EFLAGS: 00010282 (2.6.23-rc2 #1) EIP is at skb_copy_bits+0x4f/0x1ef eax: 000004d0 ebx: ce12a980 ecx: 00000134 edx: cfd5a880 esi: c8246858 edi: 00000000 ebp: c0759b14 esp: c0759adc ds: 007b es: 007b fs: 00d8 gs: 0000 ss: 0068 Process swapper (pid: 0, ti=c0759000 task=c06d0340 task.ti=c0713000) Stack: c0759b88 c0405867 ce12a980 c8bff838 c789c084 00000000 00000028 cfd5a880 d09f1890 000005dc 0000007b ce12a980 cfd5a880 c8bff838 c0759b88 d09bc521 000004d0 fffff96c 00000200 00000100 c0759b50 cfd5a880 00000246 c0759bd4 Call Trace: [<c0405e1d>] show_trace_log_lvl+0x1a/0x2f [<c0405ecd>] show_stack_log_lvl+0x9b/0xa3 [<c040608d>] show_registers+0x1b8/0x289 [<c0406271>] die+0x113/0x246 [<c0625dbc>] do_page_fault+0x4ad/0x57e [<c0624642>] error_code+0x72/0x78 [<d09bc521>] ip6_output+0x8e5/0xab2 [ipv6] [<d09bcec1>] ip6_xmit+0x2ea/0x3a3 [ipv6] [<d0a3f2ca>] sctp_v6_xmit+0x248/0x253 [sctp] [<d0a3c934>] sctp_packet_transmit+0x53f/0x5ae [sctp] [<d0a34bf8>] sctp_outq_flush+0x555/0x587 [sctp] [<d0a34d3c>] sctp_retransmit+0xf8/0x10f [sctp] [<d0a3d183>] sctp_icmp_frag_needed+0x57/0x5b [sctp] [<d0a3ece2>] sctp_v6_err+0xcd/0x148 [sctp] [<d09cf1ce>] icmpv6_notify+0xe6/0x167 [ipv6] [<d09d009a>] icmpv6_rcv+0x7d7/0x849 [ipv6] [<d09be240>] ip6_input+0x1dc/0x310 [ipv6] [<d09be965>] ipv6_rcv+0x294/0x2df [ipv6] [<c05c3789>] netif_receive_skb+0x2d2/0x335 [<c05c5733>] process_backlog+0x7f/0xd0 [<c05c58f6>] net_rx_action+0x96/0x17e [<c042e722>] __do_softirq+0x64/0xcd [<c0406f37>] do_softirq+0x5c/0xac ======================= Code: 00 00 29 ca 89 d0 2b 45 e0 89 55 ec 85 c0 7e 35 39 45 08 8b 55 e4 0f 4e 45 08 8b 75 e0 8b 7d dc 89 c1 c1 e9 02 03 b2 a0 00 00 00 <f3> a5 89 c1 83 e1 03 74 02 f3 a4 29 45 08 0f 84 7b 01 00 00 01 EIP: [<c05be62a>] skb_copy_bits+0x4f/0x1ef SS:ESP 0068:c0759adc Kernel panic - not syncing: Fatal exception in interrupt Arnaldo says: ==================== Thanks! I'm to blame for this one, problem was introduced in: `b0e380b1d8` @@ -761,7 +762,7 @@ slow_path: / * Copy a block of the IP datagram. */ - if (skb_copy_bits(skb, ptr, frag->h.raw, len)) + if (skb_copy_bits(skb, ptr, skb_transport_header(skb), len)) BUG(); left -= len; ==================== Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com> Acked-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-08-21 20:59:08 -07:00
Ilpo Järvinen	660adc6e60	[IPv6]: Invalid semicolon after if statement A similar fix to netfilter from Eric Dumazet inspired me to look around a bit by using some grep/sed stuff as looking for this kind of bugs seemed easy to automate. This is one of them I found where it looks like this semicolon is not valid. Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-08-15 15:07:30 -07:00
Jesper Juhl	703310e645	[IPV6]: Clean up duplicate includes in net/ipv6/ This patch cleans up duplicate includes in net/ipv6/ Signed-off-by: Jesper Juhl <jesper.juhl@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-08-13 22:52:03 -07:00
David S. Miller	3516ffb0fe	[TCP]: Invoke tcp_sendmsg() directly, do not use inet_sendmsg(). As discovered by Evegniy Polyakov, if we try to sendmsg after a connection reset, we can do incredibly stupid things. The core issue is that inet_sendmsg() tries to autobind the socket, but we should never do that for TCP. Instead we should just go straight into TCP's sendmsg() code which will do all of the necessary state and pending socket error checks. TCP's sendpage already directly vectors to tcp_sendpage(), so this merely brings sendmsg() in line with that. Signed-off-by: David S. Miller <davem@davemloft.net>	2007-08-02 19:42:28 -07:00
Adrian Bunk	1a3a206f7f	[NETFILTER]: Make nf_ct_ipv6_skip_exthdr() static. nf_ct_ipv6_skip_exthdr() can now become static. Signed-off-by: Adrian Bunk <bunk@stusta.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-07-31 02:28:26 -07:00
Dave Johnson	c61a7d10ef	[IPV6]: ipv6_addr_type() doesn't know about RFC4193 addresses. ipv6_addr_type() doesn't check for 'Unique Local IPv6 Unicast Addresses' (RFC4193) and returns IPV6_ADDR_RESERVED for that range. SCTP uses this function and will fail bind() and connect() calls that use RFC4193 addresses, SCTP will also ignore inbound connections from RFC4193 addresses if listening on IPV6_ADDR_ANY. There may be other users of ipv6_addr_type() that could also have problems. Signed-off-by: Dave Johnson <djohnson@sw.starentnetworks.com> Acked-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-07-31 02:28:21 -07:00
Herbert Xu	b217d616a1	[IPV4/IPV6]: Fail registration if inet device construction fails Now that netdev notifications can fail, we can use this to signal errors during registration for IPv4/IPv6. In particular, if we fail to allocate memory for the inet device, we can fail the netdev registration. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-07-31 02:28:16 -07:00
Simon Arlott	566cfd8f0e	[IPV6]: Don't update ADVMSS on routes where the MTU is not also updated The ADVMSS value was incorrectly updated for ALL routes when the MTU is updated because it's outside the effect of the if statement's condition. Signed-off-by: Simon Arlott <simon@fire.lp0.eu> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-07-31 02:28:04 -07:00
Al Viro	704eae1f32	ip6_tunnel - endianness annotations Convert rel_info to host-endian before calling ip6_tnl_err(). The things become much more straightforward that way. The key observation (and the reason why that code actually worked) is that after ip6_tnl_err() we either immediately bailed out or had rel_info set to 0 or had it set to host-endian and guaranteed to hit (rel_type == ICMP_DEST_UNREACH && rel_code == ICMP_FRAG_NEEDED) case. So inconsistent endianness didn't really lead to bugs, but it had been subtle and prone to breakage. New variant is saner and obviously safe. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-07-26 11:11:56 -07:00
Patrick McHardy	7e2acc7e27	[NETFILTER]: Fix logging regression Loading one of the LOG target fails if a different target has already registered itself as backend for the same family. This can affect the ipt_LOG and ipt_ULOG modules when both are loaded. Reported and tested by: <t.artem@mailcity.com> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-07-24 15:29:55 -07:00
YOSHIFUJI Hideaki	ca983cefd9	[TCPv6] MD5SIG: Ensure to reset allocation count to avoid panic. After clearing all passwords for IPv6 peers, we need to set allocation count to zero as well as we free the storage. Otherwise, we panic when a user trys to (re)add a password. Discovered and fixed by MIYAJIMA Mitsuharu <miyajima.mitsuharu@anchor.jp>. Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-07-24 15:27:30 -07:00
Al Viro	b77f2fa629	[IPV6]: endianness bug in ip6_tunnel Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-07-21 19:09:41 -07:00
Paul Mundt	20c2df83d2	mm: Remove slab destructors from kmem_cache_create(). Slab destructors were no longer supported after Christoph's `c59def9f22` change. They've been BUGs for both slab and slub, and slob never supported them either. This rips out support for the dtor pointer from kmem_cache_create() completely and fixes up every single callsite in the kernel (there were about 224, not including the slab allocator definitions themselves, or the documentation references). Signed-off-by: Paul Mundt <lethal@linux-sh.org>	2007-07-20 10:11:58 +09:00
Vlad Yasevich	063ed369c9	[IPV6]: Call inet6addr_chain notifiers on link down Currently if the link is brought down via ip link or ifconfig down, the inet6addr_chain notifiers are not called even though all the addresses are removed from the interface. This caused SCTP to add duplicate addresses to it's list. Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-07-15 00:16:35 -07:00
Dmitry Butskoy	f13ec93fba	[IPV6]: MSG_ERRQUEUE messages do not pass to connected raw sockets From: Dmitry Butskoy <dmitry@butskoy.name> Taken from http://bugzilla.kernel.org/show_bug.cgi?id=8747 Problem Description: It is related to the possibility to obtain MSG_ERRQUEUE messages from the udp and raw sockets, both connected and unconnected. There is a little typo in net/ipv6/icmp.c code, which prevents such messages to be delivered to the errqueue of the correspond raw socket, when the socket is CONNECTED. The typo is due to swap of local/remote addresses. Consider __raw_v6_lookup() function from net/ipv6/raw.c. When a raw socket is looked up usual way, it is something like: sk = __raw_v6_lookup(sk, nexthdr, daddr, saddr, IP6CB(skb)->iif); where "daddr" is a destination address of the incoming packet (IOW our local address), "saddr" is a source address of the incoming packet (the remote end). But when the raw socket is looked up for some icmp error report, in net/ipv6/icmp.c:icmpv6_notify() , daddr/saddr are obtained from the echoed fragment of the "bad" packet, i.e. "daddr" is the original destination address of that packet, "saddr" is our local address. Hence, for icmpv6_notify() must use "saddr, daddr" in its arguments, not "daddr, saddr" ... Steps to reproduce: Create some raw socket, connect it to an address, and cause some error situation: f.e. set ttl=1 where the remote address is more than 1 hop to reach. Set IPV6_RECVERR . Then send something and wait for the error (f.e. poll() with POLLERR\|POLLIN). You should receive "time exceeded" icmp message (because of "ttl=1"), but the socket do not receive it. If you do not connect your raw socket, you will receive MSG_ERRQUEUE successfully. (The reason is that for unconnected socket there are no actual checks for local/remote addresses). Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-07-14 23:53:08 -07:00
Patrick McHardy	61075af51f	[NETFILTER]: nf_conntrack: mark protocols __read_mostly Also remove two unnecessary EXPORT_SYMBOLs and move the nf_conntrack_l3proto_ipv4 declaration to the correct file. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-07-14 20:48:19 -07:00
Patrick McHardy	a887c1c148	[NETFILTER]: Lower *tables printk severity Lower ip6tables, arptables and ebtables printk severity similar to Dan Aloni's patch for iptables. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-07-14 20:46:15 -07:00
Yasuyuki Kozakai	e2a3123fbe	[NETFILTER]: nf_conntrack: Introduces nf_ct_get_tuplepr and uses it nf_ct_get_tuple() requires the offset to transport header and that bothers callers such as icmp[v6] l4proto modules. This introduces new function to simplify them. Signed-off-by: Yasuyuki Kozakai <yasuyuki.kozakai@toshiba.co.jp> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-07-14 20:45:14 -07:00
Yasuyuki Kozakai	ffc3069048	[NETFILTER]: nf_conntrack: make l3proto->prepare() generic and renames it The icmp[v6] l4proto modules parse headers in ICMP[v6] error to get tuple. But they have to find the offset to transport protocol header before that. Their processings are almost same as prepare() of l3proto modules. This makes prepare() more generic to simplify icmp[v6] l4proto module later. Signed-off-by: Yasuyuki Kozakai <yasuyuki.kozakai@toshiba.co.jp> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-07-14 20:44:50 -07:00
Yasuyuki Kozakai	d87d8469e2	[NETFILTER]: nf_conntrack: Increment error count on parsing IPv4 header Signed-off-by: Yasuyuki Kozakai <yasuyuki.kozakai@toshiba.co.jp> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-07-14 20:44:23 -07:00
Philippe De Muyter	56b3d975bb	[NET]: Make all initialized struct seq_operations const. Make all initialized struct seq_operations in net/ const Signed-off-by: Philippe De Muyter <phdm@macqel.be> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-07-10 23:07:31 -07:00
Micah Gruber	dffe4f048b	[IPV6]: Remove unneeded pointer idev from addrconf_cleanup(). This trivial patch removes the unneeded pointer idev returned from __in6_dev_get(), which is never used. The check for NULL can be simply done by if (__in6_dev_get(dev) == NULL). Signed-off-by: Micah Gruber <micah.gruber@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-07-10 23:04:19 -07:00
YOSHIFUJI Hideaki	4c752098f5	[IPV6]: Make IPV6_{RECV,2292}RTHDR boolean options. Because reversing RH0 is no longer supported by deprecation of RH0, let's make IPV6_{RECV,2292}RTHDR boolean options. Boolean are more appropriate from standard POV. Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-07-10 22:56:31 -07:00
YOSHIFUJI Hideaki	bb4dbf9e61	[IPV6]: Do not send RH0 anymore. Based on <draft-ietf-ipv6-deprecate-rh0-00.txt>. Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-07-10 22:55:49 -07:00
YOSHIFUJI Hideaki	c382bb9d32	[IPV6]: Restore semantics of Routing Header processing. The "fix" for emerging security threat was overkill and it broke basic semantic of IPv6 routing header processing. We should assume RT0 (or even RT2, depends on configuration) as "unknown" RH type so that we - silently ignore the routing header if segleft == 0 - send ICMPv6 Parameter Problem message back to the sender, otherwise. Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-07-10 22:47:58 -07:00
Patrick McHardy	cfbba49d80	[NET]: Avoid copying writable clones in tunnel drivers Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-07-10 22:19:05 -07:00
Patrick McHardy	0d53778e81	[NETFILTER]: Convert DEBUGP to pr_debug Convert DEBUGP to pr_debug and fix lots of non-compiling debug statements. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-07-10 22:18:20 -07:00
Patrick McHardy	330f7db5e5	[NETFILTER]: nf_conntrack: remove 'ignore_conntrack' argument from nf_conntrack_find_get All callers pass NULL, this also doesn't seem very useful for modules. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-07-10 22:17:41 -07:00
Yasuyuki Kozakai	dacd2a1a5c	[NETFILTER]: nf_conntrack: remove old memory allocator of conntrack Now memory space for help and NAT are allocated by extension infrastructure. Signed-off-by: Yasuyuki Kozakai <yasuyuki.kozakai@toshiba.co.jp> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-07-10 22:17:35 -07:00
Patrick McHardy	9f15c5302d	[NETFILTER]: x_tables: mark matches and targets __read_mostly Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-07-10 22:17:15 -07:00
Jozsef Kadlecsik	ba9dda3ab5	[NETFILTER]: x_tables: add TRACE target The TRACE target can be used to follow IP and IPv6 packets through the ruleset. Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu> Signed-off-by: Patrick NcHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-07-10 22:17:14 -07:00
Jan Engelhardt	7c4e36bc17	[NETFILTER]: Remove redundant parentheses/braces Removes redundant parentheses and braces (And add one pair in a xt_tcpudp.c macro). Signed-off-by: Jan Engelhardt <jengelh@gmx.de> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-07-10 22:17:11 -07:00
Jan Engelhardt	a47362a226	[NETFILTER]: add some consts, remove some casts Make a number of variables const and/or remove unneeded casts. Signed-off-by: Jan Engelhardt <jengelh@gmx.de> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-07-10 22:17:01 -07:00
Jan Engelhardt	e1931b784a	[NETFILTER]: x_tables: switch xt_target->checkentry to bool Switch the return type of target checkentry functions to boolean. Signed-off-by: Jan Engelhardt <jengelh@gmx.de> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-07-10 22:16:59 -07:00
Jan Engelhardt	ccb79bdce7	[NETFILTER]: x_tables: switch xt_match->checkentry to bool Switch the return type of match functions to boolean Signed-off-by: Jan Engelhardt <jengelh@gmx.de> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-07-10 22:16:58 -07:00
Jan Engelhardt	1d93a9cbad	[NETFILTER]: x_tables: switch xt_match->match to bool Switch the return type of match functions to boolean Signed-off-by: Jan Engelhardt <jengelh@gmx.de> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-07-10 22:16:57 -07:00
Jan Engelhardt	cff533ac12	[NETFILTER]: x_tables: switch hotdrop to bool Switch the "hotdrop" variables to boolean Signed-off-by: Jan Engelhardt <jengelh@gmx.de> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-07-10 22:16:56 -07:00
Stephen Hemminger	d212f87b06	[NET]: IPV6 checksum offloading in network devices The existing model for checksum offload does not correctly handle devices that can offload IPV4 and IPV6 only. The NETIF_F_HW_CSUM flag implies device can do any arbitrary protocol. This patch: * adds NETIF_F_IPV6_CSUM for those devices * fixes bnx2 and tg3 devices that need it * add NETIF_F_IPV6_CSUM to ipv6 output (incl GSO) * fixes assumptions about NETIF_F_ALL_CSUM in nat * adjusts bridge union of checksumming computation Signed-off-by: David S. Miller <davem@davemloft.net>	2007-07-10 22:15:52 -07:00
Masahide NAKAMURA	d3d6dd3ada	[XFRM]: Add module alias for transformation type. It is clean-up for XFRM type modules and adds aliases with its protocol: ESP, AH, IPCOMP, IPIP and IPv6 for IPsec ROUTING and DSTOPTS for MIPv6 It is almost the same thing as XFRM mode alias, but it is added new defines XFRM_PROTO_XXX for preprocessing since some protocols are defined as enum. Signed-off-by: Masahide NAKAMURA <nakam@linux-ipv6.org> Acked-by: Ingo Oeser <netdev@axxeo.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-07-10 22:15:43 -07:00
Masahide NAKAMURA	59fbb3a61e	[IPV6] MIP6: Loadable module support for MIPv6. This patch makes MIPv6 loadable module named "mip6". Here is a modprobe.conf(5) example to load it automatically when user application uses XFRM state for MIPv6: alias xfrm-type-10-43 mip6 alias xfrm-type-10-60 mip6 Some MIPv6 feature is not included by this modular, however, it should not be affected to other features like either IPsec or IPv6 with and without the patch. We may discuss XFRM, MH (RAW socket) and ancillary data/sockopt separately for future work. Loadable features: * MH receiving check (to send ICMP error back) * RO header parsing and building (i.e. RH2 and HAO in DSTOPTS) * XFRM policy/state database handling for RO These are NOT covered as loadable: * Home Address flags and its rule on source address selection * XFRM sub policy (depends on its own kernel option) * XFRM functions to receive RO as IPv6 extension header * MH sending/receiving through raw socket if user application opens it (since raw socket allows to do so) * RH2 sending as ancillary data * RH2 operation with setsockopt(2) Signed-off-by: Masahide NAKAMURA <nakam@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-07-10 22:15:42 -07:00
Masahide NAKAMURA	136ebf08b4	[IPV6] MIP6: Kill unnecessary ifdefs. Kill unnecessary CONFIG_IPV6_MIP6. o It is redundant for RAW socket to keep MH out with the config then it can handle any protocol. o Clean-up at AH. Signed-off-by: Masahide NAKAMURA <nakam@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-07-10 22:15:41 -07:00
Jay Vosburgh	c2edacf80e	bonding / ipv6: no addrconf for slaves separately from master At present, when a device is enslaved to bonding, if ipv6 is active then addrconf will be initated on the slave (because it is closed then opened during the enslavement processing). This causes DAD and RS packets to be sent from the slave. These packets in turn can confuse switches that perform ipv6 snooping, causing them to incorrectly update their forwarding tables (if, e.g., the slave being added is an inactve backup that won't be used right away) and direct traffic away from the active slave to a backup slave (where the incoming packets will be dropped). This patch alters the behavior so that addrconf will only run on the master device itself. I believe this is logically correct, as it prevents slaves from having an IPv6 identity independent from the master. This is consistent with the IPv4 behavior for bonding. This is accomplished by (a) having bonding set IFF_SLAVE sooner in the enslavement processing than currently occurs (before open, not after), and (b) having ipv6 addrconf ignore UP and CHANGE events on slave devices. The eql driver also uses the IFF_SLAVE flag. I inspected eql, and I believe this change is reasonable for its usage of IFF_SLAVE, but I did not test it. Signed-off-by: Jay Vosburgh <fubar@us.ibm.com> Signed-off-by: Jeff Garzik <jeff@garzik.org>	2007-07-10 12:41:19 -04:00
YOSHIFUJI Hideaki	6d5b78cdd5	[IPV6] NDISC: Fix thinko to control Router Preference support. Bug reported by Haruhito Watanabe <haruhito@sfc.keio.ac.jp>. Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-06-22 16:07:04 -07:00
Herbert Xu	74235a25c6	[IPV6] addrconf: Fix IPv6 on tuntap tunnels The recent patch that added ipv6_hwtype is broken on tuntap tunnels. Indeed, it's broken on any device that does not pass the ipv6_hwtype test. The reason is that the original test only applies to autoconfiguration, not IPv6 support. IPv6 support is allowed on any device. In fact, even with the ipv6_hwtype patch applied you can still add IPv6 addresses to any interface that doesn't pass thw ipv6_hwtype test provided that they have a sufficiently large MTU. This is a serious problem because come deregistration time these devices won't be cleaned up properly. I've gone back and looked at the rationale for the patch. It appears that the real problem is that we were creating IPv6 devices even if the MTU was too small. So here's a patch which fixes that and reverts the ipv6_hwtype stuff. Thanks to Kanru Chen for reporting this issue. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-06-14 13:02:55 -07:00
David S. Miller	3d7dbeac58	[TCP]: Disable TSO if MD5SIG is enabled. Signed-off-by: David S. Miller <davem@davemloft.net>	2007-06-12 14:36:42 -07:00
David S. Miller	df2bc459a3	[UDP]: Revert 2-pass hashing changes. This reverts changesets: `6aaf47fa48` `b7b5f487ab` `de34ed91c4` `fc038410b4` There are still some correctness issues recently discovered which do not have a known fix that doesn't involve doing a full hash table scan on port bind. So revert for now. Signed-off-by: David S. Miller <davem@davemloft.net>	2007-06-07 13:40:50 -07:00
Patrick McHarrdy	3c158f7f57	[NETFILTER]: nf_conntrack: fix helper module unload races When a helper module is unloaded all conntracks refering to it have their helper pointer NULLed out, leading to lots of races. In most places this can be fixed by proper use of RCU (they do already check for != NULL, but in a racy way), additionally nf_conntrack_expect_related needs to bail out when no helper is present. Also remove two paranoid BUG_ONs in nf_conntrack_proto_gre that are racy and not worth fixing. Signed-off-by: Patrick McHarrdy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-06-07 13:40:26 -07:00
Patrick McHardy	ef7c79ed64	[NETLINK]: Mark netlink policies const Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-06-07 13:40:10 -07:00
Bill Nottingham	75202e7689	[NET]: Fix comparisons of unsigned < 0. Recent gcc versions emit warnings when unsigned variables are compared < 0 or >= 0. Signed-off-by: Bill Nottingham <notting@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-06-03 18:08:47 -07:00
David S. Miller	8c7fc03e27	[IPV6]: Fix build warning. net/ipv6/ip6_fib.c: In function ‘fib6_add_rt2node’: net/ipv6/ip6_fib.c:661: warning: label ‘out’ defined but not used Signed-off-by: David S. Miller <davem@davemloft.net>	2007-05-31 01:23:31 -07:00
Kazunori MIYAZAWA	f282d45cb4	[IPSEC]: Fix panic when using inter address familiy IPsec on loopback. Signed-off-by: Kazunori MIYAZAWA <kazunori@miyazawa.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-05-31 01:23:28 -07:00
YOSHIFUJI Hideaki	7ebba6d14f	[IPV6] ROUTE: No longer handle ::/0 specially. We do not need to handle ::/0 routes specially any longer. This should fix BUG #8349. Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Acked-by: Yuji Sekiya <sekiya@wide.ad.jp> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-05-31 01:23:26 -07:00
Kazunori MIYAZAWA	144466bdf8	[IPSEC]: Fix IPv6 AH calculation in outbound Signed-off-by: Kazunori MIYAZAWA <miyazawa@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-05-31 01:23:25 -07:00
David S. Miller	14e50e57ae	[XFRM]: Allow packet drops during larval state resolution. The current IPSEC rule resolution behavior we have does not work for a lot of people, even though technically it's an improvement from the -EAGAIN buisness we had before. Right now we'll block until the key manager resolves the route. That works for simple cases, but many folks would rather packets get silently dropped until the key manager resolves the IPSEC rules. We can't tell these folks to "set the socket non-blocking" because they don't have control over the non-block setting of things like the sockets used to resolve DNS deep inside of the resolver libraries in libc. With that in mind I coded up the patch below with some help from Herbert Xu which provides packet-drop behavior during larval state resolution, controllable via sysctl and off by default. This lays the framework to either: 1) Make this default at some point or... 2) Move this logic into xfrm{4,6}_policy.c and implement the ARP-like resolution queue we've all been dreaming of. The idea would be to queue packets to the policy, then once the larval state is resolved by the key manager we re-resolve the route and push the packets out. The packets would timeout if the rule didn't get resolved in a certain amount of time. Signed-off-by: David S. Miller <davem@davemloft.net>	2007-05-24 18:17:54 -07:00
Oliver Hartkopp	bbb711e633	[IPV6]: Ignore ipv6 events on non-IPV6 capable devices. Signed-off-by: Oliver Hartkopp <socketcan@hartkopp.net> Signed-off-by: Urs Thuermann <urs@isnogud.escape.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-05-24 16:36:44 -07:00
Corey Mutter	ae7bf20a63	[IPV6]: Reverse sense of promisc tests in ip6_mc_input Reverse the sense of the promiscuous-mode tests in ip6_mc_input(). Signed-off-by: Corey Mutter <crm-netdev@mutternet.com> Signed-off-by: David L Stevens <dlstevens@us.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-05-14 03:00:27 -07:00
Patrick McHardy	3c2ad469c3	[NETFILTER]: Clean up table initialization - move arp_tables initial table structure definitions to arp_tables.h similar to ip_tables and ip6_tables - use C99 initializers - use initializer macros where possible Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-05-10 23:47:43 -07:00
David S. Miller	fc038410b4	[UDP]: Fix AF-specific references in AF-agnostic code. __udp_lib_port_inuse() cannot make direct references to inet_sk(sk)->rcv_saddr as that is ipv4 specific state and this code is used by ipv6 too. Use an operations vector to solve this, and this also paves the way for ipv6 support for non-wild saddr hashing in UDP. Signed-off-by: David S. Miller <davem@davemloft.net>	2007-05-10 23:47:22 -07:00
YOSHIFUJI Hideaki	9a6bf6fe71	[IPV6] ROUTE: Assign rt6i_idev for ip6_{prohibit,blk_hole}_entry. I think this is less critical, but is also suitable for -stable release. Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-05-10 23:46:12 -07:00
YOSHIFUJI Hideaki	e76b2b2567	[IPV6]: Do no rely on skb->dst before it is assigned. Because skb->dst is assigned in ip6_route_input(), it is really bad to use it in hop-by-hop option handler(s). Closes: Bug #8450 (Eric Sesterhenn <snakebyte@gmx.de>) Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-05-10 23:45:58 -07:00
David L Stevens	5bb1ab09e4	[IPV6]: Send ICMPv6 error on scope violations. When an IPv6 router is forwarding a packet with a link-local scope source address off-link, RFC 4007 requires it to send an ICMPv6 destination unreachable with code 2 ("not neighbor"), but Linux doesn't. Fix below. Signed-off-by: David L Stevens <dlstevens@us.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-05-10 23:45:32 -07:00
David Sterba	3dde6ad8fc	Fix trivial typos in Kconfig* files Fix several typos in help text in Kconfig* files. Signed-off-by: David Sterba <dave@jikos.cz> Signed-off-by: Adrian Bunk <bunk@stusta.de>	2007-05-09 07:12:20 +02:00
Randy Dunlap	e63340ae6b	header cleaning: don't include smp_lock.h when not used Remove includes of <linux/smp_lock.h> where it is not used/needed. Suggested by Al Viro. Builds cleanly on x86_64, i386, alpha, ia64, powerpc, sparc, sparc64, and arm (all 59 defconfigs). Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-05-08 11:15:07 -07:00
Linus Torvalds	15700770ef	Merge git://git.kernel.org/pub/scm/linux/kernel/git/sam/kbuild * git://git.kernel.org/pub/scm/linux/kernel/git/sam/kbuild: (38 commits) kconfig: fix mconf segmentation fault kbuild: enable use of code from a different dir kconfig: error out if recursive dependencies are found kbuild: scripts/basic/fixdep segfault on pathological string-o-death kconfig: correct minor typo in Kconfig warning message. kconfig: fix path to modules.txt in Kconfig help usr/Kconfig: fix typo kernel-doc: alphabetically-sorted entries in index.html of 'htmldocs' kbuild: be more explicit on missing .config file kbuild: clarify the creation of the LOCALVERSION_AUTO string. kbuild: propagate errors from find in scripts/gen_initramfs_list.sh kconfig: refer to qt3 if we cannot find qt libraries kbuild: handle compressed cpio initramfs-es kbuild: ignore section mismatch warning for references from .paravirtprobe to .init.text kbuild: remove stale comment in modpost.c kbuild/mkuboot.sh: allow spaces in CROSS_COMPILE kbuild: fix make mrproper for Documentation/DocBook/man kbuild: remove kconfig binaries during make mrproper kconfig/menuconfig: do not hardcode '.config' kbuild: override build timestamp & version ...	2007-05-06 13:21:57 -07:00
Pavel Emelianov	7562f876cd	[NET]: Rework dev_base via list_head (v3) Cleanup of dev_base list use, with the aim to simplify making device list per-namespace. In almost every occasion, use of dev_base variable and dev->next pointer could be easily replaced by for_each_netdev loop. A few most complicated places were converted to using first_netdev()/next_netdev(). Signed-off-by: Pavel Emelianov <xemul@openvz.org> Acked-by: Kirill Korotaev <dev@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-05-03 15:13:45 -07:00
Alexander E. Patrakov	39f5fb3035	kconfig: fix path to modules.txt in Kconfig help Documentation/modules.txt doesn't exist, but Documentation/kbuild/modules.txt does. Signed-off-by: Alexander E. Patrakov Signed-off-by: Sam Ravnborg <sam@ravnborg.org>	2007-05-02 20:58:11 +02:00
Eric Sesterhenn	d0772b70fa	[IPV6]: Fix slab corruption running ip6sic From: Eric Sesterhenn <snakebyte@gmx.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-04-28 21:26:23 -07:00
Stephen Hemminger	5632c5152a	[IPV6]: Track device renames in snmp6. When network device's are renamed, the IPV6 snmp6 code gets confused. It doesn't track name changes so it will OOPS when network device's are removed. The fix is trivial, just unregister/re-register in notify handler. Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-04-28 21:16:39 -07:00
YOSHIFUJI Hideaki	ebbd90a730	[IPV6]: Fix thinko in ipv6_rthdr_rcv() changes. Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-04-27 02:13:39 -07:00
Milind Arun Choudhary	4ef8d0aeaf	[NET]: SPIN_LOCK_UNLOCKED cleanup in drivers/atm, net SPIN_LOCK_UNLOCKED cleanup,use __SPIN_LOCK_UNLOCKED instead Signed-off-by: Milind Arun Choudhary <milindchoudhary@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-04-26 01:37:44 -07:00
YOSHIFUJI Hideaki	e1ec7842df	[IPV6] NDISC: Unify main process of sending ND messages. Because ndisc_send_na(), ndisc_send_ns() and ndisc_send_rs() are almost identical, so let's unify their common part. With gcc (GCC) 3.3.5 (Debian 1:3.3.5-13) on i386, Before: text data bss dec hex filename 14689 364 24 15077 3ae5 net/ipv6/ndisc.o After: text data bss dec hex filename 12317 364 24 12705 31a1 net/ipv6/ndisc.o Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>	2007-04-25 22:29:59 -07:00
YOSHIFUJI Hideaki	c53b3590bb	[IPV6] XFRM: Use ip6addr_any where applicable. Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>	2007-04-25 22:29:58 -07:00
YOSHIFUJI Hideaki	df8981dc19	[IPV6]: Export in6addr_any for future use. Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>	2007-04-25 22:29:57 -07:00
YOSHIFUJI Hideaki	420fe234ad	[IPV6] SIT: Unify code path to get hash array index. Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>	2007-04-25 22:29:54 -07:00
David S. Miller	30041e4af4	[IPV6]: Fix Makefile thinko. obj-$(CONFIG_PROC_FS) --> ipv6-$(CONFIG_PROC_FS) Signed-off-by: David S. Miller <davem@davemloft.net>	2007-04-25 22:29:53 -07:00
Herbert Xu	7f7d9a6b96	[IPV6]: Consolidate common SNMP code This patch moves the non-proc SNMP code into addrconf.c and reuses IPv4 SNMP code where applicable. As a result we can skip proc.o if /proc is disabled. Note that I've made a number of functions static since they're only used by addrconf.c for now. If they ever get used elsewhere we can always remove the static. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Acked-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-04-25 22:29:52 -07:00
YOSHIFUJI Hideaki	97fc8d0bc5	[IPV6] SNMP: Use put_unaligned() instead of memcpy(). Hint from David Miller <davem@davemloft.net>. Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-04-25 22:29:37 -07:00
YOSHIFUJI Hideaki	952a10be32	[IPV6] SNMP: Fix several warnings without procfs. Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>	2007-04-25 22:29:36 -07:00
YOSHIFUJI Hideaki	2334e97355	[IPV6] SNMP: Avoid unaligned accesses. Because stats pointer may not be aligned for u64, use memcpy to fill u64 values. Issue reported by David Miller <davem@davemloft.net>. Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>	2007-04-25 22:29:35 -07:00
Stephen Hemminger	3ff50b7997	[NET]: cleanup extra semicolons Spring cleaning time... There seems to be a lot of places in the network code that have extra bogus semicolons after conditionals. Most commonly is a bogus semicolon after: switch() { } Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-04-25 22:29:24 -07:00
YOSHIFUJI Hideaki	1370b5a59b	[IPV6] SNMP: Export statistics via netlink without CONFIG_PROC_FS. Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-04-25 22:29:13 -07:00
YOSHIFUJI Hideaki	49ed67a9ee	[IPV6] SNMP: Move some statistic bits to net/ipv6/proc.c. Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-04-25 22:29:11 -07:00
YOSHIFUJI Hideaki	bf99f1bde3	[IPV6] SNMP: Netlink interface. Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-04-25 22:29:10 -07:00
John Heffner	628a5c5618	[INET]: Add IP(V6)_PMTUDISC_RPOBE Add IP(V6)_PMTUDISC_PROBE value for IP(V6)_MTU_DISCOVER. This option forces us not to fragment, but does not make use of the kernel path MTU discovery. That is, it allows for user-mode MTU probing (or, packetization-layer path MTU discovery). This is particularly useful for diagnostic utilities, like traceroute/tracepath. Signed-off-by: John Heffner <jheffner@psc.edu> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-04-25 22:29:10 -07:00
John Heffner	b881ef7603	[IPV6]: MTU discovery check in ip6_fragment() Adds a check in ip6_fragment() mirroring ip_fragment() for packets that we can't fragment, and sends an ICMP Packet Too Big message in response. Signed-off-by: John Heffner <jheffner@psc.edu> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-04-25 22:29:09 -07:00
Patrick McHardy	6313c1e099	[RTNETLINK]: Remove unnecessary locking in dump callbacks Since we're now holding the rtnl during the entire dump operation, we can remove additional locking for rtnl protected data. This patch does that for all simple cases (dev_base_lock for dev_base walking, RCU protection for FIB rule dumping). Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-04-25 22:29:05 -07:00
Patrick McHardy	af65bdfce9	[NETLINK]: Switch cb_lock spinlock to mutex and allow to override it Switch cb_lock to mutex and allow netlink kernel users to override it with a subsystem specific mutex for consistent locking in dump callbacks. All netlink_dump_start users have been audited not to rely on any side-effects of the previously used spinlock. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-04-25 22:29:03 -07:00
Patrick McHardy	3b5018d676	[NETFILTER]: {eb,ip6,ip}t_LOG: remove remains of LOG target overloading All LOG targets always use their internal logging function nowadays, so remove the incorrect error message and handle real errors (!= -EEXIST) by failing to load. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-04-25 22:29:00 -07:00
Herbert Xu	604763722c	[NET]: Treat CHECKSUM_PARTIAL as CHECKSUM_UNNECESSARY When a transmitted packet is looped back directly, CHECKSUM_PARTIAL maps to the semantics of CHECKSUM_UNNECESSARY. Therefore we should treat it as such in the stack. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-04-25 22:28:43 -07:00
Herbert Xu	663ead3bb8	[NET]: Use csum_start offset instead of skb_transport_header The skb transport pointer is currently used to specify the start of the checksum region for transmit checksum offload. Unfortunately, the same pointer is also used during receive side processing. This creates a problem when we want to retransmit a received packet with partial checksums since the skb transport pointer would be overwritten. This patch solves this problem by creating a new 16-bit csum_start offset value to replace the skb transport header for the purpose of checksums. This offset is calculated from skb->head so that it does not have to change when skb->data changes. No extra space is required since csum_offset itself fits within a 16-bit word so we can use the other 16 bits for csum_start. For backwards compatibility, just before we push a packet with partial checksums off into the device driver, we set the skb transport header to what it would have been under the old scheme. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-04-25 22:28:40 -07:00
Patrick McHardy	c5c2523893	[XFRM]: Optimize MTU calculation Replace the probing based MTU estimation, which usually takes 2-3 iterations to find a fitting value and may underestimate the MTU, by an exact calculation. Also fix underestimation of the XFRM trailer_len, which causes unnecessary reallocations. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-04-25 22:28:38 -07:00
Patrick McHardy	557922584d	[XFRM]: esp: fix skb_tail_pointer conversion bug Fix incorrect switch of "trailer" skb by "skb" during skb_tail_pointer conversion: - (u8)(trailer->tail - 1) = top_iph->protocol; + (skb_tail_pointer(skb) - 1) = top_iph->protocol; - (u8 )(trailer->tail - 1) = skb_network_header(skb); + (skb_tail_pointer(skb) - 1) = skb_network_header(skb); Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-04-25 22:28:37 -07:00
YOSHIFUJI Hideaki	29f6af7712	[IPV6] FIB6RULE: Find source address during looking up route. When looking up route for destination with rules with source address restrictions, we may need to find a source address for the traffic if not given. Based on patch from Noriaki TAKAMIYA <takamiya@po.ntts.co.jp>. Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-04-25 22:28:35 -07:00
Arnaldo Carvalho de Melo	27d7ff46a3	[SK_BUFF]: Introduce skb_copy_to_linear_data{_offset} To clearly state the intent of copying to linear sk_buffs, _offset being a overly long variant but interesting for the sake of saving some bytes. Signed-off-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>	2007-04-25 22:28:29 -07:00
Arnaldo Carvalho de Melo	d626f62b11	[SK_BUFF]: Introduce skb_copy_from_linear_data{_offset} To clearly state the intent of copying from linear sk_buffs, _offset being a overly long variant but interesting for the sake of saving some bytes. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2007-04-25 22:28:23 -07:00
Herbert Xu	35fc92a9de	[NET]: Allow forwarding of ip_summed except CHECKSUM_COMPLETE Right now Xen has a horrible hack that lets it forward packets with partial checksums. One of the reasons that CHECKSUM_PARTIAL and CHECKSUM_COMPLETE were added is so that we can get rid of this hack (where it creates two extra bits in the skbuff to essentially mirror ip_summed without being destroyed by the forwarding code). I had forgotten that I've already gone through all the deivce drivers last time around to make sure that they're looking at ip_summed == CHECKSUM_PARTIAL rather than ip_summed != 0 on transmit. In any case, I've now done that again so it should definitely be safe. Unfortunately nobody has yet added any code to update CHECKSUM_COMPLETE values on forward so we I'm setting that to CHECKSUM_NONE. This should be safe to remove for bridging but I'd like to check that code path first. So here is the patch that lets us get rid of the hack by preserving ip_summed (mostly) on forwarded packets. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-04-25 22:28:16 -07:00
David S. Miller	b3da2cf37c	[INET]: Use jhash + random secret for ehash. The days are gone when this was not an issue, there are folks out there with huge bot networks that can be used to attack the established hash tables on remote systems. So just like the routing cache and connection tracking hash, use Jenkins hash with random secret input. Signed-off-by: David S. Miller <davem@davemloft.net>	2007-04-25 22:28:06 -07:00
Patrick McHardy	e6f689db51	[NETFILTER]: Use setup_timer Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-04-25 22:27:43 -07:00
Patrick McHardy	1b53d9042c	[NETFILTER]: Remove changelogs and CVS IDs Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-04-25 22:27:35 -07:00
Thomas Graf	c454673da7	[NET] rules: Unified rules dumping Implements a unified, protocol independant rules dumping function which is capable of both, dumping a specific protocol family or all of them. This speeds up dumping as less lookups are required. Signed-off-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-04-25 22:27:17 -07:00
Thomas Graf	c127ea2c45	[IPv6]: Use rtnl registration interface Signed-off-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-04-25 22:27:13 -07:00
Arnaldo Carvalho de Melo	6b88dd966b	[SK_BUFF] ipv6: Use skb_network_offset in some more places So that we reduce the number of direct accesses to skb->data. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2007-04-25 22:26:38 -07:00
Arnaldo Carvalho de Melo	b529ccf279	[NETLINK]: Introduce nlmsg_hdr() helper For the common "(struct nlmsghdr *)skb->data" sequence, so that we reduce the number of direct accesses to skb->data and for consistency with all the other cast skb member helpers. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-04-25 22:26:34 -07:00
Arnaldo Carvalho de Melo	27a884dc3c	[SK_BUFF]: Convert skb->tail to sk_buff_data_t So that it is also an offset from skb->head, reduces its size from 8 to 4 bytes on 64bit architectures, allowing us to combine the 4 bytes hole left by the layer headers conversion, reducing struct sk_buff size to 256 bytes, i.e. 4 64byte cachelines, and since the sk_buff slab cache is SLAB_HWCACHE_ALIGN... :-) Many calculations that previously required that skb->{transport,network, mac}_header be first converted to a pointer now can be done directly, being meaningful as offsets or pointers. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-04-25 22:26:28 -07:00
Arnaldo Carvalho de Melo	b0e380b1d8	[SK_BUFF]: unions of just one member don't get anything done, kill them Renaming skb->h to skb->transport_header, skb->nh to skb->network_header and skb->mac to skb->mac_header, to match the names of the associated helpers (skb[_[re]set]_{transport,network,mac}_header). Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-04-25 22:26:20 -07:00
Arnaldo Carvalho de Melo	cfe1fc7759	[SK_BUFF]: Introduce skb_network_header_len For the common sequence "skb->h.raw - skb->nh.raw", similar to skb->mac_len, that is precalculated tho, don't think we need to bloat skb with one more member, so just use this new helper, reducing the number of non-skbuff.h references to the layer headers even more. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-04-25 22:26:19 -07:00
Arnaldo Carvalho de Melo	bff9b61ce3	[SK_BUFF]: Use the helpers to get the layer header pointer Some more cases... Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-04-25 22:26:18 -07:00
Arnaldo Carvalho de Melo	ddc7b8e32b	[SK_BUFF]: Some more layer header conversions Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-04-25 22:26:03 -07:00
Arnaldo Carvalho de Melo	d10ba34b00	[SK_BUFF]: More skb_put related skb_reset_transport_header This time we have to set it to skb->tail that is not anymore equal to skb->data, so we either add a new helper or just add the skb->tail - skb->data offset, for now do the later. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-04-25 22:26:01 -07:00
Arnaldo Carvalho de Melo	55f79cc0c0	[IPV6]: Reset the network header in ip6_nd_hdr ip6_nd_hdr is always called immediately after a alloc_skb + skb_reserve sequence, i.e. when skb->tail is equal to skb->data, making it correct to use skb_reset_network_header(). Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-04-25 22:26:00 -07:00
Yasuyuki Kozakai	e7ac05f340	[NETFILTER]: nf_conntrack: add nf_copy() to safely copy members in skb This unifies the codes to copy netfilter related datas. Before copying, nf_copy() puts original members in destination skb. Signed-off-by: Yasuyuki Kozakai <yasuyuki.kozakai@toshiba.co.jp> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-04-25 22:25:55 -07:00
Arnaldo Carvalho de Melo	9c70220b73	[SK_BUFF]: Introduce skb_transport_header(skb) For the places where we need a pointer to the transport header, it is still legal to touch skb->h.raw directly if just adding to, subtracting from or setting it to another layer header. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-04-25 22:25:31 -07:00
Arnaldo Carvalho de Melo	bd82393ca2	[SK_BUFF]: More skb_reset_transport_header conversions These are a bit more subtle, they are of this type: - skb->h.raw = payload; __skb_pull(skb, payload - skb->data); + skb_reset_transport_header(skb); __skb_pull results in: skb->data = skb->data + payload - skb->data; skb->data = payload; So after __skb_pull we have skb->data pointing to payload and we can just call skb_reset_transport_header(skb), that will do: skb->h.raw = payload; The others are similar, allowing us to get rid of some more cases where a pointer was being attributed to the layer headers. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-04-25 22:25:29 -07:00
Arnaldo Carvalho de Melo	39b89160df	[SK_BUFF]: Introduce ipipv6_hdr(), remove skb->h.ipv6h Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-04-25 22:25:28 -07:00
Arnaldo Carvalho de Melo	b0061ce49c	[SK_BUFF]: Introduce ipip_hdr(), remove skb->h.ipiph Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-04-25 22:25:27 -07:00
Arnaldo Carvalho de Melo	aa8223c7bb	[SK_BUFF]: Introduce tcp_hdr(), remove skb->h.th Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-04-25 22:25:26 -07:00
Arnaldo Carvalho de Melo	ab6a5bb6b2	[TCP]: Introduce tcp_hdrlen() and tcp_optlen() The ip_hdrlen() buddy, created to reduce the number of skb->h.th-> uses and to avoid the longer, open coded equivalent. Ditched a no-op in bnx2 in the process. I wonder if we should have a BUG_ON(skb->h.th->doff < 5) in tcp_optlen()... Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-04-25 22:25:24 -07:00
Arnaldo Carvalho de Melo	88c7664f13	[SK_BUFF]: Introduce icmp_hdr(), remove skb->h.icmph Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-04-25 22:25:23 -07:00
Arnaldo Carvalho de Melo	4bedb45203	[SK_BUFF]: Introduce udp_hdr(), remove skb->h.uh Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-04-25 22:25:22 -07:00
Arnaldo Carvalho de Melo	cc70ab261c	[ICMP6]: Introduce icmp6_hdr() For consistency with all the other skb->h.raw accessors. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-04-25 22:25:20 -07:00
Arnaldo Carvalho de Melo	967b05f64e	[SK_BUFF]: Introduce skb_set_transport_header For the cases where the transport header is being set to a offset from skb->data. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-04-25 22:25:17 -07:00
Arnaldo Carvalho de Melo	ea2ae17d64	[SK_BUFF]: Introduce skb_transport_offset() For the quite common 'skb->h.raw - skb->data' sequence. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-04-25 22:25:16 -07:00
Arnaldo Carvalho de Melo	badff6d01a	[SK_BUFF]: Introduce skb_reset_transport_header(skb) For the common, open coded 'skb->h.raw = skb->data' operation, so that we can later turn skb->h.raw into a offset, reducing the size of struct sk_buff in 64bit land while possibly keeping it as a pointer on 32bit. This one touches just the most simple cases: skb->h.raw = skb->data; skb->h.raw = {skb_push\|[__]skb_pull}() The next ones will handle the slightly more "complex" cases. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-04-25 22:25:15 -07:00
Arnaldo Carvalho de Melo	0660e03f6b	[SK_BUFF]: Introduce ipv6_hdr(), remove skb->nh.ipv6h Now the skb->nh union has just one member, .raw, i.e. it is just like the skb->mac union, strange, no? I'm just leaving it like that till the transport layer is done with, when we'll rename skb->mac.raw to skb->mac_header (or ->mac_header_offset?), ditto for ->{h,nh}. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-04-25 22:25:14 -07:00
Arnaldo Carvalho de Melo	eddc9ec53b	[SK_BUFF]: Introduce ip_hdr(), remove skb->nh.iph Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-04-25 22:25:10 -07:00
Arnaldo Carvalho de Melo	c9bdd4b525	[IP]: Introduce ip_hdrlen() For the common sequence "skb->nh.iph->ihl * 4", removing a good number of open coded skb->nh.iph uses, now to go after the rest... Just out of curiosity, here are the idioms found to get the same result: skb->nh.iph->ihl << 2 skb->nh.iph->ihl<<2 skb->nh.iph->ihl * 4 skb->nh.iph->ihl4 (skb->nh.iph)->ihl sizeof(u32) Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-04-25 22:25:07 -07:00
Arnaldo Carvalho de Melo	c14d2450cb	[SK_BUFF]: Introduce skb_set_network_header For the cases where the network header is being set to a offset from skb->data. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-04-25 22:25:01 -07:00
Arnaldo Carvalho de Melo	d56f90a7c9	[SK_BUFF]: Introduce skb_network_header() For the places where we need a pointer to the network header, it is still legal to touch skb->nh.raw directly if just adding to, subtracting from or setting it to another layer header. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-04-25 22:24:59 -07:00
Arnaldo Carvalho de Melo	bbe735e424	[SK_BUFF]: Introduce skb_network_offset() For the quite common 'skb->nh.raw - skb->data' sequence. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-04-25 22:24:58 -07:00
Arnaldo Carvalho de Melo	1ced98e81d	[SK_BUFF] ipv6: More skb_reset_network_header conversions related to skb_pull Now related to this form: skb->nh.ipv6h = (struct ipv6hdr )skb_put(skb, length); That, as the others, is done when skb->tail is still equal to skb->data, making the conversion to skb_reset_network_header possible. Also one more case equivalent to skb->nh.raw = skb->data, of this form: iph = (struct ipv6hdr )skb->data; <SNIP> skb->nh.ipv6h = iph; Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-04-25 22:24:54 -07:00
Arnaldo Carvalho de Melo	e2d1bca7e6	[SK_BUFF]: Use skb_reset_network_header in skb_push cases skb_push updates and returns skb->data, so we can just call skb_reset_network_header after the call to skb_push. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-04-25 22:24:47 -07:00
Arnaldo Carvalho de Melo	c1d2bbe1cd	[SK_BUFF]: Introduce skb_reset_network_header(skb) For the common, open coded 'skb->nh.raw = skb->data' operation, so that we can later turn skb->nh.raw into a offset, reducing the size of struct sk_buff in 64bit land while possibly keeping it as a pointer on 32bit. This one touches just the most simple case, next will handle the slightly more "complex" cases. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-04-25 22:24:46 -07:00
Arnaldo Carvalho de Melo	57effc70a5	[IPV6]: Use skb->nh.ipv6h instead of casting skb->nh.raw nh.ipv6h is there exactly for this reason! Use it while it exists ;-) Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-04-25 22:24:45 -07:00
Arnaldo Carvalho de Melo	98e399f82a	[SK_BUFF]: Introduce skb_mac_header() For the places where we need a pointer to the mac header, it is still legal to touch skb->mac.raw directly if just adding to, subtracting from or setting it to another layer header. This one also converts some more cases to skb_reset_mac_header() that my regex missed as it had no spaces before nor after '=', ugh. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-04-25 22:24:41 -07:00
Arnaldo Carvalho de Melo	39f69c6f92	[SK_BUFF] xfrm: Use skb_set_mac_header in the memmove cases Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-04-25 22:24:37 -07:00
Arnaldo Carvalho de Melo	459a98ed88	[SK_BUFF]: Introduce skb_reset_mac_header(skb) For the common, open coded 'skb->mac.raw = skb->data' operation, so that we can later turn skb->mac.raw into a offset, reducing the size of struct sk_buff in 64bit land while possibly keeping it as a pointer on 32bit. This one touches just the most simple case, next will handle the slightly more "complex" cases. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-04-25 22:24:32 -07:00
YOSHIFUJI Hideaki	e5268f12f2	[IPV6]: Ensure to truncate result and return full length for sticky options. Bug noticed by Chris Wright <chrisw@sous-sol.org>. Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-04-25 22:24:17 -07:00
YOSHIFUJI Hideaki	4c6510a738	[IPV6]: Return correct result for sticky options. We returned incorrect result with IPV6_RTHDRDSTOPTS, IPV6_RTHDR and IPV6_DSTOPTS. Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-04-25 22:24:16 -07:00
Stephen Hemminger	add459aa1a	[UDP]: ipv6 style cleanup Fix whitespace around keywords. Eliminate unnecessary ()'s on return statements. Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-04-25 22:24:08 -07:00
Eric Dumazet	ae40eb1ef3	[NET]: Introduce SIOCGSTAMPNS ioctl to get timestamps with nanosec resolution Now network timestamps use ktime_t infrastructure, we can add a new ioctl() SIOCGSTAMPNS command to get timestamps in 'struct timespec'. User programs can thus access to nanosecond resolution. Signed-off-by: Eric Dumazet <dada1@cosmosbay.com> CC: Stephen Hemminger <shemminger@linux-foundation.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-04-25 22:24:04 -07:00
Herbert Xu	759e5d0064	[UDP]: Clean up UDP-Lite receive checksum This patch eliminates some duplicate code for the verification of receive checksums between UDP-Lite and UDP. It does this by introducing __skb_checksum_complete_head which is identical to __skb_checksum_complete_head apart from the fact that it takes a length parameter rather than computing the first skb->len bytes. As a result UDP-Lite will be able to use hardware checksum offload for packets which do not use partial coverage checksums. It also means that UDP-Lite loopback no longer does unnecessary checksum verification. If any NICs start support UDP-Lite this would also start working automatically. This patch removes the assumption that msg_flags has MSG_TRUNC clear upon entry in recvmsg. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-04-25 22:23:51 -07:00
Herbert Xu	1ab6eb62b0	[UDP6]: Restore sk_filter optimisation This reverts the changeset [IPV6]: UDPv6 checksum. We always need to check UDPv6 checksum because it is mandatory. The sk_filter optimisation has nothing to do whether we verify the checksum. It simply postpones it to the point when the user calls recv or poll. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-04-25 22:23:50 -07:00
YOSHIFUJI Hideaki	ca04356939	[IPV6] ADDRCONF: Fix possible inet6_ifaddr leakage with CONFIG_OPTIMISTIC_DAD. The inet6_ifaddr for source address of RS is leaked if the address is not an optimistic address. Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-04-25 22:23:44 -07:00
Neil Horman	95c385b4d5	[IPV6] ADDRCONF: Optimistic Duplicate Address Detection (RFC 4429) Support. Nominally an autoconfigured IPv6 address is added to an interface in the Tentative state (as per RFC 2462). Addresses in this state remain in this state while the Duplicate Address Detection process operates on them to determine their uniqueness on the network. During this period, these tentative addresses may not be used for communication, increasing the time before a node may be able to communicate on a network. Using Optimistic Duplicate Address Detection, autoconfigured addresses may be used immediately for communication on the network, as long as certain rules are followed to avoid conflicts with other nodes during the Duplicate Address Detection process. Signed-off-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-04-25 22:23:43 -07:00
Yasuyuki Kozakai	502b093569	[IPV6] IP6TUNNEL: Enable to control the handled inner protocol. ip6_tunnel before supporting IPv4/IPv6 tunnel allows only IPPROTO_IPV6 in configurations from userland. This allows userland to set IPPROTO_IPIP and 0(wildcard). ip6_tunnel only handles allowed inner protocols. Signed-off-by: Yasuyuki Kozakai <yasuyuki.kozakai@toshiba.co.jp> Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-04-25 22:23:42 -07:00
Yasuyuki Kozakai	3144581cb0	[IPV6] IP6TUNNEL: Rename functions ip6ip6_* to ip6_tnl_*. Signed-off-by: Yasuyuki Kozakai <yasuyuki.kozakai@toshiba.co.jp> Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-04-25 22:23:41 -07:00
Yasuyuki Kozakai	c4d3efafcc	[IPV6] IP6TUNNEL: Add support to IPv4 over IPv6 tunnel. Some notes - Protocol number IPPROTO_IPIP is used for IPv4 over IPv6 packets. - If IP6_TNL_F_USE_ORIG_TCLASS is set, TOS in IPv4 header is copied to Traffic Class in outer IPv6 header on xmit. - IP6_TNL_F_USE_ORIG_FLOWLABEL is ignored on xmit of IPv4 packets, because IPv4 header does not have flow label. - Kernel sends ICMP error if IPv4 packet is too big on xmit, even if DF flag is not set. Signed-off-by: Yasuyuki Kozakai <yasuyuki.kozakai@toshiba.co.jp> Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-04-25 22:23:40 -07:00
Yasuyuki Kozakai	61ec2aec28	[IPV6] IP6TUNNEL: Split out generic routine in ip6ip6_xmit(). This enables to add IPv4/IPv6 specific handling later, Signed-off-by: Yasuyuki Kozakai <yasuyuki.kozakai@toshiba.co.jp> Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-04-25 22:23:39 -07:00
Yasuyuki Kozakai	8359925be8	[IPV6] IP6TUNNEL: Split out generic routine in ip6ip6_rcv(). This enables to add IPv4/IPv6 specific handling later, Signed-off-by: Yasuyuki Kozakai <yasuyuki.kozakai@toshiba.co.jp> Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-04-25 22:23:38 -07:00
Yasuyuki Kozakai	e490d1d85c	[IPV6] IP6TUNNEL: Split out generic routine in ip6ip6_err(). This enables to add IPv4/IPv6 specific error handling later, Signed-off-by: Yasuyuki Kozakai <yasuyuki.kozakai@toshiba.co.jp> Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-04-25 22:23:37 -07:00
YOSHIFUJI Hideaki	7159039a12	[IPV6]: Decentralize EXPORT_SYMBOLs. Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>	2007-04-25 22:23:36 -07:00
Eric Dumazet	b7aa0bf70c	[NET]: convert network timestamps to ktime_t We currently use a special structure (struct skb_timeval) and plain 'struct timeval' to store packet timestamps in sk_buffs and struct sock. This has some drawbacks : - Fixed resolution of micro second. - Waste of space on 64bit platforms where sizeof(struct timeval)=16 I suggest using ktime_t that is a nice abstraction of high resolution time services, currently capable of nanosecond resolution. As sizeof(ktime_t) is 8 bytes, using ktime_t in 'struct sock' permits a 8 byte shrink of this structure on 64bit architectures. Some other structures also benefit from this size reduction (struct ipq in ipv4/ip_fragment.c, struct frag_queue in ipv6/reassembly.c, ...) Once this ktime infrastructure adopted, we can more easily provide nanosecond resolution on top of it. (ioctl SIOCGSTAMPNS and/or SO_TIMESTAMPNS/SCM_TIMESTAMPNS) Note : this patch includes a bug correction in compat_sock_get_timestamp() where a "err = 0;" was missing (so this syscall returned -ENOENT instead of 0) Signed-off-by: Eric Dumazet <dada1@cosmosbay.com> CC: Stephen Hemminger <shemminger@linux-foundation.org> CC: John find <linux.kernel@free.fr> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-04-25 22:23:34 -07:00
James Morris	9d729f72dc	[NET]: Convert xtime.tv_sec to get_seconds() Where appropriate, convert references to xtime.tv_sec to the get_seconds() helper function. Signed-off-by: James Morris <jmorris@namei.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-04-25 22:23:32 -07:00
YOSHIFUJI Hideaki	a23cf14b16	IPv6: fix Routing Header Type 0 handling thinko Oops, thinko. The test for accempting a RH0 was exatly the wrong way around. Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-04-24 19:26:06 -07:00
YOSHIFUJI Hideaki	0bcbc92629	[IPV6]: Disallow RH0 by default. A security issue is emerging. Disallow Routing Header Type 0 by default as we have been doing for IPv4. Note: We allow RH2 by default because it is harmless. Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-04-24 14:58:30 -07:00
YOSHIFUJI Hideaki	612f09e849	[IPV6] SNMP: Fix {In,Out}NoRoutes statistics. A packet which is being discarded because of no routes in the forwarding path should not be counted as OutNoRoutes but as InNoRoutes. Additionally, on this occasion, a packet whose destinaion is not valid should be counted as InAddrErrors separately. Based on patch from Mitsuru Chinen <mitch@linux.vnet.ibm.com>. Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-04-13 16:18:02 -07:00
David S. Miller	161980f4c6	[IPV6]: Revert recent change to rt6_check_dev(). This reverts `a0d78ebf3a` It causes pings to link-local addresses to fail. Signed-off-by: David S. Miller <davem@davemloft.net>	2007-04-06 11:42:27 -07:00
Mitsuru Chinen	60e5c16641	[IPv6]: Exclude truncated packets from InHdrErrors statistics Incoming trancated packets are counted as not only InTruncatedPkts but also InHdrErrors. They should be counted as InTruncatedPkts only. Signed-off-by: Mitsuru Chinen <mitch@linux.vnet.ibm.com> Acked-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-04-04 23:54:59 -07:00
YOSHIFUJI Hideaki	b59e139bbd	[IPv6]: Fix incorrect length check in rawv6_sendmsg() In article <20070329.142644.70222545.davem@davemloft.net> (at Thu, 29 Mar 2007 14:26:44 -0700 (PDT)), David Miller <davem@davemloft.net> says: > From: Sridhar Samudrala <sri@us.ibm.com> > Date: Thu, 29 Mar 2007 14:17:28 -0700 > > > The check for length in rawv6_sendmsg() is incorrect. > > As len is an unsigned int, (len < 0) will never be TRUE. > > I think checking for IPV6_MAXPLEN(65535) is better. > > > > Is it possible to send ipv6 jumbo packets using raw > > sockets? If so, we can remove this check. > > I don't see why such a limitation against jumbo would exist, > does anyone else? > > Thanks for catching this Sridhar. A good compiler should simply > fail to compile "if (x < 0)" when 'x' is an unsigned type, don't > you think :-) Dave, we use "int" for returning value, so we should fix this anyway, IMHO; we should not allow len > INT_MAX. Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Acked-by: Sridhar Samudrala <sri@us.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-04-02 13:30:54 -07:00
Herbert Xu	53aadcc909	[IPV6]: Set IF_READY if the device is up and has carrier We still need to set the IF_READY flag in ipv6_add_dev for the case where all addresses (including the link-local) are deleted and then recreated. In that case the IPv6 device too will be destroyed and then recreated. In order to prevent the original problem, we simply ensure that the device is up before setting IF_READY. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-03-27 14:31:52 -07:00
David S. Miller	f11e6659ce	[IPV6]: Fix routing round-robin locking. As per RFC2461, section 6.3.6, item #2, when no routers on the matching list are known to be reachable or probably reachable we do round robin on those available routes so that we make sure to probe as many of them as possible to detect when one becomes reachable faster. Each routing table has a rwlock protecting the tree and the linked list of routes at each leaf. The round robin code executes during lookup and thus with the rwlock taken as a reader. A small local spinlock tries to provide protection but this does not work at all for two reasons: 1) The round-robin list manipulation, as coded, goes like this (with read lock held): walk routes finding head and tail spin_lock(); rotate list using head and tail spin_unlock(); While one thread is rotating the list, another thread can end up with stale values of head and tail and then proceed to corrupt the list when it gets the lock. This ends up causing the OOPS in fib6_add() later onthat many people have been hitting. 2) All the other code paths that run with the rwlock held as a reader do not expect the list to change on them, they expect it to remain completely fixed while they hold the lock in that way. So, simply stated, it is impossible to implement this correctly using a manipulation of the list without violating the rwlock locking semantics. Reimplement using a per-fib6_node round-robin pointer. This way we don't need to manipulate the list at all, and since the round-robin pointer can only ever point to real existing entries we don't need to perform any locking on the changing of the round-robin pointer itself. We only need to reset the round-robin pointer to NULL when the entry it is pointing to is removed. The idea is from Thomas Graf and it is very similar to how this was implemented before the advanced router selection code when in. Signed-off-by: David S. Miller <davem@davemloft.net>	2007-03-25 18:48:05 -07:00
Thomas Graf	e1701c68c1	[NET]: Fix fib_rules compatibility breakage Based upon a patch from Patrick McHardy. The fib_rules netlink attribute policy introduced in 2.6.19 broke userspace compatibilty. When specifying a rule with "from all" or "to all", iproute adds a zero byte long netlink attribute, but the policy requires all addresses to have a size equal to sizeof(struct in_addr)/sizeof(struct in6_addr), resulting in a validation error. Check attribute length of FRA_SRC/FRA_DST in the generic framework by letting the family specific rules implementation provide the length of an address. Report an error if address length is non zero but no address attribute is provided. Fix actual bug by checking address length for non-zero instead of relying on availability of attribute. Signed-off-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-03-25 18:48:00 -07:00
Dave Jones	b6f99a2119	[NET]: fix up misplaced inlines. Turning up the warnings on gcc makes it emit warnings about the placement of 'inline' in function declarations. Here's everything that was under net/ Signed-off-by: Dave Jones <davej@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-03-22 12:27:49 -07:00
Masayuki Nakagawa	d35690beda	[IPV6]: ipv6_fl_socklist is inadvertently shared. The ipv6_fl_socklist from listening socket is inadvertently shared with new socket created for connection. This leads to a variety of interesting, but fatal, bugs. For example, removing one of the sockets may lead to the other socket's encountering a page fault when the now freed list is referenced. The fix is to not share the flow label list with the new socket. Signed-off-by: Masayuki Nakagawa <nakagawa.msy@ncos.nec.co.jp> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-03-16 16:14:03 -07:00
Chris Wright	d2b02ed948	[IPV6] fix ipv6_getsockopt_sticky copy_to_user leak User supplied len < 0 can cause leak of kernel memory. Use unsigned compare instead. Signed-off-by: Chris Wright <chrisw@sous-sol.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-03-09 16:19:17 -08:00
Olaf Kirch	dfee0a725b	[IPV6]: Fix for ipv6_setsockopt NULL dereference I came across this bug in http://bugzilla.kernel.org/show_bug.cgi?id=8155 Signed-off-by: Olaf Kirch <olaf.kirch@oracle.com> Acked-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-03-09 13:55:38 -08:00
Herbert Xu	c7ababbdc6	[IPV6]: Do not set IF_READY if device is down Now that we add the IPv6 device at registration time we don't need to set IF_READY in ipv6_add_dev anymore because we will always get a NETDEV_UP event later on should the device ever become ready. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-03-07 16:08:12 -08:00
David S. Miller	286930797d	[IPV6]: Handle np->opt being NULL in ipv6_getsockopt_sticky(). Signed-off-by: David S. Miller <davem@davemloft.net>	2007-03-07 16:08:05 -08:00
Patrick McHardy	dd63006b8f	[NETFILTER]: nf_conntrack_ipv6: fix incorrect classification of IPv6 fragments as ESTABLISHED The individual fragments of a packet reassembled by conntrack have the conntrack reference from the reassembled packet attached, but nfctinfo is not copied. This leaves it initialized to 0, which unfortunately is the value of IP_CT_ESTABLISHED. The result is that all IPv6 fragments are tracked as ESTABLISHED, allowing them to bypass a usual ruleset which accepts ESTABLISHED packets early. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-03-07 16:08:01 -08:00
Yasuyuki Kozakai	bc5f774347	[NETFILTER]: ip6_route_me_harder should take into account mark Signed-off-by: Yasuyuki Kozakai <yasuyuki.kozakai@toshiba.co.jp> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-03-05 13:25:27 -08:00
Patrick McHardy	e281db5cdf	[NETFILTER]: nf_conntrack/nf_nat: fix incorrect config ifdefs The nf_conntrack_netlink config option is named CONFIG_NF_CT_NETLINK, but multiple files use CONFIG_IP_NF_CONNTRACK_NETLINK or CONFIG_NF_CONNTRACK_NETLINK for ifdefs. Fix this and reformat all CONFIG_NF_CT_NETLINK ifdefs to only use a line. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-03-05 13:25:19 -08:00
David Stevens	aa6e4a96e7	[IPV6]: /proc/net/anycast6 unbalanced inet6_dev refcnt Reading /proc/net/anycast6 when there is no anycast address on an interface results in an ever-increasing inet6_dev reference count, as well as a reference to the netdevice you can't get rid of. Signed-off-by: David S. Miller <davem@davemloft.net>	2007-02-28 09:42:10 -08:00
Michal Wrobel	2c12a74cc4	[IPV6]: anycast refcnt fix This patch fixes a bug in Linux IPv6 stack which caused anycast address to be added to a device prior DAD has been completed. This led to incorrect reference count which resulted in infinite wait for unregister_netdevice completion on interface removal. Signed-off-by: Michal Wrobel <xmxwx@asn.pl> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-02-28 09:41:58 -08:00
David S. Miller	7401055b58	[IPV6]: Fix __ipv6_addr_type() export in correct place. It needs to be in net/ipv6/addrconf_core.c Signed-off-by: David S. Miller <davem@davemloft.net>	2007-02-26 11:42:57 -08:00
YOSHIFUJI Hideaki	45ba9dd200	[IPV6] ADDRCONF: Register inet6_dev earlier. Allocate inet6_dev earlier to allow users to set up per-interface variables. Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>	2007-02-26 11:42:55 -08:00
YOSHIFUJI Hideaki	46d480468f	[IPV6] ADDRCONF: Manage prefix route corresponding to address manually added. It is more natural to manage prefix routes corresponding to address which is being added manually. With help from Masafumi Aramoto <aramoto@linux-ipv6.org>. Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>	2007-02-26 11:42:54 -08:00
Yasuyuki Kozakai	268920584b	[IPV6] IP6TUNNEL: Use update_pmtu() of dst on xmit. Signed-off-by: Yasuyuki Kozakai <yasuyuki.kozakai@toshiba.co.jp> Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>	2007-02-26 11:42:53 -08:00
YOSHIFUJI Hideaki	8c14b7ce22	[IPV6] ADDRCONF: Statically link __ipv6_addr_type() for sunrpc subsystem. Link __ipv6_addr_type() statically for sunrpc code even if IPv6 is built as module. Signed-off-by: YOSHIFUJI Hidaki <yoshfuji@linux-ipv6.org>	2007-02-26 11:42:52 -08:00
Joe Jin	ca17c23345	[IPV6]: Adjust inet6_exit() cleanup sequence against inet6_init() This patch for adjust inet6_exit() to inverse sequence to inet6_init(). At ipv6_init, it first create proc_root/net/dev_snmp6 entry by call ipv6_misc_proc_init(), then call addrconf_init() to create the corresponding device entry at this directory, but at inet6_exit, ipv6_misc_proc_exit() called first, then call addrconf_init(). Signed-off-by: Joe Jin <joe.jin@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-02-26 11:42:44 -08:00
Noriaki TAKAMIYA	d3f23dfe8b	[IPSEC]: More fix is needed for __xfrm6_bundle_create(). Fixed to set fl_tunnel.fl6_src correctly in xfrm6_bundle_create(). Signed-off-by: Noriaki TAKAMIYA <takamiya@po.ntts.co.jp> Acked-by: Masahide NAKAMURA <nakam@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-02-26 11:42:43 -08:00
Eric W. Biederman	3fbfa98112	[PATCH] sysctl: remove the proc_dir_entry member for the sysctl tables It isn't needed anymore, all of the users are gone, and all of the ctl_table initializers have been converted to use explicit names of the fields they are initializing. [akpm@osdl.org: NTFS fix] Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Acked-by: Stephen Smalley <sds@tycho.nsa.gov> Cc: James Morris <jmorris@namei.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-02-14 08:10:00 -08:00
Eric W. Biederman	0b4d414714	[PATCH] sysctl: remove insert_at_head from register_sysctl The semantic effect of insert_at_head is that it would allow new registered sysctl entries to override existing sysctl entries of the same name. Which is pain for caching and the proc interface never implemented. I have done an audit and discovered that none of the current users of register_sysctl care as (excpet for directories) they do not register duplicate sysctl entries. So this patch simply removes the support for overriding existing entries in the sys_sysctl interface since no one uses it or cares and it makes future enhancments harder. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Acked-by: Ralf Baechle <ralf@linux-mips.org> Acked-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Russell King <rmk@arm.linux.org.uk> Cc: David Howells <dhowells@redhat.com> Cc: "Luck, Tony" <tony.luck@intel.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Andi Kleen <ak@muc.de> Cc: Jens Axboe <axboe@kernel.dk> Cc: Corey Minyard <minyard@acm.org> Cc: Neil Brown <neilb@suse.de> Cc: "John W. Linville" <linville@tuxdriver.com> Cc: James Bottomley <James.Bottomley@steeleye.com> Cc: Jan Kara <jack@ucw.cz> Cc: Trond Myklebust <trond.myklebust@fys.uio.no> Cc: Mark Fasheh <mark.fasheh@oracle.com> Cc: David Chinner <dgc@sgi.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Patrick McHardy <kaber@trash.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-02-14 08:09:59 -08:00
Tim Schmielau	cd354f1ae7	[PATCH] remove many unneeded #includes of sched.h After Al Viro (finally) succeeded in removing the sched.h #include in module.h recently, it makes sense again to remove other superfluous sched.h includes. There are quite a lot of files which include it but don't actually need anything defined in there. Presumably these includes were once needed for macros that used to live in sched.h, but moved to other header files in the course of cleaning it up. To ease the pain, this time I did not fiddle with any header files and only removed #includes from .c-files, which tend to cause less trouble. Compile tested against 2.6.20-rc2 and 2.6.20-rc2-mm2 (with offsets) on alpha, arm, i386, ia64, mips, powerpc, and x86_64 with allnoconfig, defconfig, allmodconfig, and allyesconfig as well as a few randconfigs on x86_64 and all configs in arch/arm/configs on arm. I also checked that no new warnings were introduced by the patch (actually, some warnings are removed that were emitted by unnecessarily included header files). Signed-off-by: Tim Schmielau <tim@physik3.uni-rostock.de> Acked-by: Russell King <rmk+kernel@arm.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-02-14 08:09:54 -08:00
Kazunori MIYAZAWA	73d605d1ab	[IPSEC]: changing API of xfrm6_tunnel_register This patch changes xfrm6_tunnel register and deregister interface to prepare for solving the conflict of device tunnels with inter address family IPsec tunnel. There is no device which conflicts with IPv4 over IPv6 IPsec tunnel. Signed-off-by: Kazunori MIYAZAWA <miyazawa@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-02-13 12:55:55 -08:00
Kazunori MIYAZAWA	c73cb5a2d6	[IPSEC]: make sit use the xfrm4_tunnel_register This patch makes sit use xfrm4_tunnel_register instead of inet_add_protocol. It solves conflict of sit device with inter address family IPsec tunnel. Signed-off-by: Kazunori MIYAZAWA <miyazawa@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-02-13 12:55:25 -08:00
YOSHIFUJI Hideaki	6e1d9d04c4	[IPV6] HASHTABLES: Use appropriate seed for caluculating ehash index. Tetsuo Handa <handat@pm.nttdata.co.jp> told me that connect(2) with TCPv6 socket almost always took a few minutes to return when we did not have any ports available in the range of net.ipv4.ip_local_port_range. The reason was that we used incorrect seed for calculating index of hash when we check established sockets in __inet6_check_established(). Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-02-12 20:26:39 -08:00
Masahide NAKAMURA	138939e066	[NETFILTER]: ip6t_mh: drop piggyback payload packet on MH packets Regarding RFC3775, MH payload proto field should be IPPROTO_NONE. Otherwise it must be discarded (and the receiver should send ICMP error). We assume filter should drop such piggyback everytime to disallow slipping through firewall rules, even the final receiver will discard it. Signed-off-by: Masahide NAKAMURA <nakam@linux-ipv6.org> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-02-12 11:16:17 -08:00
Patrick McHardy	a3c941b08d	[NETFILTER]: Kconfig: improve dependency handling Instead of depending on internally needed options and letting users figure out what is needed, select them when needed: - IP_NF_IPTABLES, IP_NF_ARPTABLES and IP6_NF_IPTABLES select NETFILTER_XTABLES - NETFILTER_XT_TARGET_CONNMARK, NETFILTER_XT_MATCH_CONNMARK and IP_NF_TARGET_CLUSTERIP select NF_CONNTRACK_MARK - NETFILTER_XT_MATCH_CONNBYTES selects NF_CT_ACCT Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-02-12 11:15:02 -08:00
Patrick McHardy	c0e912d7ed	[NETFILTER]: nf_conntrack: fix invalid conntrack statistics RCU assumption NF_CT_STAT_INC assumes rcu_read_lock in nf_hook_slow disables preemption as well, making it legal to use __get_cpu_var without disabling preemption manually. The assumption is not correct anymore with preemptable RCU, additionally we need to protect against softirqs when not holding nf_conntrack_lock. Add NF_CT_STAT_INC_ATOMIC macro, which disables local softirqs, and use where necessary. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-02-12 11:13:43 -08:00
Patrick McHardy	923f4902fe	[NETFILTER]: nf_conntrack: properly use RCU API for nf_ct_protos/nf_ct_l3protos arrays Replace preempt_{enable,disable} based RCU by proper use of the RCU API and add missing rcu_read_lock/rcu_read_unlock calls in all paths not obviously only used within packet process context (nfnetlink_conntrack). Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-02-12 11:12:57 -08:00
Patrick McHardy	e92ad99c78	[NETFILTER]: nf_log: minor cleanups - rename nf_logging to nf_loggers since its an array of registered loggers - rename nf_log_unregister_logger() to nf_log_unregister() to make it symetrical to nf_log_register() and convert all users Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-02-12 11:11:55 -08:00
Arjan van de Ven	9a32144e9d	[PATCH] mark struct file_operations const 7 Many struct file_operations in the kernel can be "const". Marking them const moves these to the .rodata section, which avoids false sharing with potential dirty data. In addition it'll catch accidental writes at compile time to these shared resources. Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-02-12 09:48:46 -08:00
Linus Torvalds	cb18eccff4	Merge master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 * master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6: (45 commits) [IPV4]: Restore multipath routing after rt_next changes. [XFRM] IPV6: Fix outbound RO transformation which is broken by IPsec tunnel patch. [NET]: Reorder fields of struct dst_entry [DECNET]: Convert decnet route to use the new dst_entry 'next' pointer [IPV6]: Convert ipv6 route to use the new dst_entry 'next' pointer [IPV4]: Convert ipv4 route to use the new dst_entry 'next' pointer [NET]: Introduce union in struct dst_entry to hold 'next' pointer [DECNET]: fix misannotation of linkinfo_dn [DECNET]: FRA_{DST,SRC} are le16 for decnet [UDP]: UDP can use sk_hash to speedup lookups [NET]: Fix whitespace errors. [NET] XFRM: Fix whitespace errors. [NET] X25: Fix whitespace errors. [NET] WANROUTER: Fix whitespace errors. [NET] UNIX: Fix whitespace errors. [NET] TIPC: Fix whitespace errors. [NET] SUNRPC: Fix whitespace errors. [NET] SCTP: Fix whitespace errors. [NET] SCHED: Fix whitespace errors. [NET] RXRPC: Fix whitespace errors. ...	2007-02-11 11:38:13 -08:00
Robert P. J. Day	c376222960	[PATCH] Transform kmem_cache_alloc()+memset(0) -> kmem_cache_zalloc(). Replace appropriate pairs of "kmem_cache_alloc()" + "memset(0)" with the corresponding "kmem_cache_zalloc()" call. Signed-off-by: Robert P. J. Day <rpjday@mindspring.com> Cc: "Luck, Tony" <tony.luck@intel.com> Cc: Andi Kleen <ak@muc.de> Cc: Roland McGrath <roland@redhat.com> Cc: James Bottomley <James.Bottomley@steeleye.com> Cc: Greg KH <greg@kroah.com> Acked-by: Joel Becker <Joel.Becker@oracle.com> Cc: Steven Whitehouse <swhiteho@redhat.com> Cc: Jan Kara <jack@ucw.cz> Cc: Michael Halcrow <mhalcrow@us.ibm.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Stephen Smalley <sds@tycho.nsa.gov> Cc: James Morris <jmorris@namei.org> Cc: Chris Wright <chrisw@sous-sol.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-02-11 10:51:27 -08:00
Masahide NAKAMURA	bda390d5c8	[XFRM] IPV6: Fix outbound RO transformation which is broken by IPsec tunnel patch. It seems to miss RO mode path by IPv6 over IPv4 IPsec tunnel patch when it changed semantics to check the mode from "xfrm[i]->props.mode != XFRM_MODE_TRANSPORT" to "xfrm[i]->props.mode == XFRM_MODE_TUNNEL" before changing address. It also makes two incline functions __xfrm6_bundle_addr_{remote,local} are used by nobody. This patch fixes it. Signed-off-by: Masahide NAKAMURA <nakam@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-02-10 23:20:47 -08:00
Eric Dumazet	7cc482634f	[IPV6]: Convert ipv6 route to use the new dst_entry 'next' pointer This patch removes the next pointer from 'struct rt6_info.u' union, and renames u.next to u.dst.rt6_next. Signed-off-by: Eric Dumazet <dada1@cosmosbay.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-02-10 23:20:40 -08:00
Eric Dumazet	95f30b336b	[UDP]: UDP can use sk_hash to speedup lookups In a prior patch, I introduced a sk_hash field (__sk_common.skc_hash) to let tcp lookups use one cache line per unmatched entry instead of two. We can also use sk_hash to speedup UDP part as well. We store in sk_hash the hnum value, and use sk->sk_hash (same cache line than 'next' pointer), instead of inet->num (different cache line) Note : We still have a false sharing problem for SMP machines, because sock_hold(sock) dirties the cache line containing the 'next' pointer. Not counting the udp_hash_lock rwlock. (did someone mentioned RCU ? :) ) Signed-off-by: Eric Dumazet <dada1@cosmosbay.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-02-10 23:20:29 -08:00
YOSHIFUJI Hideaki	1ab1457c42	[NET] IPV6: Fix whitespace errors. Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-02-10 23:19:42 -08:00
Eric Dumazet	dbca9b2750	[NET]: change layout of ehash table ehash table layout is currently this one : First half of this table is used by sockets not in TIME_WAIT state Second half of it is used by sockets in TIME_WAIT state. This is non optimal because of for a given hash or socket, the two chain heads are located in separate cache lines. Moreover the locks of the second half are never used. If instead of this halving, we use two list heads in inet_ehash_bucket instead of only one, we probably can avoid one cache miss, and reduce ram usage, particularly if sizeof(rwlock_t) is big (various CONFIG_DEBUG_SPINLOCK, CONFIG_DEBUG_LOCK_ALLOC settings). So we still halves the table but we keep together related chains to speedup lookups and socket state change. In this patch I did not try to align struct inet_ehash_bucket, but a future patch could try to make this structure have a convenient size (a power of two or a multiple of L1_CACHE_SIZE). I guess rwlock will just vanish as soon as RCU is plugged into ehash :) , so maybe we dont need to scratch our heads to align the bucket... Note : In case struct inet_ehash_bucket is not a power of two, we could probably change alloc_large_system_hash() (in case it use __get_free_pages()) to free the unused space. It currently allocates a big zone, but the last quarter of it could be freed. Again, this should be a temporary 'problem'. Patch tested on ipv4 tcp only, but should be OK for IPV6 and DCCP. Signed-off-by: Eric Dumazet <dada1@cosmosbay.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-02-08 14:16:46 -08:00
Patrick McHardy	9934e81c8c	[NETFILTER]: ip6_tables: remove redundant structure definitions Move ip6t_standard/ip6t_error_target/ip6t_error definitions to ip6_tables.h instead of defining them in each table individually. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-02-08 12:39:23 -08:00
Masahide NAKAMURA	a0ca215a73	[NETFILTER]: ip6_tables: support MH match This introduces match for Mobility Header (MH) described by Mobile IPv6 specification (RFC3775). User can specify the MH type or its range to be matched. Signed-off-by: Masahide NAKAMURA <nakam@linux-ipv6.org> Signed-off-by: Yasuyuki Kozakai <kozakai@linux-ipv6.org> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-02-08 12:39:21 -08:00
Jan Engelhardt	e60a13e030	[NETFILTER]: {ip,ip6}_tables: use struct xt_table instead of redefined structure names Signed-off-by: Jan Engelhardt <jengelh@gmx.de> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-02-08 12:39:20 -08:00
Jan Engelhardt	6709dbbb19	[NETFILTER]: {ip,ip6}_tables: remove x_tables wrapper functions Use the x_tables functions directly to make it better visible which parts are shared between ip_tables and ip6_tables. Signed-off-by: Jan Engelhardt <jengelh@gmx.de> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-02-08 12:39:19 -08:00
Jan Engelhardt	e1fd0586b0	[NETFILTER]: x_tables: fix return values for LOG/ULOG Signed-off-by: Jan Engelhardt <jengelh@gmx.de> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-02-08 12:39:18 -08:00
Jan Engelhardt	2822b0d926	[NETFILTER]: Remove useless comparisons before assignments Remove unnecessary if() constructs before assignment. Signed-off-by: Jan Engelhardt <jengelh@gmx.de> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-02-08 12:39:11 -08:00
Stephen Hemminger	22f8cde5bc	[NET]: unregister_netdevice as void There was no real useful information from the unregister_netdevice() return code, the only error occurred in a situation that was a driver bug. So change it to a void function. Signed-off-by: Stephen Hemminger <shemminger@osdl.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-02-08 12:39:06 -08:00
Masahide NAKAMURA	f48d5ff1e4	[IPV6] RAW: Add checksum default defines for MH. Add checksum default defines for mobility header(MH) which goes through raw socket. As the result kernel's behavior is to handle MH checksum as default. This patch also removes verifying inbound MH checksum at mip6_mh_filter() since it did not consider user specified checksum offset and was redundant check with raw socket code. Signed-off-by: Masahide NAKAMURA <nakam@linux-ipv6.org> Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-02-08 12:39:05 -08:00
Alexey Dobriyan	cc63f70b8b	[IPV4/IPV6] multicast: Check add_grhead() return value add_grhead() allocates memory with GFP_ATOMIC and in at least two places skb from it passed to skb_put() without checking. Signed-off-by: Alexey Dobriyan <adobriyan@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-02-08 12:39:04 -08:00
Miika Komu	4337226228	[IPSEC]: IPv4 over IPv6 IPsec tunnel This is the patch to support IPv4 over IPv6 IPsec. Signed-off-by: Miika Komu <miika@iki.fi> Signed-off-by: Diego Beltrami <Diego.Beltrami@hiit.fi> Signed-off-by: Kazunori Miyazawa <miyazawa@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-02-08 12:39:02 -08:00
Miika Komu	c82f963efe	[IPSEC]: IPv6 over IPv4 IPsec tunnel This is the patch to support IPv6 over IPv4 IPsec Signed-off-by: Miika Komu <miika@iki.fi> Signed-off-by: Diego Beltrami <Diego.Beltrami@hiit.fi> Signed-off-by: Kazunori Miyazawa <miyazawa@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-02-08 12:39:01 -08:00
Miika Komu	cdca72652a	[IPSEC]: exporting xfrm_state_afinfo This patch exports xfrm_state_afinfo. Signed-off-by: Miika Komu <miika@iki.fi> Signed-off-by: Diego Beltrami <Diego.Beltrami@hiit.fi> Signed-off-by: Kazunori Miyazawa <miyazawa@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-02-08 12:39:00 -08:00
David S. Miller	8eb9086f21	[IPV4/IPV6]: Always wait for IPSEC SA resolution in socket contexts. Do this even for non-blocking sockets. This avoids the silly -EAGAIN that applications can see now, even for non-blocking sockets in some cases (f.e. connect()). With help from Venkat Tekkirala. Signed-off-by: David S. Miller <davem@davemloft.net>	2007-02-08 12:38:45 -08:00
YOSHIFUJI Hideaki	a0d78ebf3a	[IPV6] ROUTE: Do not route packets to link-local address on other device. With help from Wei Dong <weid@np.css.fujitsu.com>. Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-02-08 12:38:42 -08:00
Patrick McHardy	26932566a4	[NETLINK]: Don't BUG on undersized allocations Currently netlink users BUG when the allocated skb for an event notification is undersized. While this is certainly a kernel bug, its not critical and crashing the kernel is too drastic, especially when considering that these errors have appeared multiple times in the past and it BUGs even if no listeners are present. This patch replaces BUG by WARN_ON and changes the notification functions to inform potential listeners of undersized allocations using a unique error code (EMSGSIZE). Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-02-08 12:38:41 -08:00
Li Yewang	29556526b9	[IPV6]: fix BUG of ndisc_send_redirect() When I tested IPv6 redirect function about kernel 2.6.19.1, and found that the kernel can send redirect packets whose target address is global address, and the target is not the actual endpoint of communication. But the criteria conform to RFC2461, the target address defines as following: Target Address An IP address that is a better first hop to use for he ICMP Destination Address. When the target is the actual endpoint of communication, i.e., the destination is a neighbor, the Target Address field MUST contain the same value as the ICMP Destination Address field. Otherwise the target is a better first-hop router and the Target Address MUST be the router's link-local address so that hosts can uniquely identify routers. According to this definition, when a router redirect to a host, the target address either the better first-hop router's link-local address or the same as the ICMP destination address field. But the function of ndisc_send_redirect() in net/ipv6/ndisc.c, does not check the target address correctly. There is another definition about receive Redirect message in RFC2461: 8.1. Validation of Redirect Messages A host MUST silently discard any received Redirect message that does not satisfy all of the following validity checks: ...... - The ICMP Target Address is either a link-local address (when redirected to a router) or the same as the ICMP Destination Address (when redirected to the on-link destination). ...... And the receive redirect function of ndisc_redirect_rcv() implemented this definition, checks the target address correctly. if (ipv6_addr_equal(dest, target)) { on_link = 1; } else if (!(ipv6_addr_type(target) & IPV6_ADDR_LINKLOCAL)) { ND_PRINTK2(KERN_WARNING "ICMPv6 Redirect: target address is not link-local.\n"); return; } So, I think the send redirect function must check the target address also. Signed-off-by: Li Yewang <lyw@nanjing-fnst.com> Acked-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-01-30 14:33:20 -08:00
Neil Horman	fa03ef38e1	[IPV6]: Fix up some CONFIG typos Signed-off-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-01-30 14:30:10 -08:00
David S. Miller	e89862f4c5	[TCP]: Restore SKB socket owner setting in tcp_transmit_skb(). Revert `931731123a` We can't elide the skb_set_owner_w() here because things like certain netfilter targets (such as owner MATCH) need a socket to be set on the SKB for correct operation. Thanks to Jan Engelhardt and other netfilter list members for pointing this out. Signed-off-by: David S. Miller <davem@davemloft.net>	2007-01-26 01:04:55 -08:00
Noriaki TAKAMIYA	6a2b9ce0a3	[IPV6]: Fixed the size of the netlink message notified by inet6_rt_notify(). I think the return value of rt6_nlmsg_size() should includes the amount of RTA_METRICS. Signed-off-by: Noriaki TAKAMIYA <takamiya@po.ntts.co.jp> Acked-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-01-23 22:09:41 -08:00
YOSHIFUJI Hideaki	d88ae4cc97	[IPV6] MCAST: Fix joining all-node multicast group on device initialization. Join all-node multicast group after assignment of dev->ip6_ptr because it must be assigned when ipv6_dev_mc_inc() is called. This fixes Bug#7817, reported by <gernoth@informatik.uni-erlangen.de>. Closes: 7817 Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-01-23 20:25:40 -08:00
Paul Moore	469de9b90f	[INET]: style updates for the inet_sock->is_icsk assignment fix A quick patch to change the inet_sock->is_icsk assignment to better fit with existing kernel coding style. Signed-off-by: Paul Moore <paul.moore@hp.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-01-09 14:37:06 -08:00
Patrick McHardy	f9f02cca25	[NETFILTER]: nf_conntrack_ipv6: fix crash when handling fragments When IPv6 connection tracking splits up a defragmented packet into its original fragments, the packets are taken from a list and are passed to the network stack with skb->next still set. This causes dev_hard_start_xmit to treat them as GSO fragments, resulting in a use after free when connection tracking handles the next fragment. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-01-09 14:32:41 -08:00
Paul Moore	cbbd7d4f36	[INET]: Fix incorrect "inet_sock->is_icsk" assignment. The inet_create() and inet6_create() functions incorrectly set the inet_sock->is_icsk field. Both functions assume that the is_icsk field is large enough to hold at least a INET_PROTOSW_ICSK value when it is actually only a single bit. This patch corrects the assignment by doing a boolean comparison whose result will safely fit into a single bit field. Signed-off-by: Paul Moore <paul.moore@hp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-01-09 00:29:51 -08:00
David L Stevens	30c4cf577f	[IPV4/IPV6]: Fix inet{,6} device initialization order. It is important that we only assign dev->ip{,6}_ptr only after all portions of the inet{,6} are setup. Otherwise we can receive packets before the multicast spinlocks et al. are initialized. Signed-off-by: David L Stevens <dlstevens@us.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-01-04 12:31:14 -08:00
David S. Miller	9b54d5c610	[NETFILTER] IPV6: Fix dependencies. Although the menu dependencies in net/ipv6/netfilter/Kconfig guard the entries in that file from the Kconfig GUI, this does not prevent them from being selected still via "make oldconfig" when IPV6 etc. is disabled. So add explicit dependencies. Signed-off-by: David S. Miller <davem@davemloft.net>	2006-12-17 21:59:18 -08:00
Kim Nordlund	8bce65b95a	[IPV6]: Make fib6_node subtree depend on IPV6_SUBTREES Make fib6_node 'subtree' depend on IPV6_SUBTREES. Signed-off-by: Kim Nordlund <kim.nordlund@nokia.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-12-13 16:48:31 -08:00
Brian Haley	befffe9016	[IPV6]: Fix IPV6_UNICAST_HOPS getsockopt(). > Relevant standard (RFC 3493) notes: > > The IPV6_UNICAST_HOPS option may be used with getsockopt() to > determine the hop limit value that the system will use for subsequent > unicast packets sent via that socket. > > I don't reckon -1 could be the hop limit value. -1 means un-initialized. > IMHO, the value from > case 1 (if socket is connected to some destination), otherwise case 2 > (if bound to a scope interface) or ultimately the default hop limit > ought to be returned instead, as it will be most often correct, while > the current behavior is always wrong, unless setsockopt() has been used > first. I don't if some people may think doing a route lookup in > getsockopt might be overly expensive, but at least the two other cases > should be ok, particularly the last one. The following patch seems to work for me, but this code has behaved this way for a while, so don't know if it will break any existing apps. Signed-off-by: Brian Haley <brian.haley@hp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-12-13 16:48:25 -08:00
Al Viro	e1b4b9f398	[NETFILTER]: {ip,ip6,arp}_tables: fix exponential worst-case search for loops If we come to node we'd already marked as seen and it's not a part of path (i.e. we don't have a loop right there), we already know that it isn't a part of any loop, so we don't need to revisit it. That speeds the things up if some chain is refered to from several places and kills O(exp(table size)) worst-case behaviour (without sleeping, at that, so if you manage to self-LART that way, you are SOL for a long time)... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-12-13 16:48:23 -08:00
Alexey Dobriyan	1f29bcd739	[PATCH] sysctl: remove unused "context" param Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Cc: Andi Kleen <ak@suse.de> Cc: "David S. Miller" <davem@davemloft.net> Cc: David Howells <dhowells@redhat.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-12-10 09:55:41 -08:00
Stephen Hemminger	3644f0cee7	[NET]: Convert hh_lock to seqlock. The hard header cache is in the main output path, so using seqlock instead of reader/writer lock should reduce overhead. Signed-off-by: Stephen Hemminger <shemminger@osdl.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-12-08 17:19:20 -08:00
Linus Torvalds	2685b267bc	Merge master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 * master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6: (48 commits) [NETFILTER]: Fix non-ANSI func. decl. [TG3]: Identify Serdes devices more clearly. [TG3]: Use msleep. [TG3]: Use netif_msg_*. [TG3]: Allow partial speed advertisement. [TG3]: Add TG3_FLG2_IS_NIC flag. [TG3]: Add 5787F device ID. [TG3]: Fix Phy loopback. [WANROUTER]: Kill kmalloc debugging code. [TCP] inet_twdr_hangman: Delete unnecessary memory barrier(). [NET]: Memory barrier cleanups [IPSEC]: Fix inetpeer leak in ipv4 xfrm dst entries. audit: disable ipsec auditing when CONFIG_AUDITSYSCALL=n audit: Add auditing to ipsec [IRDA] irlan: Fix compile warning when CONFIG_PROC_FS=n [IrDA]: Incorrect TTP header reservation [IrDA]: PXA FIR code device model conversion [GENETLINK]: Fix misplaced command flags. [NETLIK]: Add a pointer to the Generic Netlink wiki page. [IPV6] RAW: Don't release unlocked sock. ...	2006-12-07 09:05:15 -08:00
Christoph Lameter	e18b890bb0	[PATCH] slab: remove kmem_cache_t Replace all uses of kmem_cache_t with struct kmem_cache. The patch was generated using the following script: #!/bin/sh # # Replace one string by another in all the kernel sources. # set -e for file in `find * -name ".c" -o -name ".h"\|xargs grep -l $1`; do quilt add $file sed -e "1,\$s/$1/$2/g" $file >/tmp/$$ mv /tmp/$$ $file quilt refresh done The script was run like this sh replace kmem_cache_t "struct kmem_cache" Signed-off-by: Christoph Lameter <clameter@sgi.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-12-07 08:39:25 -08:00
Christoph Lameter	54e6ecb239	[PATCH] slab: remove SLAB_ATOMIC SLAB_ATOMIC is an alias of GFP_ATOMIC Signed-off-by: Christoph Lameter <clameter@sgi.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-12-07 08:39:24 -08:00
Alan Stern	a120586873	[PATCH] Allow NULL pointers in percpu_free The patch (as824b) makes percpu_free() ignore NULL arguments, as one would expect for a deallocation routine. (Note that free_percpu is #defined as percpu_free in include/linux/percpu.h.) A few callers are updated to remove now-unneeded tests for NULL. A few other callers already seem to assume that passing a NULL pointer to percpu_free() is okay! The patch also removes an unnecessary NULL check in percpu_depopulate(). Signed-off-by: Alan Stern <stern@rowland.harvard.edu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-12-07 08:39:22 -08:00
Masahide NAKAMURA	4e33fa14fa	[IPV6] RAW: Don't release unlocked sock. When user builds IPv6 header and send it through raw socket, kernel tries to release unlocked sock. (Kernel log shows "BUG: bad unlock balance detected" with enabled debug option.) The lock is held only for non-hdrincl sock in this function then this patch fix to do nothing about lock for hdrincl one. Signed-off-by: Masahide NAKAMURA <nakam@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-12-06 18:39:09 -08:00
YOSHIFUJI Hideaki	9a217a1c7e	[IPV6]: Repair IPv6 Fragments The commit "[IPV6]: Use kmemdup" (commit-id: `af879cc704`) broke IPv6 fragments. Bug was spotted by Yasuyuki Kozakai <yasuyuki.kozakai@toshiba.co.jp>. Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-12-06 18:39:08 -08:00
Dmitry Mishin	74c9c0c17d	[NETFILTER]: Fix {ip,ip6,arp}_tables hook validation Commit `590bdf7fd2` introduced a regression in match/target hook validation. mark_source_chains builds a bitmask for each rule representing the hooks it can be reached from, which is then used by the matches and targets to make sure they are only called from valid hooks. The patch moved the match/target specific validation before the mark_source_chains call, at which point the mask is always zero. This patch returns back to the old order and moves the standard checks to mark_source_chains. This allows to get rid of a special case for standard targets as a nice side-effect. Signed-off-by: Dmitry Mishin <dim@openvz.org> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-12-06 18:39:02 -08:00
Patrick McHardy	a3c479772c	[NETFILTER]: Mark old IPv4-only connection tracking scheduled for removal Also remove the references to "new connection tracking" from Kconfig. After some short stabilization period of the new connection tracking helpers/NAT code the old one will be removed. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-12-02 22:11:01 -08:00
Patrick McHardy	bff9a89bca	[NETFILTER]: nf_conntrack: endian annotations Resync with Al Viro's ip_conntrack annotations and fix a missed spot in ip_nat_proto_icmp.c. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-12-02 22:05:08 -08:00
Andrew Morton	b6332e6cf9	[TCP]: Fix warnings with TCP_MD5SIG disabled. Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-12-02 21:31:52 -08:00
Adrian Bunk	f5b99bcddd	[NET]: Possible cleanups. This patch contains the following possible cleanups: - make the following needlessly global functions statis: - ipv4/tcp.c: __tcp_alloc_md5sig_pool() - ipv4/tcp_ipv4.c: tcp_v4_reqsk_md5_lookup() - ipv4/udplite.c: udplite_rcv() - ipv4/udplite.c: udplite_err() - make the following needlessly global structs static: - ipv4/tcp_ipv4.c: tcp_request_sock_ipv4_ops - ipv4/tcp_ipv4.c: tcp_sock_ipv4_specific - ipv6/tcp_ipv6.c: tcp_request_sock_ipv6_ops - net/ipv{4,6}/udplite.c: remove inline's from static functions (gcc should know best when to inline them) Signed-off-by: Adrian Bunk <bunk@stusta.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-12-02 21:31:51 -08:00
Patrick McHardy	76592584be	[NETFILTER]: Fix PROC_FS=n warnings Fix some unused function/variable warnings. Signed-off-by: Patrick McHardy <kaber@trash.net>	2006-12-02 21:31:34 -08:00
Patrick McHardy	baf7b1e112	[NETFILTER]: x_tables: add NFLOG target Add new NFLOG target to allow use of nfnetlink_log for both IPv4 and IPv6. Currently we have two (unsupported by userspace) hacks in the LOG and ULOG targets to optionally call to the nflog API. They lack a few features, namely the IPv4 and IPv6 LOG targets can not specify a number of arguments related to nfnetlink_log, while the ULOG target is only available for IPv4. Remove those hacks and add a clean way to use nfnetlink_log. Signed-off-by: Patrick McHardy <kaber@trash.net>	2006-12-02 21:31:31 -08:00
Patrick McHardy	933a41e7e1	[NETFILTER]: nf_conntrack: move conntrack protocol sysctls to individual modules Signed-off-by: Patrick McHardy <kaber@trash.net>	2006-12-02 21:31:18 -08:00
Patrick McHardy	f8eb24a89a	[NETFILTER]: nf_conntrack: move extern declaration to header files Using extern in a C file is a bad idea because the compiler can't catch type errors. Signed-off-by: Patrick McHardy <kaber@trash.net>	2006-12-02 21:31:16 -08:00
Martin Josefsson	605dcad6c8	[NETFILTER]: nf_conntrack: rename struct nf_conntrack_protocol Rename 'struct nf_conntrack_protocol' to 'struct nf_conntrack_l4proto' in order to help distinguish it from 'struct nf_conntrack_l3proto'. It gets rather confusing with 'nf_conntrack_protocol'. Signed-off-by: Martin Josefsson <gandalf@wlug.westbo.se> Signed-off-by: Patrick McHardy <kaber@trash.net>	2006-12-02 21:31:09 -08:00
Gerrit Renker	4c0a6cb0db	[UDP(-Lite)]: consolidate v4 and v6 get\|setsockopt code This patch consolidates set/getsockopt code between UDP(-Lite) v4 and 6. The justification is that UDP(-Lite) is a transport-layer protocol and therefore the socket option code (at least in theory) should be AF-independent. Furthermore, there is the following code reduplication: * do_udp{,v6}_getsockopt is 100% identical between v4 and v6 * do_udp{,v6}_setsockopt is identical up to the following differerence --v4 in contrast to v4 additionally allows the experimental encapsulation types UDP_ENCAP_ESPINUDP and UDP_ENCAP_ESPINUDP_NON_IKE --the remainder is identical between v4 and v6 I believe that this difference is of little relevance. The advantages in not duplicating twice almost completely identical code. The patch further simplifies the interface of udp{,v6}_push_pending_frames, since for the second argument (struct udp_sock *up) it always holds that up = udp_sk(sk); where sk is the first function argument. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-12-02 21:30:45 -08:00
Thomas Graf	e3703b3de1	[RTNETLINK]: Add rtnl_put_cacheinfo() to unify some code IPv4, IPv6, and DECNet all use struct rta_cacheinfo in a similiar way, therefore rtnl_put_cacheinfo() is added to reuse code. Signed-off-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-12-02 21:30:44 -08:00
Ville Nuorvala	107a5fe619	[IPV6]: Improve IPv6 tunnel error reporting Log an error if the remote tunnel endpoint is unable to handle tunneled packets. Signed-off-by: Ville Nuorvala <vnuorval@tcs.hut.fi> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-12-02 21:30:27 -08:00
Ville Nuorvala	6fb32ddeb2	[IPV6]: Don't allocate memory for Tunnel Encapsulation Limit Option Signed-off-by: Ville Nuorvala <vnuorval@tcs.hut.fi> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-12-02 21:30:26 -08:00
Ville Nuorvala	305d4b3ce8	[IPV6]: Allow link-local tunnel endpoints Allow link-local tunnel endpoints if the underlying link is defined. Signed-off-by: Ville Nuorvala <vnuorval@tcs.hut.fi> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-12-02 21:30:25 -08:00
Ville Nuorvala	09c6bbf090	[IPV6]: Do mandatory IPv6 tunnel endpoint checks in realtime Doing the mandatory tunnel endpoint checks when the tunnel is set up isn't enough as interfaces can go up or down and addresses can be added or deleted after this. The checks need to be done realtime when the tunnel is processing a packet. Signed-off-by: Ville Nuorvala <vnuorval@tcs.hut.fi> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-12-02 21:30:24 -08:00
Ville Nuorvala	567131a722	[IPV6]: Fix SIOCCHGTUNNEL bug in IPv6 tunnels A logic bug in tunnel lookup could result in duplicate tunnels when changing an existing device. Signed-off-by: Ville Nuorvala <vnuorval@tcs.hut.fi> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-12-02 21:30:24 -08:00
Al Viro	ff1dcadb1b	[NET]: Split skb->csum ... into anonymous union of __wsum and __u32 (csum and csum_offset resp.) Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-12-02 21:27:18 -08:00
Al Viro	8e5200f540	[NET]: Fix assorted misannotations (from md5 and udplite merges). Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-12-02 21:27:16 -08:00
Adrian Bunk	89c8945815	[IPV6] net/ipv6/sit.c: make 2 functions static This patch makes two needlessly global functions static. Signed-off-by: Adrian Bunk <bunk@stusta.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-12-02 21:26:15 -08:00
Arnaldo Carvalho de Melo	af879cc704	[IPV6]: Use kmemdup Code diff stats: [acme@newtoy net-2.6.20]$ codiff /tmp/ipv6.ko.before /tmp/ipv6.ko.after /pub/scm/linux/kernel/git/acme/net-2.6.20/net/ipv6/ip6_output.c: ip6_output \| -52 ip6_append_data \| +2 2 functions changed, 2 bytes added, 52 bytes removed /pub/scm/linux/kernel/git/acme/net-2.6.20/net/ipv6/addrconf.c: addrconf_sysctl_register \| -27 1 function changed, 27 bytes removed /pub/scm/linux/kernel/git/acme/net-2.6.20/net/ipv6/tcp_ipv6.c: tcp_v6_syn_recv_sock \| -32 tcp_v6_parse_md5_keys \| -24 2 functions changed, 56 bytes removed /tmp/ipv6.ko.after: 5 functions changed, 2 bytes added, 135 bytes removed [acme@newtoy net-2.6.20]$ Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>	2006-12-02 21:23:58 -08:00
David S. Miller	7d9e9b3df4	[IPV6]: udp.c build fix Signed-off-by: David S. Miller <davem@davemloft.net>	2006-12-02 21:23:46 -08:00
Al Viro	f6ab028804	[NET]: Make mangling a checksum (0 -> 0xffff on the wire) explicit. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-12-02 21:23:39 -08:00
Al Viro	b51655b958	[NET]: Annotate __skb_checksum_complete() and friends. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-12-02 21:23:38 -08:00
Al Viro	5f92a7388a	[NET]: Annotate callers of the reset of checksum.h stuff. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-12-02 21:23:34 -08:00
Al Viro	868c86bcb5	[NET]: annotate csum_ipv6_magic() callers in net/* Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-12-02 21:23:31 -08:00
Al Viro	e69a4adc66	[IPV6]: Misc endianness annotations. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-12-02 21:22:52 -08:00
Gerrit Renker	ba4e58eca8	[NET]: Supporting UDP-Lite (RFC 3828) in Linux This is a revision of the previously submitted patch, which alters the way files are organized and compiled in the following manner: * UDP and UDP-Lite now use separate object files * source file dependencies resolved via header files net/ipv{4,6}/udp_impl.h * order of inclusion files in udp.c/udplite.c adapted accordingly [NET/IPv4]: Support for the UDP-Lite protocol (RFC 3828) This patch adds support for UDP-Lite to the IPv4 stack, provided as an extension to the existing UDPv4 code: * generic routines are all located in net/ipv4/udp.c * UDP-Lite specific routines are in net/ipv4/udplite.c * MIB/statistics support in /proc/net/snmp and /proc/net/udplite * shared API with extensions for partial checksum coverage [NET/IPv6]: Extension for UDP-Lite over IPv6 It extends the existing UDPv6 code base with support for UDP-Lite in the same manner as per UDPv4. In particular, * UDPv6 generic and shared code is in net/ipv6/udp.c * UDP-Litev6 specific extensions are in net/ipv6/udplite.c * MIB/statistics support in /proc/net/snmp6 and /proc/net/udplite6 * support for IPV6_ADDRFORM * aligned the coding style of protocol initialisation with af_inet6.c * made the error handling in udpv6_queue_rcv_skb consistent; to return `-1' on error on all error cases * consolidation of shared code [NET]: UDP-Lite Documentation and basic XFRM/Netfilter support The UDP-Lite patch further provides * API documentation for UDP-Lite * basic xfrm support * basic netfilter support for IPv4 and IPv6 (LOG target) Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-12-02 21:22:46 -08:00
Thomas Graf	6051e2f4fb	[IPv6] prefix: Convert RTM_NEWPREFIX notifications to use the new netlink api RTM_GETPREFIX is completely unused and is thus removed. Signed-off-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-12-02 21:22:45 -08:00
Thomas Graf	04561c1fe7	[IPv6] iflink: Convert IPv6's RTM_GETLINK to use the new netlink api By replacing the current method of exporting the device configuration which included allocating a temporary buffer, copying ipv6_devconf into it and copying that buffer into the message with a method that uses nla_reserve() allowing to copy the device configuration directly into the skb data buffer, a GFP_ATOMIC allocation could be removed. Signed-off-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-12-02 21:22:44 -08:00
David S. Miller	a928630a2f	[TCP]: Fix some warning when MD5 is disabled. Just some mis-placed ifdefs: net/ipv4/tcp_minisocks.c: In function ‘tcp_twsk_destructor’: net/ipv4/tcp_minisocks.c:364: warning: unused variable ‘twsk’ net/ipv6/tcp_ipv6.c:1846: warning: ‘tcp_sock_ipv6_specific’ defined but not used net/ipv6/tcp_ipv6.c:1877: warning: ‘tcp_sock_ipv6_mapped_specific’ defined but not used Signed-off-by: David S. Miller <davem@davemloft.net>	2006-12-02 21:22:43 -08:00
YOSHIFUJI Hideaki	cfb6eeb4c8	[TCP]: MD5 Signature Option (RFC2385) support. Based on implementation by Rick Payne. Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-12-02 21:22:39 -08:00
Gerrit Renker	b9df3cb8cf	[TCP/DCCP]: Introduce net_xmit_eval Throughout the TCP/DCCP (and tunnelling) code, it often happens that the return code of a transmit function needs to be tested against NET_XMIT_CN which is a value that does not indicate a strict error condition. This patch uses a macro for these recurring situations which is consistent with the already existing macro net_xmit_errno, saving on duplicated code. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>	2006-12-02 21:22:27 -08:00
Brian Haley	d3a1be9cba	[IPv6]: Only modify checksum for UDP Only change upper-layer checksum from 0 to 0xFFFF for UDP (as RFC 768 states), not for others as RFC 4443 doesn't require it. Signed-off-by: Brian Haley <brian.haley@hp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-12-02 21:22:13 -08:00
Thomas Graf	f465e489c4	[IPv6] rules: Remove bogus tos validation check Noticed by Al Viro: (frh->tos & ~IPV6_FLOWINFO_MASK)) where IPV6_FLOWINFO_MASK is htonl(0xfffffff) and frh->tos is u8, which makes no sense here... Signed-off-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-12-02 21:22:12 -08:00
Thomas Graf	339bf98ffc	[NETLINK]: Do precise netlink message allocations where possible Account for the netlink message header size directly in nlmsg_new() instead of relying on the caller calculate it correctly. Replaces error handling of message construction functions when constructing notifications with bug traps since a failure implies a bug in calculating the size of the skb. Signed-off-by: Thomas Graf <tgraf@suug.ch> Acked-by: Paul Moore <paul.moore@hp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-12-02 21:22:11 -08:00
Gerrit Renker	a94f723d59	[TCP]: Remove dead code in init_sequence This removes two redundancies: 1) The test (skb->protocol == htons(ETH_P_IPV6) in tcp_v6_init_sequence() is always true, due to * tcp_v6_conn_request() is the only function calling this one * tcp_v6_conn_request() redirects all skb's with ETH_P_IP protocol to tcp_v4_conn_request() [ cf. top of tcp_v6_conn_request()] 2) The first argument, `struct sock *sk' of tcp_v{4,6}_init_sequence() is never used. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-12-02 21:22:10 -08:00
YOSHIFUJI Hideaki	a11d206d0f	[IPV6]: Per-interface statistics support. For IP MIB (RFC4293). Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>	2006-12-02 21:22:08 -08:00
YOSHIFUJI Hideaki	7a3025b1b3	[IPV6]: Introduce ip6_dst_idev() to get inet6_dev{} stored in dst_entry{}. Otherwise, we will see a lot of casts... Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>	2006-12-02 21:22:07 -08:00
YOSHIFUJI Hideaki	40aa7b90a9	[IPV6] ROUTE: Use &rt->u.dst instead of cast. Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>	2006-12-02 21:22:06 -08:00
YOSHIFUJI Hideaki	33e93c9699	[IPV6] ROUTE: Use macros to format /proc/net/ipv6_route. Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>	2006-12-02 21:22:05 -08:00
David S. Miller	931731123a	[TCP]: Don't set SKB owner in tcp_transmit_skb(). The data itself is already charged to the SKB, doing the skb_set_owner_w() just generates a lot of noise and extra atomics we don't really need. Lmbench improvements on lat_tcp are minimal: before: TCP latency using localhost: 23.2701 microseconds TCP latency using localhost: 23.1994 microseconds TCP latency using localhost: 23.2257 microseconds after: TCP latency using localhost: 22.8380 microseconds TCP latency using localhost: 22.9465 microseconds TCP latency using localhost: 22.8462 microseconds Signed-off-by: David S. Miller <davem@davemloft.net>	2006-12-02 21:21:52 -08:00
David S. Miller	9ec75fe85c	[IPV6] tcp: Fix typo _read_mostly --> __read_mostly. Signed-off-by: David S. Miller <davem@davemloft.net>	2006-12-02 21:21:46 -08:00
Eric Dumazet	72a3effaf6	[NET]: Size listen hash tables using backlog hint We currently allocate a fixed size (TCP_SYNQ_HSIZE=512) slots hash table for each LISTEN socket, regardless of various parameters (listen backlog for example) On x86_64, this means order-1 allocations (might fail), even for 'small' sockets, expecting few connections. On the contrary, a huge server wanting a backlog of 50000 is slowed down a bit because of this fixed limit. This patch makes the sizing of listen hash table a dynamic parameter, depending of : - net.core.somaxconn tunable (default is 128) - net.ipv4.tcp_max_syn_backlog tunable (default : 256, 1024 or 128) - backlog value given by user application (2nd parameter of listen()) For large allocations (bigger than PAGE_SIZE), we use vmalloc() instead of kmalloc(). We still limit memory allocation with the two existing tunables (somaxconn & tcp_max_syn_backlog). So for standard setups, this patch actually reduce RAM usage. Signed-off-by: Eric Dumazet <dada1@cosmosbay.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-12-02 21:21:44 -08:00
Thomas Graf	1f6c9557e8	[NET] rules: Share common attribute validation policy Move the attribute policy for the non-specific attributes into net/fib_rules.h and include it in the respective protocols. Signed-off-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-12-02 21:21:41 -08:00
Thomas Graf	b8964ed9fa	[NET] rules: Protocol independant mark selector Move mark selector currently implemented per protocol into the protocol independant part. Signed-off-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-12-02 21:21:41 -08:00
Thomas Graf	47dcf0cb10	[NET]: Rethink mark field in struct flowi Now that all protocols have been made aware of the mark field it can be moved out of the union thus simplyfing its usage. The config options in the IPv4/IPv6/DECnet subsystems to enable respectively disable mark based routing only obfuscate the code with ifdefs, the cost for the additional comparison in the flow key is insignificant, and most distributions have all these options enabled by default anyway. Therefore it makes sense to remove the config options and enable mark based routing by default. Signed-off-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-12-02 21:21:39 -08:00
Thomas Graf	82e91ffef6	[NET]: Turn nfmark into generic mark nfmark is being used in various subsystems and has become the defacto mark field for all kinds of packets. Therefore it makes sense to rename it to `mark' and remove the dependency on CONFIG_NETFILTER. Signed-off-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-12-02 21:21:38 -08:00
Al Viro	ae08e1f092	[IPV6]: ip6_output annotations Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-12-02 21:21:26 -08:00
Al Viro	fede70b986	[IPV6]: annotate inet6_csk_search_req() Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-12-02 21:21:22 -08:00
Al Viro	90bcaf7b4a	[IPV6]: flowlabels are net-endian Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-12-02 21:21:21 -08:00
Al Viro	8a74ff7770	[IPV6]: annotate ipv6 mcast Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-12-02 21:21:15 -08:00
Al Viro	04ce69093f	[IPV6]: 'info' argument of ipv6 ->err_handler() is net-endian Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-12-02 21:21:12 -08:00
Al Viro	8c689a6eae	[XFRM]: misc annotations Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-12-02 21:21:11 -08:00
Al Viro	d2ecd9ccd0	[IPV6]: annotate inet6_hashtables Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-12-02 21:21:10 -08:00
David S. Miller	d54a81d341	[IPV6] NDISC: Calculate packet length correctly for allocation. MAX_HEADER does not include the ipv6 header length in it, so we need to add it in explicitly. With help from YOSHIFUJI Hideaki. Signed-off-by: David S. Miller <davem@davemloft.net>	2006-12-02 21:06:31 -08:00
YOSHIFUJI Hideaki	f2776ff047	[IPV6]: Fix address/interface handling in UDP and DCCP, according to the scoping architecture. TCP and RAW do not have this issue. Closes Bug #7432. Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-11-21 17:41:56 -08:00
Yasuyuki Kozakai	53ab61c6d8	[IPV6] IP6TUNNEL: Add missing nf_reset() on input path. Signed-off-by: Yasuyuki Kozakai <yasuyuki.kozakai@toshiba.co.jp> Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>	2006-11-21 16:16:27 -08:00
Yasuyuki Kozakai	b3fdd9f115	[IPV6] IP6TUNNEL: Delete all tunnel device when unloading module. Signed-off-by: Yasuyuki Kozakai <yasuyuki.kozakai@toshiba.co.jp> Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>	2006-11-21 16:16:26 -08:00
YOSHIFUJI Hideaki	ea659e0775	[IPV6] ROUTE: Do not enable router reachability probing in router mode. RFC4191 explicitly states that the procedures are applicable to hosts only. We should not have changed behavior of routers. Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>	2006-11-21 16:16:25 -08:00
YOSHIFUJI Hideaki	557e92efd4	[IPV6] ROUTE: Prefer reachable nexthop only if the caller requests. Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>	2006-11-21 16:16:24 -08:00
YOSHIFUJI Hideaki	ea73ee23c4	[IPV6] ROUTE: Try to use router which is not known unreachable. Only routers in "FAILED" state should be considered unreachable. Otherwise, we do not try to use speicific routes unless all least specific routers are considered unreachable. Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>	2006-11-21 16:16:23 -08:00
Patrick McHardy	337dde798d	[NETFILTER]: ip6_tables: use correct nexthdr value in ipv6_find_hdr() nexthdr is NEXTHDR_FRAGMENT, the nexthdr value from the fragment header is hp->nexthdr. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-11-15 21:18:50 -08:00
Patrick McHardy	d8a585d78e	[NETFILTER]: Use pskb_trim in {ip,ip6,nfnetlink}_queue Based on patch by James D. Nurmi: I've got some code very dependant on nfnetlink_queue, and turned up a large number of warns coming from skb_trim. While it's quite possibly my code, having not seen it on older kernels made me a bit suspect. Anyhow, based on some googling I turned up this thread: http://lkml.org/lkml/2006/8/13/56 And believe the issue to be related, so attached is a small patch to the kernel -- not sure if this is completely correct, but for anyone else hitting the WARN_ON(1) in skbuff.h, it might be helpful.. Signed-off-by: James D. Nurmi <jdnurmi@gmail.com> Ported to ip6_queue and nfnetlink_queue and added return value checks. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-11-15 21:18:48 -08:00
Patrick McHardy	daccff024f	[IPV6]: Give sit driver an appropriate module alias. It would be nice to keep things working even with this built as a module, it took me some time to realize my IPv6 tunnel was broken because of the missing sit module. This module alias fixes things until distributions have added an appropriate alias to modprobe.conf. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-11-05 15:47:04 -08:00
Dmitry Mishin	36f73d0c3b	[IPV6]: Add ndisc_netdev_notifier unregister. If inet6_init() fails later than ndisc_init() call, or IPv6 module is unloaded, ndisc_netdev_notifier call remains in the list and will follows in oops later. Signed-off-by: Dmitry Mishin <dim@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-11-05 14:11:33 -08:00
Al Viro	5b1225454f	[IPV6]: File the fingerprints off ah6->spi/esp6->spi In theory these are opaque 32bit values. However, we end up allocating them sequentially in host-endian and stick unchanged on the wire. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-11-01 15:42:35 -08:00
James Morris	1b7c2dbc07	[IPV6]: fix flowlabel seqfile handling There's a bug in the seqfile show operation for flowlabel objects, where each hash chain is traversed cumulatively for each element. The following function is called for each element of each chain: static void ip6fl_fl_seq_show(struct seq_file seq, struct ip6_flowlabel fl) { while(fl) { seq_printf... fl = fl->next; } } Thus, objects can appear mutliple times when reading /proc/net/ip6_flowlabel, as the above is called for each element in the chain. The solution is to remove the while() loop from the above, and traverse each chain exactly once, per the patch below. This also removes the ip6fl_fl_seq_show() function, which does nothing else. Signed-off-by: James Morris <jmorris@namei.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-10-31 00:43:44 -08:00
James Morris	c6817e4c32	[IPV6]: return EINVAL for invalid address with flowlabel lease request Currently, when an application requests a lease for a flowlabel via the IPV6_FLOWLABEL_MGR socket option, no error is returned if an invalid type of destination address is supplied as part of the request, leading to a silent failure. This patch ensures that EINVAL is returned to the application in this case. Signed-off-by: James Morris <jmorris@namei.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-10-30 18:56:06 -08:00
Dmitry Mishin	590bdf7fd2	[NETFILTER]: Missed and reordered checks in {arp,ip,ip6}_tables There is a number of issues in parsing user-provided table in translate_table(). Malicious user with CAP_NET_ADMIN may crash system by passing special-crafted table to the _tables. The first issue is that mark_source_chains() function is called before entry content checks. In case of standard target, mark_source_chains() function uses t->verdict field in order to determine new position. But the check, that this field leads no further, than the table end, is in check_entry(), which is called later, than mark_source_chains(). The second issue, that there is no check that target_offset points inside entry. If so, _ITERATE_MATCH macro will follow further, than the entry ends. As a result, we'll have oops or memory disclosure. And the third issue, that there is no check that the target is completely inside entry. Results are the same, as in previous issue. Signed-off-by: Dmitry Mishin <dim@openvz.org> Acked-by: Kirill Korotaev <dev@openvz.org> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-10-30 15:24:44 -08:00
Patrick McHardy	844dc7c880	[NETFILTER]: remove masq/NAT from ip6tables Kconfig help Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-10-30 15:24:43 -08:00
James Morris	bcd620757d	[IPV6]: fix lockup via /proc/net/ip6_flowlabel There's a bug in the seqfile handling for /proc/net/ip6_flowlabel, where, after finding a flowlabel, the code will loop forever not finding any further flowlabels, first traversing the rest of the hash bucket then just looping. This patch fixes the problem by breaking after the hash bucket has been traversed. Note that this bug can cause lockups and oopses, and is trivially invoked by an unpriveleged user. Signed-off-by: James Morris <jmorris@namei.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-10-30 15:24:42 -08:00
Heiko Carstens	a27b58fed9	[NET]: fix uaccess handling Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-10-30 15:24:41 -08:00
Patrick McHardy	6d381634d2	[NETFILTER]: Fix ip6_tables extension header bypass bug As reported by Mark Dowd <Mark_Dowd@McAfee.com>, ip6_tables is susceptible to a fragmentation attack causing false negatives on extension header matches. When extension headers occur in the non-first fragment after the fragment header (possibly with an incorrect nexthdr value in the fragment header) a rule looking for this extension header will never match. Drop fragments that are at offset 0 and don't contain the final protocol header regardless of the ruleset, since this should not happen normally. Since all extension headers are before the protocol header this makes sure an extension header is either not present or in the first fragment, where we can properly parse it. With help from Yasuyuki KOZAKAI <yasuyuki.kozakai@toshiba.co.jp>. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-10-24 16:15:10 -07:00
Patrick McHardy	51d8b1a652	[NETFILTER]: Fix ip6_tables protocol bypass bug As reported by Mark Dowd <Mark_Dowd@McAfee.com>, ip6_tables is susceptible to a fragmentation attack causing false negatives on protocol matches. When the protocol header doesn't follow the fragment header immediately, the fragment header contains the protocol number of the next extension header. When the extension header and the protocol header are sent in a second fragment a rule like "ip6tables .. -p udp -j DROP" will never match. Drop fragments that are at offset 0 and don't contain the final protocol header regardless of the ruleset, since this should not happen normally. With help from Yasuyuki KOZAKAI <yasuyuki.kozakai@toshiba.co.jp>. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-10-24 16:14:04 -07:00
Thomas Graf	375216ad0c	[IPv6] fib: initialize tb6_lock in common place to give lockdep a key Signed-off-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-10-21 20:20:54 -07:00
David S. Miller	6723ab549d	[IPV6]: Fix route.c warnings when multiple tables are disabled. Signed-off-by: David S. Miller <davem@davemloft.net>	2006-10-18 21:20:57 -07:00
Thomas Graf	9ce8ade015	[IPv6] route: Fix prohibit and blackhole routing decision Lookups resolving to ip6_blk_hole_entry must result in silently discarding the packets whereas an ip6_pkt_prohibit_entry is supposed to cause an ICMPV6_ADM_PROHIBITED message to be sent. Thanks to Kim Nordlund <kim.nordlund@nokia.com> for noticing this bug. Signed-off-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-10-18 20:46:54 -07:00
Ville Nuorvala	22e1e4d8dc	[IPV6]: Always copy rt->u.dst.error when copying a rt6_info. Signed-off-by: Ville Nuorvala <vnuorval@tcs.hut.fi> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-10-18 19:55:30 -07:00
Ville Nuorvala	264e91b68a	[IPV6]: Make IPV6_SUBTREES depend on IPV6_MULTIPLE_TABLES. As IPV6_SUBTREES can't work without IPV6_MULTIPLE_TABLES have IPV6_SUBTREES depend on it. Signed-off-by: Ville Nuorvala <vnuorval@tcs.hut.fi> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-10-18 19:55:29 -07:00
Ville Nuorvala	e0eda7bbaa	[IPV6]: Clean up BACKTRACK(). The fn check is unnecessary as fn can never be NULL in BACKTRACK(). Signed-off-by: Ville Nuorvala <vnuorval@tcs.hut.fi> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-10-18 19:55:28 -07:00
Ville Nuorvala	4251320fa2	[IPV6]: Make sure error handling is done when calling ip6_route_output(). As ip6_route_output() never returns NULL, error checking must be done by looking at dst->error in stead of comparing dst against NULL. Signed-off-by: Ville Nuorvala <vnuorval@tcs.hut.fi> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-10-18 19:55:27 -07:00
Jan Dittmer	39c850863d	[IPV6] sit: Add missing MODULE_LICENSE This is missing the MODULE_LICENSE statements and taints the kernel upon loading. License is obvious from the beginning of the file. Signed-off-by: Jan Dittmer <jdi@l4x.org> Signed-off-by: Joerg Roedel <joro-lkml@zlug.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-10-15 23:14:21 -07:00
YOSHIFUJI Hideaki	f1a95859a8	[IPV6]: Remove bogus WARN_ON in Proxy-NA handling. Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-10-15 23:14:20 -07:00
Thomas Graf	adaa70bbdf	[IPv6] rules: Use RT6_LOOKUP_F_HAS_SADDR and fix source based selectors Fixes rt6_lookup() to provide the source address in the flow and sets RT6_LOOKUP_F_HAS_SADDR whenever it is present in the flow. Avoids unnecessary prefix comparisons by checking for a prefix length first. Fixes the rule logic to not match packets if a source selector has been specified but no source address is available. Thanks to Kim Nordlund <kim.nordlund@nokia.com> for working on this patch with me. Signed-off-by: Thomas Graf <tgraf@suug.ch> Acked-by: Ville Nuorvala <vnuorval@tcs.hut.fi> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-10-15 23:14:19 -07:00
YOSHIFUJI Hideaki	9469c7b4aa	[NET]: Use typesafe inet_twsk() inline function instead of cast. Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-10-11 23:59:58 -07:00
YOSHIFUJI Hideaki	4244f8a9f8	[TCP]: Use TCPOLEN_TSTAMP_ALIGNED macro instead of magic number. Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-10-11 23:59:54 -07:00
Joerg Roedel	0be669bb37	[IPV6]: Seperate sit driver to extra module (addrconf.c changes) This patch contains the changes to net/ipv6/addrconf.c to remove sit specific code if the sit driver is not selected. Signed-off-by: Joerg Roedel <joro-lkml@zlug.org> Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-10-11 23:59:52 -07:00
Joerg Roedel	989e5b96e1	[IPV6]: Seperate sit driver to extra module This patch removes the driver of the IPv6-in-IPv4 tunnel driver (sit) from the IPv6 module. It adds an option to Kconfig which makes it possible to compile it as a seperate module. Signed-off-by: Joerg Roedel <joro-lkml@zlug.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-10-11 23:59:50 -07:00
Venkat Yekkirala	5b368e61c2	IPsec: correct semantics for SELinux policy matching Currently when an IPSec policy rule doesn't specify a security context, it is assumed to be "unlabeled" by SELinux, and so the IPSec policy rule fails to match to a flow that it would otherwise match to, unless one has explicitly added an SELinux policy rule allowing the flow to "polmatch" to the "unlabeled" IPSec policy rules. In the absence of such an explicitly added SELinux policy rule, the IPSec policy rule fails to match and so the packet(s) flow in clear text without the otherwise applicable xfrm(s) applied. The above SELinux behavior violates the SELinux security notion of "deny by default" which should actually translate to "encrypt by default" in the above case. This was first reported by Evgeniy Polyakov and the way James Morris was seeing the problem was when connecting via IPsec to a confined service on an SELinux box (vsftpd), which did not have the appropriate SELinux policy permissions to send packets via IPsec. With this patch applied, SELinux "polmatching" of flows Vs. IPSec policy rules will only come into play when there's a explicit context specified for the IPSec policy rule (which also means there's corresponding SELinux policy allowing appropriate domains/flows to polmatch to this context). Secondly, when a security module is loaded (in this case, SELinux), the security_xfrm_policy_lookup() hook can return errors other than access denied, such as -EINVAL. We were not handling that correctly, and in fact inverting the return logic and propagating a false "ok" back up to xfrm_lookup(), which then allowed packets to pass as if they were not associated with an xfrm policy. The solution for this is to first ensure that errno values are correctly propagated all the way back up through the various call chains from security_xfrm_policy_lookup(), and handled correctly. Then, flow_cache_lookup() is modified, so that if the policy resolver fails (typically a permission denied via the security module), the flow cache entry is killed rather than having a null policy assigned (which indicates that the packet can pass freely). This also forces any future lookups for the same flow to consult the security module (e.g. SELinux) for current security policy (rather than, say, caching the error on the flow cache entry). This patch: Fix the selinux side of things. This makes sure SELinux polmatching of flow contexts to IPSec policy rules comes into play only when an explicit context is associated with the IPSec policy rule. Also, this no longer defaults the context of a socket policy to the context of the socket since the "no explicit context" case is now handled properly. Signed-off-by: Venkat Yekkirala <vyekkirala@TrustedCS.com> Signed-off-by: James Morris <jmorris@namei.org>	2006-10-11 23:59:37 -07:00
Linus Torvalds	fefd26b3b8	Merge master.kernel.org:/pub/scm/linux/kernel/git/davej/configh * master.kernel.org:/pub/scm/linux/kernel/git/davej/configh: Remove all inclusions of <linux/config.h> Manually resolved trivial path conflicts due to removed files in the sound/oss/ subdirectory.	2006-10-04 09:59:57 -07:00
Dave Jones	038b0a6d8d	Remove all inclusions of <linux/config.h> kbuild explicitly includes this at build time. Signed-off-by: Dave Jones <davej@redhat.com>	2006-10-04 03:38:54 -04:00
Diego Beltrami	0a69452cb4	[XFRM]: BEET mode This patch introduces the BEET mode (Bound End-to-End Tunnel) with as specified by the ietf draft at the following link: http://www.ietf.org/internet-drafts/draft-nikander-esp-beet-mode-06.txt The patch provides only single family support (i.e. inner family = outer family). Signed-off-by: Diego Beltrami <diego.beltrami@gmail.com> Signed-off-by: Miika Komu <miika@iki.fi> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Abhinav Pathak <abhinav.pathak@hiit.fi> Signed-off-by: Jeff Ahrenholz <ahrenholz@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-10-04 00:31:09 -07:00
Herbert Xu	1e0c14f49d	[UDP]: Fix MSG_PROBE crash UDP tracks corking status through the pending variable. The IP layer also tracks it through the socket write queue. It is possible for the two to get out of sync when MSG_PROBE is used. This patch changes UDP to check the write queue to ensure that the two stay in sync. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-10-04 00:31:00 -07:00
Herbert Xu	132a55f3c5	[UDP6]: Fix flowi clobbering The udp6_sendmsg function uses a shared buffer to store the flow without taking any locks. This leads to races with SMP. This patch moves the flowi object onto the stack. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Acked-by: James Morris <jmorris@namei.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-10-04 00:30:59 -07:00
Herbert Xu	1a9e9ef684	[IPV6]: Disable SG for GSO unless we have checksum Because the system won't turn off the SG flag for us we need to do this manually on the IPv6 path. Otherwise we will throw IPv6 packets with bad checksums at the hardware. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-28 18:02:45 -07:00
Al Viro	a252cc2371	[XFRM]: xrfm_replay_check() annotations seq argument is net-endian Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-28 18:02:40 -07:00
Al Viro	6067b2baba	[XFRM]: xfrm_parse_spi() annotations Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-28 18:02:39 -07:00
Al Viro	a94cfd1974	[XFRM]: xfrm_state_lookup() annotations spi argument of xfrm_state_lookup() is net-endian Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-28 18:02:37 -07:00
Al Viro	8f83f23e6d	[XFRM]: ports in struct xfrm_selector annotated Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-28 18:02:33 -07:00
Al Viro	82103232ed	[IPV4]: inet_rcv_saddr() annotations inet_rcv_saddr() returns net-endian Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-28 18:02:28 -07:00
Al Viro	4f765d842f	[IPV4]: INET_MATCH() annotations INET_MATCH() and friends depend on an interesting set of kludges: * there's a pair of adjacent fields in struct inet_sock - __be16 dport followed by __u16 num. We want to search by pair, so we combine the keys into a single 32bit value and compare with 32bit value read from &...->dport. * on 64bit targets we combine comparisons with pair of adjacent __be32 fields in the same way. Make sure that we don't mix those values with anything else and that pairs we form them from have correct types. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-28 18:02:25 -07:00
Al Viro	fd68322209	[IPV4]: inet_addr_type() annotations argument and inferred net-endian variables in callers annotated. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-28 18:01:07 -07:00
Fabio Olive Leite	293b9c4251	[IPV6]: bh_lock_sock_nested on tcp_v6_rcv A while ago Ingo patched tcp_v4_rcv on net/ipv4/tcp_ipv4.c to use bh_lock_sock_nested and silence a lock validator warning. This fixed it for IPv4, but recently I saw a report of the same warning on IPv6. Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-28 17:53:54 -07:00
Noriaki TAKAMIYA	3b9f9a1c39	[IPV6] ADDRCONF: Mobile IPv6 Home Address support. IFA_F_HOMEADDRESS is introduced for Mobile IPv6 Home Addresses on Mobile Node. The IFA_F_HOMEADDRESS flag should be set for Mobile IPv6 Home Addresses for 2 purposes. 1) We need to check this on receipt of Type 2 Routing Header (RFC3775 Secion 6.4), 2) We prefer Home Address(es) in source address selection (RFC3484 Section 5 Rule 4). Signed-off-by: Noriaki TAKAMIYA <takamiya@po.ntts.co.jp> Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-22 15:20:29 -07:00
Noriaki TAKAMIYA	55ebaef1d5	[IPV6] ADDRCONF: Allow non-DAD'able addresses. IFA_F_NODAD flag, similar to IN6_IFF_NODAD in BSDs, is introduced to skip DAD. This flag should be set to Mobile IPv6 Home Address(es) on Mobile Node because DAD would fail if we should perform DAD; our Home Agent protects our Home Address(es). Signed-off-by: Noriaki TAKAMIYA <takamiya@po.ntts.co.jp> Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-22 15:20:28 -07:00
YOSHIFUJI Hideaki	fc26d0abd5	[IPV6] NDISC: Fix is_router flag setting. We did not send appropriate IsRouter flag if the forwarding setting is positive even value. Let's give 1/0 value to ndisc_send_na(). Also, existing users of ndisc_send_na() give 0/1 to override, we can omit redundant operation in that function. Bug hinted by Nicolas Dichtel <nicolas.dichtel@6wind.com>. Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-22 15:20:27 -07:00
YOSHIFUJI Hideaki	8814c4b533	[IPV6] ADDRCONF: Convert addrconf_lock to RCU. Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-22 15:20:26 -07:00
YOSHIFUJI Hideaki	fbea49e1e2	[IPV6] NDISC: Add proxy_ndp sysctl. We do not always need proxy NDP functionality even we enable forwarding. Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-22 15:20:25 -07:00
Ville Nuorvala	62dd93181a	[IPV6] NDISC: Set per-entry is_router flag in Proxy NA. We have sent NA with router flag from the node-wide forwarding configuration. This is not appropriate for proxy NA, and it should be set according to each proxy entry's configuration. This is used by Mobile IPv6 home agent to support physical home link in acting as a proxy router for mobile node which is not a router, for example. Based on MIPL2 kernel patch. Signed-off-by: Ville Nuorvala <vnuorval@tcs.hut.fi> Signed-off-by: Masahide NAKAMURA <nakam@linux-ipv6.org> Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>	2006-09-22 15:20:24 -07:00
Ville Nuorvala	5f3e6e9e19	[IPV6] NDISC: Avoid updating neighbor cache for proxied address in receiving NA. This aims at proxying router not updating neighbor cache entry for proxied address when it receives NA because either the proxied node is off link or it has already sent a NA to the proxied router. Based on MIPL2 kernel patch. Signed-off-by: Ville Nuorvala <vnuorval@tcs.hut.fi> Signed-off-by: Masahide NAKAMURA <nakam@linux-ipv6.org> Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>	2006-09-22 15:20:23 -07:00
Ville Nuorvala	74553b09dc	[IPV6]: Don't forward packets to proxied link-local address. Proxying router can't forward traffic sent to link-local address, so signal the sender and discard the packet. This behavior is clarified by Mobile IPv6 specification (RFC3775) but might be required for all proxying router. Based on MIPL2 kernel patch. Signed-off-by: Ville Nuorvala <vnuorval@tcs.hut.fi> Signed-off-by: Masahide NAKAMURA <nakam@linux-ipv6.org> Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>	2006-09-22 15:20:22 -07:00
Ville Nuorvala	e21e0b5f19	[IPV6] NDISC: Handle NDP messages to proxied addresses. It is required to respond to NDP messages sent directly to the "target" unicast address. Proxying node (router) is required to handle such messages. To achieve this, check if the packet in forwarding patch is NDP message. With this patch, the proxy neighbor entries are always looked up in forwarding path. We may want to optimize further. Based on MIPL2 kernel patch. Signed-off-by: Ville Nuorvala <vnuorval@tcs.hut.fi> Signed-off-by: Masahide NAKAMURA <nakam@linux-ipv6.org> Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>	2006-09-22 15:20:21 -07:00
Brian Haley	1192e403e9	[NETFILTER]: make some netfilter globals __read_mostly Signed-off-by: Brian Haley <brian.haley@hp.com> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-22 15:19:58 -07:00
Patrick McHardy	7cf73936fe	[NETFILTER]: ip6t_HL: remove write-only variable Noticed by Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-22 15:19:55 -07:00
Dmitry Mishin	90d47db4a0	[NETFILTER]: x_tables: small check_entry & module_refcount cleanup While standard_target has target->me == NULL, module_put() should be called for it as for others, because there were try_module_get() before. Signed-off-by: Dmitry Mishin <dim@openvz.org> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-22 15:19:51 -07:00
Patrick McHardy	9123de2c04	[NETFILTER]: ip6table_mangle: reroute when nfmark changes in NF_IP6_LOCAL_OUT Now that IPv6 supports policy routing we need to reroute in NF_IP6_LOCAL_OUT when the mark value changes. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-22 15:19:51 -07:00
Patrick McHardy	df0933dcb0	[NETFILTER]: kill listhelp.h Kill listhelp.h and use the list.h functions instead. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-22 15:19:45 -07:00
Patrick McHardy	a1e59abf82	[XFRM]: Fix wildcard as tunnel source Hashing SAs by source address breaks templates with wildcards as tunnel source since the source address used for hashing/lookup is still 0/0. Move source address lookup to xfrm_tmpl_resolve_one() so we can use the real address in the lookup. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-22 15:19:06 -07:00
Thomas Graf	7198f8cec1	[IPV6] address: Support NLM_F_EXCL when adding addresses iproute2 doesn't provide the NLM_F_CREATE flag when adding addresses, it is assumed to be implied. The existing code issues a check on said flag when the modify operation fails (likely due to ENOENT) before continueing to create it, this leads to a hard to predict result, therefore the NLM_F_CREATE check is removed. Signed-off-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-22 15:19:02 -07:00
Thomas Graf	680a27a23a	[IPV6] address: Allow address changes while device is administrative down Same behaviour as IPv4, using IFF_UP is a no-no anyway. Signed-off-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-22 15:19:01 -07:00
Thomas Graf	0ab6803bc9	[IPV6] address: Convert address dumping to new netlink api Replaces INET6_IFADDR_RTA_SPACE with a new function calculating the total required message size for all address messages. Signed-off-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-22 15:19:00 -07:00
Thomas Graf	101bb22969	[IPV6] address: Add put_ifaddrmsg() and rt_scope() Signed-off-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-22 15:18:59 -07:00
Thomas Graf	85486af00b	[IPV6] address: Add put_cacheinfo() to dump struct cacheinfo Signed-off-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-22 15:18:58 -07:00
Thomas Graf	1b29fc2c8b	[IPV6] address: Convert address lookup to new netlink api Signed-off-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-22 15:18:57 -07:00
Thomas Graf	b933f7166b	[IPV6] address: Convert address deletion to new netlink api Signed-off-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-22 15:18:56 -07:00
Thomas Graf	461d8837fa	[IPV6] address: Convert address addition to new netlink api Signed-off-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-22 15:18:55 -07:00
Brian Haley	94aec08ea4	[NETFILTER]: Change tunables to __read_mostly Change some netfilter tunables to __read_mostly. Also fixed some incorrect file reference comments while I was in there. (this will be my last __read_mostly patch unless someone points out something else that needs it) Signed-off-by: Brian Haley <brian.haley@hp.com> Acked-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-22 15:18:54 -07:00
Jamal Hadi Salim	eb878e8457	[IPSEC]: output mode to take an xfrm state as input param Expose IPSEC modes output path to take an xfrm state as input param. This makes it consistent with the input mode processing (which already takes the xfrm state as a param). Signed-off-by: Jamal Hadi Salim <hadi@cyberus.ca> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-22 15:18:48 -07:00
Dmitry Mishin	fda9ef5d67	[NET]: Fix sk->sk_filter field access Function sk_filter() is called from tcp_v{4,6}_rcv() functions with arg needlock = 0, while socket is not locked at that moment. In order to avoid this and similar issues in the future, use rcu for sk->sk_filter field read protection. Signed-off-by: Dmitry Mishin <dim@openvz.org> Signed-off-by: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru> Signed-off-by: Kirill Korotaev <dev@openvz.org>	2006-09-22 15:18:47 -07:00
Masahide NAKAMURA	dc435e6dac	[IPV6] MIP6: Fix to update IP6CB when cloned skbuff is received at HAO. Signed-off-by: Masahide NAKAMURA <nakam@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-22 15:18:46 -07:00
YOSHIFUJI Hideaki	33cc489668	[IPV6] ROUTE: Fix dst reference counting in ip6_pol_route_lookup(). In ip6_pol_route_lookup(), when we finish backtracking at the top-level root entry, we need to hold it. Bug noticed by Mitsuru Chinen <CHINEN@jp.ibm.com>. Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-22 15:18:26 -07:00
Thomas Graf	5176f91ea8	[NETLINK]: Make use of NLA_STRING/NLA_NUL_STRING attribute validation Converts existing NLA_STRING attributes to use the new validation features, saving a couple of temporary buffers. Signed-off-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-22 15:18:25 -07:00
Gerrit Renker	25030a7f9e	[UDP]: Unify UDPv4 and UDPv6 ->get_port() This patch creates one common function which is called by udp_v4_get_port() and udp_v6_get_port(). As a result, * duplicated code is removed * udp_port_rover and local port lookup can now be removed from udp.h * further savings follow since the same function will be used by UDP-Litev4 and UDP-Litev6 In contrast to the patch sent in response to Yoshifujis comments (fixed by this variant), the code below also removes the EXPORT_SYMBOL(udp_port_rover), since udp_port_rover can now remain local to net/ipv4/udp.c. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-22 15:18:21 -07:00
Alexey Dobriyan	e5d679f339	[NET]: Use SLAB_PANIC Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-22 15:18:19 -07:00
YOSHIFUJI Hideaki	ef047f5e10	[NET]: Use BUILD_BUG_ON() for checking size of skb->cb. Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-22 15:18:15 -07:00
Patrick McHardy	366e4adc0f	[IPV6]: Fix routing by fwmark Fix mark comparison, also dump the mask to userspace when the mask is zero, but the mark is not (in which case the mark is dumped, so the mask is needed to make sense of it). Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-22 15:18:14 -07:00
David S. Miller	267935b197	[IPV6]: Fix build with fwmark disabled. Based upon a patch by Brian Haley. Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-22 15:18:09 -07:00
YOSHIFUJI Hideaki	cd9d742622	[IPV6] ROUTE: Add support for fwmask in routing rules. Add support for fwmark masks. A mask of 0xFFFFFFFF is used when a mark value != 0 is sent without a mask. Based on patch for net/ipv4/fib_rules.c by Patrick McHardy <kaber@trash.net>. Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-22 15:18:08 -07:00
YOSHIFUJI Hideaki	2613aad5ab	[IPV6] ROUTE: Fix size of fib6_rule_policy. It should not be RTA_MAX+1 but FRA_MAX+1. Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-22 15:18:07 -07:00
YOSHIFUJI Hideaki	6c5eb6a507	[IPV6] ROUTE: Fix FWMARK support. - Add missing nla_policy entry. - type of fwmark is u32, not u8. Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-22 15:18:06 -07:00
YOSHIFUJI Hideaki	75bff8f023	[IPV6] ROUTE: Routing by FWMARK. Based on patch by Jean Lorchat <lorchat@sfc.wide.ad.jp>. Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>	2006-09-22 15:18:00 -07:00
YOSHIFUJI Hideaki	2cc67cc731	[IPV6] ROUTE: Routing by Traffic Class. Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>	2006-09-22 15:17:59 -07:00
YOSHIFUJI Hideaki	e731c248ba	[IPV6] MIP6: Several obvious clean-ups. - Remove redundant code. Pointed out by Brian Haley <brian.haley@hp.com>. - Unify code paths with/without CONFIG_IPV6_MIP. - Use NIP6_FMT for IPv6 address textual presentation. - Fold long line. Pointed out by David Miller <davem@davemloft.net>. Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>	2006-09-22 15:17:58 -07:00
David S. Miller	e4bec827fe	[IPSEC] esp: Defer output IV initialization to first use. First of all, if the xfrm_state only gets used for input packets this entropy is a complete waste. Secondly, it is often the case that a configuration loads many rules (perhaps even dynamically) and they don't all necessarily ever get used. This get_random_bytes() call was showing up in the profiles for xfrm_state inserts which is how I noticed this. Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-22 15:17:35 -07:00
David S. Miller	9d4a706d85	[XFRM]: Add generation count to xfrm_state and xfrm_dst. Each xfrm_state inserted gets a new generation counter value. When a bundle is created, the xfrm_dst objects get the current generation counter of the xfrm_state they will attach to at dst->xfrm. xfrm_bundle_ok() will return false if it sees an xfrm_dst with a generation count different from the generation count of the xfrm_state that dst points to. This provides a facility by which to passively and cheaply invalidate cached IPSEC routes during SA database changes. Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-22 15:08:42 -07:00
David S. Miller	edcd582152	[XFRM]: Pull xfrm_state_by{spi,src} hash table knowledge out of afinfo. Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-22 15:08:39 -07:00
David S. Miller	2770834c9f	[XFRM]: Pull xfrm_state_bydst hash table knowledge out of afinfo. Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-22 15:08:38 -07:00
Masahide NAKAMURA	64d9fdda8e	[XFRM] IPV6: Support Mobile IPv6 extension headers sorting. Support Mobile IPv6 extension headers sorting for two transformation policies. Mobile IPv6 extension headers should be placed after IPsec transport mode, but before transport AH when outbound. Signed-off-by: Masahide NAKAMURA <nakam@linux-ipv6.org> Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-22 15:08:37 -07:00
Masahide NAKAMURA	58c949d1b9	[XFRM] IPV6: Add sort functions to combine templates/states for IPsec. Add sort functions to combine templates/states for IPsec. Think of outbound transformation order we should be careful with transport AH which must be the last of all transport ones. Signed-off-by: Masahide NAKAMURA <nakam@linux-ipv6.org> Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-22 15:08:36 -07:00
Masahide NAKAMURA	01be8e5d59	[IPV6] MIP6: Ignore to report if mobility headers is rejected. Ignore to report user-space for known mobility headers rejected by destination options header transformation. Mobile IPv6 specification (RFC3775) says that mobility header is used with destination options header carrying home address option only for binding update message. Other type message cannot be used and node must drop it silently (and must not send binding error) if receving such packet. To achieve it, (1) application should use transformation policy and wild-card states to catch binding update message prior other packets (2) kernel doesn't report the reject to user-space not to send binding error message by application. This patch is for (2). Based on MIPL2 kernel patch. This patch was also written by: Ville Nuorvala <vnuorval@tcs.hut.fi> Signed-off-by: Masahide NAKAMURA <nakam@linux-ipv6.org> Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-22 15:08:32 -07:00
Masahide NAKAMURA	70182ed23d	[IPV6] MIP6: Report to user-space when home address option is rejected. Report to user-space when home address option is rejected. In receiving this message user-space application will send Mobile IPv6 binding error. It is rate-limited by kernel. Based on MIPL2 kernel patch. This patch was also written by: Ville Nuorvala <vnuorval@tcs.hut.fi> Signed-off-by: Masahide NAKAMURA <nakam@linux-ipv6.org> Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-22 15:08:31 -07:00
Masahide NAKAMURA	2ce4272a69	[IPV6] MIP6: Transformation support mobility header. Transformation support mobility header. Based on MIPL2 kernel patch. Signed-off-by: Masahide NAKAMURA <nakam@linux-ipv6.org> Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-22 15:07:03 -07:00
Masahide NAKAMURA	6e8f4d48b2	[IPV6] MIP6: Add sending mobility header functions through raw socket. Mobility header is built by user-space and sent through raw socket. Kernel just extracts its type to flow. Based on MIPL2 kernel patch. Signed-off-by: Masahide NAKAMURA <nakam@linux-ipv6.org> Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-22 15:07:02 -07:00
Masahide NAKAMURA	7be96f7628	[IPV6] MIP6: Add receiving mobility header functions through raw socket. Like ICMPv6, mobility header is handled through raw socket. In inbound case, check only whether ICMPv6 error should be sent as a reply or not by kernel. Based on MIPL2 kernel patch. This patch was also written by: Ville Nuorvala <vnuorval@tcs.hut.fi> This patch was also written by: Antti Tuominen <anttit@tcs.hut.fi> Signed-off-by: Masahide NAKAMURA <nakam@linux-ipv6.org> Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-22 15:07:01 -07:00
Noriaki TAKAMIYA	3d126890dd	[IPV6] MIP6: Add destination options header transformation. Add destination options header transformation for Mobile IPv6. Based on MIPL2 kernel patch. This patch was also written by: Ville Nuorvala <vnuorval@tcs.hut.fi> Signed-off-by: Noriaki TAKAMIYA <takamiya@po.ntts.co.jp> Signed-off-by: Masahide NAKAMURA <nakam@linux-ipv6.org> Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-22 15:06:58 -07:00
Noriaki TAKAMIYA	2c8d7ca0f7	[IPV6] MIP6: Add routing header type 2 transformation. Add routing header type 2 transformation for Mobile IPv6. Based on MIPL2 kernel patch. Signed-off-by: Noriaki TAKAMIYA <takamiya@po.ntts.co.jp> Signed-off-by: Masahide NAKAMURA <nakam@linux-ipv6.org> Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-22 15:06:57 -07:00
Masahide NAKAMURA	27637df92e	[IPV6] IPSEC: Support sending with Mobile IPv6 extension headers. Mobile IPv6 defines home address option as an option of destination options header. It is placed before fragment header then ip6_find_1stfragopt() is fixed to know about it. Home address option also carries final source address of the flow, then outbound AH calculation should take care of it like routing header case. Based on MIPL2 kernel patch. Signed-off-by: Masahide NAKAMURA <nakam@linux-ipv6.org> Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-22 15:06:56 -07:00
Masahide NAKAMURA	793832361f	[IPV6] MIP6: Revert address to send ICMPv6 error. IPv6 source address is replaced in receiving packet with home address option carried by destination options header. To send ICMPv6 error back, original address which is received one on wire should be used. This function checks such header is included and reverts them. Based on MIPL2 kernel patch. This patch was also written by: Ville Nuorvala <vnuorval@tcs.hut.fi> Signed-off-by: Masahide NAKAMURA <nakam@linux-ipv6.org> Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-22 15:06:55 -07:00
Masahide NAKAMURA	a831f5bbc8	[IPV6] MIP6: Add inbound interface of home address option. Add inbound function of home address option by registering it to TLV table for destination options header. Based on MIPL2 kernel patch. This patch was also written by: Ville Nuorvala <vnuorval@tcs.hut.fi> Signed-off-by: Masahide NAKAMURA <nakam@linux-ipv6.org> Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-22 15:06:53 -07:00
Masahide NAKAMURA	a80ff03e05	[IPV6]: Allow to replace skbuff by TLV parser. In receiving Mobile IPv6 home address option which is a TLV carried by destination options header, kernel will try to mangle source adderss of packet. Think of cloned skbuff it is required to replace it by the parser just like routing header case. This is a framework to achieve that to allow TLV parser to replace inbound skbuff pointer. Signed-off-by: Masahide NAKAMURA <nakam@linux-ipv6.org> Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-22 15:06:51 -07:00
Masahide NAKAMURA	c61a404325	[IPV6]: Find option offset by type. This is a helper to search option offset from extension header which can carry TLV option like destination options header. Mobile IPv6 home address option will use it. Based on MIPL2 kernel patch. Signed-off-by: Masahide NAKAMURA <nakam@linux-ipv6.org> Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-22 15:06:50 -07:00
Masahide NAKAMURA	280a9d3400	[IPV6] MIP6: Add socket option and ancillary data interface of routing header type 2. Add socket option and ancillary data interface of routing header type 2. Mobile IPv6 application will use this to send binding acknowledgement with the header without relation of confirmed route optimization (binding). Based on MIPL2 kernel patch. Signed-off-by: Masahide NAKAMURA <nakam@linux-ipv6.org> Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-22 15:06:49 -07:00
Masahide NAKAMURA	65d4ed9221	[IPV6] MIP6: Add inbound interface of routing header type 2. Add inbound interface of routing header type 2 for Mobile IPv6. Based on MIPL2 kernel patch. This patch was also written by: Ville Nuorvala <vnuorval@tcs.hut.fi> Signed-off-by: Masahide NAKAMURA <nakam@linux-ipv6.org> Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-22 15:06:48 -07:00
Masahide NAKAMURA	ee53826801	[IPV6]: Add Kconfig to enable Mobile IPv6. Add Kconfig to enable Mobile IPv6. Based on MIPL2 kernel patch. Signed-off-by: Noriaki TAKAMIYA <takamiya@po.ntts.co.jp> Signed-off-by: Masahide NAKAMURA <nakam@linux-ipv6.org> Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>	2006-09-22 15:06:46 -07:00
Masahide NAKAMURA	e53820de0f	[XFRM] IPV6: Restrict bundle reusing For outbound transformation, bundle is checked whether it is suitable for current flow to be reused or not. In such IPv6 case as below, transformation may apply incorrect bundle for the flow instead of creating another bundle: - The policy selector has destination prefix length < 128 (Two or more addresses can be matched it) - Its bundle holds dst entry of default route whose prefix length < 128 (Previous traffic was used such route as next hop) - The policy and the bundle were used a transport mode state and this time flow address is not matched the bundled state. This issue is found by Mobile IPv6 usage to protect mobility signaling by IPsec, but it is not a Mobile IPv6 specific. This patch adds strict check to xfrm_bundle_ok() for each state mode and address when prefix length is less than 128. Signed-off-by: Masahide NAKAMURA <nakam@linux-ipv6.org> Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-22 15:06:44 -07:00
Masahide NAKAMURA	9afaca0579	[XFRM] IPV6: Update outbound state timestamp for each sending. With this patch transformation state is updated last used time for each sending. Xtime is used for it like other state lifetime expiration. Mobile IPv6 enabled nodes will want to know traffic status of each binding (e.g. judgement to request binding refresh by correspondent node, or to keep home/care-of nonce alive by mobile node). The last used timestamp is an important hint about it. Based on MIPL2 kernel patch. This patch was also written by: Henrik Petander <petander@tcs.hut.fi> Signed-off-by: Masahide NAKAMURA <nakam@linux-ipv6.org> Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-22 15:06:43 -07:00
Masahide NAKAMURA	1b5c229987	[XFRM] STATE: Support non-fragment outbound transformation headers. For originated outbound IPv6 packets which will fragment, ip6_append_data() should know length of extension headers before sending them and the length is carried by dst_entry. IPv6 IPsec headers fragment then transformation was designed to place all headers after fragment header. OTOH Mobile IPv6 extension headers do not fragment then it is a good idea to make dst_entry have non-fragment length to tell it to ip6_append_data(). Signed-off-by: Masahide NAKAMURA <nakam@linux-ipv6.org> Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-22 15:06:41 -07:00
Masahide NAKAMURA	99505a8436	[XFRM] STATE: Add a hook to obtain local/remote outbound address. Outbound transformation replaces both source and destination address with state's end-point addresses at the same time when IPsec tunnel mode. It is also required to change them for Mobile IPv6 route optimization, but we should care about the following differences: - changing result is not end-point but care-of address - either source or destination is replaced for each state This hook is a common platform to change outbound address. Based on MIPL2 kernel patch. Signed-off-by: Masahide NAKAMURA <nakam@linux-ipv6.org> Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-22 15:06:41 -07:00
Masahide NAKAMURA	fbd9a5b47e	[XFRM] STATE: Common receive function for route optimization extension headers. XFRM_STATE_WILDRECV flag is introduced; the last resort state is set it and receives packet which is not route optimized but uses such extension headers i.e. Mobile IPv6 signaling (binding update and acknowledgement). A node enabled Mobile IPv6 adds the state. Based on MIPL2 kernel patch. Signed-off-by: Masahide NAKAMURA <nakam@linux-ipv6.org> Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-22 15:06:39 -07:00
Masahide NAKAMURA	1d71627d69	[XFRM] STATE: Introduce route optimization mode. Route optimization is used with routing header and destination options header for Mobile IPv6. At outbound it makes header space like IPsec transport. At inbound it does nothing because exhdrs.c functions have responsibility to update skbuff information for these headers. Signed-off-by: Masahide NAKAMURA <nakam@linux-ipv6.org> Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-22 15:06:37 -07:00
Masahide NAKAMURA	aee5adb430	[XFRM] STATE: Add a hook to find offset to be inserted header in outbound. On current kernel, ip6_find_1stfragopt() is used by IPv6 IPsec to find offset to be inserted header in outbound for transport mode. (BTW, no usage may be needed for IPv4 case.) Mobile IPv6 requires another logic for routing header and destination options header respectively. This patch is common platform for the offset and adopts it to IPsec. Based on MIPL2 kernel patch. Signed-off-by: Masahide NAKAMURA <nakam@linux-ipv6.org> Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-22 15:06:36 -07:00
Masahide NAKAMURA	eb2971b68a	[XFRM] STATE: Search by address using source address list. This is a support to search transformation states by its addresses by using source address list for Mobile IPv6 usage. To use it from user-space, it is also added a message type for source address as a xfrm state option. Based on MIPL2 kernel patch. Signed-off-by: Masahide NAKAMURA <nakam@linux-ipv6.org> Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-22 15:06:35 -07:00
Masahide NAKAMURA	6c44e6b7ab	[XFRM] STATE: Add source address list. Support source address based searching. Mobile IPv6 will use it. Based on MIPL2 kernel patch. Signed-off-by: Masahide NAKAMURA <nakam@linux-ipv6.org> Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-22 15:06:34 -07:00
Masahide NAKAMURA	7e49e6de30	[XFRM]: Add XFRM_MODE_xxx for future use. Transformation mode is used as either IPsec transport or tunnel. It is required to add two more items, route optimization and inbound trigger for Mobile IPv6. Based on MIPL2 kernel patch. This patch was also written by: Ville Nuorvala <vnuorval@tcs.hut.fi> Signed-off-by: Masahide NAKAMURA <nakam@linux-ipv6.org> Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-22 15:05:15 -07:00
YOSHIFUJI Hideaki	77d16f450a	[IPV6] ROUTE: Unify RT6_F_xxx and RT6_SELECT_F_xxx flags Unify RT6_F_xxx and RT6_SELECT_F_xxx flags into RT6_LOOKUP_F_xxx flags, and put them into ip6_route.h Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Acked-by: Ville Nuorvala <vnuorval@tcs.hut.fi Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-22 14:55:56 -07:00
YOSHIFUJI Hideaki	4e96c2b418	[IPV6] KCONFIG: Add subtrees support. This is for developers only. Based on MIPL2 kernel patch. Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: Ville Nuorvala <vnuorval@tcs.hut.fi Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-22 14:55:55 -07:00
YOSHIFUJI Hideaki	c0bece9f2a	[IPV6] ROUTE: Add credits about subtree fixes. Based on MIPL2 kernel patch. Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-22 14:55:55 -07:00
YOSHIFUJI Hideaki	cb15d9c224	[IPV6] NDISC: Search subtrees when backtracking on receipt of redirects. Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Acked-by: Ville Nuorvala <vnuorval@tcs.hut.fi Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-22 14:55:54 -07:00
YOSHIFUJI Hideaki	150730d5a5	[IPV6] ROUTE: Purge clones on other trees when deleting a route. Based on MIPL2 kernel patch. Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: Ville Nuorvala <vnuorval@tcs.hut.fi Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-22 14:55:53 -07:00
YOSHIFUJI Hideaki	982f56f3a9	[IPV6] ROUTE: Search subtree when backtracking. Based on MIPL2 kernel patch. Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: Ville Nuorvala <vnuorval@tcs.hut.fi> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-22 14:55:52 -07:00
YOSHIFUJI Hideaki	7fc33165a7	[IPV6] ROUTE: Put SUBTREE() as FIB6_SUBTREE() into ip6_fib.h for future use. Based on MIPL2 kernel patch. Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: Ville Nuorvala <vnuorval@tcs.hut.fi> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-22 14:55:51 -07:00
YOSHIFUJI Hideaki	fefc2a6c20	[IPV6] ROUTE: Allow searching subtree only. Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Acked-by: Ville Nuorvala <vnuorval@tcs.hut.fi> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-22 14:55:50 -07:00
YOSHIFUJI Hideaki	825e288ef4	[IPV6] ROUTE: Make sure we do not exceed args in fib6_lookup_1(). Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Acked-by: Ville Nuorvala <vnuorval@tcs.hut.fi> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-22 14:55:49 -07:00
YOSHIFUJI Hideaki	3fc5e0440b	[IPV6] ROUTE: Fix looking up a route on subtree. Even on RTN_ROOT node, we need to process its subtree first. Fix NULL pointer dereference in fib6_locate(). Based on MIPL2 kernel patch. Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: Ville Nuorvala <vnuorval@tcs.hut.fi> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-22 14:55:48 -07:00
YOSHIFUJI Hideaki	2285adc1e6	[IPV6] ROUTE: Prune clones from main tree as well. Based on MIPL2 kernel patch. Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: Ville Nuorvala <vnuorval@tcs.hut.fi> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-22 14:55:47 -07:00
YOSHIFUJI Hideaki	66729e18df	[IPV6] ROUTE: Make sure we have fn->leaf when adding a node on subtree. Based on MIPL2 kernel patch. Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: Ville Nuorvala <vnuorval@tcs.hut.fi> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-22 14:55:46 -07:00
YOSHIFUJI Hideaki	8e1ef0a95b	[IPV6]: Cache source address as well in ipv6_pinfo{}. Based on MIPL2 kernel patch. Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: Ville Nuorvala <vnuorval@tcs.hut.fi> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-22 14:55:45 -07:00
YOSHIFUJI Hideaki	cf6b198259	[IPV6] ROUTE: Introduce a helper to check route validity. Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Acked-by: Ville Nuorvala <vnuorval@tcs.hut.fi> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-22 14:55:44 -07:00
YOSHIFUJI Hideaki	af18476584	[IPV6] NDISC: Initialize fl with outbound interface to lookup rules properly. Based on MIPL2 kernel patch. Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: Ville Nuorvala <vnuorval@tcs.hut.fi> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-22 14:55:43 -07:00
YOSHIFUJI Hideaki	a6279458c5	[IPV6] NDISC: Search over all possible rules on receipt of redirect. Split up function for finding routes for redirects. Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-22 14:55:42 -07:00
YOSHIFUJI Hideaki	5e032e32ec	[IPV6] NDISC: Take source address into account for redirects. Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: Ville Nuorvala <vnuorval@tcs.hut.fi> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-22 14:55:41 -07:00
Patrick McHardy	5fa2a7601f	[NETFILTER]: ip6_tables: consolidate dst and hbh matches The matches are identical besides one looking for NEXTHDR_HOP, the other for NEXTHDR_DEST. Remove ip6t_dst.c and handle both in ip6t_hbh.c. Signed-off-by: Patrick McHardy <kaber@trash,net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-22 14:55:37 -07:00
Patrick McHardy	efa741656e	[NETFILTER]: x_tables: remove unused size argument to check/destroy functions The size is verified by x_tables and isn't needed by the modules anymore. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-22 14:55:34 -07:00
Patrick McHardy	fe1cb10873	[NETFILTER]: x_tables: remove unused argument to target functions Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-22 14:55:33 -07:00
Patrick McHardy	4470bbc749	[NETFILTER]: x_tables: make use of mass registation helpers Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-22 14:55:32 -07:00
David S. Miller	72d3b2c970	[IPV6]: Fixup ip6_del_rt() call for new args. Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-22 14:55:15 -07:00
Thomas Graf	ab364a6f96	[IPv6] route: Convert GETROUTE to use new netlink api Fixes various unvalidated netlink attributes causing memory corruptions when left empty by userspace applications. Signed-off-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-22 14:55:14 -07:00
Thomas Graf	2d7202bfdd	[IPv6] route: Convert FIB6 dumping to use new netlink api Signed-off-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-22 14:55:13 -07:00
Thomas Graf	86872cb579	[IPv6] route: FIB6 configuration using struct fib6_config Replaces the struct in6_rtmsg based interface orignating from the ioctl interface with a struct fib6_config based on. Allows changing the interface without breaking the ioctl interface and avoids passing on tons of parameters. The recently introduced struct nl_info is used to pass on netlink authorship information for notifications. Signed-off-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-22 14:55:12 -07:00
Thomas Graf	40e22e8f3d	[IPv6] route: Simplify ip6_ins_rt() Provide a simple ip6_ins_rt() for the majority of users and an alternative for the exception via netlink. Avoids code obfuscation. Signed-off-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-22 14:55:11 -07:00
Thomas Graf	e0a1ad73d3	[IPv6] route: Simplify ip6_del_rt() Provide a simple ip6_del_rt() for the majority of users and an alternative for the exception via netlink. Avoids code obfuscation. Signed-off-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-22 14:55:11 -07:00
Brian Haley	ab32ea5d8a	[NET/IPV4/IPV6]: Change some sysctl variables to __read_mostly Change net/core, ipv4 and ipv6 sysctl variables to __read_mostly. Couldn't actually measure any performance increase while testing (.3% I consider noise), but seems like the right thing to do. Signed-off-by: Brian Haley <brian.haley@hp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-22 14:55:03 -07:00
Thomas Graf	8c384bfa36	[IPv6] prefix: Convert prefix notifications to use rtnl_notify() Fixes a wrong use of current->pid as netlink pid. Signed-off-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-22 14:54:58 -07:00
Thomas Graf	8d7a76c9b1	[IPv6] link: Convert link notifications to use rtnl_notify() Fixes a wrong use of current->pid as netlink pid. Signed-off-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-22 14:54:57 -07:00
Thomas Graf	21713ebc4f	[IPv6] route: Convert route notifications to use rtnl_notify() Fixes a wrong use of current->pid as netlink pid. Signed-off-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-22 14:54:56 -07:00
Thomas Graf	5d62026643	[IPv6] address: Convert address notification to use rtnl_notify() Fixes a wrong use of current->pid as netlink pid. Signed-off-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-22 14:54:55 -07:00
Thomas Graf	2942e90050	[RTNETLINK]: Use rtnl_unicast() for rtnetlink unicasts Signed-off-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-22 14:54:48 -07:00
David S. Miller	f8d8fda54a	[IPV6] udp: Fix type in previous change. UDPv6 stats are UDP6_foo not UDP_foo. Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-22 14:54:47 -07:00
David S. Miller	a18135eb93	[IPV6]: Add UDP_MIB_{SND,RCV}BUFERRORS handling. Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-22 14:54:42 -07:00
Adrian Bunk	90d41122f7	[IPV6] ip6_fib.c: make code static Make the following needlessly global code static: - fib6_walker_lock - struct fib6_walker_list - fib6_walk_continue() - fib6_walk() Signed-off-by: Adrian Bunk <bunk@stusta.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-22 14:54:38 -07:00
Patrick McHardy	1b43af5480	[IPV6]: Increase number of possible routing tables to 2^32 Increase number of possible routing tables to 2^32 by replacing iterations over all possible table IDs by hash table walking. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-22 14:54:27 -07:00
Patrick McHardy	9e762a4a89	[NET]: Introduce RTA_TABLE/FRA_TABLE attributes Introduce RTA_TABLE route attribute and FRA_TABLE routing rule attribute to hold 32 bit routing table IDs. Usespace compatibility is provided by continuing to accept and send the rtm_table field, but because of its limited size it can only carry the low 8 bits of the table ID. This implies that if larger IDs are used, _all_ userspace programs using them need to use RTA_TABLE. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-22 14:54:25 -07:00
Ville Nuorvala	b142955324	[IPV6]: Make sure fib6_rule_lookup doesn't return NULL The callers of fib6_rule_lookup don't expect it to return NULL, therefore it must return ip6_null_entry whenever fib_rule_lookup fails. Signed-off-by: Ville Nuorvala <vnuorval@tcs.hut.fi> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-22 14:54:12 -07:00
David S. Miller	8423a9aadf	[IPV6]: Protect RTM_GETRULE table entry with IPV6_MULTIPLE_TABLES ifdef This is how IPv4 handles this case too. Based upon a patch from Andrew Morton. Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-22 14:54:08 -07:00
Adrian Bunk	8ce11e6a9f	[NET]: Make code static. This patch makes needlessly global code static. Signed-off-by: Adrian Bunk <bunk@stusta.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-22 14:54:07 -07:00
Patrick McHardy	3226f68817	[IPV6]: Fix policy routing lookup When the lookup in a table returns ip6_null_entry the policy routing lookup returns it instead of continuing in the next table, which effectively means it only searches the local table. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-22 14:53:58 -07:00
Patrick McHardy	6c813a7297	[IPV6]: Fix crash in ip6_del_rt ip6_null_entry doesn't have rt6i_table set, when trying to delete it the kernel crashes dereferencing table->tb6_lock. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-22 14:53:57 -07:00
Patrick McHardy	d7aba67f81	[IPV6]: Fix thinko in rt6_fill_node This looks like a mistake, the table ID is overwritten again. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-22 14:53:56 -07:00
Patrick McHardy	84fa7933a3	[NET]: Replace CHECKSUM_HW by CHECKSUM_PARTIAL/CHECKSUM_COMPLETE Replace CHECKSUM_HW by CHECKSUM_PARTIAL (for outgoing packets, whose checksum still needs to be completed) and CHECKSUM_COMPLETE (for incoming packets, device supplied full checksum). Patch originally from Herbert Xu, updated by myself for 2.6.18-rc3. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-22 14:53:53 -07:00
Thomas Graf	1823730fbc	[IPv4]: Move interface address bits to linux/if_addr.h Signed-off-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-22 14:53:47 -07:00
Thomas Graf	101367c2f8	[IPV6]: Policy Routing Rules Adds support for policy routing rules including a new local table for routes with a local destination. Signed-off-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-22 14:53:41 -07:00
Thomas Graf	c71099acce	[IPV6]: Multiple Routing Tables Adds the framework to support multiple IPv6 routing tables. Currently all automatically generated routes are put into the same table. This could be changed at a later point after considering the produced locking overhead. Signed-off-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-22 14:53:39 -07:00
Thomas Graf	5d0bbeeb14	[IPV6]: Remove ndiscs rt6_lock dependency (Ab)using rt6_lock wouldn't work anymore if rt6_lock is converted into a per table lock. Signed-off-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-22 14:53:38 -07:00
Venkat Yekkirala	4237c75c0a	[MLSXFRM]: Auto-labeling of child sockets This automatically labels the TCP, Unix stream, and dccp child sockets as well as openreqs to be at the same MLS level as the peer. This will result in the selection of appropriately labeled IPSec Security Associations. This also uses the sock's sid (as opposed to the isec sid) in SELinux enforcement of secmark in rcv_skb and postroute_last hooks. Signed-off-by: Venkat Yekkirala <vyekkirala@TrustedCS.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-22 14:53:29 -07:00
Venkat Yekkirala	beb8d13bed	[MLSXFRM]: Add flow labeling This labels the flows that could utilize IPSec xfrms at the points the flows are defined so that IPSec policy and SAs at the right label can be used. The following protos are currently not handled, but they should continue to be able to use single-labeled IPSec like they currently do. ipmr ip_gre ipip igmp sit sctp ip6_tunnel (IPv6 over IPv6 tunnel device) decnet Signed-off-by: Venkat Yekkirala <vyekkirala@TrustedCS.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-22 14:53:27 -07:00
Herbert Xu	e4d5b79c66	[CRYPTO] users: Use crypto_comp and crypto_has_* This patch converts all users to use the new crypto_comp type and the crypto_has_* functions. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2006-09-21 11:46:22 +10:00
Herbert Xu	07d4ee583e	[IPSEC]: Use HMAC template and hash interface This patch converts IPsec to use the new HMAC template. The names of existing simple digest algorithms may still be used to refer to their HMAC composites. The same structure can be used by other MACs such as AES-XCBC-MAC. This patch also switches from the digest interface to hash. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-21 11:46:18 +10:00
Herbert Xu	6b7326c849	[IPSEC] ESP: Use block ciphers where applicable This patch converts IPSec/ESP to use the new block cipher type where applicable. Similar to the HMAC conversion, existing algorithm names have been kept for compatibility. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2006-09-21 11:46:14 +10:00
Remi Denis-Courmont	d0ee011f72	[IPV6]: Accept -1 for IPV6_TCLASS This patch should add support for -1 as "default" IPv6 traffic class, as specified in IETF RFC3542 §6.5. Within the kernel, it seems tclass < 0 is already handled, but setsockopt, getsockopt and recvmsg calls won't accept it from userland. Signed-off-by: Remi Denis-Courmont <rdenis@simphalempin.com> Acked-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-17 23:21:08 -07:00
YOSHIFUJI Hideaki	e012d51cbc	[IPV6]: Fix tclass setting for raw sockets. np->cork.tclass is used only in cork'ed context. Otherwise, np->tclass should be used. Bug#7096 reported by Remi Denis-Courmont <rdenis@simphalempin.com>. Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-17 23:21:07 -07:00
YOSHIFUJI Hideaki	99c7bc0133	[IPV6]: Fix kernel OOPs when setting sticky socket options. Bug noticed by Remi Denis-Courmont <rdenis@simphalempin.com>. Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-08-31 14:52:17 -07:00
Keir Fraser	57f5f544f5	[IPV6]: ipv6_add_addr should install dstentry earlier ipv6_add_addr allocates a struct inet6_ifaddr and a dstentry, but it doesn't install the dstentry in ifa->rt until after it releases the addrconf_hash_lock. This means other CPUs will be able to see the new address while it hasn't been initialized completely yet. One possible fix would be to grab the ifp->lock spinlock when creating the address struct; a simpler fix is to just move the assignment. Acked-by: jbeulich@novell.com Acked-by: okir@suse.de Acked-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-08-29 21:22:18 -07:00
Lv Liangying	76d0cc1b64	[IPV6]: SNMPv2 "ipv6IfStatsInAddrErrors" counter error When I tested Linux kernel 2.6.17.7 about statistics "ipv6IfStatsInAddrErrors", found that this counter couldn't increase correctly. The criteria is RFC2465: ipv6IfStatsInAddrErrors OBJECT-TYPE SYNTAX Counter32 MAX-ACCESS read-only STATUS current DESCRIPTION "The number of input datagrams discarded because the IPv6 address in their IPv6 header's destination field was not a valid address to be received at this entity. This count includes invalid addresses (e.g., ::0) and unsupported addresses (e.g., addresses with unallocated prefixes). For entities which are not IPv6 routers and therefore do not forward datagrams, this counter includes datagrams discarded because the destination address was not a local address." ::= { ipv6IfStatsEntry 5 } When I send packet to host with destination that is ether invalid address(::0) or unsupported addresses(1::1), the Linux kernel just discard the packet, and the counter doesn't increase(in the function ip6_pkt_discard). Signed-off-by: Lv Liangying <lvly@nanjing-fnst.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-08-29 21:22:15 -07:00
Stephen Hemminger	59eed279c5	[IPV6]: Segmentation offload not set correctly on TCP children TCP over IPV6 would incorrectly inherit the GSO settings. This would cause kernel to send Tcp Segmentation Offload packets for IPV6 data to devices that can't handle it. It caused the sky2 driver to lock http://bugzilla.kernel.org/show_bug.cgi?id=7050 and the e1000 would generate bogus packets. I can't blame the hardware for gagging if the upper layers feed it garbage. This was a new bug in 2.6.18 introduced with GSO support. Signed-off-by: Stephen Hemminger <shemminger@osdl.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-08-26 18:42:01 -07:00
David L Stevens	acd6e00b8e	[MCAST]: Fix filter leak on device removal. This fixes source filter leakage when a device is removed and a process leaves the group thereafter. This also includes corresponding fixes for IPv6 multicast source filters on device removal. Signed-off-by: David L Stevens <dlstevens@us.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-08-17 16:29:57 -07:00
Ingo Molnar	640c41c77a	[IPV6] lockdep: annotate __icmpv6_socket Split off __icmpv6_socket's sk->sk_dst_lock class, because it gets used from softirqs, which is safe for __icmpv6_sockets (because they never get directly used via userspace syscalls), but unsafe for normal sockets. Has no effect on non-lockdep kernels. Signed-off-by: Ingo Molnar <mingo@elte.hu> Acked-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-08-17 16:29:48 -07:00
Herbert Xu	e9fa4f7bd2	[INET]: Use pskb_trim_unique when trimming paged unique skbs The IPv4/IPv6 datagram output path was using skb_trim to trim paged packets because they know that the packet has not been cloned yet (since the packet hasn't been given to anything else in the system). This broke because skb_trim no longer allows paged packets to be trimmed. Paged packets must be given to one of the pskb_trim functions instead. This patch adds a new pskb_trim_unique function to cover the IPv4/IPv6 datagram output path scenario and replaces the corresponding skb_trim calls with it. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-08-13 20:12:58 -07:00
Patrick McHardy	0eff66e625	[NETFILTER]: {arp,ip,ip6}_tables: proper error recovery in init path Neither of {arp,ip,ip6}_tables cleans up behind itself when something goes wrong during initialization. Noticed by Rennie deGraaf <degraaf@cpsc.ucalgary.ca> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-08-13 18:57:28 -07:00
Herbert Xu	06aebfb7fa	[IPV6]: The ifa lock is a BH lock The ifa lock is expected to be taken in BH context (by addrconf timers) so we must disable BH when accessing it from user context. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-08-09 16:52:04 -07:00
Wei Dong	dafee49085	[IPV6]: SNMPv2 "ipv6IfStatsOutFragCreates" counter error When I tested linux kernel 2.6.71.7 about statistics "ipv6IfStatsOutFragCreates", and found that it couldn't increase correctly. The criteria is RFC 2465: ipv6IfStatsOutFragCreates OBJECT-TYPE SYNTAX Counter32 MAX-ACCESS read-only STATUS current DESCRIPTION "The number of output datagram fragments that have been generated as a result of fragmentation at this output interface." ::= { ipv6IfStatsEntry 15 } I think there are two issues in Linux kernel. 1st: RFC2465 specifies the counter is "The number of output datagram fragments...". I think increasing this counter after output a fragment successfully is better. And it should not be increased even though a fragment is created but failed to output. 2nd: If we send a big ICMP/ICMPv6 echo request to a host, and receive ICMP/ICMPv6 echo reply consisted of some fragments. As we know that in Linux kernel first fragmentation occurs in ICMP layer(maybe saying transport layer is better), but this is not the "real" fragmentation,just do some "pre-fragment" -- allocate space for date, and form a frag_list, etc. The "real" fragmentation happens in IP layer -- set offset and MF flag and so on. So I think in "fast path" for ip_fragment/ip6_fragment, if we send a fragment which "pre-fragment" by upper layer we should also increase "ipv6IfStatsOutFragCreates". Signed-off-by: Wei Dong <weid@nanjing-fnst.com> Acked-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-08-02 13:41:21 -07:00
Wei Dong	32c524d1c4	[IPV6]: SNMPv2 "ipv6IfStatsInHdrErrors" counter error When I tested Linux kernel 2.6.17.7 about statistics "ipv6IfStatsInHdrErrors", found that this counter couldn't increase correctly. The criteria is RFC2465: ipv6IfStatsInHdrErrors OBJECT-TYPE SYNTAX Counter3 MAX-ACCESS read-only STATUS current DESCRIPTION "The number of input datagrams discarded due to errors in their IPv6 headers, including version number mismatch, other format errors, hop count exceeded, errors discovered in processing their IPv6 options, etc." ::= { ipv6IfStatsEntry 2 } When I send TTL=0 and TTL=1 a packet to a router which need to be forwarded, router just sends an ICMPv6 message to tell the sender that TIME_EXCEED and HOPLIMITS, but no increments for this counter(in the function ip6_forward). Signed-off-by: Wei Dong <weid@nanjing-fnst.com> Acked-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-08-02 13:39:57 -07:00
Tom Tucker	8d71740c56	[NET]: Core net changes to generate netevents Generate netevents for: - neighbour changes - routing redirects - pmtu changes Signed-off-by: Tom Tucker <tom@opengridcomputing.com> Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-08-02 13:38:21 -07:00
Wei Yongjun	3687b1dc6f	[TCP]: SNMPv2 tcpAttemptFails counter error Refer to RFC2012, tcpAttemptFails is defined as following: tcpAttemptFails OBJECT-TYPE SYNTAX Counter32 MAX-ACCESS read-only STATUS current DESCRIPTION "The number of times TCP connections have made a direct transition to the CLOSED state from either the SYN-SENT state or the SYN-RCVD state, plus the number of times TCP connections have made a direct transition to the LISTEN state from the SYN-RCVD state." ::= { tcp 7 } When I lookup into RFC793, I found that the state change should occured under following condition: 1. SYN-SENT -> CLOSED a) Received ACK,RST segment when SYN-SENT state. 2. SYN-RCVD -> CLOSED b) Received SYN segment when SYN-RCVD state(came from LISTEN). c) Received RST segment when SYN-RCVD state(came from SYN-SENT). d) Received SYN segment when SYN-RCVD state(came from SYN-SENT). 3. SYN-RCVD -> LISTEN e) Received RST segment when SYN-RCVD state(came from LISTEN). In my test, those direct state transition can not be counted to tcpAttemptFails. Signed-off-by: Wei Yongjun <yjwei@nanjing-fnst.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-08-02 13:38:19 -07:00
Herbert Xu	497c615aba	[IPV6]: Audit all ip6_dst_lookup/ip6_dst_store calls The current users of ip6_dst_lookup can be divided into two classes: 1) The caller holds no locks and is in user-context (UDP). 2) The caller does not want to lookup the dst cache at all. The second class covers everyone except UDP because most people do the cache lookup directly before calling ip6_dst_lookup. This patch adds ip6_sk_dst_lookup for the first class. Similarly ip6_dst_store users can be divded into those that need to take the socket dst lock and those that don't. This patch adds __ip6_dst_store for those (everyone except UDP/datagram) that don't need an extra lock. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-08-02 13:38:14 -07:00
Patrick McHardy	679e898a47	[XFRM]: Fix protocol field value for outgoing IPv6 GSO packets Signed-off-by: Patrick McHardy <kaber@trash.net> Acked-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-08-02 13:38:13 -07:00
Noriaki TAKAMIYA	081bba5b3a	[IPV6] ADDRCONF: NLM_F_REPLACE support for RTM_NEWADDR Based on MIPL2 kernel patch. Signed-off-by: Noriaki YAKAMIYA <takamiya@po.ntts.co.jp> Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>	2006-08-02 13:38:12 -07:00
Noriaki TAKAMIYA	6c22382805	[IPV6] ADDRCONF: Support get operation of single address Based on MIPL2 kernel patch. Signed-off-by: Noriaki TAKAMIYA <takamiya@po.ntts.co.jp> Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>	2006-08-02 13:38:11 -07:00
YOSHIFUJI Hideaki	8f27ebb982	[IPV6] ADDRCONF: Do not verify an address with infinity lifetime We also do not try regenarating new temporary address corresponding to an address with infinite preferred lifetime. Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>	2006-08-02 13:38:10 -07:00
Noriaki TAKAMIYA	0778769d39	[IPV6] ADDRCONF: Allow user-space to specify address lifetime Based on MIPL2 kernel patch. Signed-off-by: Noriaki TAKAMIYA <takamiya@po.ntts.co.jp> Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>	2006-08-02 13:38:09 -07:00
YOSHIFUJI Hideaki	643162258e	[IPV6] ADDRCONF: Check payload length for IFA_LOCAL attribute in RTM_{ADD,DEL}MSG message Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>	2006-08-02 13:38:08 -07:00
Tetsuo Handa	f59fc7f30b	[IPV4/IPV6]: Setting 0 for unused port field in RAW IP recvmsg(). From: Tetsuo Handa from-linux-kernel@i-love.sakura.ne.jp The recvmsg() for raw socket seems to return random u16 value from the kernel stack memory since port field is not initialized. But I'm not sure this patch is correct. Does raw socket return any information stored in port field? [ BSD defines RAW IP recvmsg to return a sin_port value of zero. This is described in Steven's TCP/IP Illustrated Volume 2 on page 1055, which is discussing the BSD rip_input() implementation. ] Acked-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-07-25 17:05:35 -07:00
Guillaume Chazarain	6b7fdc3ae1	[IPV6]: Clean skb cb on IPv6 input. Clear the accumulated junk in IP6CB when starting to handle an IPV6 packet. Signed-off-by: Guillaume Chazarain <guichaz@yahoo.fr> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-07-24 23:44:44 -07:00
David S. Miller	a922ba5510	[IPV6] xfrm6_tunnel: Delete debugging code. It doesn't compile, and it's dubious in several regards: 1) is enabled by non-Kconfig controlled CONFIG_* value (noted by Randy Dunlap) 2) XFRM6_TUNNEL_SPI_MAGIC is defined after it's first use 3) the debugging messages print object pointer addresses which have no meaning without context So let's just get rid of it. Signed-off-by: David S. Miller <davem@davemloft.net>	2006-07-24 13:49:06 -07:00
Panagiotis Issaris	0da974f4f3	[NET]: Conversions from kmalloc+memset to k(z\|c)alloc. Signed-off-by: Panagiotis Issaris <takis@issaris.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-07-21 14:51:30 -07:00
Herbert Xu	5d9c5a3292	[IPV4]: Get rid of redundant IPCB->opts initialisation Now that we always zero the IPCB->opts in ip_rcv, it is no longer necessary to do so before calling netif_rx for tunneled packets. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-07-21 14:29:53 -07:00
Herbert Xu	da952315c9	[IPCOMP]: Fix truesize after decompression The truesize check has uncovered the fact that we forgot to update truesize after pskb_expand_head. Unfortunately pskb_expand_head can't update it for us because it's used in all sorts of different contexts, some of which would not allow truesize to be updated by itself. So the solution for now is to simply update it in IPComp. This patch also changes skb_put to __skb_put since we've just expanded tailroom by exactly that amount so we know it's there (but gcc does not). Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-07-12 13:58:55 -07:00
YOSHIFUJI Hideaki	8a6ce0c083	[IPV6]: Use ipv6_addr_src_scope for link address sorting. In the source address selection, the address must be sorted from global to node-local. But, ifp->scope is different from the scope for source address selection. 2001::1 fe80::1 ::1 ifp->scope 0 0x02 0x01 ipv6_addr_src_scope(&ifp->addr) 0x0e 0x02 0x01 So, we need to use ipv6_addr_src_scope(&ifp->addr) for sorting. And, for backward compatibility, addresses should be sorted from new one to old one. Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Acked-by: Brian Haley <brian.haley@hp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-07-12 13:58:53 -07:00
Brian Haley	e55ffac601	[IPV6]: order addresses by scope If IPv6 addresses are ordered by scope, then ipv6_dev_get_saddr() can break-out of the device addr_list for() loop when the candidate source address scope is less than the destination address scope. Signed-off-by: Brian Haley <brian.haley@hp.com> Acked-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-07-12 13:58:37 -07:00
Herbert Xu	a430a43d08	[NET] gso: Fix up GSO packets with broken checksums Certain subsystems in the stack (e.g., netfilter) can break the partial checksum on GSO packets. Until they're fixed, this patch allows this to work by recomputing the partial checksums through the GSO mechanism. Once they've all been converted to update the partial checksum instead of clearing it, this workaround can be removed. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-07-08 13:34:56 -07:00
Herbert Xu	89114afd43	[NET] gso: Add skb_is_gso This patch adds the wrapper function skb_is_gso which can be used instead of directly testing skb_shinfo(skb)->gso_size. This makes things a little nicer and allows us to change the primary key for indicating whether an skb is GSO (if we ever want to do that). Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-07-08 13:34:32 -07:00
Randy Dunlap	4bdbf6c033	[NET]: add+use poison defines Add and use poison defines in net/. Signed-off-by: Randy Dunlap <rdunlap@xenotime.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-07-03 19:47:27 -07:00
Michael Chan	6703931c54	[IPV6]: Fix ipv6 GSO payload length Fix ipv6 GSO payload length calculation. The ipv6 payload length excludes the ipv6 base header length and so must be subtracted. Signed-off-by: Michael Chan <mchan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-07-03 19:41:11 -07:00
Herbert Xu	bbcf467dab	[NET]: Verify gso_type too in gso_segment We don't want nasty Xen guests to pass a TCPv6 packet in with gso_type set to TCPv4 or even UDP (or a packet that's both TCP and UDP). Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-07-03 19:38:35 -07:00
Linus Torvalds	e37a72de84	Merge master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 * master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6: [IPV6]: Added GSO support for TCPv6 [NET]: Generalise TSO-specific bits from skb_setup_caps [IPV6]: Added GSO support for TCPv6 [IPV6]: Remove redundant length check on input [NETFILTER]: SCTP conntrack: fix crash triggered by packet without chunks [TG3]: Update version and reldate [TG3]: Add TSO workaround using GSO [TG3]: Turn on hw fix for ASF problems [TG3]: Add rx BD workaround [TG3]: Add tg3_netif_stop() in vlan functions [TCP]: Reset gso_segs if packet is dodgy	2006-06-30 15:40:17 -07:00
Herbert Xu	f83ef8c0b5	[IPV6]: Added GSO support for TCPv6 This patch adds GSO support for IPv6 and TCPv6. This is based on a patch by Ananda Raju <Ananda.Raju@neterion.com>. His original description is: This patch enables TSO over IPv6. Currently Linux network stacks restricts TSO over IPv6 by clearing of the NETIF_F_TSO bit from "dev->features". This patch will remove this restriction. This patch will introduce a new flag NETIF_F_TSO6 which will be used to check whether device supports TSO over IPv6. If device support TSO over IPv6 then we don't clear of NETIF_F_TSO and which will make the TCP layer to create TSO packets. Any device supporting TSO over IPv6 will set NETIF_F_TSO6 flag in "dev->features" along with NETIF_F_TSO. In case when user disables TSO using ethtool, NETIF_F_TSO will get cleared from "dev->features". So even if we have NETIF_F_TSO6 we don't get TSO packets created by TCP layer. SKB_GSO_TCPV4 renamed to SKB_GSO_TCP to make it generic GSO packet. SKB_GSO_UDPV4 renamed to SKB_GSO_UDP as UFO is not a IPv4 feature. UFO is supported over IPv6 also The following table shows there is significant improvement in throughput with normal frames and CPU usage for both normal and jumbo. -------------------------------------------------- \| \| 1500 \| 9600 \| \| ------------------\|-------------------\| \| \| thru CPU \| thru CPU \| -------------------------------------------------- \| TSO OFF \| 2.00 5.5% id \| 5.66 20.0% id \| -------------------------------------------------- \| TSO ON \| 2.63 78.0 id \| 5.67 39.0% id \| -------------------------------------------------- Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-06-30 14:12:10 -07:00
Herbert Xu	adcfc7d0b4	[IPV6]: Added GSO support for TCPv6 This patch adds GSO support for IPv6 and TCPv6. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-06-30 14:12:06 -07:00
Herbert Xu	2889139a6a	[IPV6]: Remove redundant length check on input We don't need to check skb->len when we're just about to call pskb_may_pull since that checks it for us. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Acked-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-06-30 14:12:04 -07:00
Jörn Engel	6ab3d5624e	Remove obsolete #include <linux/config.h> Signed-off-by: Jörn Engel <joern@wohnheim.fh-wedel.de> Signed-off-by: Adrian Bunk <bunk@stusta.de>	2006-06-30 19:25:36 +02:00
Sridhar Samudrala	47da8ee681	[TCP]: Export accept queue len of a TCP listening socket via rx_queue While debugging a TCP server hang issue, we noticed that currently there is no way for a user to get the acceptq backlog value for a TCP listen socket. All the standard networking utilities that display socket info like netstat, ss and /proc/net/tcp have 2 fields called rx_queue and tx_queue. These fields do not mean much for listening sockets. This patch uses one of these unused fields(rx_queue) to export the accept queue len for listening sockets. Signed-off-by: Sridhar Samudrala <sri@us.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-06-29 16:57:57 -07:00
Darrel Goeddel	c7bdb545d2	[NETLINK]: Encapsulate eff_cap usage within security framework. This patch encapsulates the usage of eff_cap (in netlink_skb_params) within the security framework by extending security_netlink_recv to include a required capability parameter and converting all direct usage of eff_caps outside of the lsm modules to use the interface. It also updates the SELinux implementation of the security_netlink_send and security_netlink_recv hooks to take advantage of the sid in the netlink_skb_params struct. This also enables SELinux to perform auditing of netlink capability checks. Please apply, for 2.6.18 if possible. Signed-off-by: Darrel Goeddel <dgoeddel@trustedcs.com> Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov> Acked-by: James Morris <jmorris@namei.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-06-29 16:57:55 -07:00
Patrick McHardy	da298d3a4f	[NETFILTER]: x_tables: fix xt_register_table error propagation When xt_register_table fails the error is not properly propagated back. Based on patch by Lepton Wu <ytht.net@gmail.com>. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-06-29 16:57:40 -07:00
Ingo Molnar	34af946a22	[PATCH] spin/rwlock init cleanups locking init cleanups: - convert " = SPIN_LOCK_UNLOCKED" to spin_lock_init() or DEFINE_SPINLOCK() - convert rwlocks in a similar manner this patch was generated automatically. Motivation: - cleanliness - lockdep needs control of lock initialization, which the open-coded variants do not give - it's also useful for -rt and for lock debugging in general Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-06-27 17:32:39 -07:00
Herbert Xu	09b8f7a93e	[IPSEC]: Handle GSO packets This patch segments GSO packets received by the IPsec stack. This can happen when a NIC driver injects GSO packets into the stack which are then forwarded to another host. The primary application of this is going to be Xen where its backend driver may inject GSO packets into dom0. Of course this also can be used by other virtualisation schemes such as VMWare or UML since the tap device could be modified to inject GSO packets received through splice. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-06-23 02:07:38 -07:00
Herbert Xu	7967168cef	[NET]: Merge TSO/UFO fields in sk_buff Having separate fields in sk_buff for TSO/UFO (tso_size/ufo_size) is not going to scale if we add any more segmentation methods (e.g., DCCP). So let's merge them. They were used to tell the protocol of a packet. This function has been subsumed by the new gso_type field. This is essentially a set of netdev feature bits (shifted by 16 bits) that are required to process a specific skb. As such it's easy to tell whether a given device can process a GSO skb: you just have to and the gso_type field and the netdev's features field. I've made gso_type a conjunction. The idea is that you have a base type (e.g., SKB_GSO_TCPV4) that can be modified further to support new features. For example, if we add a hardware TSO type that supports ECN, they would declare NETIF_F_TSO \| NETIF_F_TSO_ECN. All TSO packets with CWR set would have a gso_type of SKB_GSO_TCPV4 \| SKB_GSO_TCPV4_ECN while all other TSO packets would be SKB_GSO_TCPV4. This means that only the CWR packets need to be emulated in software. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-06-23 02:07:29 -07:00
YOSHIFUJI Hideaki	5e2707fa3a	[IPV6] ADDRCONF: Fix default source address selection without CONFIG_IPV6_PRIVACY We need to update hiscore.rule even if we don't enable CONFIG_IPV6_PRIVACY, because we have more less significant rule; longest match. Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-06-23 02:07:24 -07:00
Łukasz Stelmach	102128e3a2	[IPV6]: Fix source address selection. Two additional labels (RFC 3484, sec. 10.3) for IPv6 addreses are defined to make a distinction between global unicast addresses and Unique Local Addresses (fc00::/7, RFC 4193) and Teredo (2001::/32, RFC 4380). It is necessary to avoid attempts of connection that would either fail (eg. fec0:: to 2001:feed::) or be sub-optimal (2001:0:: to 2001:feed::). Signed-off-by: Łukasz Stelmach <stlman@poczta.fm> Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-06-23 02:07:22 -07:00
YOSHIFUJI Hideaki	c5396a31b2	[IPV6]: Sum real space for RTAs. This patch fixes RTNLGRP_IPV6_IFINFO netlink notifications. Issue pointed out by Patrick McHardy <kaber@trash.net>. Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Acked-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-06-17 22:48:48 -07:00
Herbert Xu	b38dfee3d6	[NET]: skb_trim audit I found a few more spots where pskb_trim_rcsum could be used but were not. This patch changes them to use it. Also, sk_filter can get paged skb data. Therefore we must use pskb_trim instead of skb_trim. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-06-17 21:30:20 -07:00
Herbert Xu	364c6badde	[NET]: Clean up skb_linearize The linearisation operation doesn't need to be super-optimised. So we can replace __skb_linearize with __pskb_pull_tail which does the same thing but is more general. Also, most users of skb_linearize end up testing whether the skb is linear or not so it helps to make skb_linearize do just that. Some callers of skb_linearize also use it to copy cloned data, so it's useful to have a new function skb_linearize_cow to copy the data if it's either non-linear or cloned. Last but not least, I've removed the gfp argument since nobody uses it anymore. If it's ever needed we can easily add it back. Misc bugs fixed by this patch: * via-velocity error handling (also, no SG => no frags) Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-06-17 21:30:16 -07:00
James Morris	984bc16cc9	[SECMARK]: Add secmark support to core networking. Add a secmark field to the skbuff structure, to allow security subsystems to place security markings on network packets. This is similar to the nfmark field, except is intended for implementing security policy, rather than than networking policy. This patch was already acked in principle by Dave Miller. Signed-off-by: James Morris <jmorris@namei.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-06-17 21:29:57 -07:00
Patrick McHardy	39a27a35c5	[NETFILTER]: conntrack: add sysctl to disable checksumming Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-06-17 21:28:57 -07:00
Patrick McHardy	6442f1cf89	[NETFILTER]: conntrack: don't call helpers for related ICMP messages None of the existing helpers expects to get called for related ICMP packets and some even drop them if they can't parse them. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-06-17 21:28:55 -07:00
Herbert Xu	31a4ab9302	[IPSEC] proto: Move transport mode input path into xfrm_mode_transport Now that we have xfrm_mode objects we can move the transport mode specific input decapsulation code into xfrm_mode_transport. This removes duplicate code as well as unnecessary header movement in case of tunnel mode SAs since we will discard the original IP header immediately. This also fixes a minor bug for transport-mode ESP where the IP payload length is set to the correct value minus the header length (with extension headers for IPv6). Of course the other neat thing is that we no longer have to allocate temporary buffers to hold the IP headers for ESP and IPComp. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-06-17 21:28:41 -07:00
Herbert Xu	b59f45d0b2	[IPSEC] xfrm: Abstract out encapsulation modes This patch adds the structure xfrm_mode. It is meant to represent the operations carried out by transport/tunnel modes. By doing this we allow additional encapsulation modes to be added without clogging up the xfrm_input/xfrm_output paths. Candidate modes include 4-to-6 tunnel mode, 6-to-4 tunnel mode, and BEET modes. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-06-17 21:28:39 -07:00
Herbert Xu	546be2405b	[IPSEC] xfrm: Undo afinfo lock proliferation The number of locks used to manage afinfo structures can easily be reduced down to one each for policy and state respectively. This is based on the observation that the write locks are only held by module insertion/removal which are very rare events so there is no need to further differentiate between the insertion of modules like ipv6 versus esp6. The removal of the read locks in xfrm4_policy.c/xfrm6_policy.c might look suspicious at first. However, after you realise that nobody ever takes the corresponding write lock you'll feel better :) As far as I can gather it's an attempt to guard against the removal of the corresponding modules. Since neither module can be unloaded at all we can leave it to whoever fixes up IPv6 unloading :) Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-06-17 21:28:37 -07:00
Chris Leech	1a2449a87b	[I/OAT]: TCP recv offload to I/OAT Locks down user pages and sets up for DMA in tcp_recvmsg, then calls dma_async_try_early_copy in tcp_v4_do_rcv Signed-off-by: Chris Leech <christopher.leech@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-06-17 21:25:56 -07:00
YOSHIFUJI Hideaki	4d0c591166	[IPV6] ROUTE: Don't try less preferred routes for on-link routes. In addition to the real on-link routes, NONEXTHOP routes should be considered on-link. Problem reported by Meelis Roos <mroos@linux.ee>. Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Acked-by: Meelis Roos <mroos@linux.ee> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-05-26 13:23:41 -07:00
Alexey Dobriyan	4195f81453	[NET]: Fix "ntohl(ntohs" bugs Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-05-22 16:53:22 -07:00
Solar Designer	2c8ac66bb2	[NETFILTER]: Fix do_add_counters race, possible oops or info leak (CVE-2006-0039) Solar Designer found a race condition in do_add_counters(). The beginning of paddc is supposed to be the same as tmp which was sanity-checked above, but it might not be the same in reality. In case the integer overflow and/or the race condition are triggered, paddc->num_counters might not match the allocation size for paddc. If the check below (t->private->number != paddc->num_counters) nevertheless passes (perhaps this requires the race condition to be triggered), IPT_ENTRY_ITERATE() would read kernel memory beyond the allocation size, potentially causing an oops or leaking sensitive data (e.g., passwords from host system or from another VPS) via counter increments. This requires CAP_NET_ADMIN. Signed-off-by: Solar Designer <solar@openwall.com> Signed-off-by: Kirill Korotaev <dev@openvz.org> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-05-19 02:16:52 -07:00
Philip Craig	5c170a09d9	[NETFILTER]: fix format specifier for netfilter log targets The prefix argument for nf_log_packet is a format specifier, so don't pass the user defined string directly to it. Signed-off-by: Philip Craig <philipc@snapgear.com> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-05-19 02:15:47 -07:00
Alexey Dobriyan	d8fd0a7316	[IPV6]: Endian fix in net/ipv6/netfilter/ip6t_eui64.c:match(). Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-05-16 15:24:41 -07:00
Alexey Kuznetsov	b0013fd47b	[IPV6]: skb leakage in inet6_csk_xmit inet6_csk_xit does not free skb when routing fails. Signed-off-by: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-05-10 13:24:38 -07:00
YOSHIFUJI Hideaki	c302e6d54e	[IPV6]: Fix race in route selection. We eliminated rt6_dflt_lock (to protect default router pointer) at 2.6.17-rc1, and introduced rt6_select() for general router selection. The function is called in the context of rt6_lock read-lock held, but this means, we have some race conditions when we do round-robin. Signed-off-by; YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-04-29 18:33:22 -07:00
Patrick McHardy	e4a79ef811	[NETFILTER]: ip6_tables: remove broken comefrom debugging The introduction of x_tables broke comefrom debugging, remove it from ip6_tables as well (ip_tables already got removed). Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-04-24 17:27:32 -07:00
YOSHIFUJI Hideaki	b809739a1b	[IPV6]: Clean up hop-by-hop options handler. - Removed unused argument (nhoff) for ipv6_parse_hopopts(). - Make ipv6_parse_hopopts() to align with other extension header handlers. - Removed pointless assignment (hdr), which is not used afterwards. Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-04-18 15:57:53 -07:00
YOSHIFUJI Hideaki	e5d25a9088	[IPV6] XFRM: Fix decoding session with preceding extension header(s). We did not correctly decode session with preceding extension header(s). This was because we had already pulled preceding headers, skb->nh.raw + 40 + 1 - skb->data was minus, and pskb_may_pull() failed. We now have IP6CB(skb)->nhoff and skb->h.raw, and we can start parsing / decoding upper layer protocol from current position. Tracked down by Noriaki TAKAMIYA <takamiya@po.ntts.co.jp> and tested by Kazunori Miyazawa <kazunori@miyazawa.org>. Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-04-18 15:57:52 -07:00
YOSHIFUJI Hideaki	e3cae904d7	[IPV6] XFRM: Don't use old copy of pointer after pskb_may_pull(). Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-04-18 15:57:51 -07:00
YOSHIFUJI Hideaki	ec6700958a	[IPV6]: Ensure to have hop-by-hop options in our header of &sk_buff. Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-04-18 15:57:50 -07:00
Zach Brown	f6596f9d2b	[IPv6] reassembly: Always compute hash under the fragment lock. This closes a race where an ipq6hashfn() caller could get a hash value and race with the cycling of the random seed. By the time they got to the read_lock they'd have a stale hash value and might not find previous fragments of their datagram. This matches the previous patch to IPv4. Signed-off-by: Zach Brown <zach.brown@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-04-11 17:21:05 -07:00
KAMEZAWA Hiroyuki	6f91204225	[PATCH] for_each_possible_cpu: network codes for_each_cpu() actually iterates across all possible CPUs. We've had mistakes in the past where people were using for_each_cpu() where they should have been iterating across only online or present CPUs. This is inefficient and possibly buggy. We're renaming for_each_cpu() to for_each_possible_cpu() to avoid this in the future. This patch replaces for_each_cpu with for_each_possible_cpu under /net Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Acked-by: "David S. Miller" <davem@davemloft.net> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-04-11 06:18:31 -07:00
Denis Vlasenko	b1a7ffcb7a	[IPV6]: Deinline few large functions in inet6 code Deinline a few functions which produce 200+ bytes of code. Size Uses Wasted Name and definition ===== ==== ====== ================================================ 429 3 818 __inet6_lookup include/net/inet6_hashtables.h 404 2 384 __inet6_lookup_established include/net/inet6_hashtables.h 206 3 372 __inet6_hash include/net/inet6_hashtables.h Signed-off-by: Denis Vlasenko <vda@ilport.com.ua> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-04-09 22:48:59 -07:00
Brian Haley	503e4faad1	[NETFILTER]: Fix build with CONFIG_NETFILTER=y/m on IA64 Can't build with CONFIG_NETFILTER=y/m on IA64, there's a missing #include in net/ipv6/netfilter.c net/ipv6/netfilter.c: In function `nf_ip6_checksum': net/ipv6/netfilter.c:92: warning: implicit declaration of function `csum_ipv6_magic' Signed-off-by: Brian Haley <brian.haley@hp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-04-09 22:25:49 -07:00
Patrick McHardy	96f6bf82ea	[NETFILTER]: Convert conntrack/ipt_REJECT to new checksumming functions Besides removing lots of duplicate code, all converted users benefit from improved HW checksum error handling. Tested with and without HW checksums in almost all combinations. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-04-09 22:25:42 -07:00
Patrick McHardy	422c346fad	[NETFILTER]: Add address family specific checksum helpers Add checksum operation which takes care of verifying the checksum and dealing with HW checksum errors and avoids multiple checksum operations by setting ip_summed to CHECKSUM_UNNECESSARY after successful verification. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-04-09 22:25:41 -07:00
Patrick McHardy	bce8032ef3	[NETFILTER]: Introduce infrastructure for address family specific operations Change the queue rerouter intrastructure to a generic usable infrastructure for address family specific operations as a base for some cleanups. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-04-09 22:25:40 -07:00
Patrick McHardy	32292a7ff1	[NETFILTER]: Fix section mismatch warnings Fix section mismatch warnings caused by netfilter's init_or_cleanup functions used in many places by splitting the init from the cleanup parts. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-04-09 22:25:34 -07:00
Patrick McHardy	964ddaa10d	[NETFILTER]: Clean up hook registration Clean up hook registration by makeing use of the new mass registration and unregistration helpers. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-04-09 22:25:33 -07:00
Herbert Xu	45af08be6d	[INET]: Use port unreachable instead of proto for tunnels This patch changes GRE and SIT to generate port unreachable instead of protocol unreachable errors when we can't find a matching tunnel for a packet. This removes the ambiguity as to whether the error is caused by no tunnel being found or by the lack of support for the given tunnel type. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-04-09 22:25:29 -07:00
Herbert Xu	50fba2aa7c	[INET]: Move no-tunnel ICMP error to tunnel4/tunnel6 This patch moves the sending of ICMP messages when there are no IPv4/IPv6 tunnels present to tunnel4/tunnel6 respectively. Please note that for now if xfrm4_tunnel/xfrm6_tunnel is loaded then no ICMP messages will ever be sent. This is similar to how we handle AH/ESP/IPCOMP. This move fixes the bug where we always send an ICMP message when there is no ip6_tunnel device present for a given packet even if it is later handled by IPsec. It also causes ICMP messages to be sent when no IPIP tunnel is present. I've decided to use the "port unreachable" ICMP message over the current value of "address unreachable" (and "protocol unreachable" by GRE) because it is not ambiguous unlike the other ones which can be triggered by other conditions. There seems to be no standard specifying what value must be used so this change should be OK. In fact we should change GRE to use this value as well. Incidentally, this patch also fixes a fairly serious bug in xfrm6_tunnel where we don't check whether the embedded IPv6 header is present before dereferencing it for the inside source address. This patch is inspired by a previous patch by Hugo Santos <hsantos@av.it.pt>. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-04-09 22:25:25 -07:00
Yasuyuki Kozakai	a89ecb6a2e	[NETFILTER]: x_tables: unify IPv4/IPv6 multiport match This unifies ipt_multiport and ip6t_multiport to xt_multiport. As a result, this addes support for inversion and port range match to IPv6 packets. Signed-off-by: Yasuyuki Kozakai <yasuyuki.kozakai@toshiba.co.jp> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-04-01 02:22:54 -08:00
Yasuyuki Kozakai	dc5ab2faec	[NETFILTER]: x_tables: unify IPv4/IPv6 esp match This unifies ipt_esp and ip6t_esp to xt_esp. Please note that now a user program needs to specify IPPROTO_ESP as protocol to use esp match with IPv6. This means that ip6tables requires '-p esp' like iptables. Signed-off-by: Yasuyuki Kozakai <yasuyuki.kozakai@toshiba.co.jp> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-04-01 02:22:30 -08:00
Herbert Xu	dbe5b4aaaf	[IPSEC]: Kill unused decap state structure This patch removes the *_decap_state structures which were previously used to share state between input/post_input. This is no longer needed. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-04-01 00:54:16 -08:00
Herbert Xu	e695633e21	[IPSEC]: Kill unused decap state argument This patch removes the decap_state argument from the xfrm input hook. Previously this function allowed the input hook to share state with the post_input hook. The latter has since been removed. The only purpose for it now is to check the encap type. However, it is easier and better to move the encap type check to the generic xfrm_rcv function. This allows us to get rid of the decap state argument altogether. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-04-01 00:52:46 -08:00
Andrew Morton	65b4b4e81a	[NETFILTER]: Rename init functions. Every netfilter module uses `init' for its module_init() function and `fini' or `cleanup' for its module_exit() function. Problem is, this creates uninformative initcall_debug output and makes ctags rather useless. So go through and rename them all to $(filename)_init and $(filename)_fini. Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-03-28 17:02:48 -08:00
Herbert Xu	d2acc3479c	[INET]: Introduce tunnel4/tunnel6 Basically this patch moves the generic tunnel protocol stuff out of xfrm4_tunnel/xfrm6_tunnel and moves it into the new files of tunnel4.c and tunnel6 respectively. The reason for this is that the problem that Hugo uncovered is only the tip of the iceberg. The real problem is that when we removed the dependency of ipip on xfrm4_tunnel we didn't really consider the module case at all. For instance, as it is it's possible to build both ipip and xfrm4_tunnel as modules and if the latter is loaded then ipip simply won't load. After considering the alternatives I've decided that the best way out of this is to restore the dependency of ipip on the non-xfrm-specific part of xfrm4_tunnel. This is acceptable IMHO because the intention of the removal was really to be able to use ipip without the xfrm subsystem. This is still preserved by this patch. So now both ipip/xfrm4_tunnel depend on the new tunnel4.c which handles the arbitration between the two. The order of processing is determined by a simple integer which ensures that ipip gets processed before xfrm4_tunnel. The situation for ICMP handling is a little bit more complicated since we may not have enough information to determine who it's for. It's not a big deal at the moment since the xfrm ICMP handlers are basically no-ops. In future we can deal with this when we look at ICMP caching in general. The user-visible change to this is the removal of the TUNNEL Kconfig prompts. This makes sense because it can only be used through IPCOMP as it stands. The addition of the new modules shouldn't introduce any problems since module dependency will cause them to be loaded. Oh and I also turned some unnecessary pskb's in IPv6 related to this patch to skb's. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-03-28 17:02:46 -08:00
Linus Torvalds	fdccffc6b7	Merge master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 * master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6: [NET]: drop duplicate assignment in request_sock [IPSEC]: Fix tunnel error handling in ipcomp6	2006-03-27 08:47:29 -08:00
Alan Stern	e041c68341	[PATCH] Notifier chain update: API changes The kernel's implementation of notifier chains is unsafe. There is no protection against entries being added to or removed from a chain while the chain is in use. The issues were discussed in this thread: http://marc.theaimsgroup.com/?l=linux-kernel&m=113018709002036&w=2 We noticed that notifier chains in the kernel fall into two basic usage classes: "Blocking" chains are always called from a process context and the callout routines are allowed to sleep; "Atomic" chains can be called from an atomic context and the callout routines are not allowed to sleep. We decided to codify this distinction and make it part of the API. Therefore this set of patches introduces three new, parallel APIs: one for blocking notifiers, one for atomic notifiers, and one for "raw" notifiers (which is really just the old API under a new name). New kinds of data structures are used for the heads of the chains, and new routines are defined for registration, unregistration, and calling a chain. The three APIs are explained in include/linux/notifier.h and their implementation is in kernel/sys.c. With atomic and blocking chains, the implementation guarantees that the chain links will not be corrupted and that chain callers will not get messed up by entries being added or removed. For raw chains the implementation provides no guarantees at all; users of this API must provide their own protections. (The idea was that situations may come up where the assumptions of the atomic and blocking APIs are not appropriate, so it should be possible for users to handle these things in their own way.) There are some limitations, which should not be too hard to live with. For atomic/blocking chains, registration and unregistration must always be done in a process context since the chain is protected by a mutex/rwsem. Also, a callout routine for a non-raw chain must not try to register or unregister entries on its own chain. (This did happen in a couple of places and the code had to be changed to avoid it.) Since atomic chains may be called from within an NMI handler, they cannot use spinlocks for synchronization. Instead we use RCU. The overhead falls almost entirely in the unregister routine, which is okay since unregistration is much less frequent that calling a chain. Here is the list of chains that we adjusted and their classifications. None of them use the raw API, so for the moment it is only a placeholder. ATOMIC CHAINS ------------- arch/i386/kernel/traps.c: i386die_chain arch/ia64/kernel/traps.c: ia64die_chain arch/powerpc/kernel/traps.c: powerpc_die_chain arch/sparc64/kernel/traps.c: sparc64die_chain arch/x86_64/kernel/traps.c: die_chain drivers/char/ipmi/ipmi_si_intf.c: xaction_notifier_list kernel/panic.c: panic_notifier_list kernel/profile.c: task_free_notifier net/bluetooth/hci_core.c: hci_notifier net/ipv4/netfilter/ip_conntrack_core.c: ip_conntrack_chain net/ipv4/netfilter/ip_conntrack_core.c: ip_conntrack_expect_chain net/ipv6/addrconf.c: inet6addr_chain net/netfilter/nf_conntrack_core.c: nf_conntrack_chain net/netfilter/nf_conntrack_core.c: nf_conntrack_expect_chain net/netlink/af_netlink.c: netlink_chain BLOCKING CHAINS --------------- arch/powerpc/platforms/pseries/reconfig.c: pSeries_reconfig_chain arch/s390/kernel/process.c: idle_chain arch/x86_64/kernel/process.c idle_notifier drivers/base/memory.c: memory_chain drivers/cpufreq/cpufreq.c cpufreq_policy_notifier_list drivers/cpufreq/cpufreq.c cpufreq_transition_notifier_list drivers/macintosh/adb.c: adb_client_list drivers/macintosh/via-pmu.c sleep_notifier_list drivers/macintosh/via-pmu68k.c sleep_notifier_list drivers/macintosh/windfarm_core.c wf_client_list drivers/usb/core/notify.c usb_notifier_list drivers/video/fbmem.c fb_notifier_list kernel/cpu.c cpu_chain kernel/module.c module_notify_list kernel/profile.c munmap_notifier kernel/profile.c task_exit_notifier kernel/sys.c reboot_notifier_list net/core/dev.c netdev_chain net/decnet/dn_dev.c: dnaddr_chain net/ipv4/devinet.c: inetaddr_chain It's possible that some of these classifications are wrong. If they are, please let us know or submit a patch to fix them. Note that any chain that gets called very frequently should be atomic, because the rwsem read-locking used for blocking chains is very likely to incur cache misses on SMP systems. (However, if the chain's callout routines may sleep then the chain cannot be atomic.) The patch set was written by Alan Stern and Chandra Seetharaman, incorporating material written by Keith Owens and suggestions from Paul McKenney and Andrew Morton. [jes@sgi.com: restructure the notifier chain initialization macros] Signed-off-by: Alan Stern <stern@rowland.harvard.edu> Signed-off-by: Chandra Seetharaman <sekharan@us.ibm.com> Signed-off-by: Jes Sorensen <jes@sgi.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-03-27 08:44:50 -08:00
Herbert Xu	6abaaaae6d	[IPSEC]: Fix tunnel error handling in ipcomp6 The error handling in ipcomp6_tunnel_create is broken in two ways: 1) If we fail to allocate an SPI (this should never happen in practice since there are plenty of 32-bit SPI values for us to use), we will still go ahead and create the SA. 2) When xfrm_init_state fails, we first of all may trigger the BUG_TRAP in __xfrm_state_destroy because we didn't set the state to DEAD. More importantly we end up returning the freed state as if we succeeded! This patch fixes them both. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-03-26 17:37:54 -08:00
Patrick McHardy	b30bd282cb	[IPV6]: ip6_xmit: remove unnecessary NULL ptr check The sk argument to ip6_xmit is never NULL nowadays since the skb->priority assigment expects a valid socket. Coverity #354 Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-03-23 01:17:25 -08:00
Pablo Neira Ayuso	b9f78f9fca	[NETFILTER]: nf_conntrack: support for layer 3 protocol load on demand x_tables matches and targets that require nf_conntrack_ipv[4\|6] to work don't have enough information to load on demand these modules. This patch introduces the following changes to solve this issue: o nf_ct_l3proto_try_module_get: try to load the layer 3 connection tracker module and increases the refcount. o nf_ct_l3proto_module put: drop the refcount of the module. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-03-22 13:56:08 -08:00
Pablo Neira Ayuso	a45049c51c	[NETFILTER]: x_tables: set the protocol family in x_tables targets/matches Set the family field in xt_[matches\|targets] registered. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-03-22 13:55:40 -08:00
Patrick McHardy	443da0d527	[NETFILTER]: Fix ip6tables breakage from {get,set}sockopt compat layer do_ipv6_getsockopt returns -EINVAL for unknown options, not -ENOPROTOOPT as do_ipv6_setsockopt. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-03-22 13:53:20 -08:00
Ingo Oeser	322f74a432	[IPV6]: Cleanups for net/ipv6/addrconf.c (kzalloc, early exit) v2 Here are some possible (and trivial) cleanups. - use kzalloc() where possible - invert allocation failure test like if (object) { /* Rest of function here / } to if (object == NULL) return NULL; / Rest of function here */ Signed-off-by: Ingo Oeser <ioe-lkml@rameria.de> Acked-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-03-20 23:01:47 -08:00
Ingo Oeser	0c600eda4b	[IPV6]: Nearly complete kzalloc cleanup for net/ipv6 Stupidly use kzalloc() instead of kmalloc()/memset() everywhere where this is possible in net/ipv6/*.c . Signed-off-by: Ingo Oeser <ioe-lkml@rameria.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-03-20 23:01:32 -08:00
Ingo Oeser	78c784c47a	[IPV6]: Cleanup of net/ipv6/reassambly.c Two minor cleanups: 1. Using kzalloc() in fraq_alloc_queue() saves the memset() in ipv6_frag_create(). 2. Invert sense of if-statements to streamline code. Inverts the comment, too. Signed-off-by: Ingo Oeser <ioe-lkml@rameria.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-03-20 23:01:17 -08:00
Arnaldo Carvalho de Melo	543d9cfeec	[NET]: Identation & other cleanups related to compat_[gs]etsockopt cset No code changes, just tidying up, in some cases moving EXPORT_SYMBOLs to just after the function exported, etc. Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-03-20 22:48:35 -08:00
Dmitry Mishin	3fdadf7d27	[NET]: {get\|set}sockopt compatibility layer This patch extends {get\|set}sockopt compatibility layer in order to move protocol specific parts to their place and avoid huge universal net/compat.c file in the future. Signed-off-by: Dmitry Mishin <dim@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-03-20 22:45:21 -08:00
Dave Jones	c750360938	[IPV6]: remove useless test in ip6_append_data We've already dereferenced 'np' a dozen times at this point, so it's safe to say it's not null. Signed-off-by: Dave Jones <davej@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-03-20 22:44:52 -08:00
Ingo Molnar	57b47a53ec	[NET]: sem2mutex part 2 Semaphore to mutex conversion. The conversion was generated via scripts, and the result was validated automatically via a script as well. Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-03-20 22:35:41 -08:00
Arjan van de Ven	4a3e2f711a	[NET] sem2mutex: net/ Semaphore to mutex conversion. The conversion was generated via scripts, and the result was validated automatically via a script as well. Signed-off-by: Arjan van de Ven <arjan@infradead.org> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-03-20 22:33:17 -08:00
Arnaldo Carvalho de Melo	c4d9390941	[ICSK]: Introduce inet_csk_ctl_sock_create Consolidating open coded sequences in tcp and dccp, v4 and v6. Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-03-20 22:01:03 -08:00
David S. Miller	d76e60a5b5	[IPV6]: Fix some code/comment formatting in ip6_dst_output(). Signed-off-by: David S. Miller <davem@davemloft.net>	2006-03-20 21:35:50 -08:00
Jamal Hadi Salim	9500e8a81f	[IPSEC]: Sync series - fast path Fast path sequence updates that will generate ipsec async events Signed-off-by: Jamal Hadi Salim <hadi@cyberus.ca> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-03-20 19:15:29 -08:00
Patrick McHardy	c4b8851392	[NETFILTER]: x_tables: replace IPv4/IPv6 policy match by address family independant version Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-03-20 18:03:40 -08:00
Patrick McHardy	f2ffd9eeda	[NETFILTER]: Move ip6_masked_addrcmp to include/net/ipv6.h Replace netfilter's ip6_masked_addrcmp by a more efficient version in include/net/ipv6.h to make it usable without module dependencies. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-03-20 18:03:16 -08:00
Patrick McHardy	c498673474	[NETFILTER]: x_tables: add xt_{match,target} arguments to match/target functions Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-03-20 18:02:56 -08:00
Patrick McHardy	1c524830d0	[NETFILTER]: x_tables: pass registered match/target data to match/target functions This allows to make decisions based on the revision (and address family with a follow-up patch) at runtime. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-03-20 18:02:15 -08:00
Patrick McHardy	7f9397138e	[NETFILTER]: Convert ip6_tables matches/targets to centralized error checking Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-03-20 18:01:43 -08:00
Patrick McHardy	3cdc7c953e	[NETFILTER]: Change {ip,ip6,arp}_tables to use centralized error checking Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-03-20 18:00:36 -08:00
Yasuyuki Kozakai	6ea46c9c12	[NETFILTER]: nf_conntrack: use ipv6_addr_equal in nf_ct_reasm Signed-off-by: Yasuyuki Kozakai <yasuyuki.kozakai@toshiba.co.jp> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-03-20 17:58:44 -08:00
Harald Welte	dc808fe28d	[NETFILTER] nf_conntrack: clean up to reduce size of 'struct nf_conn' This patch moves all helper related data fields of 'struct nf_conn' into a separate structure 'struct nf_conn_help'. This new structure is only present in conntrack entries for which we actually have a helper loaded. Also, this patch cleans up the nf_conntrack 'features' mechanism to resemble what the original idea was: Just glue the feature-specific data structures at the end of 'struct nf_conn', and explicitly re-calculate the pointer to it when needed rather than keeping pointers around. Saves 20 bytes per conntrack on my x86_64 box. A non-helped conntrack is 276 bytes. We still need to save another 20 bytes in order to fit into to target of 256bytes. Signed-off-by: Harald Welte <laforge@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-03-20 17:56:32 -08:00
John Heffner	5d424d5a67	[TCP]: MTU probing Implementation of packetization layer path mtu discovery for TCP, based on the internet-draft currently found at <http://www.ietf.org/internet-drafts/draft-ietf-pmtud-method-05.txt>. Signed-off-by: John Heffner <jheffner@psc.edu> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-03-20 17:53:41 -08:00
Jesper Juhl	2b191befe2	[IPCOMP6]: don't check vfree() argument for NULL. vfree does it's own NULL checking, so checking a pointer before handing it to vfree is pointless. Signed-off-by: Jesper Juhl <jesper.juhl@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-03-20 17:46:29 -08:00
YOSHIFUJI Hideaki	e843b9e1be	[IPV6]: ROUTE: Ensure to accept redirects from nexthop for the target. It is possible to get redirects from nexthop of "more-specific" routes. Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-03-20 17:07:49 -08:00
YOSHIFUJI Hideaki	09c884d4c3	[IPV6]: ROUTE: Add accept_ra_rt_info_max_plen sysctl. Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-03-20 17:07:03 -08:00
YOSHIFUJI Hideaki	e317da9622	[IPV6]: ROUTE: Flag RTF_DEFAULT for Route Infomation for ::/0. Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-03-20 17:06:42 -08:00
YOSHIFUJI Hideaki	70ceb4f539	[IPV6]: ROUTE: Add experimental support for Route Information Option in RA (RFC4191). Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-03-20 17:06:24 -08:00
YOSHIFUJI Hideaki	52e1635631	[IPV6]: ROUTE: Add router_probe_interval sysctl. Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-03-20 17:05:47 -08:00
YOSHIFUJI Hideaki	930d6ff2e2	[IPV6]: ROUTE: Add accept_ra_rtr_pref sysctl. Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-03-20 17:05:30 -08:00
YOSHIFUJI Hideaki	270972554c	[IPV6]: ROUTE: Add Router Reachability Probing (RFC4191). Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-03-20 17:05:13 -08:00
YOSHIFUJI Hideaki	ebacaaa0fd	[IPV6]: ROUTE: Add support for Router Preference (RFC4191). Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-03-20 17:04:53 -08:00
YOSHIFUJI Hideaki	8238dd0698	[IPV6]: ROUTE: Handle finding the next best route in reachability in BACKTRACK(). Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-03-20 17:04:35 -08:00
YOSHIFUJI Hideaki	bb133964e0	[IPV6]: ROUTE: Try finding the next best route. Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-03-20 17:01:43 -08:00
YOSHIFUJI Hideaki	1ddef044ed	[IPV6]: ROUTE: Clean up rt6_select() code path in ip6_route_{intput,output}(). Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-03-20 17:01:24 -08:00
YOSHIFUJI Hideaki	118f8c1654	[IPV6]: ROUTE: Try selecting better route for non-default routes as well. Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-03-20 17:01:06 -08:00
YOSHIFUJI Hideaki	045927ff84	[IPV6]: ROUTE: More strict check for default routers in rt6_get_dflt_router(). Check RTF_ADDRCONF\|RTF_DEFAULT in rt6_get_dflt_router(). Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-03-20 17:00:48 -08:00
YOSHIFUJI Hideaki	554cfb7ee5	[IPV6]: ROUTE: Eliminate lock for default route pointer. And prepare for more advanced router selection. Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-03-20 17:00:26 -08:00
YOSHIFUJI Hideaki	519fbd8715	[IPV6]: ROUTE: Clean-up cow'ing in ip6_route_{intput,output}(). Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-03-20 17:00:05 -08:00
YOSHIFUJI Hideaki	e40cf3533c	[IPV6]: ROUTE: Convert rt6_cow() to rt6_alloc_cow(). Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-03-20 16:59:27 -08:00
YOSHIFUJI Hideaki	fb9de91ea8	[IPV6]: ROUTE: Clean up reference counting / unlocking for returning object. Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-03-20 16:59:08 -08:00
YOSHIFUJI Hideaki	d5315b500b	[IPV6]: ROUTE: Unify two code paths for pmtu disc. Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-03-20 16:58:48 -08:00
YOSHIFUJI Hideaki	299d993908	[IPV6]: ROUTE: Add rt6_alloc_clone() for cloning route allocation. Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-03-20 16:58:32 -08:00
YOSHIFUJI Hideaki	76f9edd17d	[IPV6]: ROUTE: Copy u.dst.error for RTF_REJECT routes when cloning. Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-03-20 16:56:50 -08:00
YOSHIFUJI Hideaki	a1e783634a	[IPV6]: ROUTE: Set appropriate information before inserting a route. Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-03-20 16:56:32 -08:00
YOSHIFUJI Hideaki	95a9a5ba02	[IPV6]: ROUTE: Split up rt6_cow() for future changes. Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-03-20 16:55:51 -08:00
YOSHIFUJI Hideaki	c4fd30eb18	[IPV6]: ADDRCONF: Add accept_ra_pinfo sysctl. This controls whether we accept Prefix Information in RAs. Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-03-20 16:55:26 -08:00
YOSHIFUJI Hideaki	65f5c7c114	[IPV6]: ROUTE: Add accept_ra_defrtr sysctl. This controls whether we accept default router information in RAs. Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-03-20 16:55:08 -08:00
YOSHIFUJI Hideaki	073a8e0e15	[IPV6]: ADDRCONF: Split up ipv6_generate_eui64() by device type. Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-03-20 16:54:49 -08:00
YOSHIFUJI Hideaki	955189efb4	[IPV6]: ADDRCONF: Use our standard algorithm for randomized ifid. RFC 3041 describes an algorithm to generate random interface identifier. In RFC 3041bis, it is allowed to use different algorithm than one described in RFC 3041. So, let's use our standard pseudo random algorithm to simplify our implementation. Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-03-20 16:54:09 -08:00
YOSHIFUJI Hideaki	74a3a0ed90	[IPV6]: TUNNEL6: Don't try to add multicast route twice. Since addrconf_add_dev() has already called addrconf_add_mroute() to added route for multicast prefix, there's no point to call it again in addrconf_ip6_tnl_config(). Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-03-20 16:51:48 -08:00
Herbert Xu	3759fa9c55	[TCP]: Fix zero port problem in IPv6 When we link a socket into the hash table, we need to make sure that we set the num/port fields so that it shows us with a non-zero port value in proc/netlink and on the wire. This code and comment is copied over from the IPv4 stack as is. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2006-03-13 14:26:12 -08:00
Patrick McHardy	baa829d892	[IPV4/6]: Fix UFO error propagation When ufo_append_data fails err is uninitialized, but returned back. Strangely gcc doesn't notice it. Coverity #901 and #902 Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-03-12 20:39:40 -08:00
Patrick McHardy	f8dc01f543	[XFRM]: Fix leak in ah6_input tmp_hdr is not freed when ipv6_clear_mutable_options fails. Coverity #650 Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-03-12 20:39:37 -08:00
Brian Haley	0d27b42739	[IPV6]: fix ipv6_saddr_score struct element The scope element in the ipv6_saddr_score struct used in ipv6_dev_get_saddr() is an unsigned integer, but __ipv6_addr_src_scope() returns a signed integer (and can return -1). Signed-off-by: Brian Haley <brian.haley@hp.com> Acked-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-03-11 18:50:14 -08:00
Thomas Graf	850a9a4e3c	[NETFILTER] ip_queue: Fix wrong skb->len == nlmsg_len assumption The size of the skb carrying the netlink message is not equivalent to the length of the actual netlink message due to padding. ip_queue matches the length of the payload against the original packet size to determine if packet mangling is desired, due to the above wrong assumption arbitary packets may not be mangled depening on their original size. Signed-off-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-03-07 14:56:12 -08:00
Patrick McHardy	bafac2a512	[NETFILTER]: Restore {ipt,ip6t,ebt}_LOG compatibility The nfnetlink_log infrastructure changes broke compatiblity of the LOG targets. They currently use whatever log backend was registered first, which means that if ipt_ULOG was loaded first, no messages will be printed to the ring buffer anymore. Restore compatiblity by using the old log functions by default and only use the nf_log backend if the user explicitly said so. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-02-27 13:04:17 -08:00
YOSHIFUJI Hideaki	d91675f9c7	[IPV6]: Do not ignore IPV6_MTU socket option. Based on patch by Hoerdt Mickael <hoerdt@clarinet.u-strasbg.fr>. Signed-off-by: YOSHIFUJI Hideaki <yosufuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-02-24 13:18:33 -08:00
Hugo Santos	0c0888908d	[IPV6] ip6_tunnel: release cached dst on change of tunnel params The included patch fixes ip6_tunnel to release the cached dst entry when the tunnel parameters (such as tunnel endpoints) are changed so they are used immediatly for the next encapsulated packets. Signed-off-by: Hugo Santos <hsantos@av.it.pt> Acked-by: Ville Nuorvala <vnuorval@tcs.hut.fi> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-02-24 13:16:25 -08:00
Al Viro	cc6cdac0cf	[PATCH] missing ntohs() in ip6_tunnel ->payload_len is net-endian Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2006-02-18 16:02:18 -05:00
Yasuyuki Kozakai	763ecff187	[NETFILTER]: nf_conntrack: attach conntrack to locally generated ICMPv6 error Locally generated ICMPv6 errors should be associated with the conntrack of the original packet. Since the conntrack entry may not be in the hash tables (for the first packet), it must be manually attached. Signed-off-by: Yasuyuki Kozakai <yasuyuki.kozakai@toshiba.co.jp> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-02-15 15:24:15 -08:00
Yasuyuki Kozakai	08857fa745	[NETFILTER]: nf_conntrack: attach conntrack to TCP RST generated by ip6t_REJECT TCP RSTs generated by the REJECT target should be associated with the conntrack of the original TCP packet. Since the conntrack entry is usually not is the hash tables, it must be manually attached. Signed-off-by: Yasuyuki Kozakai <yasuyuki.kozakai@toshiba.co.jp> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-02-15 15:23:28 -08:00
Nicolas DICHTEL	6d3e85ecf2	[IPV6] Don't store dst_entry for RAW socket Signed-off-by: Nicolas DICHTEL <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-02-13 15:56:13 -08:00
Kristian Slavov	9908104935	[IPV6]: Address autoconfiguration does not work after device down/up cycle If you set network interface down and up again, the IPv6 address autoconfiguration does not work. 'ip addr' shows that the link-local address is in tentative state. We don't even react to periodical router advertisements. During NETDEV_DOWN we clear IF_READY, and we don't set it back in NETDEV_UP. While starting to perform DAD on the link-local address, we notice that the device is not in IF_READY, and we abort autoconfiguration process (which would eventually send router solicitations). Acked-by: Juha-Matti Tapio <jmtapio@verkkotelakka.net> Acked-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-02-08 16:13:28 -08:00
Al Viro	e80e28b6b6	[PATCH] net/ipv6/mcast.c NULL noise removal Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2006-02-07 20:58:56 -05:00
Al Viro	1b8623545b	[PATCH] remove bogus asm/bug.h includes. A bunch of asm/bug.h includes are both not needed (since it will get pulled anyway) and bogus (since they are done too early). Removed. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2006-02-07 20:56:35 -05:00
Linus Torvalds	98bd0c07b6	Merge master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6	2006-02-05 11:10:29 -08:00
Eric Dumazet	88a2a4ac6b	[PATCH] percpu data: only iterate over possible CPUs percpu_data blindly allocates bootmem memory to store NR_CPUS instances of cpudata, instead of allocating memory only for possible cpus. As a preparation for changing that, we need to convert various 0 -> NR_CPUS loops to use for_each_cpu(). (The above only applies to users of asm-generic/percpu.h. powerpc has gone it alone and is presently only allocating memory for present CPUs, so it's currently corrupting memory). Signed-off-by: Eric Dumazet <dada1@cosmosbay.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: James Bottomley <James.Bottomley@steeleye.com> Acked-by: Ingo Molnar <mingo@elte.hu> Cc: Jens Axboe <axboe@suse.de> Cc: Anton Blanchard <anton@samba.org> Acked-by: William Irwin <wli@holomorphy.com> Cc: Andi Kleen <ak@muc.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-02-05 11:06:51 -08:00
Patrick McHardy	0047c65a60	[NETFILTER]: Prepare {ipt,ip6t}_policy match for x_tables unification The IPv4 and IPv6 version of the policy match are identical besides address comparison and the data structure used for userspace communication. Unify the data structures to break compatiblity now (before it is released), so we can port it to x_tables in 2.6.17. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-02-04 23:51:28 -08:00
Patrick McHardy	878c41ce57	[NETFILTER]: Fix ip6t_policy address matching Fix two bugs in ip6t_policy address matching: - misorder arguments to ip6_masked_addrcmp, mask must be the second argument - inversion incorrectly applied to the entire expression instead of just the address comparison Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-02-04 23:51:27 -08:00
Patrick McHardy	e55f1bc5dc	[NETFILTER]: Check policy length in policy match strict mode Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-02-04 23:51:26 -08:00
Kirill Korotaev	ee4bb818ae	[NETFILTER]: Fix possible overflow in netfilters do_replace() netfilter's do_replace() can overflow on addition within SMP_ALIGN() and/or on multiplication by NR_CPUS, resulting in a buffer overflow on the copy_from_user(). In practice, the overflow on addition is triggerable on all systems, whereas the multiplication one might require much physical memory to be present due to the check above. Either is sufficient to overwrite arbitrary amounts of kernel memory. I really hate adding the same check to all 4 versions of do_replace(), but the code is duplicate... Found by Solar Designer during security audit of OpenVZ.org Signed-Off-By: Kirill Korotaev <dev@openvz.org> Signed-Off-By: Solar Designer <solar@openwall.com> Signed-off-by: Patrck McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-02-04 23:51:25 -08:00
Herbert Xu	6f4b6ec1cf	[IPV6]: Fix illegal dst locking in softirq context. On Tue, Jan 31, 2006 at 10:24:32PM +0100, Ingo Molnar wrote: > > [<c04de9e8>] _write_lock+0x8/0x10 > [<c0499015>] inet6_destroy_sock+0x25/0x100 > [<c04b8672>] tcp_v6_destroy_sock+0x12/0x20 > [<c046bbda>] inet_csk_destroy_sock+0x4a/0x150 > [<c047625c>] tcp_rcv_state_process+0xd4c/0xdd0 > [<c047d8e9>] tcp_v4_do_rcv+0xa9/0x340 > [<c047eabb>] tcp_v4_rcv+0x8eb/0x9d0 OK this is definitely broken. We should never touch the dst lock in softirq context. Since inet6_destroy_sock may be called from that context due to the asynchronous nature of sockets, we can't take the lock there. In fact this sk_dst_reset is totally redundant since all IPv6 sockets use inet_sock_destruct as their socket destructor which always cleans up the dst anyway. So the solution is to simply remove the call. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-02-02 17:01:13 -08:00
Herbert Xu	4641e7a334	[IPV6]: Don't hold extra ref count in ipv6_ifa_notify Currently the logic in ipv6_ifa_notify is to hold an extra reference count for addrconf dst's that get added to the routing table. Thus, when addrconf dst entries are taken out of the routing table, we need to drop that dst. However, addrconf dst entries may be removed from the routing table by means other than __ipv6_ifa_notify. So we're faced with the choice of either fixing up all places where addrconf dst entries are removed, or dropping the extra reference count altogether. I chose the latter because the ifp itself always holds a dst reference count of 1 while it's alive. This is dropped just before we kfree the ifp object. Therefore we know that in __ipv6_ifa_notify we will always hold that count. This bug was found by Eric W. Biederman. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-02-02 16:55:45 -08:00
Eric W. Biederman	78b910429e	[IPV6] tcp_v6_send_synack: release the destination This patch fix dst reference counting in tcp_v6_send_synack Analysis: Currently tcp_v6_send_synack is never called with a dst entry so dst always comes in as NULL. ip6_dst_lookup calls ip6_route_output which calls dst_hold before it returns the dst entry. Neither xfrm_lookup nor tcp_make_synack consume the dst entry so we still have a dst_entry with a bumped refrence count at the end of this function. Therefore we need to call dst_release just before we return just like tcp_v4_send_synack does. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-01-31 17:51:44 -08:00
David L Stevens	7add2a4398	[IPV6] MLDv2: fix change records when transitioning to/from inactive The following patch fixes these problems in MLDv2: 1) Add/remove "delete" records for sending change reports when addition of a filter results in that filter transitioning to/from inactive. [same as recent IPv4 IGMPv3 fix] 2) Remove 2 redundant "group_type" checks (can't be IPV6_ADDR_ANY within that loop, so checks are always true) 3) change an is_in() "return 0" to "return type == MLD2_MODE_IS_INCLUDE". It should always be "0" to get here, but it improves code locality to not assume it, and if some race allowed otherwise, doing the check would return the correct result. Signed-off-by: David L Stevens <dlstevens@us.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-01-24 13:06:39 -08:00
Yasuyuki Kozakai	f0daaa654a	[NETFILTER] ip6tables: whitespace and indent cosmetic cleanup Signed-off-by: Yasuyuki Kozakai <yasuyuki.kozakai@toshiba.co.jp> Signed-off-by: Harald Welte <laforge@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-01-17 02:39:39 -08:00
Yasuyuki Kozakai	6dd42af790	[NETFILTER] Makefile cleanup These are replaced with x_tables matches and no longer exist. Signed-off-by: Yasuyuki Kozakai <yasuyuki.kozakai@toshiba.co.jp> Signed-off-by: Harald Welte <laforge@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-01-17 02:38:56 -08:00
Benoit Boissinot	ccc91324a1	[NETFILTER] ip[6]t_policy: Fix compilation warnings ip[6]t_policy argument conversion slipped when merging with x_tables Signed-off-by: Benoit Boissinot <benoit.boissinot@ens-lyon.org> Signed-off-by: Harald Welte <laforge@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-01-17 02:26:34 -08:00
YOSHIFUJI Hideaki	9343e79a7b	[IPV6]: Preserve procfs IPV6 address output format Procfs always output IPV6 addresses without the colon characters, and we cannot change that. Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-01-17 02:10:53 -08:00
Patrick McHardy	ee51b1b6ce	[XFRM]: IPsec tunnel wildcard address support When the source address of a tunnel is given as 0.0.0.0 do a routing lookup to get the real source address for the destination and fill that into the acquire message. This allows to specify policies like this: spdadd 172.16.128.13/32 172.16.0.0/20 any -P out ipsec esp/tunnel/0.0.0.0-x.x.x.x/require; spdadd 172.16.0.0/20 172.16.128.13/32 any -P in ipsec esp/tunnel/x.x.x.x-0.0.0.0/require; Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-01-13 14:34:36 -08:00
Joe Perches	46b86a2da0	[NET]: Use NIP6_FMT in kernel.h There are errors and inconsistency in the display of NIP6 strings. ie: net/ipv6/ip6_flowlabel.c There are errors and inconsistency in the display of NIPQUAD strings too. ie: net/netfilter/nf_conntrack_ftp.c This patch: adds NIP6_FMT to kernel.h changes all code to use NIP6_FMT fixes net/ipv6/ip6_flowlabel.c adds NIPQUAD_FMT to kernel.h fixes net/netfilter/nf_conntrack_ftp.c changes a few uses of "%u.%u.%u.%u" to NIPQUAD_FMT for symmetry to NIP6_FMT Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-01-13 14:29:07 -08:00
Harald Welte	2e4e6a17af	[NETFILTER] x_tables: Abstraction layer for {ip,ip6,arp}_tables This monster-patch tries to do the best job for unifying the data structures and backend interfaces for the three evil clones ip_tables, ip6_tables and arp_tables. In an ideal world we would never have allowed this kind of copy+paste programming... but well, our world isn't (yet?) ideal. o introduce a new x_tables module o {ip,arp,ip6}_tables depend on this x_tables module o registration functions for tables, matches and targets are only wrappers around x_tables provided functions o all matches/targets that are used from ip_tables and ip6_tables are now implemented as xt_FOOBAR.c files and provide module aliases to ipt_FOOBAR and ip6t_FOOBAR o header files for xt_matches are in include/linux/netfilter/, include/linux/netfilter_{ipv4,ipv6} contains compatibility wrappers around the xt_FOOBAR.h headers Based on this patchset we're going to further unify the code, gradually getting rid of all the layer 3 specific assumptions. Signed-off-by: Harald Welte <laforge@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-01-12 14:06:43 -08:00
Randy Dunlap	4fc268d24c	[PATCH] capable/capability.h (net/) net: Use <linux/capability.h> where capable() is used. Signed-off-by: Randy Dunlap <rdunlap@xenotime.net> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-01-11 18:42:14 -08:00
Kris Katterjohn	8b3a70058b	[NET]: Remove more unneeded typecasts on *malloc() This removes more unneeded casts on the return value for kmalloc(), sock_kmalloc(), and vmalloc(). Signed-off-by: Kris Katterjohn <kjak@users.sourceforge.net> Acked-by: James Morris <jmorris@namei.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-01-11 16:32:14 -08:00
David Woodhouse	ae0f7d5f83	[IPV6]: Avoid calling ip6_xmit() with NULL sk The ip6_xmit() function now assumes that its sk argument is non-NULL, which isn't currently true when TCPv6 code is sending RST or ACK packets. This fixes that code to use a socket of its own for sending such packets, as TCPv4 does. (Thanks Andi for the pointer). Signed-off-by: David Woodhouse <dwmw2@infradead.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-01-11 16:32:13 -08:00
David S. Miller	82bf7e97ac	[NET]: Some more missing include/etherdevice.h includes For compare_ether_addr() Signed-off-by: David S. Miller <davem@davemloft.net>	2006-01-11 16:32:11 -08:00
David S. Miller	5bf887f2ff	[IPV6]: Fix modular build with netfilter enabled. Also, drop __exit marker from ipv6_netfilter_fini() as this can be invoked from inet6_init() error handling paths. Based upon a report from Stephen Hemminger. Signed-off-by: David S. Miller <davem@davemloft.net>	2006-01-10 21:02:21 -08:00
Patrick McHardy	babbdb1a18	[NETFILTER]: Fix timeout sysctls on big-endian 64bit architectures The connection tracking timeout variables are unsigned long, but proc_dointvec_jiffies is used with sizeof(unsigned int) in the sysctl tables. Since there is no proc_doulongvec_jiffies function, change the timeout variables to unsigned int. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-01-10 12:54:35 -08:00
Patrick McHardy	bb94aa169e	[NETFILTER]: net/ipv[46]/netfilter.c cleanups Don't wrap entire file in #ifdef CONFIG_NETFILTER, remove a few unneccessary includes. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-01-10 12:54:29 -08:00
Kris Katterjohn	d3f4a687f6	[NET]: Change memcmp(,,ETH_ALEN) to compare_ether_addr() This changes some memcmp(one,two,ETH_ALEN) to compare_ether_addr(one,two). Signed-off-by: Kris Katterjohn <kjak@users.sourceforge.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-01-10 12:54:28 -08:00
Patrick McHardy	a2c2064f7f	[IPV6]: Set skb->priority in ip6_output.c Set skb->priority = sk->sk_priority as in raw.c and IPv4. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-01-09 14:16:31 -08:00
Patrick McHardy	2941a48631	[NET]: Convert net/{ipv4,ipv6,sched} to netdev_priv Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-01-09 14:16:03 -08:00
Pekka Enberg	f9f7500521	[PATCH] slab: remove unused align parameter from alloc_percpu __alloc_percpu and alloc_percpu both take an 'align' argument which is completely ignored. snmp6_mib_init() in net/ipv6/af_inet6.c attempts to use it, but it will be ignored. Therefore, remove the 'align' argument and fixup the lone caller. Signed-off-by: Matthew Dobson <colpatch@us.ibm.com> Acked-by: Manfred Spraul <manfred@colorfullife.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-01-08 20:12:39 -08:00
Adrian Bunk	9f5336e218	[IPV6]: small cleanups This patch contains the following cleanups: - addrconf.c: make addrconf_dad_stop() static - inet6_connection_sock.c should #include <net/inet6_connection_sock.h> for getting the prototypes of it's global functions Signed-off-by: Adrian Bunk <bunk@stusta.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-01-07 13:24:25 -08:00
Patrick McHardy	e16a8f0b8c	[NETFILTER]: Add ipt_policy/ip6t_policy matches Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-01-07 12:57:38 -08:00
Patrick McHardy	3e3850e989	[NETFILTER]: Fix xfrm lookup in ip_route_me_harder/ip6_route_me_harder ip_route_me_harder doesn't use the port numbers of the xfrm lookup and uses ip_route_input for non-local addresses which doesn't do a xfrm lookup, ip6_route_me_harder doesn't do a xfrm lookup at all. Use xfrm_decode_session and do the lookup manually, make sure both only do the lookup if the packet hasn't been transformed already. Makeing sure the lookup only happens once needs a new field in the IP6CB, which exceeds the size of skb->cb. The size of skb->cb is increased to 48b. Apparently the IPv6 mobile extensions need some more room anyway. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-01-07 12:57:33 -08:00
Patrick McHardy	8cdfab8a43	[IPV4]: reset IPCB flags when neccessary Reset IPSKB_XFRM_TUNNEL_SIZE flags in ipip and ip_gre hard_start_xmit function before the packet reenters IP. This is neccessary so the encapsulated packets are checked not to be oversized in xfrm4_output.c again. Reset all flags in sit when a packet changes its address family. Also remove some obsolete IPSKB flags. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-01-07 12:57:32 -08:00
Patrick McHardy	b05e106698	[IPV4/6]: Netfilter IPsec input hooks When the innermost transform uses transport mode the decapsulated packet is not visible to netfilter. Pass the packet through the PRE_ROUTING and LOCAL_IN hooks again before handing it to upper layer protocols to make netfilter-visibility symetrical to the output path. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-01-07 12:57:31 -08:00
Patrick McHardy	951dbc8ac7	[IPV6]: Move nextheader offset to the IP6CB Move nextheader offset to the IP6CB to make it possible to pass a packet to ip6_input_finish multiple times and have it skip already parsed headers. As a nice side effect this gets rid of the manual hopopts skipping in ip6_input_finish. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-01-07 12:57:29 -08:00
Patrick McHardy	16a6677fdf	[XFRM]: Netfilter IPsec output hooks Call netfilter hooks before IPsec transforms. Packets visit the FORWARD/LOCAL_OUT and POST_ROUTING hook before the first encapsulation and the LOCAL_OUT and POST_ROUTING hook before each following tunnel mode transform. Patch from Herbert Xu <herbert@gondor.apana.org.au>: Move the loop from dst_output into xfrm4_output/xfrm6_output since they're the only ones who need to it. xfrm{4,6}_output_one() processes the first SA all subsequent transport mode SAs and is called in a loop that calls the netfilter hooks between each two calls. In order to avoid the tail call issue, I've added the inline function nf_hook which is nf_hook_slow plus the empty list check. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-01-07 12:57:28 -08:00
Kris Katterjohn	46f25dffba	[NET]: Change 1500 to ETH_DATA_LEN in some files These patches add the header linux/if_ether.h and change 1500 to ETH_DATA_LEN in some files. Signed-off-by: Kris Katterjohn <kjak@users.sourceforge.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-01-05 16:48:56 -08:00
Patrick McHardy	22dea562bb	[NETFILTER]: Export ip6_masked_addrcmp, don't pass IPv6 addresses on stack Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-01-05 12:21:34 -08:00
Patrick McHardy	b777e0ce74	[NETFILTER]: make ipv6_find_hdr() find transport protocol header The original ipv6_find_hdr() finds the specified header in IPv6 packets. This makes it possible to get transport header so that we can kill similar loop in ip6_match_packet(). Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-01-05 12:21:16 -08:00
Pablo Neira Ayuso	c1d10adb4a	[NETFILTER]: Add ctnetlink port for nf_conntrack Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-01-05 12:19:05 -08:00
YOSHIFUJI Hideaki	181a46a56e	[NETFILTER]: Use macro for spinlock_t/rwlock_t initializations/definition. Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-01-04 13:56:54 -08:00
YOSHIFUJI Hideaki	196433c5b7	[IPV6]: Use macro for rwlock_t initialization. Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-01-04 13:56:31 -08:00
Christoph Hellwig	b5e5fa5e09	[NET]: Add a dev_ioctl() fallback to sock_ioctl() Currently all network protocols need to call dev_ioctl as the default fallback in their ioctl implementations. This patch adds a fallback to dev_ioctl to sock_ioctl if the protocol returned -ENOIOCTLCMD. This way all the procotol ioctl handlers can be simplified and we don't need to export dev_ioctl. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-01-03 14:18:33 -08:00
Arnaldo Carvalho de Melo	14c850212e	[INET_SOCK]: Move struct inet_sock & helper functions to net/inet_sock.h To help in reducing the number of include dependencies, several files were touched as they were getting needed headers indirectly for stuff they use. Thanks also to Alan Menegotto for pointing out that net/dccp/proto.c had linux/dccp.h include twice. Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-01-03 13:11:21 -08:00
Eric Dumazet	90ddc4f047	[NET]: move struct proto_ops to const I noticed that some of 'struct proto_ops' used in the kernel may share a cache line used by locks or other heavily modified data. (default linker alignement is 32 bytes, and L1_CACHE_LINE is 64 or 128 at least) This patch makes sure a 'struct proto_ops' can be declared as const, so that all cpus can share all parts of it without false sharing. This is not mandatory : a driver can still use a read/write structure if it needs to (and eventually a __read_mostly) I made a global stubstitute to change all existing occurences to make them const. This should reduce the possibility of false sharing on SMP, and speedup some socket system calls. Signed-off-by: Eric Dumazet <dada1@cosmosbay.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-01-03 13:11:15 -08:00
Arnaldo Carvalho de Melo	d83d8461f9	[IP_SOCKGLUE]: Remove most of the tcp specific calls As DCCP needs to be called in the same spots. Now we have a member in inet_sock (is_icsk), set at sock creation time from struct inet_protosw->flags (if INET_PROTOSW_ICSK is set, like for TCP and DCCP) to see if a struct sock instance is a inet_connection_sock for places like the ones in ip_sockglue.c (v4 and v6) where we previously were looking if sk_type was SOCK_STREAM, that is insufficient because we now use the same code for DCCP, that has sk_type SOCK_DCCP. Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-01-03 13:10:58 -08:00
Arnaldo Carvalho de Melo	d8313f5ca2	[INET6]: Generalise tcp_v6_hash_connect Renaming it to inet6_hash_connect, making it possible to ditch dccp_v6_hash_connect and share the same code with TCP instead. Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-01-03 13:10:56 -08:00
Arnaldo Carvalho de Melo	6d6ee43e0b	[TWSK]: Introduce struct timewait_sock_ops So that we can share several timewait sockets related functions and make the timewait mini sockets infrastructure closer to the request mini sockets one. Next changesets will take advantage of this, moving more code out of TCP and DCCP v4 and v6 to common infrastructure. Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-01-03 13:10:54 -08:00
Arnaldo Carvalho de Melo	399c07def6	[IPV6]: Export ipv6_opt_accepted It was already non-TCP specific, will be used by DCCPv6. Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-01-03 13:10:51 -08:00
Arnaldo Carvalho de Melo	3cf3dc6c2e	[IPV6]: Export some symbols for DCCPv6 Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-01-03 13:10:48 -08:00
Arnaldo Carvalho de Melo	0fa1a53e1f	[IPV6]: Introduce inet6_timewait_sock Out of tcp6_timewait_sock, that now is just an aggregation of inet_timewait_sock and inet6_timewait_sock, using tw_ipv6_offset in struct inet_timewait_sock, that is common to the IPv6 transport protocols that use timewait sockets, like DCCP and TCP. tw_ipv6_offset plays the struct inet_sock pinfo6 role, i.e. for the generic code to find the IPv6 area in a timewait sock. Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-01-03 13:10:47 -08:00
Arnaldo Carvalho de Melo	b9750ce13c	[IPV6]: Generalise some functions Using sk->sk_protocol instead of IPPROTO_TCP. Will be used by DCCPv6 in the next changesets. Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-01-03 13:10:46 -08:00
Herbert Xu	3305b80c21	[IP]: Simplify and consolidate MSG_PEEK error handling When a packet is obtained from skb_recv_datagram with MSG_PEEK enabled it is left on the socket receive queue. This means that when we detect a checksum error we have to be careful when trying to free the packet as someone could have dequeued it in the time being. Currently this delicate logic is duplicated three times between UDPv4, UDPv6 and RAWv6. This patch moves them into a one place and simplifies the code somewhat. This is based on a suggestion by Eric Dumazet. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-01-03 13:10:41 -08:00
Arnaldo Carvalho de Melo	8292a17a39	[ICSK]: Rename struct tcp_func to struct inet_connection_sock_af_ops And move it to struct inet_connection_sock. DCCP will use it in the upcoming changesets. Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-01-03 13:10:38 -08:00
Arnaldo Carvalho de Melo	ca304b6104	[IPV6]: Introduce inet6_rsk() And inet6_rsk_offset in inet_request_sock, for the same reasons as inet_sock's pinfo6 member. Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-01-03 13:10:37 -08:00
Arnaldo Carvalho de Melo	8129765ac0	[IPV6]: Generalise tcp_v6_search_req & tcp_v6_synq_add More work is needed tho to introduce inet6_request_sock from tcp6_request_sock, in the same layout considerations as ipv6_pinfo in inet_sock, next changeset will do that. Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-01-03 13:10:36 -08:00
Arnaldo Carvalho de Melo	90b19d3169	[IPV6]: Generalise __tcp_v6_hash, renaming it to __inet6_hash Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-01-03 13:10:33 -08:00
Arnaldo Carvalho de Melo	971af18bbf	[IPV6]: Reuse inet_csk_get_port in tcp_v6_get_port Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-01-03 13:10:33 -08:00
Eric Dumazet	3183606469	[NETFILTER] ip_tables: NUMA-aware allocation Part of a performance problem with ip_tables is that memory allocation is not NUMA aware, but 'only' SMP aware (ie each CPU normally touch separate cache lines) Even with small iptables rules, the cost of this misplacement can be high on common workloads. Instead of using one vmalloc() area (located in the node of the iptables process), we now allocate an area for each possible CPU, using vmalloc_node() so that memory should be allocated in the CPU's node if possible. Port to arp_tables and ip6_tables by Harald Welte. Signed-off-by: Eric Dumazet <dada1@cosmosbay.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-01-03 13:10:29 -08:00
David L Stevens	5ab4a6c81e	[IPV6] mcast: Fix multiple issues in MLDv2 reports. The below "jumbo" patch fixes the following problems in MLDv2. 1) Add necessary "ntohs" to recent "pskb_may_pull" check [breaks all nonzero source queries on little-endian (!)] 2) Add locking to source filter list [resend of prior patch] 3) fix "mld_marksources()" to a) send nothing when all queried sources are excluded b) send full exclude report when source queried sources are not excluded c) don't schedule a timer when there's nothing to report NOTE: RFC 3810 specifies the source list should be saved and each source reported individually as an IS_IN. This is an obvious DOS path, requiring the host to store and then multicast as many sources as are queried (e.g., millions...). This alternative sends a full, relevant report that's limited to number of sources present on the machine. 4) fix "add_grec()" to send empty-source records when it should The original check doesn't account for a non-empty source list with all sources inactive; the new code keeps that short-circuit case, and also generates the group header with an empty list if needed. 5) fix mca_crcount decrement to be after add_grec(), which needs its original value These issues (other than item #1 ;-) ) were all found by Yan Zheng, much thanks! Signed-off-by: David L Stevens <dlstevens@us.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-12-27 14:03:00 -08:00
YOSHIFUJI Hideaki	6732badee0	[IPV6]: Fix addrconf dead lock. We need to release idev->lcok before we call addrconf_dad_stop(). It calls ipv6_addr_del(), which will hold idev->lock. Bug spotted by Yasuyuki KOZAKAI <yasuyuki.kozakai@toshiba.co.jp>. Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-12-27 13:35:15 -08:00
David L Stevens	6f4353d891	[IPV6]: Increase default MLD_MAX_MSF to 64. The existing default of 10 is just way too low. Signed-off-by: David L Stevens <dlstevens@us.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-12-26 17:03:46 -08:00
Hiroyuki YAMAMORI	291d809ba5	[IPV6]: Fix Temporary Address Generation From: Hiroyuki YAMAMORI <h-yamamo@db3.so-net.ne.jp> Since regen_count is stored in the public address, we need to reset it when we start renewing temporary address. Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-12-23 11:24:05 -08:00
YOSHIFUJI Hideaki	3dd3bf8357	[IPV6]: Fix dead lock. We need to relesae ifp->lock before we call addrconf_dad_stop(), which will hold ifp->lock. Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-12-23 11:23:21 -08:00
David S. Miller	e6469297d4	Merge git://git.skbuff.net/gitroot/yoshfuji/linux-2.6.14+git+ipv6-fix-20051221a	2005-12-22 07:41:27 -08:00
Kristian Slavov	1d1428045c	[IPV6]: Fix address deletion If you add more than one IPv6 address belonging to the same prefix and delete the address that was last added, routing table entry for that prefix is also deleted. Tested on 2.6.14.4 To reproduce: ip addr add 3ffe::1/64 dev eth0 ip addr add 3ffe::2/64 dev eth0 /* wait DAD */ sleep 1 ip addr del 3ffe::2/64 dev eth0 ip -6 route (route to 3ffe::/64 should be gone) In ipv6_del_addr(), if ifa == ifp, we set ifa->if_next to NULL, and later assign ifap = &ifa->if_next, effectively terminating the for-loop. This prevents us from checking if there are other addresses using the same prefix that are valid, and thus resulting in deletion of the prefix. This applies only if the first entry in idev->addr_list is the address to be deleted. Signed-off-by: Kristian Slavov <kristian.slavov@nomadiclab.com> Acked-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-12-21 18:47:24 -08:00
YOSHIFUJI Hideaki	6b3ae80a63	[IPV6]: Don't select a tentative address as a source address. A tentative address is not considered "assigned to an interface" in the traditional sense (RFC2462 Section 4). Don't try to select such an address for the source address. Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>	2005-12-21 22:58:01 +09:00
YOSHIFUJI Hideaki	c5e33bddd3	[IPV6]: Run DAD when the link becomes ready. If the link was not available when the interface was created, run DAD for pending tentative addresses when the link becomes ready. Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>	2005-12-21 22:57:44 +09:00
YOSHIFUJI Hideaki	3c21edbd11	[IPV6]: Defer IPv6 device initialization until the link becomes ready. NETDEV_UP might be sent even if the link attached to the interface was not ready. DAD does not make sense in such case, so we won't do so. After interface Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>	2005-12-21 22:57:24 +09:00
YOSHIFUJI Hideaki	8de3351e6e	[IPV6]: Try not to send icmp to anycast address. Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>	2005-12-21 22:57:06 +09:00
YOSHIFUJI Hideaki	58c4fb86ea	[IPV6]: Flag RTF_ANYCAST for anycast routes. Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>	2005-12-21 22:56:42 +09:00
Patrick McHardy	9e999993c7	[XFRM]: Handle DCCP in xfrm{4,6}_decode_session Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-12-19 14:03:46 -08:00
YOSHIFUJI Hideaki	3dd4bc68fa	[IPV6]: Fix route lifetime. The route expiration time is stored in rt6i_expires in jiffies. The argument of rt6_route_add() for adding a route is not the expiration time in jiffies nor in clock_t, but the lifetime (or time left before expiration) in clock_t. Because of the confusion, we sometimes saw several strange errors (FAILs) in TAHI IPv6 Ready Logo Phase-2 Self Test. The symptoms were analyzed by Mitsuru Chinen <CHINEN@jp.ibm.com>. Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-12-19 14:02:45 -08:00
Patrick McHardy	31cb5bd4dc	[NETFILTER]: Fix incorrect dependency for IP6_NF_TARGET_NFQUEUE IP6_NF_TARGET_NFQUEUE depends on IP6_NF_IPTABLES, not IP_NF_IPTABLES. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-12-19 13:53:26 -08:00
David S. Miller	a1493d9cd1	[IPV6] addrconf: Do not print device pointer in privacy log message. Noticed by Andi Kleen, it is pointless to emit the device structure pointer in the kernel logs like this. Signed-off-by: David S. Miller <davem@davemloft.net>	2005-12-13 22:59:36 -08:00
Arnaldo Carvalho de Melo	ecc51b6d5c	[TCPv6]: Fix skb leak Spotted by Francois Romieu, thanks! Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-12-12 14:38:10 -08:00
Kazunori MIYAZAWA	73d4f84fd0	[IPv6] IPsec: fix pmtu calculation of esp It is a simple bug which uses the wrong member. This bug does not seriously affect ordinary use of IPsec. But it is important to pass IPv6 ready logo phase-2 conformance test of IPsec SGW. Signed-off-by: Kazunori MIYAZAWA <miyazawa@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-12-08 23:11:42 -08:00
Yasuyuki Kozakai	f16c910724	[NETFILTER]: nf_conntrack: Fix missing check for ICMPv6 type This makes nf_conntrack_icmpv6 check that ICMPv6 type isn't < 128 to avoid accessing out of array valid_new[] and invmap[]. Signed-off-by: Yasuyuki Kozakai <yasuyuki.kozakai@toshiba.co.jp> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-12-05 13:32:50 -08:00
YOSHIFUJI Hideaki	af1afe8662	[IPV6]: Load protocol module dynamically. [ Modified to match inet_create() bug fix by Herbert Xu -DaveM ] Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-12-02 20:56:57 -08:00
David Stevens	24c6927505	[IGMP]: workaround for IGMP v1/v2 bug From: David Stevens <dlstevens@us.ibm.com> As explained at: http://www.cs.ucsb.edu/~krishna/igmp_dos/ With IGMP version 1 and 2 it is possible to inject a unicast report to a client which will make it ignore multicast reports sent later by the router. The fix is to only accept the report if is was sent to a multicast or unicast address. Signed-off-by: David S. Miller <davem@davemloft.net>	2005-12-02 20:32:59 -08:00
Adrian Bunk	34a0b3cdc0	[IPV6]: make two functions static This patch makes two needlessly global functions static. Signed-off-by: Adrian Bunk <bunk@stusta.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-11-29 16:28:56 -08:00
Arjan van de Ven	9b5b5cff9a	[NET]: Add const markers to various variables. the patch below marks various variables const in net/; the goal is to move them to the .rodata section so that they can't false-share cachelines with things that get written to, as well as potentially helping gcc a bit with optimisations. (these were found using a gcc patch to warn about such variables) Signed-off-by: Arjan van de Ven <arjan@infradead.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-11-29 16:21:38 -08:00
YOSHIFUJI Hideaki	220bbd7483	[IPV6]: Implement appropriate dummy rule 4 in ipv6_dev_get_saddr(). Ensure to update hiscore.rule in dummy rule 4 in ipv6_dev_get_saddr(). Pointed out by Yan Zheng <yanzheng@21cn.com>. Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-11-28 22:27:11 -08:00
David S. Miller	1ef43204f4	Merge git://git.skbuff.net/gitroot/yoshfuji/linux-2.6.14+advapi-fix/	2005-11-20 20:52:16 -08:00
Yan Zheng	5d5780df23	[IPV6]: Acquire addrconf_hash_lock for read in addrconf_verify(...) addrconf_verify(...) only traverse address hash table when addrconf_hash_lock is held for writing, and it may hold addrconf_hash_lock for a long time. So I think it's better to acquire addrconf_hash_lock for reading instead of writing Signed-off-by: Yan Zheng <yanzheng@21cn.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-11-20 13:42:20 -08:00
YOSHIFUJI Hideaki	df9890c31a	[IPV6]: Fix sending extension headers before and including routing header. Based on suggestion from Masahide Nakamura <nakam@linux-ipv6.org>. Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>	2005-11-20 12:23:18 +09:00
Ville Nuorvala	a305989386	[IPV6]: Fix calculation of AH length during filling ancillary data. Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>	2005-11-20 12:21:59 +09:00
YOSHIFUJI Hideaki	8b8aa4b5a6	[IPV6]: Fix memory management error during setting up new advapi sockopts. Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>	2005-11-20 12:18:17 +09:00
David S. Miller	9e147a1cfc	[IPV6]: Fib dump really needs GFP_ATOMIC. Revert: `8225ccbaf0` Based upon a report by Yan Zheng. Signed-off-by: David S. Miller <davem@davemloft.net>	2005-11-17 16:52:51 -08:00
Yasuyuki Kozakai	e7c8a41e81	[IPV4,IPV6]: replace handmade list with hlist in IPv{4,6} reassembly Both of ipq and frag_queue have next and *prev, and they can be replaced with hlist. Thanks Arnaldo Carvalho de Melo for the suggestion. Signed-off-by: Yasuyuki Kozakai <yasuyuki.kozakai@toshiba.co.jp> Acked-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-11-16 12:55:37 -08:00
Luiz Capitulino	cb422c464b	[IPV6]: Fixes sparse warning in ipv6/ipv6_sockglue.c The patch below fixes the following sparse warning: net/ipv6/ipv6_sockglue.c:291:13: warning: Using plain integer as NULL pointer Signed-off-by: Luiz Capitulino <lcapitulino@mandriva.com.br> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-11-14 21:43:36 -08:00
Yan Zheng	12da2a435c	[IPV6]: small fix for ipv6_dev_get_saddr(...) The "score.rule++" doesn't make any sense for me. According to codes above, I think it should be "hiscore.rule++;" . Signed-off-by: Yan Zheng<yanzheng@21cn.com> Acked-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-11-14 21:42:46 -08:00
Yasuyuki Kozakai	302fe1758d	[NETFILTER] fix leak of fragment queue at unloading nf_conntrack_ipv6 This patch makes nf_conntrack_ipv6 free all IPv6 fragment queues at module unloading time. Also introduce a BUG_ON if we ever again have leaks in the memory accounting. Signed-off-by: Yasuyuki Kozakai <yasuyuki.kozakai@toshiba.co.jp> Signed-off-by: Harald Welte <laforge@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-11-14 15:28:45 -08:00
Yasuyuki Kozakai	1ba430bc3e	[NETFILTER] nf_conntrack: fix possibility of infinite loop while evicting nf_ct_frag6_queue This synchronizes nf_ct_reasm with ipv6 reassembly, and fixes a possibility of an infinite loop if CPUs evict and create nf_ct_frag6_queue in parallel. Signed-off-by: Yasuyuki Kozakai <yasuyuki.kozakai@toshiba.co.jp> Signed-off-by: Harald Welte <laforge@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-11-14 15:28:18 -08:00
Yasuyuki Kozakai	7686a02c0e	[NETFILTER]: fix type of sysctl variables in nf_conntrack_ipv6 These variables should be unsigned. This fixes sysctl handler for nf_ct_frag6_{low,high}_thresh. Signed-off-by: Yasuyuki Kozakai <yasuyuki.kozakai@toshiba.co.jp> Signed-off-by: Harald Welte <laforge@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-11-14 15:27:43 -08:00
Yasuyuki Kozakai	9bdf87d90b	[NETFILTER]: cleanup IPv6 Netfilter Kconfig This removes linux 2.4 configs in comments as TODO lists. And this also move the entry of nf_conntrack to top like IPv4 Netfilter Kconfig. Based on original patch by Krzysztof Piotr Oledzki <ole@ans.pl>. Signed-off-by: Yasuyuki Kozakai <yasuyuki.kozakai@toshiba.co.jp> Signed-off-by: Harald Welte <laforge@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-11-14 15:26:58 -08:00
Thomas Graf	8225ccbaf0	[IPV6]: Fix unnecessary GFP_ATOMIC allocation in fib6 dump Signed-off-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-11-12 12:15:16 -08:00
Herbert Xu	efacfbcb6c	[IPV6]: Fix rtnetlink dump infinite loop The recent change to netlink dump "done" callback handling broke IPv6 which played dirty tricks with the "done" callback. This causes an infinite loop during a dump. The following patch fixes it. This bug was reported by Jeff Garzik. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-11-12 12:12:05 -08:00
David S. Miller	8eb5591052	[IPV6]: Fix inet6_init missing unregister. Based mostly upon a patch from Olaf Kirch <okir@suse.de> When initialization fails in inet6_init(), we should unregister the PF_INET6 socket ops. Also, check sock_register()'s return value for errors. Signed-off-by: David S. Miller <davem@davemloft.net>	2005-11-11 15:05:47 -08:00
Herbert Xu	fb286bb299	[NET]: Detect hardware rx checksum faults correctly Here is the patch that introduces the generic skb_checksum_complete which also checks for hardware RX checksum faults. If that happens, it'll call netdev_rx_csum_fault which currently prints out a stack trace with the device name. In future it can turn off RX checksum. I've converted every spot under net/ that does RX checksum checks to use skb_checksum_complete or __skb_checksum_complete with the exceptions of: * Those places where checksums are done bit by bit. These will call netdev_rx_csum_fault directly. * The following have not been completely checked/converted: ipmr ip_vs netfilter dccp This patch is based on patches and suggestions from Stephen Hemminger and David S. Miller. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-11-10 13:01:24 -08:00
Thomas Graf	a8f74b2288	[NETLINK]: Make netlink_callback->done() optional Most netlink families make no use of the done() callback, making it optional gets rid of all unnecessary dummy implementations. Signed-off-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-11-10 02:26:40 +01:00
Yasuyuki Kozakai	9fb9cbb108	[NETFILTER]: Add nf_conntrack subsystem. The existing connection tracking subsystem in netfilter can only handle ipv4. There were basically two choices present to add connection tracking support for ipv6. We could either duplicate all of the ipv4 connection tracking code into an ipv6 counterpart, or (the choice taken by these patches) we could design a generic layer that could handle both ipv4 and ipv6 and thus requiring only one sub-protocol (TCP, UDP, etc.) connection tracking helper module to be written. In fact nf_conntrack is capable of working with any layer 3 protocol. The existing ipv4 specific conntrack code could also not deal with the pecularities of doing connection tracking on ipv6, which is also cured here. For example, these issues include: 1) ICMPv6 handling, which is used for neighbour discovery in ipv6 thus some messages such as these should not participate in connection tracking since effectively they are like ARP messages 2) fragmentation must be handled differently in ipv6, because the simplistic "defrag, connection track and NAT, refrag" (which the existing ipv4 connection tracking does) approach simply isn't feasible in ipv6 3) ipv6 extension header parsing must occur at the correct spots before and after connection tracking decisions, and there were no provisions for this in the existing connection tracking design 4) ipv6 has no need for stateful NAT The ipv4 specific conntrack layer is kept around, until all of the ipv4 specific conntrack helpers are ported over to nf_conntrack and it is feature complete. Once that occurs, the old conntrack stuff will get placed into the feature-removal-schedule and we will fully kill it off 6 months later. Signed-off-by: Yasuyuki Kozakai <yasuyuki.kozakai@toshiba.co.jp> Signed-off-by: Harald Welte <laforge@netfilter.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>	2005-11-09 16:38:16 -08:00
Ken-ichirou MATSUZAWA	9f0ede52a0	[IPV6]: ip6ip6_lock is not unlocked in error path. From: Ken-ichirou MATSUZAWA <chamas@h4.dion.ne.jp> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-11-09 13:08:29 -08:00
Peter Chubb	44fd0261d3	[IPV6]: Fix fallout from CONFIG_IPV6_PRIVACY Trying to build today's 2.6.14+git snapshot gives undefined references to use_tempaddr Looks like an ifdef got left out. Signed-off-by: Peter Chubb <peterc@gelato.unsw.edu.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-11-09 13:05:47 -08:00
Jesper Juhl	a51482bde2	[NET]: kfree cleanup From: Jesper Juhl <jesper.juhl@gmail.com> This is the net/ part of the big kfree cleanup patch. Remove pointless checks for NULL prior to calling kfree() in net/. Signed-off-by: Jesper Juhl <jesper.juhl@gmail.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Arnaldo Carvalho de Melo <acme@conectiva.com.br> Acked-by: Marcel Holtmann <marcel@holtmann.org> Acked-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: Andrew Morton <akpm@osdl.org>	2005-11-08 09:41:34 -08:00
YOSHIFUJI Hideaki	072047e4de	[IPV6]: RFC3484 compliant source address selection Choose more appropriate source address; e.g. - outgoing interface - non-deprecated - scope - matching label Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-11-08 09:38:30 -08:00
YOSHIFUJI Hideaki	b1cacb6820	[IPV6]: Make ipv6_addr_type() more generic so that we can use it for source address selection. Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-11-08 09:38:12 -08:00
YOSHIFUJI Hideaki	971f359ddc	[IPV6]: Put addr_diff() into common header for future use. Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-11-08 09:37:56 -08:00
Stephen Hemminger	6df716340d	[TCP/DCCP]: Randomize port selection This patch randomizes the port selected on bind() for connections to help with possible security attacks. It should also be faster in most cases because there is no need for a global lock. Signed-off-by: Stephen Hemminger <shemminger@osdl.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>	2005-11-05 21:23:15 -02:00
Yan Zheng	979ad66312	[IPV6]: inet6_ifinfo_notify should use RTM_DELLINK in addrconf_ifdown Signed-off-by: Yan Zheng <yanzheng@21cn.com> Acked-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>	2005-11-03 01:03:05 -02:00
Yan Zheng	8713dbf057	[MCAST]: ip[6]_mc_add_src should be called when number of sources is zero And filter mode is exclude. Further explanation by David Stevens: Multicast source filters aren't widely used yet, and that's really the only feature that's affected if an application actually exercises this bug, as far as I can tell. An ordinary filter-less multicast join should still work, and only forwarded multicast traffic making use of filters and doing empty-source filters with the MSFILTER ioctl would be at risk of not getting multicast traffic forwarded to them because the reports generated would not be based on the correct counts. Signed-off-by: Yan Zheng <yanzheng@21cn.com Acked-by: David L Stevens <dlstevens@us.ibm.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>	2005-11-02 21:03:57 -02:00
Yan Zheng	97300b5fdf	[MCAST] IPv6: Check packet size when process Multicast Signed-off-by: Yan Zheng <yanzheng@21cn.com Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>	2005-10-31 22:52:03 -02:00
Yan Zheng	9d17f21893	[IPV6]: Fix behavior of ip6_route_input() for link local address I find that linux will reply echo request destined to an address which belongs to an interface other than the one from which the request received. This behavior doesn't make sense for link local address. YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> said: Please note that sender does need to setup neighbor entry by hand to reproduce this bug. (Link-local address on eth1 is not visible on eth0, from the point of view of neighbor discovery in IPv6.) +--------+ +--------+ \| sender \| \| router \| +---+----+ +-+----+-+ \|eth0 eth0\| \|eth1 -----+----------------------+- -+-------------- Signed-off-by: Yan Zheng <yanzheng@21cn.com> Acked-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: Andrew Morton <akpm@osdl.org> (forwarded) Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>	2005-10-31 16:54:05 -02:00
Harald Welte	6b7d31fcdd	[NETFILTER]: Add "revision" support to arp_tables and ip6_tables Like ip_tables already has it for some time, this adds support for having multiple revisions for each match/target. We steal one byte from the name in order to accomodate a 8 bit version number. Signed-off-by: Harald Welte <laforge@netfilter.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>	2005-10-31 16:36:08 -02:00
David Hardeman	378f058cc4	[PATCH] Use sg_set_buf/sg_init_one where applicable This patch uses sg_set_buf/sg_init_one in some places where it was duplicated. Signed-off-by: David Hardeman <david@2gen.com> Cc: James Bottomley <James.Bottomley@steeleye.com> Cc: Greg KH <greg@kroah.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Jeff Garzik <jgarzik@pobox.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2005-10-30 11:19:43 +11:00
Yan Zheng	f12baeab9d	[MCAST] IPv6: Fix algorithm to compute Querier's Query Interval 5.1.3. Maximum Response Code The Maximum Response Code field specifies the maximum time allowed before sending a responding Report. The actual time allowed, called the Maximum Response Delay, is represented in units of milliseconds, and is derived from the Maximum Response Code as follows: If Maximum Response Code < 32768, Maximum Response Delay = Maximum Response Code If Maximum Response Code >=32768, Maximum Response Code represents a floating-point value as follows: 0 1 2 3 4 5 6 7 8 9 A B C D E F +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ \|1\| exp \| mant \| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Maximum Response Delay = (mant \| 0x1000) << (exp+3) 5.1.9. QQIC (Querier's Query Interval Code) The Querier's Query Interval Code field specifies the [Query Interval] used by the Querier. The actual interval, called the Querier's Query Interval (QQI), is represented in units of seconds, and is derived from the Querier's Query Interval Code as follows: If QQIC < 128, QQI = QQIC If QQIC >= 128, QQIC represents a floating-point value as follows: 0 1 2 3 4 5 6 7 +-+-+-+-+-+-+-+-+ \|1\| exp \| mant \| +-+-+-+-+-+-+-+-+ QQI = (mant \| 0x10) << (exp + 3) -- rfc3810 #define MLDV2_QQIC(value) MLDV2_EXP(0x80, 4, 3, value) #define MLDV2_MRC(value) MLDV2_EXP(0x8000, 12, 3, value) Above macro are defined in mcast.c. but 1 << 4 == 0x10 and 1 << 12 == 0x1000. So the result computed by original Macro is larger. Signed-off-by: Yan Zheng <yanzheng@21cn.com> Acked-by: David L Stevens <dlstevens@us.ibm.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>	2005-10-28 16:35:18 -02:00
Ananda Raju	e89e9cf539	[IPv4/IPv6]: UFO Scatter-gather approach Attached is kernel patch for UDP Fragmentation Offload (UFO) feature. 1. This patch incorporate the review comments by Jeff Garzik. 2. Renamed USO as UFO (UDP Fragmentation Offload) 3. udp sendfile support with UFO This patches uses scatter-gather feature of skb to generate large UDP datagram. Below is a "how-to" on changes required in network device driver to use the UFO interface. UDP Fragmentation Offload (UFO) Interface: ------------------------------------------- UFO is a feature wherein the Linux kernel network stack will offload the IP fragmentation functionality of large UDP datagram to hardware. This will reduce the overhead of stack in fragmenting the large UDP datagram to MTU sized packets 1) Drivers indicate their capability of UFO using dev->features \|= NETIF_F_UFO \| NETIF_F_HW_CSUM \| NETIF_F_SG NETIF_F_HW_CSUM is required for UFO over ipv6. 2) UFO packet will be submitted for transmission using driver xmit routine. UFO packet will have a non-zero value for "skb_shinfo(skb)->ufo_size" skb_shinfo(skb)->ufo_size will indicate the length of data part in each IP fragment going out of the adapter after IP fragmentation by hardware. skb->data will contain MAC/IP/UDP header and skb_shinfo(skb)->frags[] contains the data payload. The skb->ip_summed will be set to CHECKSUM_HW indicating that hardware has to do checksum calculation. Hardware should compute the UDP checksum of complete datagram and also ip header checksum of each fragmented IP packet. For IPV6 the UFO provides the fragment identification-id in skb_shinfo(skb)->ip6_frag_id. The adapter should use this ID for generating IPv6 fragments. Signed-off-by: Ananda Raju <ananda.raju@neterion.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (forwarded) Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>	2005-10-28 16:30:00 -02:00
John Hawkes	670c02c2bf	[NET]: Wider use of for_each_*cpu() In 'net' change the explicit use of for-loops and NR_CPUS into the general for_each_cpu() or for_each_online_cpu() constructs, as appropriate. This widens the scope of potential future optimizations of the general constructs, as well as takes advantage of the existing optimizations of first_cpu() and next_cpu(), which is advantageous when the true CPU count is much smaller than NR_CPUS. Signed-off-by: John Hawkes <hawkes@sgi.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>	2005-10-25 23:54:01 -02:00
Yan Zheng	4ea6a8046b	[IPV6]: Fix refcnt of struct ip6_flowlabel Signed-off-by: Yan Zheng <yanzheng@21cn.com> Acked-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>	2005-10-25 21:17:52 -02:00
Andrew Morton	e6850cce8f	[NETFILTER]: Fix ip6_table.c build with NETFILTER_DEBUG enabled. Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-10-15 16:15:38 -07:00
David S. Miller	c8923c6b85	[NETFILTER]: Fix OOPSes on machines with discontiguous cpu numbering. Original patch by Harald Welte, with feedback from Herbert Xu and testing by S�bastien Bernard. EBTABLES, ARP tables, and IP/IP6 tables all assume that cpus are numbered linearly. That is not necessarily true. This patch fixes that up by calculating the largest possible cpu number, and allocating enough per-cpu structure space given that. Signed-off-by: David S. Miller <davem@davemloft.net>	2005-10-13 14:41:23 -07:00
Herbert Xu	d4875b049b	[IPSEC] Fix block size/MTU bugs in ESP This patch fixes the following bugs in ESP: * Fix transport mode MTU overestimate. This means that the inner MTU is smaller than it needs be. Worse yet, given an input MTU which is a multiple of 4 it will always produce an estimate which is not a multiple of 4. For example, given a standard ESP/3DES/MD5 transform and an MTU of 1500, the resulting MTU for transport mode is 1462 when it should be 1464. The reason for this is because IP header lengths are always a multiple of 4 for IPv4 and 8 for IPv6. * Ensure that the block size is at least 4. This is required by RFC2406 and corresponds to what the esp_output function does. At the moment this only affects crypto_null as its block size is 1. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-10-10 21:11:34 -07:00
Herbert Xu	a02a64223e	[IPSEC]: Use ALIGN macro in ESP This patch uses the macro ALIGN in all the applicable spots for ESP. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-10-10 21:11:08 -07:00
YOSHIFUJI Hideaki	140e26fcd5	[IPV6]: Fix NS handing for proxy/anycast address Timer set up by pneigh_enqueue() ended up calling ndisc_rcv() via pndisc_redo(), which clears LOCALLY_ENQUEUED flag in NEIGH_CB(skb) and NS was queued again. Let's call ndisc_recv_ns() directly to avoid the loop. Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-10-05 12:11:41 -07:00
Yan Zheng	fab10fe37a	[MCAST] ipv6: Fix address size in grec_size Signed-Off-By: Yan Zheng <yanzheng@21cn.com> Acked-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Acked-by: David L Stevens <dlstevens@us.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-10-05 12:08:13 -07:00
YOSHIFUJI Hideaki	87bf9c97b4	[IPV6]: Fix infinite loop in udp_v6_get_port(). Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-10-04 13:00:39 -07:00
Herbert Xu	e5ed639913	[IPV4]: Replace __in_dev_get with __in_dev_get_rcu/rtnl The following patch renames __in_dev_get() to __in_dev_get_rtnl() and introduces __in_dev_get_rcu() to cover the second case. 1) RCU with refcnt should use in_dev_get(). 2) RCU without refcnt should use __in_dev_get_rcu(). 3) All others must hold RTNL and use __in_dev_get_rtnl(). There is one exception in net/ipv4/route.c which is in fact a pre-existing race condition. I've marked it as such so that we remember to fix it. This patch is based on suggestions and prior work by Suzanne Wood and Paul McKenney. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-10-03 14:35:55 -07:00
David S. Miller	a5e7c210fe	[IPV6]: Fix leak added by udp connect dst caching fix. Based upon a patch from Mitsuru KANDA <mk@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-10-03 14:21:58 -07:00
Yan Zheng	f36d6ab182	[IPV6]: Fix ipv6 fragment ID selection at slow path Signed-Off-By: Yan Zheng <yanzheng@21cn.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-10-03 14:19:15 -07:00
Eric Dumazet	81c3d5470e	[INET]: speedup inet (tcp/dccp) lookups Arnaldo and I agreed it could be applied now, because I have other pending patches depending on this one (Thank you Arnaldo) (The other important patch moves skc_refcnt in a separate cache line, so that the SMP/NUMA performance doesnt suffer from cache line ping pongs) 1) First some performance data : -------------------------------- tcp_v4_rcv() wastes a lot of time in __inet_lookup_established() The most time critical code is : sk_for_each(sk, node, &head->chain) { if (INET_MATCH(sk, acookie, saddr, daddr, ports, dif)) goto hit; /* You sunk my battleship! / } The sk_for_each() does use prefetch() hints but only the begining of "struct sock" is prefetched. As INET_MATCH first comparison uses inet_sk(__sk)->daddr, wich is far away from the begining of "struct sock", it has to bring into CPU cache cold cache line. Each iteration has to use at least 2 cache lines. This can be problematic if some chains are very long. 2) The goal ----------- The idea I had is to change things so that INET_MATCH() may return FALSE in 99% of cases only using the data already in the CPU cache, using one cache line per iteration. 3) Description of the patch --------------------------- Adds a new 'unsigned int skc_hash' field in 'struct sock_common', filling a 32 bits hole on 64 bits platform. struct sock_common { unsigned short skc_family; volatile unsigned char skc_state; unsigned char skc_reuse; int skc_bound_dev_if; struct hlist_node skc_node; struct hlist_node skc_bind_node; atomic_t skc_refcnt; + unsigned int skc_hash; struct proto skc_prot; }; Store in this 32 bits field the full hash, not masked by (ehash_size - 1) Using this full hash as the first comparison done in INET_MATCH permits us immediatly skip the element without touching a second cache line in case of a miss. Suppress the sk_hashent/tw_hashent fields since skc_hash (aliased to sk_hash and tw_hash) already contains the slot number if we mask with (ehash_size - 1) File include/net/inet_hashtables.h 64 bits platforms : #define INET_MATCH(__sk, __hash, __cookie, __saddr, __daddr, __ports, __dif)\ (((__sk)->sk_hash == (__hash)) ((((__u64 )&(inet_sk(__sk)->daddr)))== (__cookie)) && \ ((((__u32 )&(inet_sk(__sk)->dport))) == (__ports)) && \ (!((__sk)->sk_bound_dev_if) \|\| ((__sk)->sk_bound_dev_if == (__dif)))) 32bits platforms: #define TCP_IPV4_MATCH(__sk, __hash, __cookie, __saddr, __daddr, __ports, __dif)\ (((__sk)->sk_hash == (__hash)) && \ (inet_sk(__sk)->daddr == (__saddr)) && \ (inet_sk(__sk)->rcv_saddr == (__daddr)) && \ (!((__sk)->sk_bound_dev_if) \|\| ((__sk)->sk_bound_dev_if == (__dif)))) - Adds a prefetch(head->chain.first) in __inet_lookup_established()/__tcp_v4_check_established() and __inet6_lookup_established()/__tcp_v6_check_established() and __dccp_v4_check_established() to bring into cache the first element of the list, before the {read\|write}_lock(&head->lock); Signed-off-by: Eric Dumazet <dada1@cosmosbay.com> Acked-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-10-03 14:13:38 -07:00
Herbert Xu	325ed82393	[NET]: Fix packet timestamping. I've found the problem in general. It affects any 64-bit architecture. The problem occurs when you change the system time. Suppose that when you boot your system clock is forward by a day. This gets recorded down in skb_tv_base. You then wind the clock back by a day. From that point onwards the offset will be negative which essentially overflows the 32-bit variables they're stored in. In fact, why don't we just store the real time stamp in those 32-bit variables? After all, we're not going to overflow for quite a while yet. When we do overflow, we'll need a better solution of course. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-10-03 13:57:23 -07:00
Herbert Xu	c62dba9011	[IPV6]: Fix [Bug 5306] Oops on IPv6 route lookup > Steps to reproduce: > 1. Boot Linux, do NOT setup any IPv6 routes > 2. ip route get 2001::1 (or any unroutable address) Well caught. We never set rt6i_idev on ip6_null_entry. This patch should make the problem go away. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Acked-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-09-26 15:10:16 -07:00
Harald Welte	d67b24c40f	[NETFILTER]: Fix ip[6]t_NFQUEUE Kconfig dependency We have to introduce a separate Kconfig menu entry for the NFQUEUE targets. They cannot "just" depend on nfnetlink_queue, since nfnetlink_queue could be linked into the kernel, whereas iptables can be a module. Signed-off-by: Harald Welte <laforge@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-09-24 16:52:03 -07:00
Linus Torvalds	875bd5ab01	Merge master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6	2005-09-19 18:46:11 -07:00
Mark J Cox	6d1cfe3f17	[PATCH] raw_sendmsg DoS on 2.6 Fix unchecked __get_user that could be tricked into generating a memory read on an arbitrary address. The result of the read is not returned directly but you may be able to divine some information about it, or use the read to cause a crash on some architectures by reading hardware state. CAN-2004-2492. Fix from Al Viro, ack from Dave Miller. Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-09-19 18:45:42 -07:00
Yasuyuki Kozakai	e674d0f38d	[NETFILTER] ip6tables: remove duplicate code Some IPv6 matches have very similar loops to find IPv6 extension header and we can unify them. This patch introduces ipv6_find_hdr() to do it. I just checked that it can find the target headers in the packet which has dst,hbh,rt,frag,ah,esp headers. Signed-off-by: Yasuyuki Kozakai <yasuyuki.kozakai@toshiba.co.jp> Signed-off-by: Harald Welte <laforge@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-09-19 15:34:40 -07:00
Mitsuru KANDA	987905ded3	[IPV6]: Check connect(2) status for IPv6 UDP socket (Re: xfrm_lookup) I think we should cache the per-socket route(dst_entry) only when the IPv6 UDP socket is connect(2)'ed. (which is same as IPv4 UDP send behavior) Signed-off-by: Mitsuru KANDA <mk@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-09-18 00:30:08 -07:00
David L Stevens	40796c5e8f	[IPV6]: Fix per-socket multicast filtering in sk_reuse case per-socket multicast filters were not being applied to all sockets in the case of an exact-match bound address, due to an over-exuberant "return" in the look-up code. Fix below. IPv4 does not have this problem. Thanks to Hoerdt Mickael for reporting the bug. Signed-off-by: David L Stevens <dlstevens@us.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-09-14 21:10:20 -07:00
Denis Lukianov	de9daad90e	[MCAST]: Fix MCAST_EXCLUDE line dupes This patch fixes line dupes at /ipv4/igmp.c and /ipv6/mcast.c in the 2.6 kernel, where MCAST_EXCLUDE is mistakenly used instead of MCAST_INCLUDE. Signed-off-by: Denis Lukianov <denis@voxelsoft.com> Signed-off-by: David L Stevens <dlstevens@us.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-09-14 20:53:42 -07:00
Brian Haley	e6df439b89	[IPV6]: Bring Type 0 routing header in-line with rfc3542. Signed-off-by: Brian Haley <brian.haley@hp.com> Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-09-10 00:15:06 -07:00
Ingo Molnar	8d06afab73	[PATCH] timer initialization cleanup: DEFINE_TIMER Clean up timer initialization by introducing DEFINE_TIMER a'la DEFINE_SPINLOCK. Build and boot-tested on x86. A similar patch has been been in the -RT tree for some time. Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-09-09 14:03:48 -07:00
Dipankar Sarma	b835996f62	[PATCH] files: lock-free fd look-up With the use of RCU in files structure, the look-up of files using fds can now be lock-free. The lookup is protected by rcu_read_lock()/rcu_read_unlock(). This patch changes the readers to use lock-free lookup. Signed-off-by: Maneesh Soni <maneesh@in.ibm.com> Signed-off-by: Ravikiran Thirumalai <kiran_th@gmail.com> Signed-off-by: Dipankar Sarma <dipankar@in.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-09-09 13:57:55 -07:00
Patrick McHardy	e104411b82	[XFRM]: Always release dst_entry on error in xfrm_lookup Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-09-08 15:11:55 -07:00
Patrick McHardy	a57ebc90f1	[IPV6]: Don't redo xfrm_lookup for cached dst entries The xfrm lookup is already done when the dst entry is looked up first and stored in the cache. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-09-08 14:27:47 -07:00
David S. Miller	2e66fc4116	Merge git://git.skbuff.net/gitroot/yoshfuji/linux-2.6-git-rfc3542	2005-09-08 12:59:43 -07:00
Stephen Hemminger	42ca89c18b	[IPV6]: Need to use pskb_trim_rcsum(). Fix pskb_trim usage in ipv6. Only the udp one is really a bug, other places are just doing equivalent code. Signed-off-by: Stephen Hemminger <shemminger@osdl.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-09-08 12:57:43 -07:00
YOSHIFUJI Hideaki	41a1f8ea4f	[IPV6]: Support IPV6_{RECV,}TCLASS socket options / ancillary data. Based on patch from David L Stevens <dlstevens@us.ibm.com> Signed-off-by: David L Stevens <dlstevens@us.ibm.com> Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>	2005-09-08 10:19:03 +09:00
YOSHIFUJI Hideaki	333fad5364	[IPV6]: Support several new sockopt / ancillary data in Advanced API (RFC3542). Support several new socket options / ancillary data: IPV6_RECVPKTINFO, IPV6_PKTINFO, IPV6_RECVHOPOPTS, IPV6_HOPOPTS, IPV6_RECVDSTOPTS, IPV6_DSTOPTS, IPV6_RTHDRDSTOPTS, IPV6_RECVRTHDR, IPV6_RTHDR, IPV6_RECVHOPOPTS, IPV6_HOPOPTS Old semantics are preserved as IPV6_2292xxxx so that we can maintain backward compatibility. Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>	2005-09-08 09:59:17 +09:00
YOSHIFUJI Hideaki	2dac4b96b9	[IPV6]: Repair Incoming Interface Handling for Raw Socket. Due to changes to enforce checking interface bindings, sockets did not see loopback packets bound for our local address on our interface. e.g.) When we ping6 fe80::1%eth0, skb->dev points loopback_dev while IP6CB(skb)->iif indicates eth0. This patch fixes the issue by using appropriate incoming interface, in the sense of scoping architecture. Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-09-01 17:44:49 -07:00
Jesper Juhl	573dbd9596	[CRYPTO]: crypto_free_tfm() callers no longer need to check for NULL Since the patch to add a NULL short-circuit to crypto_free_tfm() went in, there's no longer any need for callers of that function to check for NULL. This patch removes the redundant NULL checks and also a few similar checks for NULL before calls to kfree() that I ran into while doing the crypto_free_tfm bits. I've succesfuly compile tested this patch, and a kernel with the patch applied boots and runs just fine. When I posted the patch to LKML (and other lists/people on Cc) it drew the following comments : J. Bruce Fields commented "I've no problem with the auth_gss or nfsv4 bits.--b." Sridhar Samudrala said "sctp change looks fine." Herbert Xu signed off on the patch. So, I guess this is ready to be dropped into -mm and eventually mainline. Signed-off-by: Jesper Juhl <jesper.juhl@gmail.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-09-01 17:44:29 -07:00
Harald Welte	0ac4f893f2	[NETFILTER6]: Add new ip6tables HOPLIMIT target This target allows users to modify the hoplimit header field of the IPv6 header. Signed-off-by: Harald Welte <laforge@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-08-29 16:13:29 -07:00
Eric Dumazet	ba89966c19	[NET]: use __read_mostly on kmem_cache_t , DEFINE_SNMP_STAT pointers This patch puts mostly read only data in the right section (read_mostly), to help sharing of these data between CPUS without memory ping pongs. On one of my production machine, tcp_statistics was sitting in a heavily modified cache line, so every SNMP update had to force a reload. Signed-off-by: Eric Dumazet <dada1@cosmosbay.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-08-29 16:11:18 -07:00
Patrick McHardy	05465343bf	[NETFILTER]: Add goto target Originally written by Henrik Nordstrom <hno@marasystems.com>, taken from netfilter patch-o-matic and added ip6_tables support. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-08-29 16:04:18 -07:00
Patrick McHardy	764d8a9f24	[NETFILTER]: Add IPv6 REJECT target Originally written by Yasuyuki Kozakai <yasuyuki.kozakai@toshiba.co.jp>, taken from netfilter patch-o-matic and fixed up to work with current kernels. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-08-29 16:04:12 -07:00
Arnaldo Carvalho de Melo	20380731bc	[NET]: Fix sparse warnings Of this type, mostly: CHECK net/ipv6/netfilter.c net/ipv6/netfilter.c:96:12: warning: symbol 'ipv6_netfilter_init' was not declared. Should it be static? net/ipv6/netfilter.c:101:6: warning: symbol 'ipv6_netfilter_fini' was not declared. Should it be static? Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-08-29 16:01:32 -07:00
Patrick McHardy	066286071d	[NETLINK]: Add "groups" argument to netlink_kernel_create Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-08-29 16:01:11 -07:00
Patrick McHardy	ac6d439d20	[NETLINK]: Convert netlink users to use group numbers instead of bitmasks Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-08-29 16:00:54 -07:00
Christoph Hellwig	34b4a4a624	[NETFILTER]: Remove tasklist_lock abuse in ipt{,6}owner Rip out cmd/sid/pid matching since its unfixable broken and stands in the way of locking changes to tasklist_lock. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-08-29 15:59:07 -07:00
Patrick McHardy	a61bbcf28a	[NET]: Store skb->timestamp as offset to a base timestamp Reduces skb size by 8 bytes on 64-bit. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-08-29 15:58:24 -07:00
Arnaldo Carvalho de Melo	5324a040cc	[INET6_HASHTABLES]: Move inet6_lookup functions to net/ipv6/inet6_hashtables.c Doing this we allow tcp_diag to support IPV6 even if tcp_diag is compiled statically and IPV6 is compiled as a module, removing the previous restriction while not building any IPV6 code if it is not selected. Now to work on the tcpdiag_register infrastructure and then to rename the whole thing to inetdiag, reflecting its by then completely generic nature. Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-08-29 15:57:29 -07:00
Arnaldo Carvalho de Melo	505cbfc577	[IPV6]: Generalise the tcp_v6_lookup routines In the same way as was done with the v4 counterparts, this will be moved to inet6_hashtables.c. Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-08-29 15:57:24 -07:00
Arnaldo Carvalho de Melo	6687e988d9	[ICSK]: Move TCP congestion avoidance members to icsk This changeset basically moves tcp_sk()->{ca_ops,ca_state,etc} to inet_csk(), minimal renaming/moving done in this changeset to ease review. Most of it is just changes of struct tcp_sock * to struct sock * parameters. With this we move to a state closer to two interesting goals: 1. Generalisation of net/ipv4/tcp_diag.c, becoming inet_diag.c, being used for any INET transport protocol that has struct inet_hashinfo and are derived from struct inet_connection_sock. Keeps the userspace API, that will just not display DCCP sockets, while newer versions of tools can support DCCP. 2. INET generic transport pluggable Congestion Avoidance infrastructure, using the current TCP CA infrastructure with DCCP. Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-08-29 15:56:18 -07:00
Patrick McHardy	64ce207306	[NET]: Make NETDEBUG pure printk wrappers Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-08-29 15:56:08 -07:00
Arnaldo Carvalho de Melo	295ff7edb8	[TIMEWAIT]: Introduce inet_timewait_death_row That groups all of the tables and variables associated to the TCP timewait schedulling/recycling/killing code, that now can be isolated from the TCP specific code and used by other transport protocols, such as DCCP. Next changeset will move this code to net/ipv4/inet_timewait_sock.c Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-08-29 15:55:48 -07:00
Harald Welte	bbd86b9fc4	[NETFILTER]: add /proc/net/netfilter interface to nf_queue This patch adds a /proc/net/netfilter/nf_queue file, similar to the recently-added /proc/net/netfilter/nf_log. It indicates which queue handler is registered to which protocol family. This is useful since there are now multiple queue handlers in the treee (ip[6]_queue, nfnetlink_queue). Signed-off-by: Harald Welte <laforge@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-08-29 15:51:18 -07:00
Harald Welte	210a9ebef2	[NETFILTER]: ip{6}_queue: prevent unregistration race with nfnetlink_queue Since nfnetlink_queue can override ip{6}_queue as queue handlers, we can no longer blindly unregister whoever is registered for PF_INET[6], but only unregister ourselves. Signed-off-by: Harald Welte <laforge@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-08-29 15:51:08 -07:00
Arnaldo Carvalho de Melo	0a5578cf8e	[ICSK]: Generalise tcp_listen_{start,stop} This also moved inet_iif from tcp to inet_hashtables.h, as it is needed by the inet_lookup callers, perhaps this needs a bit of polishing, but for now seems fine. Signed-off-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-08-29 15:49:24 -07:00
Arnaldo Carvalho de Melo	463c84b97f	[NET]: Introduce inet_connection_sock This creates struct inet_connection_sock, moving members out of struct tcp_sock that are shareable with other INET connection oriented protocols, such as DCCP, that in my private tree already uses most of these members. The functions that operate on these members were renamed, using a inet_csk_ prefix while not being moved yet to a new file, so as to ease the review of these changes. Signed-off-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-08-29 15:43:19 -07:00
Arnaldo Carvalho de Melo	8feaf0c0a5	[INET]: Generalise tcp_tw_bucket, aka TIME_WAIT sockets This paves the way to generalise the rest of the sock ID lookup routines and saves some bytes in TCPv4 TIME_WAIT sockets on distro kernels (where IPv6 is always built as a module): [root@qemu ~]# grep tw_sock /proc/slabinfo tw_sock_TCPv6 0 0 128 31 1 tw_sock_TCP 0 0 96 41 1 [root@qemu ~]# Now if a protocol wants to use the TIME_WAIT generic infrastructure it only has to set the sk_prot->twsk_obj_size field with the size of its inet_timewait_sock derived sock and proto_register will create sk_prot->twsk_slab, for now its only for INET sockets, but we can introduce timewait_sock later if some non INET transport protocolo wants to use this stuff. Next changesets will take advantage of this new infrastructure to generalise even more TCP code. [acme@toy net-2.6.14]$ grep built-in /tmp/before.size /tmp/after.size /tmp/before.size: 188646 11764 5068 205478 322a6 net/ipv4/built-in.o /tmp/after.size: 188144 11764 5068 204976 320b0 net/ipv4/built-in.o [acme@toy net-2.6.14]$ Tested with both IPv4 & IPv6 (::1 (localhost) & ::ffff:172.20.0.1 (qemu host)). Signed-off-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-08-29 15:42:13 -07:00
Arnaldo Carvalho de Melo	c752f0739f	[TCP]: Move the tcp sock states to net/tcp_states.h Lots of places just needs the states, not even linux/tcp.h, where this enum was, needs it. This speeds up development of the refactorings as less sources are rebuilt when things get moved from net/tcp.h. Signed-off-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-08-29 15:41:54 -07:00
Arnaldo Carvalho de Melo	f3f05f7046	[INET]: Generalise the tcp_listen_ lock routines Signed-off-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-08-29 15:41:49 -07:00
Arnaldo Carvalho de Melo	6e04e02165	[INET]: Move tcp_port_rover to inet_hashinfo Also expose all of the tcp_hashinfo members, i.e. killing those tcp_ehash, etc macros, this will more clearly expose already generic functions and some that need just a bit of work to become generic, as we'll see in the upcoming changesets. Signed-off-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-08-29 15:41:44 -07:00
Arnaldo Carvalho de Melo	2d8c4ce519	[INET]: Generalise tcp_bind_hash & tcp_inherit_port This required moving tcp_bucket_cachep to inet_hashinfo. Signed-off-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-08-29 15:40:29 -07:00
Arnaldo Carvalho de Melo	a55ebcc4c4	[INET]: Move bind_hash from tcp_sk to inet_sk This should really be in a inet_connection_sock, but I'm leaving it for a later optimization, when some more fields common to INET transport protocols now in tcp_sk or inet_sk will be chunked out into inet_connection_sock, for now its better to concentrate on getting the changes in the core merged to leave the DCCP tree with only DCCP specific code. Next changesets will take advantage of this move to generalise things like tcp_bind_hash, tcp_put_port, tcp_inherit_port, making the later receive a inet_hashinfo parameter, and even __tcp_tw_hashdance, etc in the future, when tcp_tw_bucket gets transformed into the struct timewait_sock hierarchy. tcp_destroy_sock also is eligible as soon as tcp_orphan_count gets moved to sk_prot. A cascade of incremental changes will ultimately make the tcp_lookup functions be fully generic. Signed-off-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-08-29 15:38:48 -07:00
Arnaldo Carvalho de Melo	0f7ff9274e	[INET]: Just rename the TCP hashtable functions/structs to inet_ This is to break down the complexity of the series of patches, making it very clear that this one just does: 1. renames tcp_ prefixed hashtable functions and data structures that were already mostly generic to inet_ to share it with DCCP and other INET transport protocols. 2. Removes not used functions (__tb_head & tb_head) 3. Removes some leftover prototypes in the headers (tcp_bucket_unlock & tcp_v4_build_header) Next changesets will move tcp_sk(sk)->bind_hash to inet_sock so that we can make functions such as tcp_inherit_port, __tcp_inherit_port, tcp_v4_get_port, __tcp_put_port, generic and get others like tcp_destroy_sock closer to generic (tcp_orphan_count will go to sk->sk_prot to allow this). Eventually most of these functions will be used passing the transport protocol inet_hashinfo structure. Signed-off-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-08-29 15:38:32 -07:00
Harald Welte	608c8e4f7b	[NETFILTER]: Extend netfilter logging API This patch is in preparation to nfnetlink_log: - loggers now have to register struct nf_logger instead of nf_logfn - nf_log_unregister() replaced by nf_log_unregister_pf() and nf_log_unregister_logger() - add comment to ip[6]t_LOG.h to assure nobody redefines flags - add /proc/net/netfilter/nf_log to tell user which logger is currently registered for which address family - if user has configured logging, but no logging backend (logger) is available, always spit a message to syslog, not just the first time. - split ip[6]t_LOG.c into two parts: Backend: Always try to register as logger for the respective address family Frontend: Always log via nf_log_packet() API - modify all users of nf_log_packet() to accomodate additional argument Signed-off-by: Harald Welte <laforge@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-08-29 15:38:07 -07:00
Arnaldo Carvalho de Melo	32519f11d3	[INET]: Introduce inet_sk_rebuild_header From tcp_v4_rebuild_header, that already was pretty generic, I only needed to use sk->sk_protocol instead of the hardcoded IPPROTO_TCP and establish the requirement that INET transport layer protocols that want to use this function map TCP_SYN_SENT to its equivalent state. Signed-off-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-08-29 15:37:55 -07:00
Arnaldo Carvalho de Melo	e6848976b7	[NET]: Cleanup INET_REFCNT_DEBUG code Signed-off-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-08-29 15:37:29 -07:00
Patrick McHardy	d13964f449	[IPV4/6]: Check if packet was actually delivered to a raw socket to decide whether to send an ICMP unreachable Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-08-29 15:37:22 -07:00
Andrew McDonald	0bd1b59b15	[IPV6]: Check interface bindings on IPv6 raw socket reception Take account of whether a socket is bound to a particular device when selecting an IPv6 raw socket to receive a packet. Also perform this check when receiving IPv6 packets with router alert options. Signed-off-by: Andrew McDonald <andrew@mcdonald.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-08-29 15:37:06 -07:00
Harald Welte	7af4cc3fa1	[NETFILTER]: Add "nfnetlink_queue" netfilter queue handler over nfnetlink - Add new nfnetlink_queue module - Add new ipt_NFQUEUE and ip6t_NFQUEUE modules to access queue numbers 1-65535 - Mark ip_queue and ip6_queue Kconfig options as OBSOLETE - Update feature-removal-schedule to remove ip[6]_queue in December Signed-off-by: Harald Welte <laforge@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-08-29 15:36:56 -07:00
Harald Welte	0ab43f8499	[NETFILTER]: Core changes required by upcoming nfnetlink_queue code - split netfiler verdict in 16bit verdict and 16bit queue number - add 'queuenum' argument to nf_queue_outfn_t and its users ip[6]_queue - move NFNL_SUBSYS_ definitions from enum to #define - introduce autoloading for nfnetlink subsystem modules - add MODULE_ALIAS_NFNL_SUBSYS macro - add nf_unregister_queue_handlers() to register all handlers for a given nf_queue_outfn_t - add more verbose DEBUGP macro definition to nfnetlink.c - make nfnetlink_subsys_register fail if subsys already exists - add some more comments and debug statements to nfnetlink.c Signed-off-by: Harald Welte <laforge@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-08-29 15:36:49 -07:00
Harald Welte	2cc7d57309	[NETFILTER]: Move reroute-after-queue code up to the nf_queue layer. The rerouting functionality is required by the core, therefore it has to be implemented by the core and not in individual queue handlers. Signed-off-by: Harald Welte <laforge@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-08-29 15:36:19 -07:00
Harald Welte	4fdb3bb723	[NETLINK]: Add properly module refcounting for kernel netlink sockets. - Remove bogus code for compiling netlink as module - Add module refcounting support for modules implementing a netlink protocol - Add support for autoloading modules that implement a netlink protocol as soon as someone opens a socket for that protocol Signed-off-by: Harald Welte <laforge@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-08-29 15:35:08 -07:00
Harald Welte	020b4c12db	[NETFILTER]: Move ipv4 specific code from net/core/netfilter.c to net/ipv4/netfilter.c Netfilter cleanup - Move ipv4 code from net/core/netfilter.c to net/ipv4/netfilter.c - Move ipv6 netfilter code from net/ipv6/ip6_output.c to net/ipv6/netfilter.c Signed-off-by: Harald Welte <laforge@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-08-29 15:35:01 -07:00
Harald Welte	089af26c70	[NETFILTER]: Rename skb_ip_make_writable() to skb_make_writable() There is nothing IPv4-specific in it. In fact, it was already used by IPv6, too... Upcoming nfnetlink_queue code will use it for any kind of packet. Signed-off-by: Harald Welte <laforge@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-08-29 15:34:40 -07:00
David S. Miller	f2ccd8fa06	[NET]: Kill skb->real_dev Bonding just wants the device before the skb_bond() decapsulation occurs, so simply pass that original device into packet_type->func() as an argument. It remains to be seen whether we can use this same exact thing to get rid of skb->input_dev as well. Signed-off-by: David S. Miller <davem@davemloft.net>	2005-08-29 15:32:25 -07:00
Harald Welte	6869c4d8e0	[NETFILTER]: reduce netfilter sk_buff enlargement As discussed at netconf'05, we're trying to save every bit in sk_buff. The patch below makes sk_buff 8 bytes smaller. I did some basic testing on my notebook and it seems to work. The only real in-tree user of nfcache was IPVS, who only needs a single bit. Unfortunately I couldn't find some other free bit in sk_buff to stuff that bit into, so I introduced a separate field for them. Maybe the IPVS guys can resolve that to further save space. Initially I wanted to shrink pkt_type to three bits (PACKET_HOST and alike are only 6 values defined), but unfortunately the bluetooth code overloads pkt_type :( The conntrack-event-api (out-of-tree) uses nfcache, but Rusty just came up with a way how to do it without any skb fields, so it's safe to remove it. - remove all never-implemented 'nfcache' code - don't have ipvs code abuse 'nfcache' field. currently get's their own compile-conditional skb->ipvs_property field. IPVS maintainers can decide to move this bit elswhere, but nfcache needs to die. - remove skb->nfcache field to save 4 bytes - move skb->nfctinfo into three unused bits to save further 4 bytes Signed-off-by: Harald Welte <laforge@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-08-29 15:31:04 -07:00
David S. Miller	d5d283751e	[TCP]: Document non-trivial locking path in tcp_v{4,6}_get_port(). This trips up a lot of folks reading this code. Put an unlikely() around the port-exhaustion test for good measure. Signed-off-by: David S. Miller <davem@davemloft.net>	2005-08-23 10:49:54 -07:00
Patrick McHardy	66a79a19a7	[NETFILTER]: Fix HW checksum handling in ip_queue/ip6_queue The checksum needs to be filled in on output, after mangling a packet ip_summed needs to be reset. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-08-23 10:10:35 -07:00
Herbert Xu	6fc8b9e7c6	[IPCOMP]: Fix false smp_processor_id warning This patch fixes a false-positive from debug_smp_processor_id(). The processor ID is only used to look up crypto_tfm objects. Any processor ID is acceptable here as long as it is one that is iterated on by for_each_cpu(). Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-08-18 14:36:59 -07:00
Patrick McHardy	fad87acaea	[IPV6]: Fix SKB leak in ip6_input_finish() Changing it to how ip_input handles should fix it. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-08-16 21:03:41 -07:00
Patrick McHardy	793245eeb9	[IPV6]: Fix raw socket hardware checksum failures When packets hit raw sockets the csum update isn't done yet, do it manually. Packets can also reach rawv6_rcv on the output path through ip6_call_ra_chain, in this case skb->ip_summed is CHECKSUM_NONE and this codepath isn't executed. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-08-16 20:39:38 -07:00
Herbert Xu	6fc0b4a7a7	[IPSEC]: Restrict socket policy loading to CAP_NET_ADMIN. The interface needs much redesigning if we wish to allow normal users to do this in some way. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-08-06 06:33:15 -07:00
Alexey Kuznetsov	db44575f6f	[NET]: fix oops after tunnel module unload Tunnel modules used to obtain module refcount each time when some tunnel was created, which meaned that tunnel could be unloaded only after all the tunnels are deleted. Since killing old MOD_*_USE_COUNT macros this protection has gone. It is possible to return it back as module_get/put, but it looks more natural and practically useful to force destruction of all the child tunnels on module unload. Signed-off-by: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-07-30 17:46:44 -07:00
Olaf Hering	44456d37b5	[PATCH] turn many #if $undefined_string into #ifdef $undefined_string turn many #if $undefined_string into #ifdef $undefined_string to fix some warnings after -Wno-def was added to global CFLAGS Signed-off-by: Olaf Hering <olh@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-07-27 16:26:08 -07:00
Cal Peake	227510c7f1	[IPV6]: fix implicit declaration of function `xfrm6_tunnel_unregister' Signed-off-by: Cal Peake <cp@absolutedigital.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-07-24 19:30:06 -07:00
Patrick McHardy	d3984a6b6a	[NETFILTER]: Fix ip6t_LOG MAC format I broke this in the patch that consolidated MAC logging. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-07-22 12:52:47 -07:00
Patrick McHardy	4c1217deeb	[NETFILTER]: Fix deadlock in ip6_queue Already fixed in ip_queue, ip6_queue was missed. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-07-22 12:49:30 -07:00
Patrick McHardy	0303770deb	[NET]: Make ipip/ip6_tunnel independant of XFRM Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-07-19 14:03:34 -07:00
Sam Ravnborg	6a2e9b738c	[NET]: move config options out to individual protocols Move the protocol specific config options out to the specific protocols. With this change net/Kconfig now starts to become readable and serve as a good basis for further re-structuring. The menu structure is left almost intact, except that indention is fixed in most cases. Most visible are the INET changes where several "depends on INET" are replaced with a single ifdef INET / endif pair. Several new files were created to accomplish this change - they are small but serve the purpose that config options are now distributed out where they belongs. Signed-off-by: Sam Ravnborg <sam@ravnborg.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-07-11 21:13:56 -07:00
David S. Miller	9c05989bb2	[IPV6]: Fix warning in ip6_mc_msfilter. Signed-off-by: David S. Miller <davem@davemloft.net>	2005-07-08 21:44:39 -07:00
David L Stevens	9951f036fe	[IPV4]: (INCLUDE,empty)/leave-group equivalence for full-state MSF APIs & errno fix 1) Adds (INCLUDE, empty)/leave-group equivalence to the full-state multicast source filter APIs (IPv4 and IPv6) 2) Fixes an incorrect errno in the IPv6 leave-group (ENOENT should be EADDRNOTAVAIL) Signed-off-by: David L Stevens <dlstevens@us.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-07-08 17:47:28 -07:00
David L Stevens	917f2f105e	[IPV4]: multicast API "join" issues 1) In the full-state API when imsf_numsrc == 0 errno should be "0", but returns EADDRNOTAVAIL 2) An illegal filter mode change errno should be EINVAL, but returns EADDRNOTAVAIL 3) Trying to do an any-source option without IP_ADD_MEMBERSHIP errno should be EINVAL, but returns EADDRNOTAVAIL 4) Adds comments for the less obvious error return values Signed-off-by: David L Stevens <dlstevens@us.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-07-08 17:45:16 -07:00
David S. Miller	c1b4a7e695	[TCP]: Move to new TSO segmenting scheme. Make TSO segment transmit size decisions at send time not earlier. The basic scheme is that we try to build as large a TSO frame as possible when pulling in the user data, but the size of the TSO frame output to the card is determined at transmit time. This is guided by tp->xmit_size_goal. It is always set to a multiple of MSS and tells sendmsg/sendpage how large an SKB to try and build. Later, tcp_write_xmit() and tcp_push_one() chop up the packet if necessary and conditions warrant. These routines can also decide to "defer" in order to wait for more ACKs to arrive and thus allow larger TSO frames to be emitted. A general observation is that TSO elongates the pipe, thus requiring a larger congestion window and larger buffering especially at the sender side. Therefore, it is important that applications 1) get a large enough socket send buffer (this is accomplished by our dynamic send buffer expansion code) 2) do large enough writes. Signed-off-by: David S. Miller <davem@davemloft.net>	2005-07-05 15:24:38 -07:00
Herbert Xu	e2ed4052aa	[IPV6]: Makes IPv6 rcv registration happen last during initialisation. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-07-05 14:41:20 -07:00
Thomas Graf	e176fe8954	[NET]: Remove unused security member in sk_buff Signed-off-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-07-05 14:12:44 -07:00
YOSHIFUJI Hideaki	7fe40f73d7	[IPV6]: remove more unused IPV6_AUTHHDR things. Remove two more unused IPV6_AUTHHDR option things, which I failed to remove them last time, plus, mark IPV6_AUTHHDR obsolete. Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-06-28 15:46:24 -07:00
YOSHIFUJI Hideaki	ae9cda5d65	[IPV6]: Don't dump temporary addresses twice Each IPv6 Temporary Address (w/ CONFIG_IPV6_PRIVACY) is dumped twice to netlink. Because temporary addresses are listed in idev->addr_list, there's no need to dump idev->tempaddr separately. Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-06-28 13:00:30 -07:00
Patrick McHardy	8a47077a0b	[NETLINK]: Missing padding fields in dumped structures Plug holes with padding fields and initialized them to zero. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-06-28 12:56:45 -07:00
Patrick McHardy	9ef1d4c7c7	[NETLINK]: Missing initializations in dumped data Mostly missing initialization of padding fields of 1 or 2 bytes length, two instances of uninitialized nlmsgerr->msg of 16 bytes length. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-06-28 12:55:30 -07:00
Stephen Hemminger	5f8ef48d24	[TCP]: Allow choosing TCP congestion control via sockopt. Allow using setsockopt to set TCP congestion control to use on a per socket basis. Signed-off-by: Stephen Hemminger <shemminger@osdl.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-06-23 20:37:36 -07:00
Stephen Hemminger	317a76f9a4	[TCP]: Add pluggable congestion control algorithm infrastructure. Allow TCP to have multiple pluggable congestion control algorithms. Algorithms are defined by a set of operations and can be built in or modules. The legacy "new RENO" algorithm is used as a starting point and fallback. Signed-off-by: Stephen Hemminger <shemminger@osdl.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-06-23 12:19:55 -07:00
Paulo Marques	543537bd92	[PATCH] create a kstrdup library function This patch creates a new kstrdup library function and changes the "local" implementations in several places to use this function. Most of the changes come from the sound and net subsystems. The sound part had already been acknowledged by Takashi Iwai and the net part by David S. Miller. I left UML alone for now because I would need more time to read the code carefully before making changes there. Signed-off-by: Paulo Marques <pmarques@grupopie.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-06-23 09:45:18 -07:00
Patrick McHardy	047601bf7c	[NETFILTER]: Fix ip6t_LOG sit tunnel logging Sit tunnel logging is currently broken: MAC=01:23:45:67:89:ab->01:23:45:47:89:ac TUNNEL=123.123. 0.123-> 12.123. 6.123 Apart from the broken IP address, MAC addresses are printed differently for sit tunnels than for everything else. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-06-21 14:07:13 -07:00
Patrick McHardy	97216c799a	[NETFILTER]: Missing owner-field initialization in ip6table_raw I missed this one when fixing up iptable_raw. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-06-21 14:03:01 -07:00
Patrick McHardy	e98231858b	[NETFILTER]: Restore netfilter assumptions in IPv6 multicast Netfilter assumes that skb->data == skb->nh.ipv6h Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-06-21 14:02:15 -07:00
Patrick McHardy	18b8afc771	[NETFILTER]: Kill nf_debug Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-06-21 14:01:57 -07:00
Patrick McHardy	e45b1be8bc	[NETFILTER]: Kill lockhelp.h Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-06-21 14:01:30 -07:00
David L Stevens	c9e3e8b695	[IPV6]: multicast join and misc Here is a simplified version of the patch to fix a bug in IPv6 multicasting. It: 1) adds existence check & EADDRINUSE error for regular joins 2) adds an exception for EADDRINUSE in the source-specific multicast join (where a prior join is ok) 3) adds a missing/needed read_lock on sock_mc_list; would've raced with destroying the socket on interface down without 4) adds a "leave group" in the (INCLUDE, empty) source filter case. This frees unneeded socket buffer memory, but also prevents an inappropriate interaction among the 8 socket options that mess with this. Some would fail as if in the group when you aren't really. Item #4 had a locking bug in the last version of this patch; rather than removing the idev->lock read lock only, I've simplified it to remove all lock state in the path and treat it as a direct "leave group" call for the (INCLUDE,empty) case it covers. Tested on an MP machine. :-) Much thanks to HoerdtMickael <hoerdt@clarinet.u-strasbg.fr> who reported the original bug. Signed-off-by: David L Stevens <dlstevens@us.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-06-21 13:58:25 -07:00
Jamal Hadi Salim	0d51aa80a9	[IPV6]: V6 route events reported with wrong netlink PID and seq number Essentially netlink at the moment always reports a pid and sequence of 0 always for v6 route activities. To understand the repurcassions of this look at: http://lists.quagga.net/pipermail/quagga-dev/2005-June/003507.html While fixing this, i took the liberty to resolve the outstanding issue of IPV6 routes inserted via ioctls to have the correct pids as well. This patch tries to behave as close as possible to the v4 routes i.e maintains whatever PID the socket issuing the command owns as opposed to the process. That made the patch a little bulky. I have tested against both netlink derived utility to add/del routes as well as ioctl derived one. The Quagga folks have tested against quagga. This fixes the problem and so far hasnt been detected to introduce any new issues. Signed-off-by: Jamal Hadi Salim <hadi@cyberus.ca> Acked-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-06-21 13:51:04 -07:00
Herbert Xu	72cb6962a9	[IPSEC]: Add xfrm_init_state This patch adds xfrm_init_state which is simply a wrapper that calls xfrm_get_type and subsequently x->type->init_state. It also gets rid of the unused args argument. Abstracting it out allows us to add common initialisation code, e.g., to set family-specific flags. The add_time setting in xfrm_user.c was deleted because it's already set by xfrm_state_alloc. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Acked-by: James Morris <jmorris@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-06-20 13:18:08 -07:00
Herbert Xu	e0f9f8586a	[IPV4/IPV6]: Replace spin_lock_irq with spin_lock_bh In light of my recent patch to net/ipv4/udp.c that replaced the spin_lock_irq calls on the receive queue lock with spin_lock_bh, here is a similar patch for all other occurences of spin_lock_irq on receive/error queue locks in IPv4 and IPv6. In these stacks, we know that they can only be entered from user or softirq context. Therefore it's safe to disable BH only. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-06-18 22:56:18 -07:00
Jamal Hadi Salim	9ed19f339e	[NETLINK]: Set correct pid for ioctl originating netlink events This patch ensures that netlink events created as a result of programns using ioctls (such as ifconfig, route etc) contains the correct PID of those events. Signed-off-by: Jamal Hadi Salim <hadi@cyberus.ca> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-06-18 22:55:51 -07:00
Jamal Hadi Salim	e431b8c004	[NETLINK]: Explicit typing This patch converts "unsigned flags" to use more explict types like u16 instead and incrementally introduces NLMSG_NEW(). Signed-off-by: Jamal Hadi Salim <hadi@cyberus.ca> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-06-18 22:55:31 -07:00
Jamal Hadi Salim	b6544c0b4c	[NETLINK]: Correctly set NLM_F_MULTI without checking the pid This patch rectifies some rtnetlink message builders that derive the flags from the pid. It is now explicit like the other cases which get it right. Also fixes half a dozen dumpers which did not set NLM_F_MULTI at all. Signed-off-by: Jamal Hadi Salim <hadi@cyberus.ca> Signed-off-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-06-18 22:54:12 -07:00
Arnaldo Carvalho de Melo	2ad69c55a2	[NET] rename struct tcp_listen_opt to struct listen_sock Signed-off-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-06-18 22:48:55 -07:00
Arnaldo Carvalho de Melo	0e87506fcc	[NET] Generalise tcp_listen_opt This chunks out the accept_queue and tcp_listen_opt code and moves them to net/core/request_sock.c and include/net/request_sock.h, to make it useful for other transport protocols, DCCP being the first one to use it. Next patches will rename tcp_listen_opt to accept_sock and remove the inline tcp functions that just call a reqsk_queue_ function. Signed-off-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-06-18 22:47:59 -07:00
Arnaldo Carvalho de Melo	60236fdd08	[NET] Rename open_request to request_sock Ok, this one just renames some stuff to have a better namespace and to dissassociate it from TCP: struct open_request -> struct request_sock tcp_openreq_alloc -> reqsk_alloc tcp_openreq_free -> reqsk_free tcp_openreq_fastfree -> __reqsk_free With this most of the infrastructure closely resembles a struct sock methods subset. Signed-off-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-06-18 22:47:21 -07:00
Arnaldo Carvalho de Melo	2e6599cb89	[NET] Generalise TCP's struct open_request minisock infrastructure Kept this first changeset minimal, without changing existing names to ease peer review. Basicaly tcp_openreq_alloc now receives the or_calltable, that in turn has two new members: ->slab, that replaces tcp_openreq_cachep ->obj_size, to inform the size of the openreq descendant for a specific protocol The protocol specific fields in struct open_request were moved to a class hierarchy, with the things that are common to all connection oriented PF_INET protocols in struct inet_request_sock, the TCP ones in tcp_request_sock, that is an inet_request_sock, that is an open_request. I.e. this uses the same approach used for the struct sock class hierarchy, with sk_prot indicating if the protocol wants to use the open_request infrastructure by filling in sk_prot->rsk_prot with an or_calltable. Results? Performance is improved and TCP v4 now uses only 64 bytes per open request minisock, down from 96 without this patch :-) Next changeset will rename some of the structs, fields and functions mentioned above, struct or_calltable is way unclear, better name it struct request_sock_ops, s/struct open_request/struct request_sock/g, etc. Signed-off-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-06-18 22:46:52 -07:00
R�mi Denis-Courmont	77bd91967a	[IPv6] Don't generate temporary for TUN devices Userland layer-2 tunneling devices allocated through the TUNTAP driver (drivers/net/tun.c) have a type of ARPHRD_NONE, and have no link-layer address. The kernel complains at regular interval when IPv6 Privacy extension are enabled because it can't find an hardware address : Dec 29 11:02:04 auguste kernel: __ipv6_regen_rndid(idev=cb3e0c00): cannot get EUI64 identifier; use random bytes. IPv6 Privacy extensions should probably be disabled on that sort of device. They won't work anyway. If userland wants a more usual Ethernet-ish interface with usual IPv6 autoconfiguration, it will use a TAP device with an emulated link-layer and a random hardware address rather than a TUN device. As far as I could fine, TUN virtual device from TUNTAP is the very only sort of device using ARPHRD_NONE as kernel device type. Signed-off-by: R�mi Denis-Courmont <rdenis@simphalempin.com> Acked-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-06-13 15:01:34 -07:00
YOSHIFUJI Hideaki	84427d5330	[IPV6]: Ensure to use icmpv6_socket in non-preemptive context. We saw following trace several times: \|BUG: using smp_processor_id() in preemptible [00000001] code: httpd/30137 \|caller is icmpv6_send+0x23/0x540 \| [<c01ad63b>] smp_processor_id+0x9b/0xb8 \| [<c02993e7>] icmpv6_send+0x23/0x540 This is because of icmpv6_socket, which is the only one user of smp_processor_id() in icmpv6_send(), AFAIK. Since it should be used in non-preemptive context, let's defer the dereference after disabling preemption (by icmpv6_xmit_lock()). Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-06-13 14:59:44 -07:00
Gabor Fekete	8181b8c1f3	[IPV6]: Update parm.link in ip6ip6_tnl_change() Signed-off-by: Gabor Fekete <gfekete@cc.jyu.fi> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-06-08 14:54:38 -07:00
Adrian Bunk	4fef0304ee	[IPV6]: Kill export of fl6_sock_lookup. There is no usage of this EXPORT_SYMBOL in the kernel. Signed-off-by: Adrian Bunk <bunk@stusta.de> Acked-by: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-06-02 13:06:36 -07:00
David S. Miller	6c94d3611b	[IPV6]: Clear up user copy warning in flowlabel code. We are intentionally ignoring the copy_to_user() value, make it clear to the compiler too. Noted by Jeff Garzik. Signed-off-by: David S. Miller <davem@davemloft.net>	2005-05-29 20:28:01 -07:00
Hideaki YOSHIFUJI	92d63decc0	From: Kazunori Miyazawa <kazunori@miyazawa.org> [XFRM] Call dst_check() with appropriate cookie This fixes infinite loop issue with IPv6 tunnel mode. Signed-off-by: Kazunori Miyazawa <kazunori@miyazawa.org> Signed-off-by: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-05-26 12:58:04 -07:00
Herbert Xu	180e425033	[IPV6]: Fix xfrm tunnel oops with large packets Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Acked-by: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-05-23 13:11:07 -07:00
Herbert Xu	2fdba6b085	[IPV4/IPV6] Ensure all frag_list members have NULL sk Having frag_list members which holds wmem of an sk leads to nightmares with partially cloned frag skb's. The reason is that once you unleash a skb with a frag_list that has individual sk ownerships into the stack you can never undo those ownerships safely as they may have been cloned by things like netfilter. Since we have to undo them in order to make skb_linearize happy this approach leads to a dead-end. So let's go the other way and make this an invariant: For any skb on a frag_list, skb->sk must be NULL. That is, the socket ownership always belongs to the head skb. It turns out that the implementation is actually pretty simple. The above invariant is actually violated in the following patch for a short duration inside ip_fragment. This is OK because the offending frag_list member is either destroyed at the end of the slow path without being sent anywhere, or it is detached from the frag_list before being sent. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-05-18 22:52:33 -07:00
Herbert Xu	aabc9761b6	[IPSEC]: Store idev entries I found a bug that stopped IPsec/IPv6 from working. About a month ago IPv6 started using rt6i_idev->dev on the cached socket dst entries. If the cached socket dst entry is IPsec, then rt6i_idev will be NULL. Since we want to look at the rt6i_idev of the original route in this case, the easiest fix is to store rt6i_idev in the IPsec dst entry just as we do for a number of other IPv6 route attributes. Unfortunately this means that we need some new code to handle the references to rt6i_idev. That's why this patch is bigger than it would otherwise be. I've also done the same thing for IPv4 since it is conceivable that once these idev attributes start getting used for accounting, we probably need to dereference them for IPv4 IPsec entries too. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-05-03 16:27:10 -07:00
Herbert Xu	2a0a6ebee1	[NETLINK]: Synchronous message processing. Let's recap the problem. The current asynchronous netlink kernel message processing is vulnerable to these attacks: 1) Hit and run: Attacker sends one or more messages and then exits before they're processed. This may confuse/disable the next netlink user that gets the netlink address of the attacker since it may receive the responses to the attacker's messages. Proposed solutions: a) Synchronous processing. b) Stream mode socket. c) Restrict/prohibit binding. 2) Starvation: Because various netlink rcv functions were written to not return until all messages have been processed on a socket, it is possible for these functions to execute for an arbitrarily long period of time. If this is successfully exploited it could also be used to hold rtnl forever. Proposed solutions: a) Synchronous processing. b) Stream mode socket. Firstly let's cross off solution c). It only solves the first problem and it has user-visible impacts. In particular, it'll break user space applications that expect to bind or communicate with specific netlink addresses (pid's). So we're left with a choice of synchronous processing versus SOCK_STREAM for netlink. For the moment I'm sticking with the synchronous approach as suggested by Alexey since it's simpler and I'd rather spend my time working on other things. However, it does have a number of deficiencies compared to the stream mode solution: 1) User-space to user-space netlink communication is still vulnerable. 2) Inefficient use of resources. This is especially true for rtnetlink since the lock is shared with other users such as networking drivers. The latter could hold the rtnl while communicating with hardware which causes the rtnetlink user to wait when it could be doing other things. 3) It is still possible to DoS all netlink users by flooding the kernel netlink receive queue. The attacker simply fills the receive socket with a single netlink message that fills up the entire queue. The attacker then continues to call sendmsg with the same message in a loop. Point 3) can be countered by retransmissions in user-space code, however it is pretty messy. In light of these problems (in particular, point 3), we should implement stream mode netlink at some point. In the mean time, here is a patch that implements synchronous processing. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-05-03 14:55:09 -07:00
Folkert van Heusden	c3924c70dd	[TCP]: Optimize check in port-allocation code, v6 version. Signed-off-by: Folkert van Heusden <folkert@vanheusden.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-05-03 14:36:45 -07:00
Thomas Graf	db46edc6d3	[RTNETLINK] Cleanup rtnetlink_link tables Converts remaining rtnetlink_link tables to use c99 designated initializers to make greping a little bit easier. Signed-off-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-05-03 14:29:39 -07:00
Herbert Xu	679a873824	[IPV6]: Fix raw socket checksums with IPsec I made a mistake in my last patch to the raw socket checksum code. I used the value of inet->cork.length as the length of the payload. While this works with normal packets, it breaks down when IPsec is present since the cork length includes the extension header length. So here is a patch to fix the length calculations. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-05-03 14:24:36 -07:00
Dave Jones	89c8b3a110	[IPV6]: Incorrect permissions on route flush sysctl On Mon, Apr 25, 2005 at 12:01:13PM -0400, Dave Jones wrote: > This has been brought up before.. http://lkml.org/lkml/2000/1/21/116 > but didnt seem to get resolved. This morning I got someone > file a bugzilla about it breaking sysctl(8). And here's its ipv6 counterpart. Signed-off-by: Dave Jones <davej@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-04-28 12:11:49 -07:00
Al Viro	b453257f05	[PATCH] kill gratitious includes of major.h under net/* A lot of places in there are including major.h for no reason whatsoever. Removed. And yes, it still builds. The history of that stuff is often amusing. E.g. for net/core/sock.c the story looks so, as far as I've been able to reconstruct it: we used to need major.h in net/socket.c circa 1.1.early. In 1.1.13 that need had disappeared, along with register_chrdev(SOCKET_MAJOR, "socket", &net_fops) in sock_init(). Include had not. When 1.2 -> 1.3 reorg of net/* had moved a lot of stuff from net/socket.c to net/core/sock.c, this crap had followed... Signed-off-by: Al Viro <viro@parcelfarce.linux.theplanet.co.uk> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-04-25 18:32:13 -07:00
Arnaldo Carvalho de Melo	edec231a8a	[IPV6]: export inet6_sock_nr Please apply, SCTP/DCCP needs this when INET_REFCNT_DEBUG is set. Signed-off-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-04-24 20:22:28 -07:00
Herbert Xu	0d3d077cd4	[SELINUX]: Fix ipv6_skip_exthdr() invocation causing OOPS. The SELinux hooks invoke ipv6_skip_exthdr() with an incorrect length final argument. However, the length argument turns out to be superfluous. I was just reading ipv6_skip_exthdr and it occured to me that we can get rid of len altogether. The only place where len is used is to check whether the skb has two bytes for ipv6_opt_hdr. This check is done by skb_header_pointer/skb_copy_bits anyway. Now it might appear that we've made the code slower by deferring the check to skb_copy_bits. However, this check should not trigger in the common case so this is OK. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-04-24 20:16:19 -07:00
Herbert Xu	3320da8906	[IPV6]: Replace bogus instances of inet->recverr While looking at this problem I noticed that IPv6 was sometimes looking at inet->recverr which is bogus. Here is a patch to correct that and use np->recverr. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Acked-by: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-04-19 22:32:22 -07:00
Herbert Xu	357b40a18b	[IPV6]: IPV6_CHECKSUM socket option can corrupt kernel memory So here is a patch that introduces skb_store_bits -- the opposite of skb_copy_bits, and uses them to read/write the csum field in rawv6. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-04-19 22:30:14 -07:00
YOSHIFUJI Hideaki	fd92833a52	[IPV6]: Fix a branch prediction From: Tushar Gohad <tgohad@mvista.com> Signed-off-by: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-04-19 22:27:09 -07:00
Linus Torvalds	1da177e4c3	Linux-2.6.12-rc2 Initial git repository build. I'm not bothering with the full history, even though we have it. We can create a separate "historical" git archive of that later if we want to, and in the meantime it's about 3.2GB when imported into git - space that would just make the early git days unnecessarily complicated, when we don't have a lot of good infrastructure for it. Let it rip!	2005-04-16 15:20:36 -07:00

... 104 105 106 107 108 ...

6291 Commits