linux

Author	SHA1	Message	Date
Michal Marek	72c7664f92	ipvs: Kconfig cleanup IP_VS_PROTO_AH_ESP should be set iff either of IP_VS_PROTO_{AH,ESP} is selected. Express this with standard kconfig syntax. Signed-off-by: Michal Marek <mmarek@suse.cz> Acked-by: Simon Horman <horms@verge.net.au> Signed-off-by: Patrick McHardy <kaber@trash.net>	2010-07-05 10:42:37 +02:00
David S. Miller	e490c1defe	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/kaber/nf-next-2.6	2010-07-02 22:42:06 -07:00
David S. Miller	8244132ea8	Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 Conflicts: net/ipv4/ip_output.c	2010-06-23 18:26:27 -07:00
Nick Chalk	26ec037f98	IPVS: one-packet scheduling Allow one-packet scheduling for UDP connections. When the fwmark-based or normal virtual service is marked with '-o' or '--ops' options all connections are created only to schedule one packet. Useful to schedule UDP packets from same client port to different real servers. Recommended with RR or WRR schedulers (the connections are not visible with ipvsadm -L). Signed-off-by: Nick Chalk <nick@loadbalancer.org> Signed-off-by: Simon Horman <horms@verge.net.au> Signed-off-by: Patrick McHardy <kaber@trash.net>	2010-06-22 08:07:01 +02:00
Changli Gao	d8d1f30b95	net-next: remove useless union keyword remove useless union keyword in rtable, rt6_info and dn_route. Since there is only one member in a union, the union keyword isn't useful. Signed-off-by: Changli Gao <xiaosuo@gmail.com> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-06-10 23:31:35 -07:00
Sven Wegener	aea9d711f3	ipvs: Add missing locking during connection table hashing and unhashing The code that hashes and unhashes connections from the connection table is missing locking of the connection being modified, which opens up a race condition and results in memory corruption when this race condition is hit. Here is what happens in pretty verbose form: CPU 0 CPU 1 ------------ ------------ An active connection is terminated and we schedule ip_vs_conn_expire() on this CPU to expire this connection. IRQ assignment is changed to this CPU, but the expire timer stays scheduled on the other CPU. New connection from same ip:port comes in right before the timer expires, we find the inactive connection in our connection table and get a reference to it. We proper lock the connection in tcp_state_transition() and read the connection flags in set_tcp_state(). ip_vs_conn_expire() gets called, we unhash the connection from our connection table and remove the hashed flag in ip_vs_conn_unhash(), without proper locking! While still holding proper locks we write the connection flags in set_tcp_state() and this sets the hashed flag again. ip_vs_conn_expire() fails to expire the connection, because the other CPU has incremented the reference count. We try to re-insert the connection into our connection table, but this fails in ip_vs_conn_hash(), because the hashed flag has been set by the other CPU. We re-schedule execution of ip_vs_conn_expire(). Now this connection has the hashed flag set, but isn't actually hashed in our connection table and has a dangling list_head. We drop the reference we held on the connection and schedule the expire timer for timeouting the connection on this CPU. Further packets won't be able to find this connection in our connection table. ip_vs_conn_expire() gets called again, we think it's already hashed, but the list_head is dangling and while removing the connection from our connection table we write to the memory location where this list_head points to. The result will probably be a kernel oops at some other point in time. This race condition is pretty subtle, but it can be triggered remotely. It needs the IRQ assignment change or another circumstance where packets coming from the same ip:port for the same service are being processed on different CPUs. And it involves hitting the exact time at which ip_vs_conn_expire() gets called. It can be avoided by making sure that all packets from one connection are always processed on the same CPU and can be made harder to exploit by changing the connection timeouts to some custom values. Signed-off-by: Sven Wegener <sven.wegener@stealer.net> Cc: stable@kernel.org Acked-by: Simon Horman <horms@verge.net.au> Signed-off-by: Patrick McHardy <kaber@trash.net>	2010-06-09 16:10:57 +02:00
Patrick McHardy	1e4b105712	Merge branch 'master' of /repos/git/net-next-2.6 Conflicts: net/bridge/br_device.c net/bridge/br_forward.c Signed-off-by: Patrick McHardy <kaber@trash.net>	2010-05-10 18:39:28 +02:00
Eric Dumazet	aa39514516	net: sk_sleep() helper Define a new function to return the waitqueue of a "struct sock". static inline wait_queue_head_t sk_sleep(struct sock sk) { return sk->sk_sleep; } Change all read occurrences of sk_sleep by a call to this function. Needed for a future RCU conversion. sk_sleep wont be a field directly available. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-04-20 16:37:13 -07:00
Patrick McHardy	6291055465	Merge branch 'master' of /repos/git/net-next-2.6 Conflicts: Documentation/feature-removal-schedule.txt net/ipv6/netfilter/ip6t_REJECT.c net/netfilter/xt_limit.c Signed-off-by: Patrick McHardy <kaber@trash.net>	2010-04-20 16:02:01 +02:00
Patrick McHardy	3d91c1a848	IPVS: fix potential stack overflow with overly long protocol names When protocols use very long names, the sprintf calls might overflow the on-stack buffer. No protocol in the kernel does this however. Print the protocol name in the pr_debug statement directly to avoid this. Based on patch by Zhitong Wang <zhitong.wangzt@alibaba-inc.com> Acked-by: Simon Horman <horms@verge.net.au> Signed-off-by: Patrick McHardy <kaber@trash.net>	2010-04-08 13:35:47 +02:00
Tejun Heo	5a0e3ad6af	include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h percpu.h is included by sched.h and module.h and thus ends up being included when building most .c files. percpu.h includes slab.h which in turn includes gfp.h making everything defined by the two files universally available and complicating inclusion dependencies. percpu.h -> slab.h dependency is about to be removed. Prepare for this change by updating users of gfp and slab facilities include those headers directly instead of assuming availability. As this conversion needs to touch large number of source files, the following script is used as the basis of conversion. http://userweb.kernel.org/~tj/misc/slabh-sweep.py The script does the followings. * Scan files for gfp and slab usages and update includes such that only the necessary includes are there. ie. if only gfp is used, gfp.h, if slab is used, slab.h. * When the script inserts a new include, it looks at the include blocks and try to put the new include such that its order conforms to its surrounding. It's put in the include block which contains core kernel includes, in the same order that the rest are ordered - alphabetical, Christmas tree, rev-Xmas-tree or at the end if there doesn't seem to be any matching order. * If the script can't find a place to put a new include (mostly because the file doesn't have fitting include block), it prints out an error message indicating which .h file needs to be added to the file. The conversion was done in the following steps. 1. The initial automatic conversion of all .c files updated slightly over 4000 files, deleting around 700 includes and adding ~480 gfp.h and ~3000 slab.h inclusions. The script emitted errors for ~400 files. 2. Each error was manually checked. Some didn't need the inclusion, some needed manual addition while adding it to implementation .h or embedding .c file was more appropriate for others. This step added inclusions to around 150 files. 3. The script was run again and the output was compared to the edits from #2 to make sure no file was left behind. 4. Several build tests were done and a couple of problems were fixed. e.g. lib/decompress_.c used malloc/free() wrappers around slab APIs requiring slab.h to be added manually. 5. The script was run on all .h files but without automatically editing them as sprinkling gfp.h and slab.h inclusions around .h files could easily lead to inclusion dependency hell. Most gfp.h inclusion directives were ignored as stuff from gfp.h was usually wildly available and often used in preprocessor macros. Each slab.h inclusion directive was examined and added manually as necessary. 6. percpu.h was updated not to include slab.h. 7. Build test were done on the following configurations and failures were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my distributed build env didn't work with gcov compiles) and a few more options had to be turned off depending on archs to make things build (like ipr on powerpc/64 which failed due to missing writeq). x86 and x86_64 UP and SMP allmodconfig and a custom test config. * powerpc and powerpc64 SMP allmodconfig * sparc and sparc64 SMP allmodconfig * ia64 SMP allmodconfig * s390 SMP allmodconfig * alpha SMP allmodconfig * um on x86_64 SMP allmodconfig 8. percpu.h modifications were reverted so that it could be applied as a separate patch and serve as bisection point. Given the fact that I had only a couple of failures from tests on step 6, I'm fairly confident about the coverage of this conversion patch. If there is a breakage, it's likely to be something in one of the arch headers which should be easily discoverable easily on most builds of the specific arch. Signed-off-by: Tejun Heo <tj@kernel.org> Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>	2010-03-30 22:02:32 +09:00
Jan Engelhardt	7911b5c75b	netfilter: ipvs: use NFPROTO values for NF_HOOK invocation Semantic patch: // <smpl> @@ @@ IP_VS_XMIT( -PF_INET6, +NFPROTO_IPV6, ...) @@ @@ IP_VS_XMIT( -PF_INET, +NFPROTO_IPV4, ...) // </smpl> Signed-off-by: Jan Engelhardt <jengelh@medozas.de>	2010-03-25 16:03:07 +01:00
Joe Perches	1da05f50f6	netfilter: net/netfilter/ipvs/ip_vs_ftp.c: Remove use of NIPQUAD NIPQUAD has very few uses left. Remove this use and make the code have the identical form of the only other use of "%u,%u,%u,%u,%u,%u" in net/ipv4/netfilter/nf_nat_ftp.c Signed-off-by: Joe Perches <joe@perches.com> Acked-by: Simon Horman <horms@verge.net.au> Signed-off-by: Patrick McHardy <kaber@trash.net>	2010-03-15 18:03:05 +01:00
David S. Miller	38bdbd8efc	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/kaber/nf-next-2.6	2010-02-26 09:31:09 -08:00
Simon Horman	51f0bc7868	IPVS: ip_vs_lblcr: use list headA Use list_head rather than a custom list implementation. Signed-off-by: Simon Horman <horms@verge.net.au> Signed-off-by: Patrick McHardy <kaber@trash.net>	2010-02-26 17:45:14 +01:00
Alexey Dobriyan	3ffe533c87	ipv6: drop unused "dev" arg of icmpv6_send() Dunno, what was the idea, it wasn't used for a long time. Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-02-18 14:30:17 -08:00
Venkata Mohan Reddy	2906f66a56	ipvs: SCTP Trasport Loadbalancing Support Enhance IPVS to load balance SCTP transport protocol packets. This is done based on the SCTP rfc 4960. All possible control chunks have been taken care. The state machine used in this code looks some what lengthy. I tried to make the state machine easy to understand. Signed-off-by: Venkata Mohan Reddy Koppula <mohanreddykv@gmail.com> Signed-off-by: Simon Horman <horms@verge.net.au> Signed-off-by: Patrick McHardy <kaber@trash.net>	2010-02-18 12:31:05 +01:00
Patrick McHardy	9ab99d5a43	Merge branch 'master' of /repos/git/net-next-2.6 Signed-off-by: Patrick McHardy <kaber@trash.net>	2010-02-10 14:17:10 +01:00
Joe Perches	a79e7ac4ad	ipvs: use standardized format in sprintf Use the same format string as net/ipv4/netfilter/nf_nat_ftp.c to encode an ipv4 address and port. Both uses should be a single common function. Signed-off-by: Joe Perches <joe@perches.com> Acked-by: Simon Horman <horms@verge.net.au> Signed-off-by: Patrick McHardy <kaber@trash.net>	2010-01-11 11:53:31 +01:00
Catalin(ux) M. BOIE	6f7edb4881	IPVS: Allow boot time change of hash size I was very frustrated about the fact that I have to recompile the kernel to change the hash size. So, I created this patch. If IPVS is built-in you can append ip_vs.conn_tab_bits=?? to kernel command line, or, if you built IPVS as modules, you can add options ip_vs conn_tab_bits=??. To keep everything backward compatible, you still can select the size at compile time, and that will be used as default. It has been about a year since this patch was originally posted and subsequently dropped on the basis of insufficient test data. Mark Bergsma has provided the following test results which seem to strongly support the need for larger hash table sizes: We do however run into the same problem with the default setting (212 = 4096 entries), as most of our LVS balancers handle around a million connections/SLAB entries at any point in time (around 100-150 kpps load). With only 4096 hash table entries this implies that each entry consists of a linked list of 256 connections on average. To provide some statistics, I did an oprofile run on an 2.6.31 kernel, with both the default 4096 table size, and the same kernel recompiled with IP_VS_CONN_TAB_BITS set to 18 (218 = 262144 entries). I built a quick test setup with a part of Wikimedia/Wikipedia's live traffic mirrored by the switch to the test host. With the default setting, at ~ 120 kpps packet load we saw a typical %si CPU usage of around 30-35%, and oprofile reported a hot spot in ip_vs_conn_in_get: samples % image name app name symbol name 1719761 42.3741 ip_vs.ko ip_vs.ko ip_vs_conn_in_get 302577 7.4554 bnx2 bnx2 /bnx2 181984 4.4840 vmlinux vmlinux __ticket_spin_lock 128636 3.1695 vmlinux vmlinux ip_route_input 74345 1.8318 ip_vs.ko ip_vs.ko ip_vs_conn_out_get 68482 1.6874 vmlinux vmlinux mwait_idle After loading the recompiled kernel with 218 entries, %si CPU usage dropped in half to around 12-18%, and oprofile looks much healthier, with only 7% spent in ip_vs_conn_in_get: samples % image name app name symbol name 265641 14.4616 bnx2 bnx2 /bnx2 143251 7.7986 vmlinux vmlinux __ticket_spin_lock 140661 7.6576 ip_vs.ko ip_vs.ko ip_vs_conn_in_get 94364 5.1372 vmlinux vmlinux mwait_idle 86267 4.6964 vmlinux vmlinux ip_route_input [ horms@verge.net.au: trivial up-port and minor style fixes ] Signed-off-by: Catalin(ux) M. BOIE <catab@embedromix.ro> Cc: Mark Bergsma <mark@wikimedia.org> Signed-off-by: Simon Horman <horms@verge.net.au> Signed-off-by: Patrick McHardy <kaber@trash.net>	2010-01-05 05:50:24 +01:00
Arjan van de Ven	04bcef2a83	ipvs: Add boundary check on ioctl arguments The ipvs code has a nifty system for doing the size of ioctl command copies; it defines an array with values into which it indexes the cmd to find the right length. Unfortunately, the ipvs code forgot to check if the cmd was in the range that the array provides, allowing for an index outside of the array, which then gives a "garbage" result into the length, which then gets used for copying into a stack buffer. Fix this by adding sanity checks on these as well as the copy size. [ horms@verge.net.au: adjusted limit to IP_VS_SO_GET_MAX ] Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Acked-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Simon Horman <horms@verge.net.au> Signed-off-by: Patrick McHardy <kaber@trash.net>	2010-01-04 16:37:12 +01:00
Florian Fainelli	ae24e578de	ipvs: ip_vs_wrr.c: use lib/gcd.c Remove the private version of the greatest common divider to use lib/gcd.c, the latter also implementing the a < b case. [akpm@linux-foundation.org: repair neighboring whitespace because the diff looked odd] Signed-off-by: Florian Fainelli <florian@openwrt.org> Cc: Sergei Shtylyov <sshtylyov@ru.mvista.com> Cc: Takashi Iwai <tiwai@suse.de> Acked-by: Simon Horman <horms@verge.net.au> Cc: Julius Volz <juliusv@google.com> Cc: David S. Miller <davem@davemloft.net> Cc: Patrick McHardy <kaber@trash.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Patrick McHardy <kaber@trash.net>	2009-12-22 09:42:06 +01:00
Simon Horman	258c889362	ipvs: zero usvc and udest Make sure that any otherwise uninitialised fields of usvc are zero. This has been obvserved to cause a problem whereby the port of fwmark services may end up as a non-zero value which causes scheduling of a destination server to fail for persisitent services. As observed by Deon van der Merwe <dvdm@truteq.co.za>. This fix suggested by Julian Anastasov <ja@ssi.bg>. For good measure also zero udest. Cc: Deon van der Merwe <dvdm@truteq.co.za> Acked-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Simon Horman <horms@verge.net.au> Cc: stable@kernel.org Signed-off-by: Patrick McHardy <kaber@trash.net>	2009-12-15 17:01:25 +01:00
Xiaotian Feng	9abfe315de	ipvs: fix synchronization on connection close commit `9d3a0de` makes slaves expire as they would do on the master with much shorter timeouts. But it introduces another problem: When we close a connection, on master server the connection became CLOSE_WAIT/TIME_WAIT, it was synced to slaves, but if master is finished within it's timeouts (CLOSE), it will not be synced to slaves. Then slaves will be kept on CLOSE_WAIT/TIME_WAIT until timeout reaches. Thus we should also sync with CLOSE. Cc: Wensong Zhang <wensong@linux-vs.org> Cc: Simon Horman <horms@verge.net.au> Cc: Julian Anastasov <ja@ssi.bg> Cc: David S. Miller <davem@davemloft.net> Signed-off-by: Xiaotian Feng <dfeng@redhat.com> Acked-by: Simon Horman <horms@verge.net.au> Signed-off-by: Patrick McHardy <kaber@trash.net>	2009-12-14 16:38:21 +01:00
Eric W. Biederman	f8572d8f2a	sysctl net: Remove unused binary sysctl code Now that sys_sysctl is a compatiblity wrapper around /proc/sys all sysctl strategy routines, and all ctl_name and strategy entries in the sysctl tables are unused, and can be revmoed. In addition neigh_sysctl_register has been modified to no longer take a strategy argument and it's callers have been modified not to pass one. Cc: "David Miller" <davem@davemloft.net> Cc: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org> Cc: netdev@vger.kernel.org Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>	2009-11-12 02:05:06 -08:00
Alexey Dobriyan	8d65af789f	sysctl: remove "struct file *" argument of ->proc_handler It's unused. It isn't needed -- read or write flag is already passed and sysctl shouldn't care about the rest. It _was_ used in two places at arch/frv for some reason. Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Cc: David Howells <dhowells@redhat.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: "David S. Miller" <davem@davemloft.net> Cc: James Morris <jmorris@namei.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-09-24 07:21:04 -07:00
Julius Volz	94b265514a	IPVS: Add handling of incoming ICMPV6 messages Add handling of incoming ICMPv6 messages. This follows the handling of IPv4 ICMP messages. Amongst ther things this problem allows IPVS to behave sensibly when an ICMPV6_PKT_TOOBIG message is received: This message is received when a realserver sends a packet >PMTU to the client. The hop on this path with insufficient MTU will generate an ICMPv6 Packet Too Big message back to the VIP. The LVS server receives this message, but the call to the function handling this has been missing. Thus, IPVS fails to forward the message to the real server, which then does not adjust the path MTU. This patch adds the missing call to ip_vs_in_icmp_v6() in ip_vs_in() to handle this situation. Thanks to Rob Gallagher from HEAnet for reporting this issue and for testing this patch in production (with direct routing mode). [horms@verge.net.au: tweaked changelog] Signed-off-by: Julius Volz <julius.volz@gmail.com> Tested-by: Rob Gallagher <robert.gallagher@heanet.ie> Signed-off-by: Simon Horman <horms@verge.net.au> Signed-off-by: Patrick McHardy <kaber@trash.net>	2009-08-31 16:22:23 +02:00
Simon Horman	1e66dafc75	ipvs: Use atomic operations atomicly A pointed out by Shin Hong, IPVS doesn't always use atomic operations in an atomic manner. While this seems unlikely to be manifest in strange behaviour, it seems appropriate to clean this up. Cc: shin hong <hongshin@gmail.com> Signed-off-by: Simon Horman <horms@verge.net.au> Signed-off-by: Patrick McHardy <kaber@trash.net>	2009-08-31 14:18:48 +02:00
Jan Engelhardt	36cbd3dcc1	net: mark read-only arrays as const String literals are constant, and usually, we can also tag the array of pointers const too, moving it to the .rodata section. Signed-off-by: Jan Engelhardt <jengelh@medozas.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-08-05 10:42:58 -07:00
Hannes Eder	1e3e238e9c	IPVS: use pr_err and friends instead of IP_VS_ERR and friends Since pr_err and friends are used instead of printk there is no point in keeping IP_VS_ERR and friends. Furthermore make use of '__func__' instead of hard coded function names. Signed-off-by: Hannes Eder <heder@google.com> Acked-by: Simon Horman <horms@verge.net.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-08-02 18:29:30 -07:00
Hannes Eder	9aada7ac04	IPVS: use pr_fmt While being at it cleanup whitespace. Signed-off-by: Hannes Eder <heder@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-07-30 14:29:44 -07:00
Johannes Berg	134e63756d	genetlink: make netns aware This makes generic netlink network namespace aware. No generic netlink families except for the controller family are made namespace aware, they need to be checked one by one and then set the family->netnsok member to true. A new function genlmsg_multicast_netns() is introduced to allow sending a multicast message in a given namespace, for example when it applies to an object that lives in that namespace, a new function genlmsg_multicast_allns() to send a message to all network namespaces (for objects that do not have an associated netns). The function genlmsg_multicast() is changed to multicast the message in just init_net, which is currently correct for all generic netlink families since they only work in init_net right now. Some will later want to work in all net namespaces because they do not care about the netns at all -- those will have to be converted to use one of the new functions genlmsg_multicast_allns() or genlmsg_multicast_netns() whenever they are made netns aware in some way. After this patch families can easily decide whether or not they should be available in all net namespaces. Many genl families us it for objects not related to networking and should therefore be available in all namespaces, but that will have to be done on a per family basis. Note that this doesn't touch on the checkpoint/restart problem where network namespaces could be used, genl families and multicast groups are numbered globally and I see no easy way of changing that, especially since it must be possible to multicast to all network namespaces for those families that do not care about netns. Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-07-12 14:03:27 -07:00
Eric Dumazet	adf30907d6	net: skb->dst accessors Define three accessors to get/set dst attached to a skb struct dst_entry skb_dst(const struct sk_buff skb) void skb_dst_set(struct sk_buff skb, struct dst_entry dst) void skb_dst_drop(struct sk_buff *skb) This one should replace occurrences of : dst_release(skb->dst) skb->dst = NULL; Delete skb->dst field Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-06-03 02:51:04 -07:00
Michał Mirosław	8f698d5453	ipvs: Use genl_register_family_with_ops() Use genl_register_family_with_ops() instead of a copy. Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-05-21 16:50:24 -07:00
Simon Horman	be8be9eccb	ipvs: Fix IPv4 FWMARK virtual services This fixes the use of fwmarks to denote IPv4 virtual services which was unfortunately broken as a result of the integration of IPv6 support into IPVS, which was included in 2.6.28. The problem arises because fwmarks are stored in the 4th octet of a union nf_inet_addr .all, however in the case of IPv4 only the first octet, corresponding to .ip, is assigned and compared. In other words, using .all = { 0, 0, 0, htonl(svc->fwmark) always results in a value of 0 (32bits) being stored for IPv4. This means that one fwmark can be used, as it ends up being mapped to 0, but things break down when multiple fwmarks are used, as they all end up being mapped to 0. As fwmarks are 32bits a reasonable fix seems to be to just store the fwmark in .ip, and comparing and storing .ip when fwmarks are used. This patch makes the assumption that in calls to ip_vs_ct_in_get() and ip_vs_sched_persist() if the proto parameter is IPPROTO_IP then we are dealing with an fwmark. I believe this is valid as ip_vs_in() does fairly strict filtering on the protocol and IPPROTO_IP should not be used in these calls unless explicitly passed when making these calls for fwmarks in ip_vs_sched_persist(). Tested-by: Fabien Duchêne <fabien.duchene@student.uclouvain.be> Cc: Joseph Mack NA3T <jmack@wm7d.net> Cc: Julius Volz <julius.volz@gmail.com> Signed-off-by: Simon Horman <horms@verge.net.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-05-08 14:54:47 -07:00
Harvey Harrison	09640e6365	net: replace uses of __constant_{endian} Base versions handle constant folding now. Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-02-01 00:45:17 -08:00
Simon Horman	68888d1053	IPVS: Make "no destination available" message more consistent between schedulers Acked-by: Graeme Fowler <graeme@graemef.net> Signed-off-by: Simon Horman <horms@verge.net.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-12-29 18:37:36 -08:00
David S. Miller	7e452baf6b	Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 Conflicts: drivers/message/fusion/mptlan.c drivers/net/sfc/ethtool.c net/mac80211/debugfs_sta.c	2008-11-11 15:43:02 -08:00
Harvey Harrison	b7b45f47d6	netfilter: payload_len is be16, add size of struct rather than size of pointer payload_len is a be16 value, not cpu_endian, also the size of a ponter to a struct ipv6hdr was being added, not the size of the struct itself. Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-11-10 16:46:06 -08:00
Harvey Harrison	ca62059b7e	ipvs: oldlen, newlen should be be16, not be32 Noticed by sparse: net/netfilter/ipvs/ip_vs_proto_tcp.c:195:6: warning: incorrect type in argument 5 (different base types) net/netfilter/ipvs/ip_vs_proto_tcp.c:195:6: expected restricted __be16 [usertype] oldlen net/netfilter/ipvs/ip_vs_proto_tcp.c:195:6: got restricted __be32 [usertype] <noident> net/netfilter/ipvs/ip_vs_proto_tcp.c:196:6: warning: incorrect type in argument 6 (different base types) net/netfilter/ipvs/ip_vs_proto_tcp.c:196:6: expected restricted __be16 [usertype] newlen net/netfilter/ipvs/ip_vs_proto_tcp.c:196:6: got restricted __be32 [usertype] <noident> net/netfilter/ipvs/ip_vs_proto_tcp.c:270:6: warning: incorrect type in argument 5 (different base types) net/netfilter/ipvs/ip_vs_proto_tcp.c:270:6: expected restricted __be16 [usertype] oldlen net/netfilter/ipvs/ip_vs_proto_tcp.c:270:6: got restricted __be32 [usertype] <noident> net/netfilter/ipvs/ip_vs_proto_tcp.c:271:6: warning: incorrect type in argument 6 (different base types) net/netfilter/ipvs/ip_vs_proto_tcp.c:271:6: expected restricted __be16 [usertype] newlen net/netfilter/ipvs/ip_vs_proto_tcp.c:271:6: got restricted __be32 [usertype] <noident> net/netfilter/ipvs/ip_vs_proto_udp.c:206:6: warning: incorrect type in argument 5 (different base types) net/netfilter/ipvs/ip_vs_proto_udp.c:206:6: expected restricted __be16 [usertype] oldlen net/netfilter/ipvs/ip_vs_proto_udp.c:206:6: got restricted __be32 [usertype] <noident> net/netfilter/ipvs/ip_vs_proto_udp.c:207:6: warning: incorrect type in argument 6 (different base types) net/netfilter/ipvs/ip_vs_proto_udp.c:207:6: expected restricted __be16 [usertype] newlen net/netfilter/ipvs/ip_vs_proto_udp.c:207:6: got restricted __be32 [usertype] <noident> net/netfilter/ipvs/ip_vs_proto_udp.c:282:6: warning: incorrect type in argument 5 (different base types) net/netfilter/ipvs/ip_vs_proto_udp.c:282:6: expected restricted __be16 [usertype] oldlen net/netfilter/ipvs/ip_vs_proto_udp.c:282:6: got restricted __be32 [usertype] <noident> net/netfilter/ipvs/ip_vs_proto_udp.c:283:6: warning: incorrect type in argument 6 (different base types) net/netfilter/ipvs/ip_vs_proto_udp.c:283:6: expected restricted __be16 [usertype] newlen net/netfilter/ipvs/ip_vs_proto_udp.c:283:6: got restricted __be32 [usertype] <noident> Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Acked-by: Simon Horman <horms@verge.net.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-11-06 23:09:56 -08:00
Alexey Dobriyan	6d9f239a1e	net: '&' redux I want to compile out proc_* and sysctl_* handlers totally and stub them to NULL depending on config options, however usage of & will prevent this, since taking adress of NULL pointer will break compilation. So, drop & in front of every ->proc_handler and every ->strategy handler, it was never needed in fact. Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-11-03 18:21:05 -08:00
Julius Volz	48148938b4	IPVS: Remove supports_ipv6 scheduler flag Remove the 'supports_ipv6' scheduler flag since all schedulers now support IPv6. Signed-off-by: Julius Volz <julius.volz@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-11-03 17:08:56 -08:00
Julius Volz	445483758e	IPVS: Add IPv6 support to LBLC/LBLCR schedulers Add IPv6 support to LBLC and LBLCR schedulers. These were the last schedulers without IPv6 support, but we might want to keep the supports_ipv6 flag in the case of future schedulers without IPv6 support. Signed-off-by: Julius Volz <julius.volz@gmail.com> Acked-by: Simon Horman <horms@verge.net.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-11-03 17:08:28 -08:00
Julius Volz	20971a0afb	IPVS: Add IPv6 support to SH and DH schedulers Add IPv6 support to SH and DH schedulers. I hope this simple IPv6 address hashing is good enough. The 128 bit are just XORed into 32 before hashing them like an IPv4 address. Signed-off-by: Julius Volz <julius.volz@gmail.com> Acked-by: Simon Horman <horms@verge.net.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-11-02 23:52:01 -08:00
Harvey Harrison	14d5e834f6	net: replace NIPQUAD() in net/netfilter/ Using NIPQUAD() with NIPQUAD_FMT, %d.%d.%d.%d or %u.%u.%u.%u can be replaced with %pI4 Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-10-31 00:54:29 -07:00
Harvey Harrison	5b095d9892	net: replace %p6 with %pI6 Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-10-29 12:52:50 -07:00
Harvey Harrison	38ff4fa49b	netfilter: replace uses of NIP6_FMT with %p6 Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-10-28 16:08:13 -07:00
Julius Volz	0537ae6a3d	ipvs: Update CONFIG_IP_VS_IPV6 description and help text This adds a URL to further info to the CONFIG_IP_VS_IPV6 Kconfig help text. Also, I think it should be ok to remove the "DANGEROUS" label in the description line at this point to get people to try it out and find all the bugs ;) It's still marked as experimental, of course. Signed-off-by: Julius Volz <juliusv@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-10-19 23:29:56 -07:00
David S. Miller	f901b64472	ipvs: Add proper dependencies on IP_VS, and fix description header line. Linus noted a build failure case: net/netfilter/ipvs/ip_vs_xmit.c: In function 'ip_vs_tunnel_xmit': net/netfilter/ipvs/ip_vs_xmit.c:616: error: implicit declaration of function 'ip_select_ident' The proper include file (net/ip.h) is being included in ip_vs_xmit.c to get that declaration. So the only possible case where this can happen is if CONFIG_INET is not enabled. This seems to be purely a missing dependency in the ipvs/Kconfig file IP_VS entry. Also, while we're here, remove the out of date "EXPERIMENTAL" string in the IP_VS config help header line. IP_VS no longer depends upon CONFIG_EXPERIMENTAL Signed-off-by: David S. Miller <davem@davemloft.net>	2008-10-11 12:18:04 -07:00
Julius Volz	cb7f6a7b71	IPVS: Move IPVS to net/netfilter/ipvs Since IPVS now has partial IPv6 support, this patch moves IPVS from net/ipv4/ipvs to net/netfilter/ipvs. It's a result of: $ git mv net/ipv4/ipvs net/netfilter and adapting the relevant Kconfigs/Makefiles to the new path. Signed-off-by: Julius Volz <juliusv@google.com> Signed-off-by: Simon Horman <horms@verge.net.au>	2008-10-07 08:38:24 +11:00

50 Commits