Commit Graph

89134 Commits

Author SHA1 Message Date
Pavel Emelyanov
d9ed0f0e2d [VLAN]: Introduce the vlan_net structure and init/exit net ops.
Unlike TUN, it is empty from the very beginning, and will 
be eventually populated later.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Acked-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-04-16 00:49:09 -07:00
Pavel Emelyanov
a9fde26078 [VLAN]: Tag vlan_group_device with net device, not ifindex.
Currently vlan group is searched using one key - the ifindex.
We'll have to lookup the vlan_group by two keys - ifindex and
net. Turning the vlan_group lookup key to struct net_device
pointer will make this process easier.

Besides, this will eliminate one more place in the networking,
that assumes that indexes are unique in the kernel.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Acked-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-04-16 00:48:04 -07:00
Pavel Emelyanov
669f87baab [RTNL]: Introduce the rtnl_kill_links helper.
This one is responsible for calling ->dellink on each net
device found in net to help with vlan net_exit hook in the
nearest future.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Acked-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-04-16 00:46:52 -07:00
Pavel Emelyanov
3a931a80cb [RTNL]: Relax for_each_netdev_safe in __rtnl_link_unregister.
Each potential list_del (happening from inside a ->dellink call)
is followed by goto restart, so there's no need in _safe iteration.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Acked-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-04-16 00:45:56 -07:00
Pavel Emelyanov
fc54c65853 [TUN]: Allow to register tun devices in namespace.
This is basically means that a net is set for a new device, but
actually also involves two more steps:

1. mark the tun device as "local", i.e. do not allow for it to
   move across namespaces.

This is done so, since tun device is most often associated to some
file (and thus to some process) and moving the device alone is not
valid while keeping the file and the process outside. The need in 
ability to move a detached persistent device is to be investigated 
later.

2. get the tun device's net when tun becomes attached and put one
   when it becomes detached.

This is needed to handle the case when a task owning the tun dies,
but a files lives for some more time - in this case we must not
allow for net to be freed, since its exit hook will spoil that file's
private data by unregistering the tun from under tun_chr_close.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-04-16 00:41:53 -07:00
Pavel Emelyanov
d647a591da [TUN]: Make the tun_dev_list per-net.
Remove the static tun_dev_list and replace its occurrences in
driver with per-net one.

It is used in two places - in tun_set_iff and tun_cleanup. In 
the first case it's legal to use current net_ns. In the cleanup
call - move the loop, that unregisters all devices in net exit
hook.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-04-16 00:41:16 -07:00
Pavel Emelyanov
79d1760491 [TUN]: Introduce the tun_net structure and init/exit net ops.
This is the first step in making tuntap devices work in net 
namespaces. The structure mentioned is pointed by generic
net pointer with tun_net_id id, and tun driver fills one on 
its load. It will contain only the tun devices list.

So declare this structure and introduce net init and exit hooks.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-04-16 00:40:46 -07:00
Ilpo Järvinen
17515408a1 [TCP]: Remove superflushious skb == write_queue_tail() check
Needed can only be more strict than what was checked by the
earlier common case check for non-tail skbs, thus
cwnd_len <= needed will never match in that case anyway.

Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-04-15 20:36:55 -07:00
Mandeep Singh Baines
b131dd5d65 [ETHTOOL]: Add support for large eeproms
Currently, it is not possible to read/write to an eeprom larger than
128k in size because the buffer used for temporarily storing the
eeprom contents is allocated using kmalloc. kmalloc can only allocate
a maximum of 128k depending on architecture.

Modified ethtool_get/set_eeprom to only allocate a page of memory and
then copy the eeprom a page at a time.

Updated original patch as per suggestions from Joe Perches.

Signed-off-by: Mandeep Singh Baines <msb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-04-15 19:29:17 -07:00
Oliver Hartkopp
73e87e02ec CAN: use hrtimers in can-bcm protocol
Make use of hrtimers to support high resolution capabilities, when
provided by the system clocksource.

The conversion to hrtimers additionally discovered and solved an
unlikely race condition that has been reproduced under (unrealistic)
massive receive load, which can only be produced on vcan software devices.

[ Fix printf format warnings on 64-bit -DaveM ]

Signed-off-by: Oliver Hartkopp <oliver@hartkopp.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-04-15 19:29:14 -07:00
Allan Stephens
85035568a9 [TIPC]: Enhance validation of format on incoming messages
This patch ensures that TIPC properly handles incoming messages
that have incorrect or unexpected formats.  Most significantly,
it now ensures that each sl_buff has at least as much data as
the message header indicates it should, and that the entire
message header is stored contiguously; this prevents TIPC from
accidentally accessing memory that is not part of the sk_buff.

Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-04-15 19:04:54 -07:00
Allan Stephens
fe13dda2d2 [TIPC]: Force linearization of non-linear sk_buffs
This patch allows TIPC to process incoming messages that are
stored in a fragmented sk_buff, by forcing the linearization
of any such messages it receives.

Note: This is an interim solution to allow TIPC to operate with
Ethernet devices that generate non-linear buffers (such as the
gianfar driver), until such time as the rest of TIPC is enhanced
to handle sk_buffs with multiple data areas.

Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-04-15 19:03:23 -07:00
Allan Stephens
bdc82bee43 [TIPC]: Use fast buffer cloning to improve performance
This patch causes TIPC to allocate fast clonable sk_buffs,
rather than standard ones.  This speeds up the cloning
operation done by the link code each time a message is sent
off-node.

Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-04-15 19:02:30 -07:00
Allan Stephens
11ecede787 [TIPC]: Remove redundant NULL check when discarding buffers
This patch eliminates a null pointer check when discarding a
TIPC message buffer, since kfree_skb() already handles this
situation.

Acknowledgements to Florian Westphal (fw@strlen.de> for
suggesting this enhancement.

Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-04-15 19:01:43 -07:00
Pavel Emelyanov
dec827d174 [NETNS]: The generic per-net pointers.
Add the elastic array of void * pointer to the struct net.
The access rules are simple:

 1. register the ops with register_pernet_gen_device to get
    the id of your private pointer
 2. call net_assign_generic() to put the private data on the
    struct net (most preferably this should be done in the
    ->init callback of the ops registered)
 3. do not store any private reference on the net_generic array;
 4. do not change this pointer while the net is alive;
 5. use the net_generic() to get the pointer.

When adding a new pointer, I copy the old array, replace it
with a new one and schedule the old for kfree after an RCU
grace period.

Since the net_generic explores the net->gen array inside rcu
read section and once set the net->gen->ptr[x] pointer never 
changes, this grants us a safe access to generic pointers.

Quoting Paul: "... RCU is protecting -only- the net_generic 
structure that net_generic() is traversing, and the [pointer]
returned by net_generic() is protected by a reference counter 
in the upper-level struct net."

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-04-15 00:36:08 -07:00
Pavel Emelyanov
c93cf61fd1 [NETNS]: The net-subsys IDs generator.
To make some per-net generic pointers, we need some way to address
them, i.e. - IDs. This is simple IDA-based IDs generator for pernet
subsystems.

Addressing questions about potential checkpoint/restart problems: 
these IDs are "lite-offsets" within the net structure and are by no 
means supposed to be exported to the userspace.

Since it will be used in the nearest future by devices only (tun,
vlan, tunnels, bridge, etc), I make it resemble the functionality
of register_pernet_device().

The new ids is stored in the *id pointer _before_ calling the init
callback to make this id available in this callback.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-04-15 00:35:23 -07:00
Adrian Bunk
31efdf0530 [ISDN] include/linux/isdn.h: remove dead code
This patch remove the usage of a nonexisting kconfig variable.

Reported-by: Robert P. J. Day <rpjday@crashcourse.ca>
Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-04-15 00:30:16 -07:00
Adrian Bunk
7ef3abd210 [IRDA]: Remove irlan_eth_send_gratuitous_arp()
Even kernel 2.2.26 (sic) already contains the
  #undef CONFIG_IRLAN_SEND_GRATUITOUS_ARP
with the comment "but for some reason the machine crashes if you use DHCP".

Either someone finally looks into this or it's simply time to remove 
this dead code.

Reported-by: Robert P. J. Day <rpjday@crashcourse.ca>
Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-04-15 00:29:24 -07:00
Adrian Bunk
99971e70fd [WANPIPE]: Forgotten bits of Sangoma drivers removal.
Robert P. J. Day spotted that my removal of the Sangoma drivers missed
a few bits.

Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-04-15 00:27:58 -07:00
Allan Stephens
0c3141e910 [TIPC]: Overhaul of socket locking logic
This patch modifies TIPC's socket code to follow the same approach
used by other protocols.  This change eliminates the need for a
mutex in the TIPC-specific portion of the socket protocol data
structure -- in its place, the standard Linux socket backlog queue
and associated locking routines are utilized.  These changes fix
a long-standing receive queue bug on SMP systems, and also enable
individual read and write threads to utilize a socket without
unnecessarily interfering with each other.

Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-04-15 00:22:02 -07:00
Allan Stephens
b89741a0cc [TIPC]: Cosmetic changes to TIPC connect() code
This patch fixes TIPC's connect routine to conform to Linux
kernel style norms of indentation, line length, etc.

Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-04-15 00:20:37 -07:00
Allan Stephens
4934c69a38 [TIPC]: Add error check to detect non-blocking form of connect()
This patch causes TIPC to return an error indication if the non-
blocking form of connect() is requested (which TIPC does not yet
support).

Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-04-15 00:16:19 -07:00
Allan Stephens
1819b83718 [TIPC]: Correct "off by 1" error in socket queue limit enforcement
This patch fixes a bug that allowed TIPC to queue 1 more message
than allowed by the socket receive queue threshold limits.  The
patch also improves the threshold code's logic and naming to help
prevent this sort of error from recurring in the future.

Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-04-15 00:15:50 -07:00
Allan Stephens
7a8036c2b9 [TIPC]: Ignore message padding when receiving stream data
This patch ensures that padding bytes appearing at the end of
an incoming TIPC message are not returned as valid stream data.

Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-04-15 00:15:15 -07:00
Allan Stephens
a198d3a200 [TIPC]: Allow stream receive to read from multiple TIPC messages
This patch allows a stream socket to receive data from multiple
TIPC messages in its receive queue, without requiring the use of
the MSG_WAITALL flag.

Acknowledgements to Florian Westphal <fw-tipc@strlen.de> for
identifying this issue and suggesting how to correct it.

Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-04-15 00:07:15 -07:00
Allan Stephens
990098068f [TIPC]: Skip connection flow control in connectionless sockets
This patch optimizes the receive path for SOCK_DGRAM and SOCK_RDM
messages by skipping over code that handles connection-based flow
control.

Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-04-15 00:06:12 -07:00
Denis V. Lunev
2c8dd11636 [XFRM]: Compilation warnings in xfrm_user.c.
When CONFIG_SECURITY_NETWORK_XFRM is undefined the following warnings appears:
net/xfrm/xfrm_user.c: In function 'xfrm_add_pol_expire':
net/xfrm/xfrm_user.c:1576: warning: 'ctx' may be used uninitialized in this function
net/xfrm/xfrm_user.c: In function 'xfrm_get_policy':
net/xfrm/xfrm_user.c:1340: warning: 'ctx' may be used uninitialized in this function
(security_xfrm_policy_alloc is noop for the case).

It seems that they are result of the commit
03e1ad7b5d ("LSM: Make the Labeled IPsec
hooks more stack friendly")

Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-04-14 14:47:48 -07:00
YOSHIFUJI Hideaki
569508c964 [TCP]: Format addresses appropriately in debug messages.
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-04-14 04:09:36 -07:00
YOSHIFUJI Hideaki
a7d632b6b4 [IPV4]: Use NIPQUAD_FMT to format ipv4 addresses.
And use %u to format port.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-04-14 04:09:00 -07:00
David S. Miller
334f8b2afd Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/kaber/nf-2.6.26 2008-04-14 03:50:43 -07:00
Pavel Emelyanov
7477fd2e6b [SOCK]: Add some notes about per-bind-bucket sock lookup.
I was asked about "why don't we perform a sk_net filtering in
bind_conflict calls, like we do in other sock lookup places"
for a couple of times.

Can we please add a comment about why we do not need one?

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-04-14 02:42:27 -07:00
Pavel Emelyanov
13f51d82ac [DCCP]: Fix comment about control sockets.
These sockets now have a bit other names and are no longer global.

Shame on me, I haven't provided a good comment for this when
sending DCCP netnsization patches.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-04-14 02:38:45 -07:00
David S. Miller
df39e8ba56 Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6
Conflicts:

	drivers/net/ehea/ehea_main.c
	drivers/net/wireless/iwlwifi/Kconfig
	drivers/net/wireless/rt2x00/rt61pci.c
	net/ipv4/inet_timewait_sock.c
	net/ipv6/raw.c
	net/mac80211/ieee80211_sta.c
2008-04-14 02:30:23 -07:00
Patrick McHardy
ef1a5a50bb [NETFILTER]: nf_conntrack: fix incorrect check for expectations
The expectation classes changed help->expectations to an array,
fix use as scalar value.

Signed-off-by: Patrick McHardy <kaber@trash.net>
2008-04-14 11:21:01 +02:00
Peter Warasin
e7bfd0a1a6 [NETFILTER]: bridge: add ebt_nflog watcher
This patch adds the ebtables nflog watcher to the kernel in order to
allow ebtables log through the nfnetlink_log backend.

Signed-off-by: Peter Warasin <peter@endian.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2008-04-14 11:15:54 +02:00
Jan Engelhardt
3c9fba656a [NETFILTER]: nf_conntrack: replace NF_CT_DUMP_TUPLE macro indrection by function call
Directly call IPv4 and IPv6 variants where the address family is
easily known.

Signed-off-by: Jan Engelhardt <jengelh@computergmbh.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2008-04-14 11:15:54 +02:00
Jan Engelhardt
12c33aa20e [NETFILTER]: nf_conntrack: const annotations in nf_conntrack_sctp, nf_nat_proto_gre
Signed-off-by: Jan Engelhardt <jengelh@computergmbh.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2008-04-14 11:15:54 +02:00
Jan Engelhardt
f2ea825f48 [NETFILTER]: nf_nat: use bool type in nf_nat_proto
Signed-off-by: Jan Engelhardt <jengelh@computergmbh.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2008-04-14 11:15:53 +02:00
Jan Engelhardt
5f2b4c9006 [NETFILTER]: nf_conntrack: use bool type in struct nf_conntrack_tuple.h
Signed-off-by: Jan Engelhardt <jengelh@computergmbh.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2008-04-14 11:15:53 +02:00
Jan Engelhardt
09f263cd39 [NETFILTER]: nf_conntrack: use bool type in struct nf_conntrack_l4proto
Signed-off-by: Jan Engelhardt <jengelh@computergmbh.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2008-04-14 11:15:53 +02:00
Jan Engelhardt
8ce8439a31 [NETFILTER]: nf_conntrack: use bool type in struct nf_conntrack_l3proto
Signed-off-by: Jan Engelhardt <jengelh@computergmbh.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2008-04-14 11:15:52 +02:00
Jan Engelhardt
9dbae79178 [NETFILTER]: Remove unused callbacks in nf_conntrack_l3proto
These functions are never called.

Signed-off-by: Jan Engelhardt <jengelh@computergmbh.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2008-04-14 11:15:52 +02:00
Patrick McHardy
5e8fbe2ac8 [NETFILTER]: nf_conntrack: add tuplehash l3num/protonum accessors
Add accessors for l3num and protonum and get rid of some overly long
expressions.

Signed-off-by: Patrick McHardy <kaber@trash.net>
2008-04-14 11:15:52 +02:00
Patrick McHardy
5f7da4d26d [NETFILTER]: nf_conntrack_tcp: catch invalid state updates over ctnetlink
Invalid states can cause out-of-bound memory accesses of the state table.
Also don't insist on having a new state contained in the netlink message.

Signed-off-by: Patrick McHardy <kaber@trash.net>
2008-04-14 11:15:52 +02:00
Patrick McHardy
dd13b01036 [NETFILTER]: nf_nat: kill helper and seq_adjust hooks
Connection tracking helpers (specifically FTP) need to be called
before NAT sequence numbers adjustments are performed to be able
to compare them against previously seen ones. We've introduced
two new hooks around 2.6.11 to maintain this ordering when NAT
modules were changed to get called from conntrack helpers directly.

The cost of netfilter hooks is quite high and sequence number
adjustments are only rarely needed however. Add a RCU-protected
sequence number adjustment function pointer and call it from
IPv4 conntrack after calling the helper.

Signed-off-by: Patrick McHardy <kaber@trash.net>
2008-04-14 11:15:52 +02:00
Patrick McHardy
55871d0479 [NETFILTER]: nf_conntrack_extend: warn on confirmed conntracks
New extensions may only be added to unconfirmed conntracks to avoid races
when reallocating the storage.

Also change NF_CT_ASSERT to use WARN_ON to get backtraces.

Signed-off-by: Patrick McHardy <kaber@trash.net>
2008-04-14 11:15:51 +02:00
Patrick McHardy
8c87238b72 [NETFILTER]: nf_nat: don't add NAT extension for confirmed conntracks
Adding extensions to confirmed conntracks is not allowed to avoid races
on reallocation. Don't setup NAT for confirmed conntracks in case NAT
module is loaded late.

The has one side-effect, the connections existing before the NAT module
was loaded won't enter the bysource hash. The only case where this actually
makes a difference is in case of SNAT to a multirange where the IP before
NAT is also part of the range. Since old connections don't enter the
bysource hash the first new connection from the IP will have a new address
selected. This shouldn't matter at all.

Signed-off-by: Patrick McHardy <kaber@trash.net>
2008-04-14 11:15:51 +02:00
Patrick McHardy
42cf800c24 [NETFILTER]: nf_nat: remove obsolete check for ICMP redirects
Locally generated ICMP packets have a reference to the conntrack entry
of the original packet manually attached by icmp_send(). Therefore the
check for locally originated untracked ICMP redirects can never be
true.

Signed-off-by: Patrick McHardy <kaber@trash.net>
2008-04-14 11:15:50 +02:00
Patrick McHardy
9d908a69a3 [NETFILTER]: nf_nat: add SCTP protocol support
Signed-off-by: Patrick McHardy <kaber@trash.net>
2008-04-14 11:15:50 +02:00
Patrick McHardy
4910a08799 [NETFILTER]: nf_nat: add DCCP protocol support
Signed-off-by: Patrick McHardy <kaber@trash.net>
2008-04-14 11:15:50 +02:00