linux

Author	SHA1	Message	Date
Michal Miroslaw	aace57e054	[NETFILTER]: nfnetlink_log: fix instance_create() failure path Fix memory leak on instance_create() while module is being unloaded. Signed-off-by: Michal Miroslaw <mirq-linux@rere.qmqm.pl> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-10 16:53:39 -07:00
Michal Miroslaw	c6a8f64836	[NETFILTER]: nfnetlink_log: fix style Fix function definition style to match other functions in nfnetlink_log.c. Signed-off-by: Michal Miroslaw <mirq-linux@rere.qmqm.pl> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-10 16:53:39 -07:00
Michal Miroslaw	d63b043d95	[NETFILTER]: nfnetlink_log: flush queue early If queue is filled to its threshold, then flush it right away instead of waiting for timer or next packet. Signed-off-by: Michal Miroslaw <mirq-linux@rere.qmqm.pl> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-10 16:53:38 -07:00
Michal Miroslaw	e35670614d	[NETFILTER]: nfnetlink_log: kill duplicate code Kill some cut'n'paste effect. Just after __nfulnl_send() returning, inst->skb is always NULL. Signed-off-by: Michal Miroslaw <mirq-linux@rere.qmqm.pl> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-10 16:53:38 -07:00
Pablo Neira Ayuso	5faa1f4cb5	[NETFILTER]: nf_conntrack_netlink: add support to related connections This patch adds support to relate a connection to an existing master connection. This patch is used by conntrackd to correctly replicate related connections. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-10 16:53:37 -07:00
Patrick McHardy	3583240249	[NETFILTER]: nf_conntrack_expect: kill unique ID Similar to the conntrack ID, the per-expectation ID is not needed anymore, kill it. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-10 16:53:36 -07:00
Patrick McHardy	7f85f91472	[NETFILTER]: nf_conntrack: kill unique ID Remove the per-conntrack ID, its not necessary anymore for dumping. For compatiblity reasons we send the address of the conntrack to userspace as ID. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-10 16:53:36 -07:00
Patrick McHardy	f73e924cdd	[NETFILTER]: ctnetlink: use netlink policy Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-10 16:53:35 -07:00
Patrick McHardy	5bf7585393	[NETFILTER]: nfnetlink_queue: use netlink policy Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-10 16:53:34 -07:00
Patrick McHardy	fd8281adac	[NETFILTER]: nfnetlink_log: use netlink policy Also remove unused nfula_min array. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-10 16:53:34 -07:00
Patrick McHardy	e373057828	[NETFILTER]: nfnetlink: support attribute policies Add support for automatic checking of per-callback attribute policies. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-10 16:53:33 -07:00
Patrick McHardy	dd82185f2c	[NETFILTER]: nfnetlink: use nlmsg_notify() Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-10 16:53:32 -07:00
Patrick McHardy	fdf708322d	[NETFILTER]: nfnetlink: rename functions containing 'nfattr' There is no struct nfattr anymore, rename functions to 'nlattr'. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-10 16:53:32 -07:00
Patrick McHardy	df6fb868d6	[NETFILTER]: nfnetlink: convert to generic netlink attribute functions Get rid of the duplicated rtnetlink macros and use the generic netlink attribute functions. The old duplicated stuff is moved to a new header file that exists just for userspace. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-10 16:53:31 -07:00
Patrick McHardy	7c8d4cb419	[NETFILTER]: nfnetlink: make subsystem and callbacks const Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-10 16:53:30 -07:00
Ivo van Doorn	fe242cfd33	[RFKILL]: Move rfkill_switch_all out of global header rfkill_switch_all shouldn't be called by drivers directly, instead they should send a signal over the input device. To prevent confusion for driver developers, move the function into a rfkill private header. Signed-off-by: Ivo van Doorn <IvDoorn@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-10 16:53:29 -07:00
Michael Buesch	30ccb08847	[PATCH] mac80211: bss_tim_clear must use ~ instead of ! We need to use bitwise NOT. This also cleans up the code a little bit to make it more readable. Signed-off-by: Michael Buesch <mb@bu3sch.de> Reviewed-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2007-10-10 16:53:18 -07:00
Johannes Berg	b4010e0890	[PATCH] mac80211: remove generic IE for AP interfaces This is not useful since we do not support probe response offload to hardware at this time and beacons are set in another way. Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2007-10-10 16:53:17 -07:00
Johannes Berg	51617f0b76	[PATCH] mac80211: remove all prism2 ioctls This patch removes all prism2 ioctls. Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2007-10-10 16:53:17 -07:00
Johannes Berg	53918994b7	[PATCH] mac80211: fix iff_promiscs, iff_allmultis race When we update the counters iff_promiscs and iff_allmultis in struct ieee80211_local we have no common lock held to protect them. The problem is that the update to each counter may not be atomic, so we could end up with iff_promiscs == -1 in unfortunate conditions. To fix it, use atomic_t values. It doesn't matter whether the two counters are updated together atomically or not, if there are two invocations of set_multicast_list we will end up with multiple configure_filter() invocations of which the latter will always be correct. Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2007-10-10 16:53:16 -07:00
Johannes Berg	50741ae05a	[PATCH] mac80211: fix TKIP IV update The TKIP IV should be updated only after MMIC verification, this patch changes it to be at that spot. Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2007-10-10 16:53:16 -07:00
Johannes Berg	fb1c1cd6c5	[PATCH] mac80211: fix vlan bug VLAN interfaces have yet another bug: they aren't accounted for properly in the receive path in prepare_for_handlers(). I noticed this by code inspection, but it would be easy for the compiler to catch such things if we'd just use the proper enum where appropriate. Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2007-10-10 16:53:15 -07:00
Johannes Berg	af1a90da39	[PATCH] mac80211: remove ieee80211_wep_get_keyidx This function is not used any more. Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2007-10-10 16:53:14 -07:00
Johannes Berg	6a22a59d48	[PATCH] mac80211: consolidate encryption Currently we run through all crypto handlers for each transmitted frame although we already know which one will be used. This changes the code to invoke only the needed handler. It also moves the wep code into wep.c. Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2007-10-10 16:53:14 -07:00
Johannes Berg	4f0d18e26f	[PATCH] mac80211: consolidate decryption Currently, we run through all three crypto algorithms for each received frame even though we have previously determined which key we have and as such already know which algorithm will be used. Change it to invoke only the needed function. Also move the WEP decrypt handler to wep.c so that fewer functions need to be non-static. Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2007-10-10 16:53:13 -07:00
Johannes Berg	b2e7771e55	[PATCH] mac80211: pass frames to monitor interfaces early This makes mac80211 pass all frames to monitor interfaces early before all receive processing with the benefit that only a single copy needs to be made, all monitors can receive clones of the skb and if the frame will be discarded we don't even need to make a single copy. Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2007-10-10 16:53:12 -07:00
Johannes Berg	5b2812e925	[PATCH] mac80211: fix interface initialisation and deinitialisation When an interface is registered it is still uninitialised so ieee80211_if_reinit() can't be called on it (it will oops.) Hence, we need to move the uninit method assignment. Also, this patch fixes the bug that the master device is never initialised nor deinitialised at all. Oddly, the deinit code had an if statement to not run some code when running for the master interface (which never happened), but that if statement is also wrong. Fix that too. Now that the uninit code is run for the master device, another bug surfaced: it tries to remove all dependent interfaces and that oopses or BUGs at some point, either because it unregisters already unregistered interfaces (missing list_del bug) or due to trying to iterate a list that has had other things removed. Fix this too by handling the master interface specially. Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2007-10-10 16:53:11 -07:00
Herbert Xu	b421995235	[PKT_SCHED]: Add stateless NAT Stateless NAT is useful in controlled environments where restrictions are placed on through traffic such that we don't need connection tracking to correctly NAT protocol-specific data. In particular, this is of interest when the number of flows or the number of addresses being NATed is large, or if connection tracking information has to be replicated and where it is not practical to do so. Previously we had stateless NAT functionality which was integrated into the IPv4 routing subsystem. This was a great solution as long as the NAT worked on a subnet to subnet basis such that the number of NAT rules was relatively small. The reason is that for SNAT the routing based system had to perform a linear scan through the rules. If the number of rules is large then major renovations would have take place in the routing subsystem to make this practical. For the time being, the least intrusive way of achieving this is to use the u32 classifier written by Alexey Kuznetsov along with the actions infrastructure implemented by Jamal Hadi Salim. The following patch is an attempt at this problem by creating a new nat action that can be invoked from u32 hash tables which would allow large number of stateless NAT rules that can be used/updated in constant time. The actual NAT code is mostly based on the previous stateless NAT code written by Alexey. In future we might be able to utilise the protocol NAT code from netfilter to improve support for other protocols. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-10 16:53:11 -07:00
Johannes Berg	79010420cc	[PATCH] mac80211: fix virtual interface locking Florian Lohoff noticed a bug in mac80211: when bringing the master interface down while other virtual interfaces are up we call dev_close() under a spinlock which is not allowed. This patch removes the sub_if_lock used by mac80211 in favour of using an RCU list. All list manipulations are already done under rtnl so are well protected against each other, and the read-side locks we took in the RX and TX code are already in RCU read-side critical sections. Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Cc: Florian Lohoff <flo@rfc822.org> Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: Michal Piotrowski <michal.k.k.piotrowski@gmail.com> Cc: Satyam Sharma <satyam@infradead.org> Signed-off-by: Michael Wu <flamingice@sourmilk.net> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2007-10-10 16:53:00 -07:00
Johannes Berg	ea49c359f3	[PATCH] mac80211: remove crypto algorithm typedef The typedef is not required, we can just use "enum ieee80211_key_alg" instead of "ieee80211_key_alg" Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: Michael Wu <flamingice@sourmilk.net> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2007-10-10 16:53:00 -07:00
Johannes Berg	0ec3ca4459	[PATCH] mac80211: validate VLAN interfaces better This patch changes mac80211 to verify that VLAN interfaces are valid and not bother drivers about them any more. VLAN interfaces are now only valid when an AP interface is up with the same MAC address, and are automatically turned off when the AP interface is set down. Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Cc: Jouni Malinen <j@w1.fi> Signed-off-by: Michael Wu <flamingice@sourmilk.net> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2007-10-10 16:52:57 -07:00
Johannes Berg	4150c57212	[PATCH] mac80211: revamp interface and filter configuration Drivers are currently supposed to keep track of monitor interfaces if they allow so-called "hard" monitor, and they are also supposed to keep track of multicast etc. This patch changes that, replaces the set_multicast_list() callback with a new configure_filter() callback that takes filter flags (FIF_) instead of interface flags (IFF_). For a driver, this means it should open the filter as much as necessary to get all frames requested by the filter flags. Accordingly, the filter flags are named "positively", e.g. FIF_ALLMULTI. Multicast filtering is a bit special in that drivers that have no multicast address filters need to allow multicast frames through when either the FIF_ALLMULTI flag is set or when the mc_count value is positive. At the same time, drivers are no longer notified about monitor interfaces at all, this means they now need to implement the start() and stop() callbacks and the new change_filter_flags() callback. Also, the start()/stop() ordering changed, start() is now called before any add_interface() as it really should be, and stop() after any remove_interface(). The patch also changes the behaviour of setting the bssid to multicast for scanning when IEEE80211_HW_NO_PROBE_FILTERING is set; the IEEE80211_HW_NO_PROBE_FILTERING flag is removed and the filter flag FIF_BCN_PRBRESP_PROMISC introduced. This is a lot more efficient for hardware like b43 that supports it and other hardware can still set the BSSID to all-ones. Driver modifications by Johannes Berg (b43 & iwlwifi), Michael Wu (rtl8187, adm8211, and p54), Larry Finger (b43legacy), and Ivo van Doorn (rt2x00). Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: Michael Wu <flamingice@sourmilk.net> Signed-off-by: Larry Finger <Larry.Finger@lwfinger.net> Signed-off-by: Ivo van Doorn <IvDoorn@gmail.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2007-10-10 16:52:57 -07:00
Eric W. Biederman	f4618d39a3	[NETNS]: Simplify the network namespace list locking rules. Denis V. Lunev <den@sw.ru> noticed that the locking rules for the network namespace list are over complicated and broken. In particular the current register_netdev_notifier currently does not take any lock making the for_each_net iteration racy with network namespace creation and destruction. Oops. The fact that we need to use for_each_net in rtnl_unlock() when the rtnetlink support becomes per network namespace makes designing the proper locking tricky. In addition we need to be able to call rtnl_lock() and rtnl_unlock() when we have the net_mutex held. After thinking about it and looking at the alternatives carefully it looks like the simplest and most maintainable solution is to remove net_list_mutex altogether, and to use the rtnl_mutex instead. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-10 16:52:55 -07:00
Andrew Morton	58c14a8fe6	[ATM] net/atm/lec.c: printk warning fix net/atm/lec.c: In function 'lec_start_xmit': net/atm/lec.c:371: warning: format '%x' expects type 'unsigned int', but argument 4 has type 'long unsigned int' Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-10 16:52:54 -07:00
Andrew Morton	0aa4f3331b	[WIRELESS]: Fix Kconfig. Seems that a bare "depends" is no longer allowed in Sam's kbuild tree. Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-10 16:52:52 -07:00
Stephen Hemminger	3b04ddde02	[NET]: Move hardware header operations out of netdevice. Since hardware header operations are part of the protocol class not the device instance, make them into a separate object and save memory. Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-10 16:52:52 -07:00
Stephen Hemminger	b95cce3576	[NET]: Wrap hard_header_parse Wrap the hard_header_parse function to simplify next step of header_ops conversion. Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-10 16:52:51 -07:00
Stephen Hemminger	0c4e85813d	[NET]: Wrap netdevice hardware header creation. Add inline for common usage of hardware header creation, and fix bug in IPV6 mcast where the assumption about negative return is an errno. Negative return from hard_header means not enough space was available,(ie -N bytes). Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-10 16:52:50 -07:00
Eric W. Biederman	2774c7aba6	[NET]: Make the loopback device per network namespace. This patch makes loopback_dev per network namespace. Adding code to create a different loopback device for each network namespace and adding the code to free a loopback device when a network namespace exits. This patch modifies all users the loopback_dev so they access it as init_net.loopback_dev, keeping all of the code compiling and working. A later pass will be needed to update the users to use something other than the initial network namespace. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-10 16:52:49 -07:00
Eric W. Biederman	0cc217e16c	[IPV4]: When possible test for IFF_LOOPBACK and not dev == loopback_dev Now that multiple loopback devices are becoming possible it makes the code a little cleaner and more maintainable to test if a deivice is th a loopback device by testing dev->flags & IFF_LOOPBACK instead of dev == loopback_dev. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-10 16:52:48 -07:00
Eric W. Biederman	5967789dbc	[IPV4]: Remove unnecessary test for the loopback device from inetdev_destroy Currently we never call unregister_netdev for the loopback device so it is impossible for us to reach inetdev_destroy with the loopback device. So the test in inetdev_destroy is unnecessary. Further when testing with my network namespace patches removing unregistering the loopback device and calling inetdev_destroy works fine so there appears to be no reason for avoiding unregistering the loopback device. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-10 16:52:48 -07:00
Eric W. Biederman	9dd776b6d7	[NET]: Add network namespace clone & unshare support. This patch allows you to create a new network namespace using sys_clone, or sys_unshare. As the network namespace is still experimental and under development clone and unshare support is only made available when CONFIG_NET_NS is selected at compile time. As this patch introduces network namespace support into code paths that exist when the CONFIG_NET is not selected there are a few additions made to net_namespace.h to allow a few more functions to be used when the networking stack is not compiled in. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-10 16:52:46 -07:00
Eric W. Biederman	8b41d1887d	[NET]: Fix running without sysfs When sysfs support is compiled out the kernel still keeps and maintains the kobject tree. So it is not safe to skip our kobject reference counting or to avoid becoming members of the kobject tree. It is safe to not add the networking specific sysfs attributes. This patch removes the sysfs special cases from net/core/dev.c renames functions from netdev_sysfs_xxxx to netdev_kobject_xxxx and always compiles in net-sysfs.c net-sysfs.c is modified with a CONFIG_SYSFS guard around the parts that are actually sysfs specific. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-10 16:52:46 -07:00
Arnaldo Carvalho de Melo	bb293e6a24	[CCID3]: Remove ifdef surrounding BUG_ON As suggested by DaveM. Signed-off-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>	2007-10-10 16:52:45 -07:00
Gerrit Renker	cecd8d0ec4	[DCCP]: Reduce the number of writable states Since DCCP requires to close both ends of a connection simultaneously, permission to write in state DCCP_CLOSING is removed in dccp_sendmsg(): * if the sending end closed, it would encounter a write error anyhow; * if the other end has closed the connection, it accepts no more data. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>	2007-10-10 16:52:45 -07:00
Gerrit Renker	e356d37a09	[DCCP]: Factor out common code for generating Resets This factors code common to dccp_v{4,6}_ctl_send_reset into a separate function, and adds support for filling in the Data 1 ... Data 3 fields from RFC 4340, 5.6. It is useful to have this separate, since the following Reset codes will always be generated from the control socket rather than via dccp_send_reset: * Code 3, "No Connection", cf. 8.3.1; * Code 4, "Packet Error" (identification for Data 1 added); * Code 5, "Option Error" (identification for Data 1..3 added, will be used later); * Code 6, "Mandatory Error" (same as Option Error); * Code 7, "Connection Refused" (what on Earth is the difference to "No Connection"?); * Code 8, "Bad Service Code"; * Code 9, "Too Busy"; * Code 10, "Bad Init Cookie" (not used). Code 0 is not recommended by the RFC, the following codes would be used in dccp_send_reset() instead, since they all relate to an established DCCP connection: * Code 1, "Closed"; * Code 2, "Aborted"; * Code 11, "Aggression Penalty" (12.3). Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>	2007-10-10 16:52:44 -07:00
Gerrit Renker	9bf55cda9b	[DCCP]: Sequence number wrap-around when sending reset This replaces normal addition with mod-48 addition so that sequence number wraparound is respected. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>	2007-10-10 16:52:43 -07:00
Gerrit Renker	a94f0f9705	[DCCP]: Rate-limit DCCP-Syncs This implements a SHOULD from RFC 4340, 7.5.4: "To protect against denial-of-service attacks, DCCP implementations SHOULD impose a rate limit on DCCP-Syncs sent in response to sequence-invalid packets, such as not more than eight DCCP-Syncs per second." The rate-limit is maintained on a per-socket basis. This is a more stringent policy than enforcing the rate-limit on a per-source-address basis and protects against attacks with forged source addresses. Moreover, the mechanism is deliberately kept simple. In contrast to xrlim_allow(), bursts of Sync packets in reply to sequence-invalid packets are not supported. This foils such attacks where the receipt of a Sync triggers further sequence-invalid packets. (I have tested this mechanism against xrlim_allow algorithm for Syncs, permitting bursts just increases the problems.) In order to keep flexibility, the timeout parameter can be set via sysctl; and the whole mechanism can even be disabled (which is however not recommended). The algorithm in this patch has been improved with regard to wrapping issues thanks to a suggestion by Arnaldo. Commiter note: Rate limited the step 6 DCCP_WARN too, as it says we're sending a sync. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>	2007-10-10 16:52:43 -07:00
Gerrit Renker	ee1a15922d	[DCCP]: Remove duplicate code for Reset from connected socket In this patch, duplicated code is removed for the case when a Reset packet is sent from a connected socket. This code duplication is between dccp_make_reset and dccp_transmit_skb, which already contained an (up to now entirely unused) switch statement to fill in the reset code from the DCCP_SKB_CB. The only thing that has been removed is the call to dst_clone(dst), since the queue_xmit functions use sk_dst_cache anyway. I wasn't sure which purpose inet_sk_rebuild_header served, so I left it in. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>	2007-10-10 16:52:42 -07:00
Gerrit Renker	0430ee3451	[DCCP]: Add Support for Data 1 .. 3 fields of Reset packets This adds fields to support the informational Data 1..3 fields of the DCCP-Reset packets (RFC 4340, 5.6), and makes minor cosmetic changes to documentation. Code which fills in these fields follows in subsequent patches, it is primarily used for reporting option-processing and feature-negotiation errors. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>	2007-10-10 16:52:42 -07:00
Gerrit Renker	727ecc5faa	[DCCP]: Add FIXME for send_delayed_ack This adds a FIXME to signal that the function dccp_send_delayed_ack is nowhere used in the entire DCCP/CCID code. Using a delayed Ack timer is suggested in 11.3 of RFC 4340, but it has also rather subtle implications for the Ack-Ratio-accounting. CCID2 does not use this (maybe it should). I think leaving the function in is good, in case someone wants to implement this. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>	2007-10-10 16:52:41 -07:00
Gerrit Renker	2e86908f7d	[CCID3]: Move NULL-protection into function This moves several instances of testing against NULL into the function which is used to de-reference the CCID-private data. Committer note: Made the BUG_ON depend on having CONFIG_IP_DCCP_CCID3_DEBUG, as it is too much to have this on production code. Also made sure that the macro is used only after checking if sk_state is not LISTEN, to make it equivalent to what we had before. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>	2007-10-10 16:52:41 -07:00
Gerrit Renker	08831700cc	[DCCP]: Send Reset upon Sync in state REQUEST This fixes the code to correspond to RFC 4340, 7.5.4, which states the exception that a Sync received in state REQUEST generates a Reset (not a SyncAck). To achieve this, only a small change is required. Since dccp_rcv_request_sent_state_process() already uses the correct Reset Code number 4 ("Packet Error"), we only need to shift the if-statement a few lines further down. (To test this case: replace DCCP_PKT_RESPONSE with DCCP_PKT_SYNC in dccp_make_response.) Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>	2007-10-10 16:52:40 -07:00
WANG Cong	53465eb4ab	[BLUETOOTH]: Make hidp_setup_input() return int This patch: - makes hidp_setup_input() return int to indicate errors; - checks its return value to handle errors. And this time it is against -rc7-mm1 tree. Thanks to roel and Marcel Holtmann for comments. Signed-off-by: WANG Cong <xiyou.wangcong@gmail.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-10 16:52:39 -07:00
Ilpo Järvinen	912d8f0b1f	[TCP] MIB: Count FRTO's successfully detected spurious RTOs Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-10 16:52:39 -07:00
Ilpo Järvinen	93e6802029	[TCP]: Reordered ACK's (old) SACKs not included to discarded MIB In case of ACK reordering, the SACK block might be valid in it's time but is already obsoleted since we've received another kind of confirmation about arrival of the segments through snd_una advancement of an earlier packet. I didn't bother to build distinguishing of valid and invalid SACK blocks but simply made reordered SACK blocks that are too old always not counted regardless of their "real" validity which could be determined by using the ack field of the reordered packet (won't be significant IMHO). DSACKs can very well be considered useful even in this situation, so won't do any of this for them. Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-10 16:52:38 -07:00
Ilpo Järvinen	a6963a6b3d	[TCP]: Re-place highest_sack check to a more robust position I previously added checking to position that is rather poor as state has already been adjusted quite a bit. Re-placing it above all state changes should be more robust though the return should never ever get executed regardless of its place :-). Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-10 16:52:38 -07:00
Gerrit Renker	b0d045ca45	[DCCP]: Parameter renaming The parameter `seq' of dccp_send_sync() is in fact an acknowledgement number and not a sequence number - thus renamed by this patch into `ackno'. Secondly, a `critical' warning is added when a Sync/SyncAck could not be sent. Sanity: I have checked all other functions that are called in dccp_transmit_skb, there are no clashes with the use of dccpd_ack_seq; no other function is using this slot at the same time. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-10 16:52:37 -07:00
Gerrit Renker	e155d76922	[DCCP]: Fix Reset/Sync-Flood Bug This updates sequence number checking with regard to RFC 4340, 7.5.4. Missing in the code was an exception for sequence-invalid Reset packets, which get a Sync acknowledging GSR, instead of (as usual) P.seqno. This can lead to an oscillating ping-pong flood of Reset packets. In fact, it has been observed on the wire as follows: 1. client establishes connection to server; 2. before server can write to client, client crashes without notifying the server (NB: now no longer possible due to ABORT function); 3. server sends DCCP-Data packet (has no ackno); 4. client generates Reset "No Connection", seqno=0, increments seqno; 5. server replies with Sync, using ackno = P.seqno; 6. client generates Reset "No Connection" with seqno = ackno + 1; 7. goto (5). The difference is that now in (5) the server uses GSR. This causes the Reset sent by the client in (6) to become sequence-valid, so that in (7) the vicious circle is broken; the Reset is then enqueued and causes the socket to enter TIMEWAIT state. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-10 16:52:37 -07:00
Gerrit Renker	cbe1f5f88a	[DCCP]: Shorten variable names in dccp_check_seqno This patch is in part required by the next patch; it * replaces 6 instances of `DCCP_SKB_CB(skb)->dccpd_seq' with `seqno'; * replaces 7 instances of `DCCP_SKB_CB(skb)->dccpd_ack_seq' with `ackno'; * replaces 1 use of dccp_inc_seqno() by unfolding `ADD48' macro in place. No changes in algorithm, all changes are text replacement/substitution. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-10 16:52:36 -07:00
Gerrit Renker	3393da8241	[DCCP]: Simplify interface of dccp_sample_rtt The third parameter of dccp_sample_rtt now becomes useless and is removed. Also combined the subtraction of the timestamp echo and the elapsed time. This is safe, since (a) presence of timestamp echo is tested first and (b) elapsed time is either present and non-zero or it is not set and equals 0 due to the memset in dccp_parse_options. To avoid measuring option-processing time, the timestamp for measuring the initial Request/Response RTT sample is taken directly when the function is called (the Linux implementation always adds a timestamp on the Request, so there is no loss in doing this). Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-10 16:52:35 -07:00
Gerrit Renker	4c70f383e0	[DCCP]: Provide 10s of microsecond timesource This provides a timesource, conveniently used for DCCP timestamps, which returns the elapsed time in 10s of microseconds since initialisation. This makes for a wrap-around time of about 11.9 hours, which should be sufficient for most applications. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-10 16:52:35 -07:00
Gerrit Renker	aa97efd97a	[DCCP]: Reuse ktime_get_real() calls again This patch reduces the number of timestamps taken in the receive path for each packet. The ccid3_hc_tx_update_x() routine is called in * the receive path for each CCID3-controlled packet * for the nofeedback timer (if no feedback arrives during 4 RTT) Currently, when there is no loss, each packet gets timestamped twice. The patch resolves this by recycling the first timestamp taken on packet reception for RTT sampling. When the no_feedback_timer() is called, then the timestamp argument is simply set to NULL - so that ccid3_hc_tx_update_x() takes care of the logic. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-10 16:52:34 -07:00
Michael Wu	e0eb685962	[MAC80211]: rename ieee80211_cfg.h to cfg.h Might as well rename ieee80211_cfg.h to cfg.h to keep things consistent. Signed-off-by: Michael Wu <flamingice@sourmilk.net> Signed-off-by: John W. Linville <linville@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-10 16:52:34 -07:00
Johannes Berg	d86ec781ef	[MAC80211]: kill vlan_id Each station has a vlan_id that is useless. Remove it. Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: Michael Wu <flamingice@sourmilk.net> Signed-off-by: John W. Linville <linville@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-10 16:52:33 -07:00
Johannes Berg	c095df531f	[MAC80211]: kill IE parse typedef The parse result typedef isn't needed. Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: Michael Wu <flamingice@sourmilk.net> Signed-off-by: John W. Linville <linville@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-10 16:52:33 -07:00
Johannes Berg	fa5fea711f	[MAC80211]: rename ieee80211_cfg.c to cfg.c It's just painful to have the extra ieee80211_ prefix. Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: Michael Wu <flamingice@sourmilk.net> Signed-off-by: John W. Linville <linville@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-10 16:52:32 -07:00
Johannes Berg	dd1cd4c620	[MAC80211]: print out wiphy name instead of master device This makes mac80211 print out the wiphy name instead of the master device name where appropriate. Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: Michael Wu <flamingice@sourmilk.net> Signed-off-by: John W. Linville <linville@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-10 16:52:32 -07:00
Johannes Berg	693d454dff	[MAC80211]: fix warnings introduced by the doc patches This fixes a warning about NUM_IEEE80211_MODES missing in a switch statement. Intentionally do not add a default case so we get warnings at these places if we need to add new modes. Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: Michael Wu <flamingice@sourmilk.net> Signed-off-by: John W. Linville <linville@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-10 16:52:30 -07:00
Johannes Berg	011bfcc4f3	[MAC80211]: remove key threshold stuff This patch removes the key threshold stuff from mac80211. I have patches for later that add it as a per-key setting to nl/cfg80211. Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: Michael Wu <flamingice@sourmilk.net> Signed-off-by: John W. Linville <linville@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-10 16:52:29 -07:00
Johannes Berg	72abd81b98	[MAC80211]: allow drivers to indicate failed FCS/PLCP checksum This patch allows drivers to indicate bad FCS/PLCP CRC to the stack and have the stack drop packets like that except for monitor interfaces. Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: Michael Wu <flamingice@sourmilk.net> Signed-off-by: John W. Linville <linville@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-10 16:52:28 -07:00
Michael Buesch	61609bc0e4	[MAC80211]: Add support for setting TX power and radio status This adds support for disabling the radio and setting the TXpower through wext. This also fixes the prism TXpower ioctl (It always overwrote the TXpower value in ieee80211_hw_config()) Signed-off-by: Michael Buesch <mb@bu3sch.de> Signed-off-by: John W. Linville <linville@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-10 16:52:23 -07:00
Johannes Berg	501d857ec9	[IEEE80211]: Fix softmac lockdep reports. It seems I was actually able to hit this deadlock, on my quad G5 softmac locks up more often than not. This fixes it by using an own workqueue that can safely be flushed under RTNL. Not sure if the patch is correct with the workqueue naming. And don't think with the patch it doesn't continually lock up. It still does, just doesn't invoke lockdep warnings all the time. Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: John W. Linville <linville@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-10 16:52:22 -07:00
Jamal Hadi Salim	8236632fb3	[NET_SCHED]: explict hold dev tx lock For N cpus, with full throttle traffic on all N CPUs, funneling traffic to the same ethernet device, the devices queue lock is contended by all N CPUs constantly. The TX lock is only contended by a max of 2 CPUS. In the current mode of operation, after all the work of entering the dequeue region, we may endup aborting the path if we are unable to get the tx lock and go back to contend for the queue lock. As N goes up, this gets worse. The changes in this patch result in a small increase in performance with a 4CPU (2xdual-core) with no irq binding. Both e1000 and tg3 showed similar behavior; Signed-off-by: Jamal Hadi Salim <hadi@cyberus.ca> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-10 16:52:15 -07:00
Daniel Lezcano	de3cb747ff	[NET]: Dynamically allocate the loopback device, part 1. This patch replaces all occurences to the static variable loopback_dev to a pointer loopback_dev. That provides the mindless, trivial, uninteressting change part for the dynamic allocation for the loopback. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com> Acked-By: Kirill Korotaev <dev@sw.ru> Acked-by: Benjamin Thery <benjamin.thery@bull.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-10 16:52:14 -07:00
Johannes Berg	5568296573	[NL80211]: add netlink interface to cfg80211 Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: John W. Linville <linville@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-10 16:52:14 -07:00
Ilpo Järvinen	b76892051c	[TCP]: Avoid clearing sacktag hint in trivial situations There's no reason to clear the sacktag skb hint when small part of the rexmit queue changes. Account changes (if any) instead when fragmenting/collapsing. RTO/FRTO do not touch SACKED_ACKED bits so no need to discard SACK tag hint at all. Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-10 16:52:12 -07:00
Ilpo Järvinen	c96fd3d461	[TCP]: Enable SACK enhanced FRTO (RFC4138) by default Most of the description that follows comes from my mail to netdev (some editing done): Main obstacle to FRTO use is its deployment as it has to be on the sender side where as wireless link is often the receiver's access link. Take initiative on behalf of unlucky receivers and enable it by default in future Linux TCP senders. Also IETF seems to interested in advancing FRTO from experimental [1]. How does FRTO help? =================== FRTO detects spurious RTOs and avoids a number of unnecessary retransmissions and a couple of other problems that can arise due to incorrect guess made at RTO (i.e., that segments were lost when they actually got delayed which is likely to occur e.g. in wireless environments with link-layer retransmission). Though FRTO cannot prevent the first (potentially unnecessary) retransmission at RTO, I suspect that it won't cost that much even if you have to pay for each bit (won't be that high percentage out of all packets after all :-)). However, usually when you have a spurious RTO, not only the first segment unnecessarily retransmitted but the whole window. It goes like this: all cumulative ACKs got delayed due to in-order delivery, then TCP will actually send 1.5*original cwnd worth of data in the RTO's slow-start when the delayed ACKs arrive (basically the original cwnd worth of it unnecessarily). In case one is interested in minimizing unnecessary retransmissions e.g. due to cost, those rexmissions must never see daylight. Besides, in the worst case the generated burst overloads the bottleneck buffers which is likely to significantly delay the further progress of the flow. In case of ll rexmissions, ACK compression often occurs at the same time making the burst very "sharp edged" (in that case TCP often loses most of the segments above high_seq => very bad performance too). When FRTO is enabled, those unnecessary retransmissions are fully avoided except for the first segment and the cwnd behavior after detected spurious RTO is determined by the response (one can tune that by sysctl). Basic version (non-SACK enhanced one), FRTO can fail to detect spurious RTO as spurious and falls back to conservative behavior. ACK lossage is much less significant than reordering, usually the FRTO can detect spurious RTO if at least 2 cumulative ACKs from original window are preserved (excluding the ACK that advances to high_seq). With SACK-enhanced version, the detection is quite robust. FRTO should remove the need to set a high lower bound for the RTO estimator due to delay spikes that occur relatively common in some environments (esp. in wireless/cellular ones). [1] http://www1.ietf.org/mail-archive/web/tcpm/current/msg02862.html Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-10 16:52:12 -07:00
Ilpo Järvinen	009a2e3e4e	[TCP] FRTO: Improve interoperability with other undo_marker users Basically this change enables it, previously other undo_marker users were left with nothing. Reverse undo_marker logic completely to get it set right in CA_Loss. On the other hand, when spurious RTO is detected, clear it. Clearing might be too heavy for some scenarios but seems safe enough starting point for now and shouldn't have much effect except in majority of cases (if in any). By adding a new FLAG_ we avoid looping through write_queue when RTO occurs. Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-10 16:52:11 -07:00
Ilpo Järvinen	7c46a03e67	[TCP]: Cleanup tcp_tso_acked and tcp_clean_rtx_queue Implements following cleanups: - Comment re-placement (CodingStyle) - tcp_tso_acked() local (wrapper-like) variable removal (readability) - __-types removed (IMHO they make local variables jumpy looking and just was space) - acked -> flag (naming conventions elsewhere in TCP code) - linebreak adjustments (readability) - nested if()s combined (reduced indentation) - clarifying newlines added Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-10 16:52:10 -07:00
Ilpo Järvinen	13fcf850cc	[TCP]: Move accounting from tso_acked to clean_rtx_queue The accounting code is pretty much the same, so it's a shame we do it in two places. I'm not too sure if added fully_acked check in MTU probing is really what we want perhaps the added end_seq could be used in the after() comparison. Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-10 16:52:09 -07:00
Ilpo Järvinen	5af4ec236f	[TCP]: clear_all_retrans_hints prefixed by tcp_ In addition, fix its function comment spacing. Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>	2007-10-10 16:52:09 -07:00
Ilpo Järvinen	91fed7a15c	[TCP]: Make fackets_out accurate Substraction for fackets_out is unconditional when snd_una advances, thus there's no need to do it inside the loop. Just make sure correct bounds are honored. Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-10 16:52:08 -07:00
Ilpo Järvinen	0dde7b5404	[TCP]: Maintain highest_sack accurately to the highest skb In general, it should not be necessary to call tcp_fragment for already SACKed skbs, but it's better to be safe than sorry. And indeed, it can be called from sacktag when a DSACK arrives or some ACK (with SACK) reordering occurs (sacktag could be made to avoid the call in the latter case though I'm not sure if it's worth of the trouble and added complexity to cover such marginal case). The collapse case has return for SACKED_ACKED case earlier, so just WARN_ON if internal inconsistency is detected for some reason. Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-10 16:52:08 -07:00
Joe Perches	0795af5729	[NET]: Introduce and use print_mac() and DECLARE_MAC_BUF() This is nicer than the MAC_FMT stuff. Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-10 16:51:42 -07:00
Rick Jones	5ee3afba88	[TCP]: Return useful listenq info in tcp_info and INET_DIAG_INFO. Return some useful information such as the maximum listen backlog and the current listen backlog in the tcp_info structure and INET_DIAG_INFO. Signed-off-by: Rick Jones <rick.jones2@hp.com> Signed-off-by: Eric Dumazet <dada1@cosmosbay.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-10 16:51:35 -07:00
Pavel Emelyanov	768f3591e2	[NETNS]: Cleanup list walking in setup_net and cleanup_net I proposed introducing a list_for_each_entry_continue_reverse macro to be used in setup_net() when unrolling the failed ->init callback. Here is the macro and some more cleanup in the setup_net() itself to remove one variable from the stack :) The same thing is for the cleanup_net() - the existing list_for_each_entry_reverse() is used. Minor, but the code looks nicer. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Acked-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-10 16:51:35 -07:00
Vlad Yasevich	6b2f9cb64d	[SCTP]: Tie ADD-IP and AUTH functionality as required by spec. ADD-IP spec requires AUTH. It is, in fact, dangerous without AUTH. So, disable ADD-IP functionality if the peer claims to support ADD-IP, but not AUTH. Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-10 16:51:33 -07:00
Vlad Yasevich	65b07e5d0d	[SCTP]: API updates to suport SCTP-AUTH extensions. Add SCTP-AUTH API. The API implemented here was agreed to between implementors at the 9th SCTP Interop. It will be documented in the next revision of the SCTP socket API spec. Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-10 16:51:32 -07:00
Vlad Yasevich	bbd0d59809	[SCTP]: Implement the receive and verification of AUTH chunk This patch implements the receive path needed to process authenticated chunks. Add ability to process the AUTH chunk and handle edge cases for authenticated COOKIE-ECHO as well. Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-10 16:51:31 -07:00
Vlad Yasevich	4cd57c8078	[SCTP]: Enable the sending of the AUTH chunk. SCTP-AUTH, Section 6.2: Endpoints MUST send all requested chunks authenticated where this has been requested by the peer. The other chunks MAY be sent authenticated or not. If endpoint pair shared keys are used, one of them MUST be selected for authentication. To send chunks in an authenticated way, the sender MUST include these chunks after an AUTH chunk. This means that a sender MUST bundle chunks in order to authenticate them. If the endpoint has no endpoint pair shared key for the peer, it MUST use Shared Key Identifier 0 with an empty endpoint pair shared key. If there are multiple endpoint shared keys the sender selects one and uses the corresponding Shared Key Identifier Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-10 16:51:31 -07:00
Vlad Yasevich	730fc3d05c	[SCTP]: Implete SCTP-AUTH parameter processing Implement processing for the CHUNKS, RANDOM, and HMAC parameters and deal with how this parameters are effected by association restarts. In particular, during unexpeted INIT processing, we need to reply with parameters from the original INIT chunk. Also, after restart, we need to update the old association with new peer parameters and change the association shared keys. Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-10 16:51:30 -07:00
Vlad Yasevich	a29a5bd4f5	[SCTP]: Implement SCTP-AUTH initializations. The patch initializes AUTH related members of the generic SCTP structures and provides a way to enable/disable auth extension. Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-10 16:51:30 -07:00
Vlad Yasevich	1f485649f5	[SCTP]: Implement SCTP-AUTH internals This patch implements the internals operations of the AUTH, such as key computation and storage. It also adds necessary variables to the SCTP data structures. Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-10 16:51:29 -07:00
David L Stevens	96793b4825	[IPV4]: Add ICMPMsgStats MIB (RFC 4293) Background: RFC 4293 deprecates existing individual, named ICMP type counters to be replaced with the ICMPMsgStatsTable. This table includes entries for both IPv4 and IPv6, and requires counting of all ICMP types, whether or not the machine implements the type. These patches "remove" (but not really) the existing counters, and replace them with the ICMPMsgStats tables for v4 and v6. It includes the named counters in the /proc places they were, but gets the values for them from the new tables. It also counts packets generated from raw socket output (e.g., OutEchoes, MLD queries, RA's from radvd, etc). Changes: 1) create icmpmsg_statistics mib 2) create icmpv6msg_statistics mib 3) modify existing counters to use these 4) modify /proc/net/snmp to add "IcmpMsg" with all ICMP types listed by number for easy SNMP parsing 5) modify /proc/net/snmp printing for "Icmp" to get the named data from new counters. Signed-off-by: David L Stevens <dlstevens@us.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-10 16:51:28 -07:00
David L Stevens	14878f75ab	[IPV6]: Add ICMPMsgStats MIB (RFC 4293) [rev 2] Background: RFC 4293 deprecates existing individual, named ICMP type counters to be replaced with the ICMPMsgStatsTable. This table includes entries for both IPv4 and IPv6, and requires counting of all ICMP types, whether or not the machine implements the type. These patches "remove" (but not really) the existing counters, and replace them with the ICMPMsgStats tables for v4 and v6. It includes the named counters in the /proc places they were, but gets the values for them from the new tables. It also counts packets generated from raw socket output (e.g., OutEchoes, MLD queries, RA's from radvd, etc). Changes: 1) create icmpmsg_statistics mib 2) create icmpv6msg_statistics mib 3) modify existing counters to use these 4) modify /proc/net/snmp to add "IcmpMsg" with all ICMP types listed by number for easy SNMP parsing 5) modify /proc/net/snmp printing for "Icmp" to get the named data from new counters. [new to 2nd revision] 6) support per-interface ICMP stats 7) use common macro for per-device stat macros Signed-off-by: David L Stevens <dlstevens@us.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-10 16:51:27 -07:00
Denis Cheng	8b14a53670	[NET]: all net/ cleanup with ARRAY_SIZE Signed-off-by: Denis Cheng <crquan@gmail.com>	2007-10-10 16:51:26 -07:00
Denis Cheng	c40f6fff40	[IPV4] af_inet.c: use ARRAY_SIZE macro from kernel.h instead Signed-off-by: Denis Cheng <crquan@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-10 16:51:26 -07:00
Denis Cheng	26ff5ddc5a	[NETLINK]: the temp variable name max is ambiguous with the macro max provided by <linux/kernel.h>, so changed its name to a more proper one: limit Signed-off-by: Denis Cheng <crquan@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-10 16:51:25 -07:00
Denis Cheng	99406c885a	[NETLINK]: use the macro min(x,y) provided by <linux/kernel.h> instead Signed-off-by: Denis Cheng <crquan@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-10 16:51:25 -07:00
Herbert Xu	52886051ff	[SKBUFF]: Fix up csum_start when head room changes Thanks for noticing the bug where csum_start is not updated when the head room changes. This patch fixes that. It also moves the csum/ip_summed copying into copy_skb_header so that skb_copy_expand gets it too. I've checked its callers and no one should be upset by this. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-10 16:51:24 -07:00
Herbert Xu	0cfad07555	[NETLINK]: Avoid pointer in netlink_run_queue I was looking at Patrick's fix to inet_diag and it occured to me that we're using a pointer argument to return values unnecessarily in netlink_run_queue. Changing it to return the value will allow the compiler to generate better code since the value won't have to be memory-backed. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-10 16:51:24 -07:00
Vlad Yasevich	007e3936bd	[SCTP]: Move sysctl_sctp_[rw]mem definitions to protocol.c The sctp_[rw]mem definitions should really be in protocol.c since that is where they are initialized. This also allows one to build a kernel without sysctl support. Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-10 16:51:23 -07:00
Vlad Yasevich	131a47e31a	[SCTP]: Implement the Supported Extensions Parameter SCTP Supported Extenions parameter is specified in Section 4.2.7 of the ADD-IP draft (soon to be RFC). The parameter is encoded as: 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ \| Parameter Type = 0x8008 \| Parameter Length \| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ \| CHUNK TYPE 1 \| CHUNK TYPE 2 \| CHUNK TYPE 3 \| CHUNK TYPE 4 \| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ \| .... \| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ \| CHUNK TYPE N \| PAD \| PAD \| PAD \| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ It contains a list of chunks that a particular SCTP extension uses. Current extensions supported are Partial Reliability (FWD-TSN) and ADD-IP (ASCONF and ASCONF-ACK). When implementing new extensions (AUTH, PKT-DROP, etc..), new chunks need to be added to this parameter. Parameter processing would be modified to negotiate support for these new features. Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-10 16:51:23 -07:00
Denis V. Lunev	76c72d4f44	[IPV4/IPV6/DECNET]: Small cleanup for fib rules. This patch slightly cleanups FIB rules framework. rules_list as a pointer on struct fib_rules_ops is useless. It is always assigned with a static per/subsystem list in IPv4, IPv6 and DecNet. Signed-off-by: Denis V. Lunev <den@openvz.org> Acked-by: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-10 16:51:22 -07:00
Pavel Emelyanov	056925ab31	[NET]: Cleanup calling netdev notifiers. The call_netdev_notifiers routine can successfully be used in the net/core_dev.c itself. This will save 6 lines of code and 62 ;) bytes of .text section. 62 is rather small, but I have one more patch saving ~30 bytes from netns code (sent to Eric), so altogether they can save some more noticeable amount. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-10 16:51:21 -07:00
Pavel Emelyanov	30d97d3585	[NETNS]: Consolidate hashes creation in netdev_init() The dev_name_hash and the dev_index_hash are now booth kmalloc-ed (and each element is properly initialized as usually) so I think it's worth consolidating this code making it look nicer (and saving 28 bytes of .text section ;) ) Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-10 16:51:21 -07:00
Eric W. Biederman	ad7379d494	[NET]: Fix the prototype of call_netdevice_notifiers. This replaces the void * parameter with a struct net_device * which is what is actually required. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-10 16:51:20 -07:00
Jamal Hadi Salim	22dd749501	[NET]: migrate HARD_TX_LOCK to header file HARD_TX_LOCK micro is a nice aggregation that could be used in other spots. move it to netdevice.h Also makes sure the previously superflous cpu arguement is used. Thanks to DaveM for the suggestions. Signed-off-by: Jamal Hadi Salim <hadi@cyberus.ca> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-10 16:51:20 -07:00
Milan Kocian	0b69d4bd26	[IPV6]: Remove redundant RTM_DELLINK message. Remove useless message. We get the right message from another subsystem. Signed-off-by: Milan Kocian <milon@wq.cz> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-10 16:51:18 -07:00
Paul Moore	63d804eade	[CIPSO]: remove duplicated code in the cipso_v4_*_getattr() functions The bulk of the CIPSO option parsing/processing in the cipso_v4_sock_getattr() and cipso_v4_skb_getattr() functions are identical, the only real difference being where the functions obtain the CIPSO option itself. This patch creates a new function, cipso_v4_getattr(), which contains the common CIPSO option parsing/processing code and modifies the existing functions to call this new helper function. Signed-off-by: Paul Moore <paul.moore@hp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-10 16:51:17 -07:00
Jeff Garzik	88d3aafdae	[ETHTOOL] Provide default behaviors for a few ethtool sub-ioctls For the operations get-tx-csum get-sg get-tso get-ufo the default ethtool_op_xxx behavior is fine for all drivers, so we permit op==NULL to imply the default behavior. This provides a more uniform behavior across all drivers, eliminating ethtool(8) "ioctl not supported" errors on older drivers that had not been updated for the latest sub-ioctls. The ethtool_op_xxx() functions are left exported, in case anyone wishes to call them directly from a driver-private implementation -- a not-uncommon case. Should an ethtool_op_xxx() helper remain unused for a while, except by net/core/ethtool.c, we can un-export it at a later date. [ Resolved conflicts with set/get value ethtool patch... -DaveM ] Signed-off-by: Jeff Garzik <jeff@garzik.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-10 16:51:17 -07:00
Ralf Baechle	10d024c1b2	[NET]: Nuke SET_MODULE_OWNER macro. It's been a useless no-op for long enough in 2.6 so I figured it's time to remove it. The number of people that could object because they're maintaining unified 2.4 and 2.6 drivers is probably rather small. [ Handled drivers added by netdev tree and some missed IRDA cases... -DaveM ] Signed-off-by: Ralf Baechle <ralf@linux-mips.org> Signed-off-by: Jeff Garzik <jeff@garzik.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-10 16:51:13 -07:00
Johannes Berg	aa0daf0e02	[MAC80211]: remove/change some comments about Michael MIC hardware offload There are a few TODO comments in the mac80211 sources regarding hardware offload for Michael MIC verification. Those items are, however, better handled in the driver instead of the stack, if any device requires such hand-holding. Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: John W. Linville <linville@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-10 16:49:32 -07:00
Tomas Winkler	475fa49c12	[MAC80211]: PS mode fix tx.mode must be set also for buffered frames. It is used in the tx hanlders Signed-off-by: Tomas Winkler <tomas.winkler@intel.com> Signed-off-by: John W. Linville <linville@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-10 16:49:32 -07:00
Stephen Hemminger	68aae11674	[MAC80211]: use internal network device stats Stats are now available for device usage inside network_device Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org> Acked-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: John W. Linville <linville@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-10 16:49:31 -07:00
warmcat	24338793ee	[MAC80211]: get STA after tx radiotap snipped Johannes Berg noticed that in __ieee80211_tx_prepare() we try to get the STA from addr1 of the ieee80211 header when the radiotap header is actually still at the front of the packet. This patch defers doing that until the radiotap header is gone. Signed-off-by: Andy Green <andy@warmcat.com> Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: John W. Linville <linville@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-10 16:49:31 -07:00
Volker Braun	139c3a0492	[MAC80211]: ignore key index on pairwise key (WEP only) Work-around for broken APs that use a non-zero key index for WEP pairwise keys. With this patch, WEP encryption only is exempt from providing a zero key index. Signed-off-by: Volker Braun <volker.braun@physik.hu-berlin.de> Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Acked-by: Michael Wu <flamingice@sourmilk.net> Signed-off-by: John W. Linville <linville@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-10 16:49:30 -07:00
Johannes Berg	c39e3a0d03	[MAC80211]: remove TKIP mixing for hw accel again The TKIP mixing code was added for the benefit of Intel's ipw3945 chipset but that code ended up not using it. We have previously identified many problems with this code and it crystallized that library functions for mixing are likely to handle this in much more generality and might allow b43 to take advantage of hardware acceleration for TKIP. Due to these reasons, remove the TKIP mixing for hardware accelerated crypto operations. Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Acked-by: Michael Buesch <mb@bu3sch.de> Acked-by: Michael Wu <flamingice@sourmilk.net> Signed-off-by: John W. Linville <linville@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-10 16:49:30 -07:00
Johannes Berg	6a7664d451	[MAC80211]: remove HW_KEY_IDX_INVALID This patch makes the mac80211/driver interface rely only on the IEEE80211_TXCTL_DO_NOT_ENCRYPT flag to signal to the driver whether a frame should be encrypted or not, since mac80211 internally no longer relies on HW_KEY_IDX_INVALID either this removes it, changes the key index to be a u8 in all places and makes the full range of the value available to drivers. Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Acked-by: Michael Wu <flamingice@sourmilk.net> Signed-off-by: John W. Linville <linville@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-10 16:49:29 -07:00
Johannes Berg	c15a205070	[MAC80211]: remove set_key_idx callback No existing drivers use this callback, hence there's no telling how it might be used. In fact, it is unlikely to be of much use as-is because the default key index isn't something that the driver can do much with without knowing which interface it was for etc. And if it needs the key index for the transmitted frame, it can get it by keeping a reference to the key_conf structure and looking it up by hw_key_idx. Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Acked-by: Michael Wu <flamingice@sourmilk.net> Signed-off-by: John W. Linville <linville@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-10 16:49:28 -07:00
Johannes Berg	7848ba7d7a	[MAC80211]: rework hardware crypto flags This patch reworks the various hardware crypto related flags to make them more local, i.e. put them with each key or each packet instead of into the hw struct. Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Acked-by: Michael Wu <flamingice@sourmilk.net> Signed-off-by: John W. Linville <linville@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-10 16:49:27 -07:00
Johannes Berg	b708e61062	[MAC80211]: remove turbo modes This patch removes all mention of the atheros turbo modes that can't possibly work properly anyway since in some places we don't check for them when we should. I have no idea what the iwlwifi drivers were doing with these but it can't possibly have been correct. Cc: Zhu Yi <yi.zhu@intel.com> Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Acked-by: Michael Wu <flamingice@sourmilk.net> Signed-off-by: John W. Linville <linville@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-10 16:49:27 -07:00
Johannes Berg	d4e46a3d98	[MAC80211]: fix race conditions with keys During receive processing, we select the key long before using it and because there's no locking it is possible that we kfree() the key after having selected it but before using it for crypto operations. Obviously, this is bad. Secondly, during transmit processing, there are two possible races: We have a similar race between select_key() and using it for encryption, but we also have a race here between select_key() and hardware encryption (both when a key is removed.) This patch solves these issues by using RCU: when a key is to be freed, we first remove the pointer from the appropriate places (sdata->keys, sdata->default_key, sta->key) using rcu_assign_pointer() and then synchronize_rcu(). Then, we can safely kfree() the key and remove it from the hardware. There's a window here where the hardware may still be using it for decryption, but we can't work around that without having two hardware callbacks, one to disable the key for RX and one to disable it for TX; but the worst thing that will happen is that we receive a packet decrypted that we don't find a key for any more and then drop it. When we add a key, we first need to upload it to the hardware and then, using rcu_assign_pointer() again, link it into our structures. In the code using keys (TX/RX paths) we use rcu_dereference() to get the key and enclose the whole tx/rx section in a rcu_read_lock() ... rcu_read_unlock() block. Because we've uploaded the key to hardware before linking it into internal structures, we can guarantee that it is valid once get to into tx(). One possible race condition remains, however: when we have hardware acceleration enabled and the driver shuts down the queues, we end up queueing the frame. If now somebody removes the key, the key will be removed from hwaccel and then then driver will be asked to encrypt the frame with a key index that has been removed. Hence, drivers will need to be aware that the hw_key_index they are passed might not be under all circumstances. Most drivers will, however, simply ignore that condition and encrypt the frame with the selected key anyway, this only results in a frame being encrypted with a wrong key or dropped (rightfully) because the key was not valid. There isn't much we can do about it unless we want to walk the pending frame queue every time a key is removed and remove all frames that used it. This race condition, however, will most likely be solved once we add multiqueue support to mac80211 because then frames will be queued further up the stack instead of after being processed. Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Acked-by: Michael Wu <flamingice@sourmilk.net> Signed-off-by: John W. Linville <linville@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-10 16:49:26 -07:00
Johannes Berg	c29b9b9b02	[MAC80211]: don't send invalid QoS frames Kalle Valo noticed that QoS frames are sent with an invalid QoS control field; this is because we increase the header length but neither initialise the space nor actually have enough space in the header structure for the QoS control field. This patch fixes it by treating the QoS field specially and appending it explicitly, initialising it to zero. Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Acked-by: Michael Wu <flamingice@sourmilk.net> Signed-off-by: John W. Linville <linville@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-10 16:49:26 -07:00
Johannes Berg	5d4ecd9370	[MAC80211]: remove spy wext ioctls mac80211 never calls wireless_spy_update so these aren't useful. Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Acked-by: Michael Wu <flamingice@sourmilk.net> Signed-off-by: John W. Linville <linville@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-10 16:49:25 -07:00
Eric Dumazet	39c90ece75	[IPV4]: Convert rt_check_expire() from softirq processing to workqueue. On loaded/big hosts, rt_check_expire() if of litle use, because it generally breaks out of its main loop because of a jiffies change. It can take a long time (read : timer invocations) to actually scan the whole hash table, freeing unused entries. Converting it to use a workqueue instead of softirq is a nice move because we can allow rt_check_expire() to do the scan it is supposed to do, without hogging the CPU. This has an impact on the average number of entries in cache, reducing ram usage. Cache is more responsive to parameter changes (/proc/sys/net/ipv4/route/gc_timeout and /proc/sys/net/ipv4/route/gc_interval) Note: Maybe the default value of gc_interval (60 seconds) is too high, since this means we actually need 5 (300/60) invocations to scan the whole table. Signed-off-by: Eric Dumazet <dada1@cosmosbay.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-10 16:49:25 -07:00
Ivo van Doorn	e0665486b7	[RFKILL]: Add support for ultrawideband This patch will add support for UWB keys to rfkill, support for this has been requested by Inaky. Signed-off-by: Ivo van Doorn <IvDoorn@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-10 16:49:23 -07:00
Ivo van Doorn	234a0ca6f1	[RFKILL]: Remove IRDA As Dmitry pointed out earlier, rfkill-input.c doesn't support irda because there are no users and we shouldn't add unrequired KEY_ defines. However, RFKILL_TYPE_IRDA was defined in the rfkill.h header file and would confuse people about whether it is implemented or not. This patch removes IRDA support completely, so it can be added whenever a driver wants the feature. Signed-off-by: Ivo van Doorn <IvDoorn@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-10 16:49:23 -07:00
Eric W. Biederman	077130c0cf	[NET]: Fix race when opening a proc file while a network namespace is exiting. The problem: proc_net files remember which network namespace the are against but do not remember hold a reference count (as that would pin the network namespace). So we currently have a small window where the reference count on a network namespace may be incremented when opening a /proc file when it has already gone to zero. To fix this introduce maybe_get_net and get_proc_net. maybe_get_net increments the network namespace reference count only if it is greater then zero, ensuring we don't increment a reference count after it has gone to zero. get_proc_net handles all of the magic to go from a proc inode to the network namespace instance and call maybe_get_net on it. PROC_NET the old accessor is removed so that we don't get confused and use the wrong helper function. Then I fix up the callers to use get_proc_net and handle the case case where get_proc_net returns NULL. In that case I return -ENXIO because effectively the network namespace has already gone away so the files we are trying to access don't exist anymore. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Acked-by: Paul E. McKenney <paulmck@us.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-10 16:49:22 -07:00
Jesper Dangaard Brouer	e9bef55d3d	[NET_SCHED]: Cleanup L2T macros and handle oversized packets Change L2T (length to time) macros, in all rate based schedulers, to call a common function qdisc_l2t() that does the rate table lookup. This function handles if the packet size lookup is larger than the rate table, which often occurs with TSO enabled. Signed-off-by: Jesper Dangaard Brouer <hawk@comx.dk> Acked-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-10 16:49:20 -07:00
Adrian Bunk	b6fa1a4d74	[SCTP] net/sctp/socket.c: make 3 variables static This patch makes the following needlessly global variables static: - sctp_memory_pressure - sctp_memory_allocated - sctp_sockets_allocated Signed-off-by: Adrian Bunk <bunk@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-10 16:49:19 -07:00
Adrian Bunk	5c94bf86c8	[SCTP]: Make sctp_addto_param() static. sctp_addto_param() can become static. Signed-off-by: Adrian Bunk <bunk@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-10 16:49:19 -07:00
Thomas Graf	8f4c1f9b04	[NETLINK]: Introduce nested and byteorder flag to netlink attribute This change allows the generic attribute interface to be used within the netfilter subsystem where this flag was initially introduced. The byte-order flag is yet unused, it's intended use is to allow automatic byte order convertions for all atomic types. Signed-off-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-10 16:49:16 -07:00
David S. Miller	9d5010db7e	[NET]: Add a might_sleep() to dev_close(). Requested by Johannes Berg. Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-10 16:49:15 -07:00
Eric Dumazet	86bba269d0	[PATCH] NET : convert IP route cache garbage collection from softirq processing to a workqueue When the periodic IP route cache flush is done (every 600 seconds on default configuration), some hosts suffer a lot and eventually trigger the "soft lockup" message. dst_run_gc() is doing a scan of a possibly huge list of dst_entries, eventually freeing some (less than 1%) of them, while holding the dst_lock spinlock for the whole scan. Then it rearms a timer to redo the full thing 1/10 s later... The slowdown can last one minute or so, depending on how active are the tcp sessions. This second version of the patch converts the processing from a softirq based one to a workqueue. Even if the list of entries in garbage_list is huge, host is still responsive to softirqs and can make progress. Instead of resetting gc timer to 0.1 second if one entry was freed in a gc run, we do this if more than 10% of entries were freed. Before patch : Aug 16 06:21:37 SRV1 kernel: BUG: soft lockup detected on CPU#0! Aug 16 06:21:37 SRV1 kernel: Aug 16 06:21:37 SRV1 kernel: Call Trace: Aug 16 06:21:37 SRV1 kernel: <IRQ> [<ffffffff802286f0>] wake_up_process+0x10/0x20 Aug 16 06:21:37 SRV1 kernel: [<ffffffff80251e09>] softlockup_tick+0xe9/0x110 Aug 16 06:21:37 SRV1 kernel: [<ffffffff803cd380>] dst_run_gc+0x0/0x140 Aug 16 06:21:37 SRV1 kernel: [<ffffffff802376f3>] run_local_timers+0x13/0x20 Aug 16 06:21:37 SRV1 kernel: [<ffffffff802379c7>] update_process_times+0x57/0x90 Aug 16 06:21:37 SRV1 kernel: [<ffffffff80216034>] smp_local_timer_interrupt+0x34/0x60 Aug 16 06:21:37 SRV1 kernel: [<ffffffff802165cc>] smp_apic_timer_interrupt+0x5c/0x80 Aug 16 06:21:37 SRV1 kernel: [<ffffffff8020a816>] apic_timer_interrupt+0x66/0x70 Aug 16 06:21:37 SRV1 kernel: [<ffffffff803cd3d3>] dst_run_gc+0x53/0x140 Aug 16 06:21:37 SRV1 kernel: [<ffffffff803cd3c6>] dst_run_gc+0x46/0x140 Aug 16 06:21:37 SRV1 kernel: [<ffffffff80237148>] run_timer_softirq+0x148/0x1c0 Aug 16 06:21:37 SRV1 kernel: [<ffffffff8023340c>] __do_softirq+0x6c/0xe0 Aug 16 06:21:37 SRV1 kernel: [<ffffffff8020ad6c>] call_softirq+0x1c/0x30 Aug 16 06:21:37 SRV1 kernel: <EOI> [<ffffffff8020cb34>] do_softirq+0x34/0x90 Aug 16 06:21:37 SRV1 kernel: [<ffffffff802331cf>] local_bh_enable_ip+0x3f/0x60 Aug 16 06:21:37 SRV1 kernel: [<ffffffff80422913>] _spin_unlock_bh+0x13/0x20 Aug 16 06:21:37 SRV1 kernel: [<ffffffff803dfde8>] rt_garbage_collect+0x1d8/0x320 Aug 16 06:21:37 SRV1 kernel: [<ffffffff803cd4dd>] dst_alloc+0x1d/0xa0 Aug 16 06:21:37 SRV1 kernel: [<ffffffff803e1433>] __ip_route_output_key+0x573/0x800 Aug 16 06:21:37 SRV1 kernel: [<ffffffff803c02e2>] sock_common_recvmsg+0x32/0x50 Aug 16 06:21:37 SRV1 kernel: [<ffffffff803e16dc>] ip_route_output_flow+0x1c/0x60 Aug 16 06:21:37 SRV1 kernel: [<ffffffff80400160>] tcp_v4_connect+0x150/0x610 Aug 16 06:21:37 SRV1 kernel: [<ffffffff803ebf07>] inet_bind_bucket_create+0x17/0x60 Aug 16 06:21:37 SRV1 kernel: [<ffffffff8040cd16>] inet_stream_connect+0xa6/0x2c0 Aug 16 06:21:37 SRV1 kernel: [<ffffffff80422981>] _spin_lock_bh+0x11/0x30 Aug 16 06:21:37 SRV1 kernel: [<ffffffff803c0bdf>] lock_sock_nested+0xcf/0xe0 Aug 16 06:21:37 SRV1 kernel: [<ffffffff80422981>] _spin_lock_bh+0x11/0x30 Aug 16 06:21:37 SRV1 kernel: [<ffffffff803be551>] sys_connect+0x71/0xa0 Aug 16 06:21:37 SRV1 kernel: [<ffffffff803eee3f>] tcp_setsockopt+0x1f/0x30 Aug 16 06:21:37 SRV1 kernel: [<ffffffff803c030f>] sock_common_setsockopt+0xf/0x20 Aug 16 06:21:37 SRV1 kernel: [<ffffffff803be4bd>] sys_setsockopt+0x9d/0xc0 Aug 16 06:21:37 SRV1 kernel: [<ffffffff8028881e>] sys_ioctl+0x5e/0x80 Aug 16 06:21:37 SRV1 kernel: [<ffffffff80209c4e>] system_call+0x7e/0x83 After patch : (RT_CACHE_DEBUG set to 2 to get following traces) dst_total: 75469 delayed: 74109 work_perf: 141 expires: 150 elapsed: 8092 us dst_total: 78725 delayed: 73366 work_perf: 743 expires: 400 elapsed: 8542 us dst_total: 86126 delayed: 71844 work_perf: 1522 expires: 775 elapsed: 8849 us dst_total: 100173 delayed: 68791 work_perf: 3053 expires: 1256 elapsed: 9748 us dst_total: 121798 delayed: 64711 work_perf: 4080 expires: 1997 elapsed: 10146 us dst_total: 154522 delayed: 58316 work_perf: 6395 expires: 25 elapsed: 11402 us dst_total: 154957 delayed: 58252 work_perf: 64 expires: 150 elapsed: 6148 us dst_total: 157377 delayed: 57843 work_perf: 409 expires: 400 elapsed: 6350 us dst_total: 163745 delayed: 56679 work_perf: 1164 expires: 775 elapsed: 7051 us dst_total: 176577 delayed: 53965 work_perf: 2714 expires: 1389 elapsed: 8120 us dst_total: 198993 delayed: 49627 work_perf: 4338 expires: 1997 elapsed: 8909 us dst_total: 226638 delayed: 46865 work_perf: 2762 expires: 2748 elapsed: 7351 us I successfully reduced the IP route cache of many hosts by a four factor thanks to this patch. Previously, I had to disable "ip route flush cache" to avoid crashes. Signed-off-by: Eric Dumazet <dada1@cosmosbay.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-10 16:49:15 -07:00
David S. Miller	678aa8e4eb	[NET]: #if 0 out net_alloc() for now. We will undo this once it is actually used. Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-10 16:49:14 -07:00
Eric W. Biederman	c48dad7ecd	[NET]: Disable netfilter sockopts when not in the initial network namespace Until we support multiple network namespaces with netfilter only allow netfilter configuration in the initial network namespace. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-10 16:49:13 -07:00
Eric W. Biederman	d8a5ec6727	[NET]: netlink support for moving devices between network namespaces. The simplest thing to implement is moving network devices between namespaces. However with the same attribute IFLA_NET_NS_PID we can easily implement creating devices in the destination network namespace as well. However that is a little bit trickier so this patch sticks to what is simple and easy. A pid is used to identify a process that happens to be a member of the network namespace we want to move the network device to. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-10 16:49:13 -07:00
Eric W. Biederman	ce286d3273	[NET]: Implement network device movement between namespaces This patch introduces NETIF_F_NETNS_LOCAL a flag to indicate a network device is local to a single network namespace and should never be moved. Useful for pseudo devices that we need an instance in each network namespace (like the loopback device) and for any device we find that cannot handle multiple network namespaces so we may trap them in the initial network namespace. This patch introduces the function dev_change_net_namespace a function used to move a network device from one network namespace to another. To the network device nothing special appears to happen, to the components of the network stack it appears as if the network device was unregistered in the network namespace it is in, and a new device was registered in the network namespace the device was moved to. This patch sets up a namespace device destructor that upon the exit of a network namespace moves all of the movable network devices to the initial network namespace so they are not lost. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-10 16:49:12 -07:00
Eric W. Biederman	b267b17964	[NET]: Factor out __dev_alloc_name from dev_alloc_name When forcibly changing the network namespace of a device I need something that can generate a name for the device in the new namespace without overwriting the old name. __dev_alloc_name provides me that functionality. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-10 16:49:11 -07:00
Eric W. Biederman	881d966b48	[NET]: Make the device list and device lookups per namespace. This patch makes most of the generic device layer network namespace safe. This patch makes dev_base_head a network namespace variable, and then it picks up a few associated variables. The functions: dev_getbyhwaddr dev_getfirsthwbytype dev_get_by_flags dev_get_by_name __dev_get_by_name dev_get_by_index __dev_get_by_index dev_ioctl dev_ethtool dev_load wireless_process_ioctl were modified to take a network namespace argument, and deal with it. vlan_ioctl_set and brioctl_set were modified so their hooks will receive a network namespace argument. So basically anthing in the core of the network stack that was affected to by the change of dev_base was modified to handle multiple network namespaces. The rest of the network stack was simply modified to explicitly use &init_net the initial network namespace. This can be fixed when those components of the network stack are modified to handle multiple network namespaces. For now the ifindex generator is left global. Fundametally ifindex numbers are per namespace, or else we will have corner case problems with migration when we get that far. At the same time there are assumptions in the network stack that the ifindex of a network device won't change. Making the ifindex number global seems a good compromise until the network stack can cope with ifindex changes when you change namespaces, and the like. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-10 16:49:10 -07:00
Eric W. Biederman	b4b510290b	[NET]: Support multiple network namespaces with netlink Each netlink socket will live in exactly one network namespace, this includes the controlling kernel sockets. This patch updates all of the existing netlink protocols to only support the initial network namespace. Request by clients in other namespaces will get -ECONREFUSED. As they would if the kernel did not have the support for that netlink protocol compiled in. As each netlink protocol is updated to be multiple network namespace safe it can register multiple kernel sockets to acquire a presence in the rest of the network namespaces. The implementation in af_netlink is a simple filter implementation at hash table insertion and hash table look up time. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-10 16:49:09 -07:00
Eric W. Biederman	e9dc865340	[NET]: Make device event notification network namespace safe Every user of the network device notifiers is either a protocol stack or a pseudo device. If a protocol stack that does not have support for multiple network namespaces receives an event for a device that is not in the initial network namespace it quite possibly can get confused and do the wrong thing. To avoid problems until all of the protocol stacks are converted this patch modifies all netdev event handlers to ignore events on devices that are not in the initial network namespace. As the rest of the code is made network namespace aware these checks can be removed. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-10 16:49:09 -07:00
Eric W. Biederman	e730c15519	[NET]: Make packet reception network namespace safe This patch modifies every packet receive function registered with dev_add_pack() to drop packets if they are not from the initial network namespace. This should ensure that the various network stacks do not receive packets in a anything but the initial network namespace until the code has been converted and is ready for them. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-10 16:49:08 -07:00
Eric W. Biederman	6d34b1c27a	[NET]: Initialize the network namespace of network devices. Except for carefully selected pseudo devices all network interfaces should start out in the initial network namespace. Ultimately it will be register_netdev that examines what dev->nd_net is set to and places a device in a network namespace. This patch modifies alloc_netdev to initialize the network namespace a device is in with the initial network namespace. This gets it right for the vast majority of devices so their drivers need not be modified and for those few pseudo devices that need something different they can change this parameter before calling register_netdevice. The network namespace parameter on a network device is not reference counted as the devices are inside of a network namespace and cannot remain in that namespace past the lifetime of the network namespace. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-10 16:49:07 -07:00
Eric W. Biederman	1b8d7ae42d	[NET]: Make socket creation namespace safe. This patch passes in the namespace a new socket should be created in and has the socket code do the appropriate reference counting. By virtue of this all socket create methods are touched. In addition the socket create methods are modified so that they will fail if you attempt to create a socket in a non-default network namespace. Failing if we attempt to create a socket outside of the default network namespace ensures that as we incrementally make the network stack network namespace aware we will not export functionality that someone has not audited and made certain is network namespace safe. Allowing us to partially enable network namespaces before all of the exotic protocols are supported. Any protocol layers I have missed will fail to compile because I now pass an extra parameter into the socket creation code. [ Integrated AF_IUCV build fixes from Andrew Morton... -DaveM ] Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-10 16:49:07 -07:00
Eric W. Biederman	457c4cbc5a	[NET]: Make /proc/net per network namespace This patch makes /proc/net per network namespace. It modifies the global variables proc_net and proc_net_stat to be per network namespace. The proc_net file helpers are modified to take a network namespace argument, and all of their callers are fixed to pass &init_net for that argument. This ensures that all of the /proc/net files are only visible and usable in the initial network namespace until the code behind them has been updated to be handle multiple network namespaces. Making /proc/net per namespace is necessary as at least some files in /proc/net depend upon the set of network devices which is per network namespace, and even more files in /proc/net have contents that are relevant to a single network namespace. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-10 16:49:06 -07:00
Eric W. Biederman	5f256becd8	[NET]: Basic network namespace infrastructure. This is the basic infrastructure needed to support network namespaces. This infrastructure is: - Registration functions to support initializing per network namespace data when a network namespaces is created or destroyed. - struct net. The network namespace data structure. This structure will grow as variables are made per network namespace but this is the minimal starting point. - Functions to grab a reference to the network namespace. I provide both get/put functions that keep a network namespace from being freed. And hold/release functions serve as weak references and will warn if their count is not zero when the data structure is freed. Useful for dealing with more complicated data structures like the ipv4 route cache. - A list of all of the network namespaces so we can iterate over them. - A slab for the network namespace data structure allowing leaks to be spotted. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-10 16:49:03 -07:00
Eric W. Biederman	890d52d3f1	[ATALK]: In notifier handlers convert the void pointer to a netdevice This slightly improves code safety and clarity. Later network namespace patches touch this code so this is a preliminary cleanup. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-10 16:49:02 -07:00

1 2 3 4 5 ...

5936 Commits