linux

Author	SHA1	Message	Date
Jamal Hadi Salim	007a531b0a	[PKTGEN]: Introduce sequential flows By default all flows in pktgen are randomly selected. This patch introduces ability to have all defined flows to be sent sequentially. Robert defined randomness to be the default behavior. Signed-off-by: Jamal Hadi Salim <hadi@cyberus.ca> Signed-off-by: Robert Olsson <robert.olsson@its.uu.se> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-07-10 22:16:27 -07:00
Jamal Hadi Salim	16dab72f65	[PKTGEN]: Centralize packet overhead tracking Track the extra packet overhead for VLAN tags, MPLS, IPSEC etc Signed-off-by: Jamal Hadi Salim <hadi@cyberus.ca> Signed-off-by: Robert Olsson <robert.olsson@its.uu.se> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-07-10 22:16:26 -07:00
Larry Finger	eef6caf8a9	[MAC80211]: Set low initial rate in rc80211_simple The initial rate for STA's using rc80211_simple is set to the last rate in the rate table. For situations for which the signal is weak, the rate may be too high for authentication and association. Although the rc80211_simple module will adjust the speed, the response may not be fast enough for a successful connection. This modification sets the initial rate to the lowest supported value. Signed-off-by: Larry Finger <Larry.Finger@lwfinger.net> Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: John W. Linville <linville@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-07-10 22:16:25 -07:00
Ilpo Järvinen	d041005116	[TCP]: SACK fastpath did override adjusted fackets_out Do same adjustment to SACK fastpath counters provided that they're valid. Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-07-10 22:16:24 -07:00
Patrick McHardy	61cbc2fca6	[NET]: Fix secondary unicast/multicast address count maintenance When a reference to an existing address is increased or decreased without hitting zero, the address count is incorrectly adjusted. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-07-10 22:16:23 -07:00
Peter P Waskiewicz Jr	d62733c8e4	[SCHED]: Qdisc changes and sch_rr added for multiqueue Add the new sch_rr qdisc for multiqueue network device support. Allow sch_prio and sch_rr to be compiled with or without multiqueue hardware support. sch_rr is part of sch_prio, and is referenced from MODULE_ALIAS. This was done since sch_prio and sch_rr only differ in their dequeue routine. Signed-off-by: Peter P Waskiewicz Jr <peter.p.waskiewicz.jr@intel.com> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-07-10 22:16:22 -07:00
Peter P Waskiewicz Jr	f25f4e4480	[CORE] Stack changes to add multiqueue hardware support API Add the multiqueue hardware device support API to the core network stack. Allow drivers to allocate multiple queues and manage them at the netdev level if they choose to do so. Added a new field to sk_buff, namely queue_mapping, for drivers to know which tx_ring to select based on OS classification of the flow. Signed-off-by: Peter P Waskiewicz Jr <peter.p.waskiewicz.jr@intel.com> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-07-10 22:16:21 -07:00
Herbert Xu	a298830cd0	[NET]: Fix TX checksum feature check This patch fixes a boolean error in the new TX checksum check that causes bogus TSO packets to be generated. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-07-10 22:16:19 -07:00
James Chapman	342f0234c7	[UDP]: Introduce UDP encapsulation type for L2TP This patch adds a new UDP_ENCAP_L2TPINUDP encapsulation type for UDP sockets. When a UDP socket's encap_type is UDP_ENCAP_L2TPINUDP, the skb is delivered to a function pointed to by the udp_sock's encap_rcv funcptr. If the skb isn't wanted by L2TP, it returns >0, which causes it to be passed through to UDP. Include padding to put the new encap_rcv field on a 4-byte boundary. Previously, the only user of UDP encap sockets was ESP, so when CONFIG_XFRM was not defined, some of the encap code was compiled out. This patch changes that. As a result, udp_encap_rcv() will now do a little more work when CONFIG_XFRM is not defined. Signed-off-by: James Chapman <jchapman@katalix.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-07-10 22:15:57 -07:00
Patrick McHardy	4417da668c	[NET]: dev: secondary unicast address support Add support for configuring secondary unicast addresses on network devices. To support this devices capable of filtering multiple unicast addresses need to change their set_multicast_list function to configure unicast filters as well and assign it to dev->set_rx_mode instead of dev->set_multicast_list. Other devices are put into promiscous mode when secondary unicast addresses are present. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-07-10 22:15:56 -07:00
Patrick McHardy	3fba5a8b1e	[NET]: dev_mcast: switch to generic net_device address lists Use generic net_device address lists for multicast list handling. Some defines are used to keep drivers working. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-07-10 22:15:55 -07:00
Patrick McHardy	bf742482d7	[NET]: dev: introduce generic net_device address lists Introduce struct dev_addr_list and list maintenance functions based on dev_mc_list and the related functions. This will be used by follow-up patches for both multicast and secondary unicast addresses. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-07-10 22:15:54 -07:00
Patrick McHardy	75ebe8f736	[NET]: dev_mcast: unexport dev_mc_upload dev_mc_add/dev_mc_delete take care of uploading the list when necessary and thats the only interface other code should use. Also remove two incorrect calls in DECnet. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-07-10 22:15:53 -07:00
Stephen Hemminger	d212f87b06	[NET]: IPV6 checksum offloading in network devices The existing model for checksum offload does not correctly handle devices that can offload IPV4 and IPV6 only. The NETIF_F_HW_CSUM flag implies device can do any arbitrary protocol. This patch: * adds NETIF_F_IPV6_CSUM for those devices * fixes bnx2 and tg3 devices that need it * add NETIF_F_IPV6_CSUM to ipv6 output (incl GSO) * fixes assumptions about NETIF_F_ALL_CSUM in nat * adjusts bridge union of checksumming computation Signed-off-by: David S. Miller <davem@davemloft.net>	2007-07-10 22:15:52 -07:00
Masahide NAKAMURA	d3d6dd3ada	[XFRM]: Add module alias for transformation type. It is clean-up for XFRM type modules and adds aliases with its protocol: ESP, AH, IPCOMP, IPIP and IPv6 for IPsec ROUTING and DSTOPTS for MIPv6 It is almost the same thing as XFRM mode alias, but it is added new defines XFRM_PROTO_XXX for preprocessing since some protocols are defined as enum. Signed-off-by: Masahide NAKAMURA <nakam@linux-ipv6.org> Acked-by: Ingo Oeser <netdev@axxeo.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-07-10 22:15:43 -07:00
Masahide NAKAMURA	59fbb3a61e	[IPV6] MIP6: Loadable module support for MIPv6. This patch makes MIPv6 loadable module named "mip6". Here is a modprobe.conf(5) example to load it automatically when user application uses XFRM state for MIPv6: alias xfrm-type-10-43 mip6 alias xfrm-type-10-60 mip6 Some MIPv6 feature is not included by this modular, however, it should not be affected to other features like either IPsec or IPv6 with and without the patch. We may discuss XFRM, MH (RAW socket) and ancillary data/sockopt separately for future work. Loadable features: * MH receiving check (to send ICMP error back) * RO header parsing and building (i.e. RH2 and HAO in DSTOPTS) * XFRM policy/state database handling for RO These are NOT covered as loadable: * Home Address flags and its rule on source address selection * XFRM sub policy (depends on its own kernel option) * XFRM functions to receive RO as IPv6 extension header * MH sending/receiving through raw socket if user application opens it (since raw socket allows to do so) * RH2 sending as ancillary data * RH2 operation with setsockopt(2) Signed-off-by: Masahide NAKAMURA <nakam@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-07-10 22:15:42 -07:00
Masahide NAKAMURA	136ebf08b4	[IPV6] MIP6: Kill unnecessary ifdefs. Kill unnecessary CONFIG_IPV6_MIP6. o It is redundant for RAW socket to keep MH out with the config then it can handle any protocol. o Clean-up at AH. Signed-off-by: Masahide NAKAMURA <nakam@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-07-10 22:15:41 -07:00
Patrick McHardy	2371baa4bd	[RTNETLINK]: Fix rtnetlink compat attribute patch Sent the wrong patch previously. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-07-10 22:15:40 -07:00
Patrick McHardy	afdc3238ec	[RTNETLINK]: Add nested compat attribute Add a nested compat attribute type that can be used to convert attributes that contain a structure to nested attributes in a backwards compatible way. The attribute looks like this: struct { [ compat contents ] struct rtattr { .rta_len = total size, .rta_type = type, } rta; struct old_structure struct; [ nested top-level attribute ] struct rtattr { .rta_len = nest size, .rta_type = type, } nest_attr; [ optional 0 .. n nested attributes ] struct rtattr { .rta_len = private attribute len, .rta_type = private attribute typ, } nested_attr; struct nested_data data; }; Since both userspace and kernel deal correctly with attributes that are larger than expected old versions will just parse the compat part and ignore the rest. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-07-10 22:15:39 -07:00
Patrick McHardy	1092cb2197	[NETLINK]: attr: add nested compat attribute type Add a nested compat attribute type that can be used to convert attributes that contain a structure to nested attributes in a backwards compatible way. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-07-10 22:15:38 -07:00
Patrick McHardy	334a8132d9	[SKBUFF]: Keep track of writable header len of headerless clones Currently NAT (and others) that want to modify cloned skbs copy them, even if in the vast majority of cases its not necessary because the skb is a clone made by TCP and the portion NAT wants to modify is actually writable because TCP release the header reference before cloning. The problem is that there is no clean way for NAT to find out how long the writable header area is, so this patch introduces skb->hdr_len to hold this length. When a headerless skb is cloned skb->hdr_len is set to the current headroom, for regular clones it is copied from the original. A new function skb_clone_writable(skb, len) returns whether the skb is writable up to len bytes from skb->data. To avoid enlarging the skb the mac_len field is reduced to 16 bit and the new hdr_len field is put in the remaining 16 bit. I've done a few rough benchmarks of NAT (not with this exact patch, but a very similar one). As expected it saves huge amounts of system time in case of sendfile, bringing it down to basically the same amount as without NAT, with sendmsg it only helps on loopback, probably because of the large MTU. Transmit a 1GB file using sendfile/sendmsg over eth0/lo with and without NAT: - sendfile eth0, no NAT: sys 0m0.388s - sendfile eth0, NAT: sys 0m1.835s - sendfile eth0: NAT + path: sys 0m0.370s (~ -80%) - sendfile lo, no NAT: sys 0m0.258s - sendfile lo, NAT: sys 0m2.609s - sendfile lo, NAT + patch: sys 0m0.260s (~ -90%) - sendmsg eth0, no NAT: sys 0m2.508s - sendmsg eth0, NAT: sys 0m2.539s - sendmsg eth0, NAT + patch: sys 0m2.445s (no change) - sendmsg lo, no NAT: sys 0m2.151s - sendmsg lo, NAT: sys 0m3.557s - sendmsg lo, NAT + patch: sys 0m2.159s (~ -40%) I expect other users can see a similar performance improvement, packet mangling iptables targets, ipip and ip_gre come to mind .. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-07-10 22:15:37 -07:00
Krishna Kumar	e50c41b53d	[NET]: qdisc_restart - couple of optimizations. Changes : - netif_queue_stopped need not be called inside qdisc_restart as it has been called already in qdisc_run() before the first skb is sent, and in __qdisc_run() after each intermediate skb is sent (note : we are the only sender, so the queue cannot get stopped while the tx lock was got in the ~LLTX case). - BUG_ON((int) q->q.qlen < 0) was a relic from old times when -1 meant more packets are available, and __qdisc_run used to loop when qdisc_restart() returned -1. During those days, it was necessary to make sure that qlen is never less than zero, since __qdisc_run would get into an infinite loop if no packets are on the queue and this bug in qdisc was there (and worse - no more skbs could ever get queue'd as we hold the queue lock too). With Herbert's recent change to return values, this check is not required. Hopefully Herbert can validate this change. If at all this is required, it should be added to skb_dequeue (in failure case), and not to qdisc_qlen. Signed-off-by: Krishna Kumar <krkumar2@in.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-07-10 22:15:36 -07:00
Krishna Kumar	6c1361a6f2	[NET]: qdisc_restart - readability changes plus one bug fix. New changes : - Incorporated Peter Waskiewicz's comments. - Re-added back one warning message (on driver returning wrong value). Previous changes : - Converted to use switch/case code which looks neater. - "if (ret == NETDEV_TX_LOCKED && lockless)" is buggy, and the lockless check should be removed, since driver will return NETDEV_TX_LOCKED only if lockless is true and driver has to do the locking. In the original code as well as the latest code, this code can result in a bug where if LLTX is not set for a driver (lockless == 0) but the driver is written wrongly to do a trylock (despite LLTX being set), the driver returns LOCKED. But since lockless is zero, the packet is requeue'd instead of calling collision code which will issue warning and free up the skb. Instead this skb will be retried with this driver next time, and the same result will ensue. Removing this check will catch these driver bugs instead of hiding the problem. I am keeping this change to readability section since : a. it is confusing to check two things as it is; and b. it is difficult to keep this check in the changed 'switch' code. - Changed some names, like try_get_tx_pkt to dev_dequeue_skb (as that is the work being done and easier to understand) and do_dev_requeue to dev_requeue_skb, merged handle_dev_cpu_collision and tx_islocked to dev_handle_collision (handle_dev_cpu_collision is a small routine with only one caller, so there is no need to have two separate routines which also results in getting rid of two macros, etc. - Removed an XXX comment as it should never fail (I suspect this was related to batch skb WIP, Jamal ?). Converted some functions to original coding style of having the return values and the function name on same line, eg prio2list. Signed-off-by: Krishna Kumar <krkumar2@in.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-07-10 22:15:35 -07:00
Gerrit Renker	49d66a70cf	[CCID3]: Fix a bug in the send time processing ccid3_hc_tx_send_packet currently returns 0 when the time difference between current time and t_nom is less than 1000 microseconds. In this case the packet is sent immediately; but, unlike other packets that can be emitted on first attempt, it will not have its window counter updated and its options set as required. This is a bug. Fix: Require the time difference to be at least 1000 microseconds. The algorithm then converges: time differences > 1000 microseconds trigger the timer in dccp_write_xmit; after timer expiry this function is tried again; when the time difference is less than 1000, the packet will have its options added and window counter updated as required. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>	2007-07-10 22:15:34 -07:00
Gerrit Renker	8132da4d41	[CCID3]: Sending time: update to ktime_t This updates the computation of t_nom and t_last_win_count to use the newer gettimeofday interface. Committer note: used ktime_to_timeval to set the 'now' variable to t_ld in ccid3hctx_no_feedback_timer Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>	2007-07-10 22:15:27 -07:00
Arnaldo Carvalho de Melo	dd36a9aba4	loss_interval: make struct dccp_li_hist_entry private net/dccp/ccids/lib/loss_interval.c is the only place where this struct is used. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2007-07-10 22:15:24 -07:00
Arnaldo Carvalho de Melo	cc4d6a3a34	loss_interval: Nuke dccp_li_hist It had just a slab cache, so, for the sake of simplicity just make dccp_trfc_lib module init routine create the slab cache, no need for users of the lib to create a private loss_interval object. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2007-07-10 22:15:23 -07:00
Arnaldo Carvalho de Melo	c70b729e66	loss_interval: Make dccp_li_hist_entry_{new,delete} private Not used outside the loss_interval code anymore. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2007-07-10 22:15:22 -07:00
Arnaldo Carvalho de Melo	8c281780c6	loss_interval: unexport dccp_li_hist_interval_new Now its only used inside the loss_interval code. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2007-07-10 22:15:21 -07:00
Arnaldo Carvalho de Melo	cc0a910b94	[DCCP] loss_interval: Move ccid3_hc_rx_update_li to loss_interval Renaming it to dccp_li_update_li. Also based on previous work by Ian McDonald. Signed-off-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>	2007-07-10 22:15:20 -07:00
Arnaldo Carvalho de Melo	878ac60023	[CCID3]: Pass ccid3_li_hist to ccid3_hc_rx_update_li Now ccid3_hc_rx_update_li is ready to be moved to net/dccp/ccids/lib/loss_interval, it uses the same interface as the other functions there. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2007-07-10 22:15:19 -07:00
Arnaldo Carvalho de Melo	d83258a3da	Remove accesses to ccid3_hc_rx_sock in ccid3_hc_rx_{update,calc_first}_li This is a preparatory patch for moving these loss interval functions from net/dccp/ccids/ccid3.c to net/dccp/ccids/lib/loss_interval.c. Based on a patch by Ian McDonald. Signed-off-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>	2007-07-10 22:15:18 -07:00
Ian McDonald	6bc7efe8ef	loss_interval: Fix timeval initialisation When compiling with EXTRA_CFLAGS=-W noticed that tstamp is not initialised correctly in dccp_li_calc_first_li. Signed-off-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net> Signed-off-by: Ian McDonald <ian.mcdonald@jandi.co.nz>	2007-07-10 22:15:06 -07:00
Ian McDonald	e961811fcd	Fix dccp_sum_coverage When compiling with EXTRA_CFLAGS=-W notice that we have signed/unsigned issue in dccp.h. Signed-off-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net> Signed-off-by: Ian McDonald <ian.mcdonald@jandi.co.nz>	2007-07-10 22:15:05 -07:00
Ian McDonald	b2f41ff413	ccid3: Update copyrights Signed-off-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>	2007-07-10 22:15:04 -07:00
Patrick McHardy	07b5b17e15	[VLAN]: Use rtnl_link API Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-07-10 22:15:03 -07:00
Patrick McHardy	a4bf3af4ac	[VLAN]: Introduce symbolic constants for flag values Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-07-10 22:15:02 -07:00
Patrick McHardy	b020cb4885	[VLAN]: Keep track of number of QoS mappings Keep track of the number of configured ingress/egress QoS mappings to avoid iteration while calculating the netlink attribute size. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-07-10 22:15:01 -07:00
Patrick McHardy	734423cf38	[VLAN]: Use 32 bit value for skb->priority mapping skb->priority has only 32 bits and even VLAN uses 32 bit values in its API. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-07-10 22:15:00 -07:00
Patrick McHardy	2ae0bf69b7	[VLAN]: Return proper error codes in register_vlan_device The returned device is unused, return proper error codes instead and avoid having the ioctl handler guess the error. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-07-10 22:14:59 -07:00
Patrick McHardy	e89fe42cd0	[VLAN]: Move device registation to seperate function Move device registration and configuration of the underlying device to a seperate function. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-07-10 22:14:58 -07:00
Patrick McHardy	c1d3ee9925	[VLAN]: Split up device checks Move the checks of the underlying device to a seperate function. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-07-10 22:14:57 -07:00
Patrick McHardy	42429aaee5	[VLAN]: Move vlan_group allocation to seperate function Move group allocation to a seperate function to clean up the code a bit and allocate groups before registering the device. Device registration is globally visible and causes netlink events, so we shouldn't fail afterwards. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-07-10 22:14:40 -07:00
Patrick McHardy	2f4284a406	[VLAN]: Move some device intialization code to dev->init callback Move some device initialization code to new dev->init callback to make it shareable with netlink. Additionally this fixes a minor bug, dev->iflink is set after registration, which causes an incorrect value in the initial netlink message. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-07-10 22:14:39 -07:00
Patrick McHardy	c17d8874f9	[VLAN]: Convert name-based configuration functions to struct netdevice * Move the device lookup and checks to the ioctl handler under the RTNL and change all name-based interfaces to take a struct net_device * instead. This allows to use them from a netlink interface, which identifies devices based on ifindex not name. It also avoids races between the ioctl interface and the (upcoming) netlink interface since now all changes happen under the RTNL. As a nice side effect this greatly simplifies error handling in the helper functions and fixes a number of incorrect error codes like -EINVAL for device not found. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-07-10 22:14:38 -07:00
Patrick McHardy	38f7b870d4	[RTNETLINK]: Link creation API Add rtnetlink API for creating, changing and deleting software devices. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-07-10 22:14:20 -07:00
Patrick McHardy	0157f60c0c	[RTNETLINK]: Split up rtnl_setlink Split up rtnl_setlink into a function performing validation and a function performing the actual changes. This allows to share the modifcation logic with rtnl_newlink, which is introduced by the next patch. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-07-10 22:14:16 -07:00
Larry Finger	b3d88ad49a	[MAC80211]: Add support for SIOCGIWRATE ioctl At present, transmission rate information for mac80211 is available only if verbose debugging is turned on, and then only in the logs. This patch implements the SIOCGIWRATE ioctl, which adds the current transmission rate to the output of iwconfig. Signed-off-by: Larry Finger <Larry.Finger@lwfinger.net> Signed-off-by: John W. Linville <linville@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-07-10 22:14:07 -07:00
Ville Tervo	8de0a15483	[Bluetooth] Keep rfcomm_dev on the list until it is freed This patch changes the RFCOMM TTY release process so that the TTY is kept on the list until it is really freed. A new device flag is used to keep track of released TTYs. Signed-off-by: Ville Tervo <ville.tervo@nokia.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2007-07-11 07:06:51 +02:00
Herbert Xu	a7ab4b501f	[TCPv4]: Improve BH latency in /proc/net/tcp Currently the code for /proc/net/tcp disable BH while iterating over the entire established hash table. Even though we call cond_resched_softirq for each entry, we still won't process softirq's as regularly as we would otherwise do which results in poor performance when the system is loaded near capacity. This anomaly comes from the 2.4 code where this was all in a single function and the local_bh_disable might have made sense as a small optimisation. The cost of each local_bh_disable is so small when compared against the increased latency in keeping it disabled over a large but mostly empty TCP established hash table that we should just move it to the individual read_lock/read_unlock calls as we do in inet_diag. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-07-10 22:06:20 -07:00
Jamal Hadi Salim	c716a81ab9	[NET_SCHED]: Cleanup readability of qdisc restart Over the years this code has gotten hairier. Resulting in many long discussions over long summer days and patches that get it wrong. This patch helps tame that code so normal people will understand it. Thanks to Thomas Graf, Peter J. waskiewicz Jr, and Patrick McHardy for their valuable reviews. Signed-off-by: Jamal Hadi Salim <hadi@cyberus.ca> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-07-10 22:06:16 -07:00
Allan Stephens	05646c9110	[TIPC]: Optimize stream send routine to avoid fragmentation This patch enhances TIPC's stream socket send routine so that it avoids transmitting data in chunks that require fragmentation and reassembly, thereby improving performance at both the sending and receiving ends of the connection. The "maximum packet size" hint that records MTU info allows the socket to decide how big a chunk it should send; in the event that the hint has become stale, fragmentation may still occur, but the data will be passed correctly and the hint will be updated in time for the following send. Note: The 66060 byte pseudo-MTU used for intra-node connections requires the send routine to perform an additional check to ensure it does not exceed TIPC"s limit of 66000 bytes of user data per chunk. Signed-off-by: Allan Stephens <allan.stephens@windriver.com> Signed-off-by: Jon Paul Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-07-10 22:06:12 -07:00
Allan Stephens	5eee6a6dc9	[TIPC]: Use standard socket "not implemented" routines This patch modifies TIPC's socket API to utilize existing generic routines to indicate unsupported operations, rather than adding similar TIPC-specific routines. Signed-off-by: Allan Stephens <allan.stephens@windriver.com> Signed-off-by: Jon Paul Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-07-10 22:06:09 -07:00
Allan Stephens	f3ec75f627	[TIPC]: Improved support for Ethernet traffic filtering This patch simplifies TIPC's Ethernet receive routine to take advantage of information already present in each incoming sk_buff indicating whether the packet was explicitly sent to the interface, has been broadcast to all interfaces, or was picked up because the interface is in promiscous mode. This new approach also fixes the problem of TIPC accepting unwanted traffic through UML's multicast-based Ethernet interfaces (which deliver traffic in a promiscuous manner even if the interface is not configured to be promiscuous). Signed-off-by: Allan Stephens <allan.stephens@windriver.com> Signed-off-by: Jon Paul Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-07-10 22:06:02 -07:00
David S. Miller	e06e7c6158	[IPV4]: The scheduled removal of multipath cached routing support. With help from Chris Wedgwood. Signed-off-by: David S. Miller <davem@davemloft.net>	2007-07-10 22:05:57 -07:00
Mikko Rapeli	84950cf0ba	[Bluetooth] Hangup TTY before releasing rfcomm_dev The core problem is that RFCOMM socket layer ioctl can release rfcomm_dev struct while RFCOMM TTY layer is still actively using it. Calling tty_vhangup() is needed for a synchronous hangup before rfcomm_dev is freed. Addresses the oops at http://bugzilla.kernel.org/show_bug.cgi?id=7509 Acked-by: Alan Cox <alan@lxorguk.ukuu.org.uk> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2007-07-11 07:01:26 +02:00
Marcel Holtmann	ef222013fc	[Bluetooth] Add hci_recv_fragment() helper function Most drivers must handle fragmented HCI data packets and events. This patch adds a generic function for their reassembly to the Bluetooth core layer and thus allows to shrink the complexity of the drivers. Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2007-07-11 06:42:04 +02:00
J. Bruce Fields	d8558f99fb	sunrpc: drop BKL around wrap and unwrap We don't need the BKL when wrapping and unwrapping; and experiments by Avishay Traeger have found that permitting multiple encryption and decryption operations to proceed in parallel can provide significant performance improvements. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> Cc: Avishay Traeger <atraeger@cs.sunysb.edu> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2007-07-10 23:40:50 -04:00
Frank van Maarseveen	d3bc9a1deb	SUNRPC client: add interface for binding to a local address In addition to binding to a local privileged port the NFS client should allow binding to a specific local address. This is used by the server for callbacks. The patch adds the necessary interface. Signed-off-by: Frank van Maarseveen <frankvm@frankvm.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2007-07-10 23:40:49 -04:00
Frank van Maarseveen	a97476926e	SUNRPC server: record the destination address of a request Save the destination address of an incoming request over TCP like is done already for UDP. It is necessary later for callbacks by the server. Signed-off-by: Frank van Maarseveen <frankvm@frankvm.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2007-07-10 23:40:49 -04:00
Frank van Maarseveen	96802a0951	SUNRPC: cleanup transport creation argument passing Cleanup argument passing to functions for creating an RPC transport. Signed-off-by: Frank van Maarseveen <frankvm@frankvm.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2007-07-10 23:40:49 -04:00
Chuck Lever	43780b87fa	SUNRPC: Add a convenient default for the hostname when calling rpc_create() A couple of callers just use a stringified IP address for the rpc client's hostname. Move the logic for constructing this into rpc_create(), so it can be shared. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2007-07-10 23:40:46 -04:00
Chuck Lever	45160d6275	SUNRPC: Rename rpcb_getport to be consistent with new rpcb_getport_sync name Clean up, for consistency. Rename rpcb_getport as rpcb_getport_async, to match the naming scheme of rpcb_getport_sync. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2007-07-10 23:40:46 -04:00
Chuck Lever	cce63cd637	SUNRPC: Rename rpcb_getport_external routine In preparation for handling NFS mount option parsing in the kernel, rename rpcb_getport_external as rpcb_get_port_sync, and make it available always (instead of only when CONFIG_ROOT_NFS is enabled). Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2007-07-10 23:40:46 -04:00
Chuck Lever	f7fb558e50	SUNRPC: Allow rpcbind requests to be interrupted by a signal. This allows NFS mount requests and RPC re-binding to be interruptible if the server isn't responding. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2007-07-10 23:40:45 -04:00
Trond Myklebust	8a702bbb7d	SUNRPC: Suppress some noisy and unnecessary printk() calls in call_verify() Convert them into dprintk() calls. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2007-07-10 23:40:38 -04:00
Trond Myklebust	0df7fb74fb	SUNRPC: Ensure RPCSEC_GSS destroys the security context when freeing a cred Do so by set the gc_proc field to RPC_GSS_PROC_DESTROY, and then sending a NULL RPC call. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2007-07-10 23:40:37 -04:00
Trond Myklebust	0285ed1f12	SUNRPC: Ensure that the struct gss_auth lifetime exceeds the credential's Add a refcount in order to ensure that the gss_auth doesn't disappear from underneath us while we're freeing up GSS contexts. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2007-07-10 23:40:37 -04:00
Trond Myklebust	1be27f3660	SUNRPC: Remove the tk_auth macro... We should almost always be deferencing the rpc_auth struct by means of the credential's cr_auth field instead of the rpc_clnt->cl_auth anyway. Fix up that historical mistake, and remove the macro that propagated it. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2007-07-10 23:40:37 -04:00
Trond Myklebust	1dd17ec693	SUNRPC: Allow rpc_auth to run clean up before the rpc_client is destroyed RPCSEC_GSS needs to be able to send NULL RPC calls to the server in order to free up any remaining GSS contexts. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2007-07-10 23:40:36 -04:00
Trond Myklebust	5d28dc8207	SUNRPC: Convert gss_ctx_lock to an RCU lock Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2007-07-10 23:40:36 -04:00
Trond Myklebust	f5c2187cfe	SUNRPC: Convert the credential garbage collector into a shrinker callback Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2007-07-10 23:40:36 -04:00
Trond Myklebust	9499b4341b	SUNRPC: Give credential cache a local spinlock Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2007-07-10 23:40:36 -04:00
Trond Myklebust	31be5bf15f	SUNRPC: Convert the credcache lookup code to use RCU Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2007-07-10 23:40:36 -04:00
Trond Myklebust	e092bdcd93	SUNRPC: cleanup rpc credential cache garbage collection Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2007-07-10 23:40:35 -04:00
Trond Myklebust	fc432dd907	SUNRPC: Enforce atomic updates of rpc_cred->cr_flags Convert to the use of atomic bitops... Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2007-07-10 23:40:35 -04:00
Trond Myklebust	696e38df9d	SUNRPC: replace casts in auth_unix.c with container_of() Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2007-07-10 23:40:35 -04:00
Trond Myklebust	5fe4755e25	SUNRPC: Clean up rpc credential initialisation Add a helper rpc_cred_init() Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2007-07-10 23:40:35 -04:00
Trond Myklebust	f1c0a86150	SUNRPC: Mark auth and cred operation tables as constant. Also do the same for gss_api operation tables. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2007-07-10 23:40:34 -04:00
Trond Myklebust	de7a8ce38a	SUNRPC: Rename rpcauth_destroy() to rpcauth_release() Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2007-07-10 23:40:34 -04:00
Trond Myklebust	5e1550d6a2	SUNRPC: Add the helper function 'rpc_call_null()' Does a NULL RPC call and returns a pointer to the resulting rpc_task. The call may be either synchronous or asynchronous. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2007-07-10 23:40:34 -04:00
Trond Myklebust	64c91a1f1c	SUNRPC: Make rpc_ping() static Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2007-07-10 23:40:34 -04:00
Trond Myklebust	fc1b356f56	SUNRPC: Fix races in rpcauth_create See the FIXME: auth_flavors[] really needs a lock and module refcounting. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2007-07-10 23:40:34 -04:00
Trond Myklebust	07a2bf1da4	SUNRPC: Fix a memory leak in gss_create() Fix a memory leak in gss_create() whereby the rpc credcache was not being freed if the rpc_mkpipe() call failed. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2007-07-10 23:40:33 -04:00
Trond Myklebust	5c9cfc828a	SUNRPC: Fix a typo in unx_create() We want to set the unix_cred_cache.nextgc on the first call to unx_create(), which should be when unix_auth.au_count === 1 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2007-07-10 23:40:33 -04:00
Trond Myklebust	3ab9bb7243	SUNRPC: Fix a memory leak in the auth credcache code The leak only affects the RPCSEC_GSS caches, since they are the only ones that are dynamically allocated... Rename the existing rpcauth_free_credcache() to rpcauth_clear_credcache() in order to better describe its role, then add a new function rpcauth_destroy_credcache() that actually frees the cache in addition to clearing it out. Also move the call to destroy the credcache in gss_destroy() to come before the rpc upcall pipe is unlinked. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2007-07-10 23:40:33 -04:00
Trond Myklebust	03a1256f06	SUNRPC: Add a field to track the number of kernel users of an rpc_pipe This allows us to correctly deduce when we need to remove the pipe. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2007-07-10 23:40:33 -04:00
Trond Myklebust	62e1761cef	SUNRPC: Clean up rpc_pipefs. Add a dentry_ops with a d_delete() method in order to ensure that dentries are removed as soon as the last reference is gone. Clean up rpc_depopulate() so that it only removes files that were created via rpc_populate(). Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2007-07-10 23:40:33 -04:00
Trond Myklebust	34f3089608	SUNRPC: Enable non-exclusive create in rpc_mkpipe() Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2007-07-10 23:40:32 -04:00
Trond Myklebust	6e84c7b66a	SUNRPC: Add a downcall queue to struct rpc_inode Currently, the downcall queue is tied to the struct gss_auth, which means that different RPCSEC_GSS pseudoflavours must use different upcall pipes. Add a list to struct rpc_inode that can be used instead. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2007-07-10 23:40:32 -04:00
Trond Myklebust	3b68aaeaf5	SUNRPC: Always match an upcall message in gss_pipe_downcall() It used to be possible for an rpc.gssd daemon to stuff the RPC credential cache for any rpc client simply by creating RPCSEC_GSS contexts and then doing downcalls. In practice, no daemons ever made use of this feature. Remove this feature now, since it will be impossible to figure out which mechanism a given context actually matches if we enable more than one gss mechanism to use the same upcall pipe. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2007-07-10 23:40:31 -04:00
Trond Myklebust	b185f835e2	SUNRPC: Remove the gss_auth spinlock We're just as well off using the inode spinlock instead. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2007-07-10 23:40:31 -04:00
Trond Myklebust	4a8c1344dc	SUNRPC: Add a backpointer from the struct rpc_cred to the rpc_auth Cleans up an issue whereby rpcsec_gss uses the rpc_clnt->cl_auth. If we want to be able to add several rpc_auths to a single rpc_clnt, then this abuse must go. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2007-07-10 23:40:31 -04:00
Trond Myklebust	c1384c9c4c	SUNRPC: fix hang due to eventd deadlock... Brian Behlendorf writes: The root cause of the NFS hang we were observing appears to be a rare deadlock between the kernel provided usermodehelper API and the linux NFS client. The deadlock can arise because both of these services use the generic linux work queues. The usermodehelper API run the specified user application in the context of the work queue. And NFS submits both cleanup and reconnect work to the generic work queue for handling. Normally this is fine but a deadlock can result in the following situation. - NFS client is in a disconnected state - [events/0] runs a usermodehelper app with an NFS dependent operation, this triggers an NFS reconnect. - NFS reconnect happens to be submitted to [events/0] work queue. - Deadlock, the [events/0] work queue will never process the reconnect because it is blocked on the previous NFS dependent operation which will not complete.` The solution is simply to run reconnect requests on rpciod. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2007-07-10 23:40:31 -04:00
Trond Myklebust	6e5b70e9d1	SUNRPC: clean up rpc_call_async/rpc_call_sync/rpc_run_task Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2007-07-10 23:40:30 -04:00
Trond Myklebust	188fef11db	SUNRPC: Move rpc_register_client and friends into net/sunrpc/clnt.c Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2007-07-10 23:40:30 -04:00
Trond Myklebust	f61534dfd3	SUNRPC: Remove redundant calls to rpciod_up()/rpciod_down() Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2007-07-10 23:40:30 -04:00
Trond Myklebust	4ada539ed7	SUNRPC: Make create_client() take a reference to the rpciod workqueue Ensures that an rpc_client always has the possibility to send asynchronous RPC calls. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2007-07-10 23:40:30 -04:00
Trond Myklebust	ab418d70e1	SUNRPC: Optimise rpciod_up() Instead of taking the mutex every time we just need to increment/decrement rpciod_users, we can optmise by using atomic_inc_not_zero and atomic_dec_and_test. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2007-07-10 23:40:30 -04:00
Trond Myklebust	d431a555fc	SUNRPC: Don't create an rpc_pipefs directory before rpc_clone is initialised Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2007-07-10 23:40:29 -04:00
Trond Myklebust	4c402b4097	SUNRPC: Remove rpc_clnt->cl_count The kref now does most of what cl_count + cl_user used to do. The only remaining role for cl_count is to tell us if we are in a 'shutdown' phase. We can provide that information using a single bit field instead of a full atomic counter. Also rename rpc_destroy_client() to rpc_close_client(), which reflects better what its role is these days. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2007-07-10 23:40:29 -04:00
Trond Myklebust	8ad7c892e1	SUNRPC: Make rpc_clone take a reference instead of using cl_count Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2007-07-10 23:40:29 -04:00
Trond Myklebust	90c5755ff5	SUNRPC: Kill rpc_clnt->cl_oneshot Replace it with explicit calls to rpc_shutdown_client() or rpc_destroy_client() (for the case of asynchronous calls). Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2007-07-10 23:40:29 -04:00
Trond Myklebust	848f1fe6be	SUNRPC: Kill rpc_clnt->cl_dead Its use is at best racy, and there is only one user (lockd), which has additional locking that makes the whole thing redundant. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2007-07-10 23:40:29 -04:00
Trond Myklebust	34f52e3591	SUNRPC: Convert rpc_clnt->cl_users to a kref Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2007-07-10 23:40:28 -04:00
Trond Myklebust	c44fe70553	SUNRPC: Clean up tk_pid allocation and make it lockless Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2007-07-10 23:40:28 -04:00
Trond Myklebust	4bef61ff75	SUNRPC: Add a per-rpc_clnt spinlock Use that to protect the rpc_clnt->cl_tasks list instead of using a global lock. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2007-07-10 23:40:28 -04:00
Trond Myklebust	6529eba08f	SUNRPC: Move rpc_task->tk_task list into struct rpc_clnt Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2007-07-10 23:40:28 -04:00
Trond Myklebust	b39e625b6e	NFSv4: Clean up nfs4_call_async() Use rpc_run_task() instead of doing it ourselves. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2007-07-10 23:40:24 -04:00
Linus Torvalds	6ed911fb04	Merge branch 'upstream-linus' of master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/netdev-2.6 * 'upstream-linus' of master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/netdev-2.6: (40 commits) bonding/bond_main.c: make 2 functions static ps3: gigabit ethernet driver for PS3, take3 [netdrvr] Fix dependencies for ax88796 ne2k clone driver eHEA: Capability flag for DLPAR support Remove sk98lin ethernet driver. sunhme.c:quattro_pci_find() must be __devinit bonding / ipv6: no addrconf for slaves separately from master atl1: remove write-only var in tx handler macmace: use "unsigned long flags;" Cleanup usbnet_probe() return value handling netxen: deinline and sparse fix eeprom_93cx6: shorten pulse timing to match spec (bis) phylib: Add Marvell 88E1112 phy id phylib: cleanup marvell.c a bit AX88796 network driver IOC3: Switch to pci refcounting safe APIs e100: Fix Tyan motherboard e100 not receiving IPMI commands QE Ethernet driver writes to wrong register to mask interrupts rrunner.c:rr_init() must be __devinit tokenring/3c359.c:xl_init() must be __devinit ...	2007-07-10 14:56:22 -07:00
Jay Vosburgh	c2edacf80e	bonding / ipv6: no addrconf for slaves separately from master At present, when a device is enslaved to bonding, if ipv6 is active then addrconf will be initated on the slave (because it is closed then opened during the enslavement processing). This causes DAD and RS packets to be sent from the slave. These packets in turn can confuse switches that perform ipv6 snooping, causing them to incorrectly update their forwarding tables (if, e.g., the slave being added is an inactve backup that won't be used right away) and direct traffic away from the active slave to a backup slave (where the incoming packets will be dropped). This patch alters the behavior so that addrconf will only run on the master device itself. I believe this is logically correct, as it prevents slaves from having an IPv6 identity independent from the master. This is consistent with the IPv4 behavior for bonding. This is accomplished by (a) having bonding set IFF_SLAVE sooner in the enslavement processing than currently occurs (before open, not after), and (b) having ipv6 addrconf ignore UP and CHANGE events on slave devices. The eql driver also uses the IFF_SLAVE flag. I inspected eql, and I believe this change is reasonable for its usage of IFF_SLAVE, but I did not test it. Signed-off-by: Jay Vosburgh <fubar@us.ibm.com> Signed-off-by: Jeff Garzik <jeff@garzik.org>	2007-07-10 12:41:19 -04:00
Jens Axboe	cf8208d0ea	sendfile: convert nfsd to splice_direct_to_actor() Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2007-07-10 08:04:14 +02:00
Akinobu Mita	67c4f7aa9e	[PATCH] softmac: use list_for_each_entry Cleanup using list_for_each_entry. Cc: Johannes Berg <johannes@sipsolutions.net> Cc: Joe Jezak <josejx@gentoo.org> Cc: Daniel Drake <dsd@gentoo.org> Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2007-07-08 22:16:37 -04:00
David Woodhouse	1c39858b5d	Fix use-after-free oops in Bluetooth HID. When cleaning up HIDP sessions, we currently close the ACL connection before deregistering the input device. Closing the ACL connection schedules a workqueue to remove the associated objects from sysfs, but the input device still refers to them -- and if the workqueue happens to run before the input device removal, the kernel will oops when trying to look up PHYSDEVPATH for the removed input device. Fix this by deregistering the input device before closing the connections. Signed-off-by: David Woodhouse <dwmw2@infradead.org> Acked-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-07-07 12:22:37 -07:00
Jarek Poplawski	25442cafb8	[NETPOLL]: Fixups for 'fix soft lockup when removing module' >From my recent patch: > > #1 > > Until kernel ver. 2.6.21 (including) cancel_rearming_delayed_work() > > required a work function should always (unconditionally) rearm with > > delay > 0 - otherwise it would endlessly loop. This patch replaces > > this function with cancel_delayed_work(). Later kernel versions don't > > require this, so here it's only for uniformity. But Oleg Nesterov <oleg@tv-sign.ru> found: > But 2.6.22 doesn't need this change, why it was merged? > > In fact, I suspect this change adds a race, ... His description was right (thanks), so this patch reverts #1. Signed-off-by: Jarek Poplawski <jarkao2@o2.pl> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-07-05 17:42:44 -07:00
Adrian Bunk	94b83419e5	[NET]: net/core/netevent.c should #include <net/netevent.h> Every file should include the headers containing the prototypes for its global functions. Signed-off-by: Adrian Bunk <bunk@stusta.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-07-05 17:40:27 -07:00
Jing Min Zhao	25845b5155	[NETFILTER]: nf_conntrack_h323: add checking of out-of-range on choices' index values Choices' index values may be out of range while still encoded in the fixed length bit-field. This bug may cause access to undefined types (NULL pointers) and thus crashes (Reported by Zhongling Wen). This patch also adds checking of decode flag when decoding SEQUENCEs. Signed-off-by: Jing Min Zhao <zhaojingmin@vivecode.com> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-07-05 17:40:23 -07:00
Johannes Berg	2cd052e443	[NET] skbuff: remove export of static symbol skb_clone_fraglist is static so it shouldn't be exported. Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-07-05 17:40:19 -07:00
Vlad Yasevich	1669d857a2	SCTP: Add scope_id validation for link-local binds SCTP currently permits users to bind to link-local addresses, but doesn't verify that the scope id specified at bind matches the interface that the address is configured on. It was report that this can hang a system. Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-07-05 17:40:15 -07:00
Vlad Yasevich	f50f95cab7	SCTP: Check to make sure file is valid before setting timeout In-kernel sockets created with sock_create_kern don't usually have a file and file descriptor allocated to them. As a result, when SCTP tries to check the non-blocking flag, we Oops when dereferencing a NULL file pointer. Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-07-05 17:40:11 -07:00
Vlad Yasevich	3663c30660	SCTP: Fix thinko in sctp_copy_laddrs() Correctly dereference bytes_copied in sctp_copy_laddrs(). I totally must have spaced when doing this. Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-07-05 17:40:08 -07:00
Jarek Poplawski	17200811cf	[NETPOLL] netconsole: fix soft lockup when removing module #1 Until kernel ver. 2.6.21 (including) cancel_rearming_delayed_work() required a work function should always (unconditionally) rearm with delay > 0 - otherwise it would endlessly loop. This patch replaces this function with cancel_delayed_work(). Later kernel versions don't require this, so here it's only for uniformity. #2 After deleting a timer in cancel_[rearming_]delayed_work() there could stay a last skb queued in npinfo->txq causing a memory leak after kfree(npinfo). Initial patch & testing by: Jason Wessel <jason.wessel@windriver.com> Signed-off-by: Jarek Poplawski <jarkao2@o2.pl> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-06-28 22:11:47 -07:00
David S. Miller	25243633c2	Merge master.kernel.org:/pub/scm/linux/kernel/git/vxy/lksctp-dev	2007-06-28 21:21:43 -07:00
Stephen Hemminger	0db3dc73f7	[NETPOLL]: tx lock deadlock fix If sky2 device poll routine is called from netpoll_send_skb, it would deadlock. The netpoll_send_skb held the netif_tx_lock, and the poll routine could acquire it to clean up skb's. Other drivers might use same locking model. The driver is correct, netpoll should not introduce more locking problems than it causes already. So change the code to drop lock before calling poll handler. Signed-off-by: Stephen Hemminger <shemminger@linux.foundation.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-06-27 00:39:42 -07:00
Zach Brown	5131a184a3	SCTP: lock_sock_nested in sctp_sock_migrate sctp_sock_migrate() grabs the socket lock on a newly allocated socket while holding the socket lock on an old socket. lockdep worries that this might be a recursive lock attempt. task/3026 is trying to acquire lock: (sk_lock-AF_INET){--..}, at: [<ffffffff88105b8c>] sctp_sock_migrate+0x2e3/0x327 [sctp] but task is already holding lock: (sk_lock-AF_INET){--..}, at: [<ffffffff8810891f>] sctp_accept+0xdf/0x1e3 [sctp] This patch tells lockdep that this locking is safe by using lock_sock_nested(). Signed-off-by: Zach Brown <zach.brown@oracle.com> Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>	2007-06-26 09:29:09 -04:00
Olaf Kirch	5b5a60da28	[NET]: Make skb_seq_read unmap the last fragment Having walked through the entire skbuff, skb_seq_read would leave the last fragment mapped. As a consequence, the unwary caller would leak kmaps, and proceed with preempt_count off by one. The only (kind of non-intuitive) workaround is to use skb_seq_read_abort. This patch makes sure skb_seq_read always unmaps frag_data after having cycled through the skb's paged part. Signed-off-by: Olaf Kirch <olaf.kirch@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-06-23 23:11:52 -07:00
Shannon Nelson	515e06c455	[NET]: Re-enable irqs before pushing pending DMA requests This moves the local_irq_enable() call in net_rx_action() to before calling the CONFIG_NET_DMA's dma_async_memcpy_issue_pending() rather than after. This shortens the irq disabled window and allows for DMA drivers that need to do their own irq hold. Signed-off-by: Shannon Nelson <shannon.nelson@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-06-23 23:09:23 -07:00
Jens Axboe	ddb61a57bb	[TCP] tcp_read_sock: Allow recv_actor() return return negative error value. tcp_read_sock() currently assumes that the recv_actor() only returns number of bytes copied. For network splice receive, we may have to return an error in some cases. So allow the actor to return a negative error value. Signed-off-by: Jens Axboe <jens.axboe@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-06-23 23:07:50 -07:00
Florian Westphal	64beb8f3eb	[TIPC]: Fix infinite loop in netlink handler The tipc netlink config handler uses the nlmsg_pid from the request header as destination for its reply. If the application initialized nlmsg_pid to 0, the reply is looped back to the kernel, causing hangup. Fix: use nlmsg_pid of the skb that triggered the request. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-06-23 22:59:40 -07:00
Patrick McHardy	dbbeb2f991	[SKBUFF]: Fix incorrect config #ifdef around skb_copy_secmark secmark doesn't depend on CONFIG_NET_SCHED. Signed-off-by: Patrick McHardy <kaber@trash.net> Acked-by: James Morris <jmorris@namei.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-06-23 22:58:34 -07:00
YOSHIFUJI Hideaki	6d5b78cdd5	[IPV6] NDISC: Fix thinko to control Router Preference support. Bug reported by Haruhito Watanabe <haruhito@sfc.keio.ac.jp>. Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-06-22 16:07:04 -07:00
Yasuyuki Kozakai	e2d8e314ad	[NETFILTER]: nfctnetlink: Don't allow to change helper There is no realistic situation to change helper (Who wants IRC helper to track FTP traffic ?). Moreover, if we want to do that, we need to fix race issue by nfctnetlink and running helper. That will add overhead to packet processing. It wouldn't pay. So this rejects the request to change helper. The requests to add or remove helper are accepted as ever. Signed-off-by: Yasuyuki Kozakai <yasuyuki.kozakai@toshiba.co.jp> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-06-22 14:10:22 -07:00
Jerome Borsboom	d258131aae	[NETFILTER]: nf_conntrack_sip: add missing message types containing RTP info Signed-off-by: Jerome Borsboom <j.borsboom@erasmusmc.nl> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-06-22 14:08:17 -07:00
Neil Horman	186e234358	SCTP: Fix sctp_getsockopt_get_peer_addrs This is the split out of the patch that we agreed I should split out from my last patch. It changes space_left to be computed in the same way the to variable is. I know we talked about changing space_left to an int, but I think size_t is more appropriate, since we should never have negative space in our buffer, and computing using offsetof means space_left should now never drop below zero. Signed-off-by: Neil Horman <nhorman@tuxdriver.com> Acked-by: Sridhar Samudrala <sri@us.ibm.com> Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>	2007-06-19 09:47:32 -04:00
Neil Horman	408f22e81e	SCTP: update sctp_getsockopt helpers to allow oversized buffers I noted the other day while looking at a bug that was ostensibly in some perl networking library, that we strictly avoid allowing getsockopt operations to complete if we pass in oversized buffers. This seems to make libraries like Perl::NET malfunction since it seems to allocate oversized buffers for use in several operations. It also seems to be out of line with the way udp, tcp and ip getsockopt routines handle buffer input (since the *optlen pointer in both an input and an output and gets set to the length of the data that we copy into the buffer). This patch brings our getsockopt helpers into line with other protocols, and allows us to accept oversized buffers for our getsockopt operations. Tested by me with good results. Signed-off-by: Neil Horman <nhorman@tuxdriver.com> Acked-by: Sridhar Samudrala <sri@us.ibm.com> Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>	2007-06-19 09:46:34 -04:00
David Howells	19e6454ca7	[AF_RXRPC]: Return the number of bytes buffered in rxrpc_send_data() Return the number of bytes buffered in rxrpc_send_data(). Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-06-18 23:30:41 -07:00
Neil Horman	cc0191aeef	[IPVS]: Fix state variable on failure to start ipvs threads ip_vs currently fails to reset its ip_vs_sync_state variable if the sync thread fails to start properly. The result is that the kernel will report a running daemon when their actuall is none. If you issue the following commands: 1. ipvsadm --start-daemon master --mcast-interface bla 2. ipvsadm -L --daemon 3. ipvsadm --stop-daemon master Assuming that bla is not an actual interface, step 2 should return no data, but instead returns: $ ipvsadm -L --daemon master sync daemon (mcast=bla, syncid=0) Signed-off-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-06-18 22:33:20 -07:00
Patrick McHardy	281216177a	[XFRM]: Fix MTU calculation for non-ESP SAs My IPsec MTU optimization patch introduced a regression in MTU calculation for non-ESP SAs, the SA's header_len needs to be subtracted from the MTU if the transform doesn't provide a ->get_mtu() function. Reported-and-tested-by: Marco Berizzi <pupilla@hotmail.com> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-06-18 22:30:15 -07:00
Adrian Bunk	16c61add51	[RXRPC] net/rxrpc/ar-connection.c: fix NULL dereference This patch fixes a NULL dereference spotted by the Coverity checker. Signed-off-by: Adrian Bunk <bunk@stusta.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-06-15 15:15:43 -07:00
Ilpo Järvinen	7769f4064c	[TCP]: Fix logic breakage due to DSACK separation Commit `6f74651ae6` is found guilty of breaking DSACK counting, which should be done only for the SACK block reported by the DSACK instead of every SACK block that is received along with DSACK information. Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-06-15 15:14:04 -07:00
Ilpo Järvinen	b9ce204f0a	[TCP]: Congestion control API RTT sampling fix Commit `164891aadf` broke RTT sampling of congestion control modules. Inaccurate timestamps could be fed to them without providing any way for them to identify such cases. Previously RTT sampler was called only if FLAG_RETRANS_DATA_ACKED was not set filtering inaccurate timestamps nicely. In addition, the new behavior could give an invalid timestamp (zero) to RTT sampler if only skbs with TCPCB_RETRANS were ACKed. This solves both problems. Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-06-15 15:08:43 -07:00
David S. Miller	559f0a2857	Merge master.kernel.org:/pub/scm/linux/kernel/git/vxy/lksctp-dev	2007-06-14 13:06:21 -07:00
Herbert Xu	74235a25c6	[IPV6] addrconf: Fix IPv6 on tuntap tunnels The recent patch that added ipv6_hwtype is broken on tuntap tunnels. Indeed, it's broken on any device that does not pass the ipv6_hwtype test. The reason is that the original test only applies to autoconfiguration, not IPv6 support. IPv6 support is allowed on any device. In fact, even with the ipv6_hwtype patch applied you can still add IPv6 addresses to any interface that doesn't pass thw ipv6_hwtype test provided that they have a sufficiently large MTU. This is a serious problem because come deregistration time these devices won't be cleaned up properly. I've gone back and looked at the rationale for the patch. It appears that the real problem is that we were creating IPv6 devices even if the MTU was too small. So here's a patch which fixes that and reverts the ipv6_hwtype stuff. Thanks to Kanru Chen for reporting this issue. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-06-14 13:02:55 -07:00
Ilpo Järvinen	d7ea5b91fa	[TCP]: Add missing break to TCP option parsing code This flaw does not affect any behavior (currently). Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-06-14 12:58:26 -07:00
Vlad Yasevich	06ad391919	[SCTP] Don't disable PMTU discovery when mtu is small Right now, when we receive a mtu estimate smaller then minim threshold in the ICMP message, we disable the path mtu discovery on the transport. This leads to the never increasing sctp fragmentation point even when the real path mtu has increased. Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>	2007-06-13 20:44:42 +00:00
Vlad Yasevich	8a4794914f	[SCTP] Flag a pmtu change request Currently, if the socket is owned by the user, we drop the ICMP message. As a result SCTP forgets that path MTU changed and never adjusting it's estimate. This causes all subsequent packets to be fragmented. With this patch, we'll flag the association that it needs to udpate it's estimate based on the already updated routing information. Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com> Acked-by: Sridhar Samudrala <sri@us.ibm.com>	2007-06-13 20:44:42 +00:00
Vlad Yasevich	c910b47e18	[SCTP] Update pmtu handling to be similar to tcp Introduce new function sctp_transport_update_pmtu that updates the transports and destination caches view of the path mtu. Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com> Acked-by: Sridhar Samudrala <sri@us.ibm.com>	2007-06-13 20:44:42 +00:00
Vlad Yasevich	fe979ac169	[SCTP] Fix leak in sctp_getsockopt_local_addrs when copy_to_user fails If the copy_to_user or copy_user calls fail in sctp_getsockopt_local_addrs(), the function should free locally allocated storage before returning error. Spotted by Coverity. Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com> Acked-by: Sridhar Samudrala <sri@us.ibm.com>	2007-06-13 20:44:41 +00:00
Vlad Yasevich	8b35805693	[SCTP]: Allow unspecified port in sctp_bindx() Allow sctp_bindx() to accept multiple address with unspecified port. In this case, all addresses inherit the first bound port. We still catch full mis-matches. Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com> Acked-by: Sridhar Samudrala <sri@us.ibm.com>	2007-06-13 20:44:41 +00:00
Vlad Yasevich	d570ee490f	[SCTP]: Correctly set daddr for IPv6 sockets during peeloff During peeloff of AF_INET6 socket, the inet6_sk(sk)->daddr wasn't set correctly since the code was assuming IPv4 only. Now we use a correct call to set the destination address. Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com> Acked-by: Sridhar Samudrala <sri@us.ibm.com>	2007-06-13 20:44:41 +00:00
David S. Miller	66e1e3b20c	[TCP]: Set initial_ssthresh default to zero in Cubic and BIC. Because of the current default of 100, Cubic and BIC perform very poorly compared to standard Reno. In the worst case, this change makes Cubic and BIC as aggressive as Reno. So this change should be very safe. Signed-off-by: David S. Miller <davem@davemloft.net>	2007-06-13 01:03:53 -07:00
Ilpo Järvinen	af15cc7b85	[TCP]: Fix left_out setting during FRTO Without FRTO, the tcp_try_to_open is never called with lost_out > 0 (see tcp_time_to_recover). However, when FRTO is enabled, the !tp->lost condition is not used until end of FRTO because that way TCP avoids premature entry to fast recovery during FRTO. Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-06-12 16:16:44 -07:00
David S. Miller	3d7dbeac58	[TCP]: Disable TSO if MD5SIG is enabled. Signed-off-by: David S. Miller <davem@davemloft.net>	2007-06-12 14:36:42 -07:00
David S. Miller	9cadcd28f0	Merge branch 'mac80211-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-2.6	2007-06-12 14:12:49 -07:00
Mattias Nissler	14042cbefc	[PATCH] mac80211: Don't stop tx queue on master device while scanning. mac80211 stops the tx queues during scans. This is wrong with respect to the master deivce tx queue, since stopping it prevents any probes from being sent during the scan. Instead, they accumulate in the queue and are only sent after the scan is finished, which is obviously wrong. Signed-off-by: Mattias Nissler <mattias.nissler@gmx.de> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2007-06-11 20:29:11 -04:00
Johannes Berg	0107136c04	[PATCH] mac80211: fix debugfs tx power reduction output This patch fixes a typo in mac80211's debugfs.c. Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2007-06-11 17:47:48 -04:00
David Lamparter	c9aca9da02	[PATCH] cfg80211: fix signed macaddress in sysfs Fix signedness mixup making mac addresses show up strangely (like 00:11:22:33:44:ffffffaa) in /sys/class/ieee80211/*/macaddress. Signed-off-by: David Lamparter <equinox@diac24.net> Acked-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2007-06-11 17:47:41 -04:00
G. Liakhovetski	b7e773b869	[IrDA]: f-timer reloading when sending rejected frames. Jean II was right: you have to re-charge the final timer when resending rejected frames. Otherwise it triggers at a wrong time and can break the currently running communication. Reproducible under rt-preempt. Signed-off-by: G. Liakhovetski <gl@dsa-ac.de> Signed-off-by: Samuel Ortiz <samuel@sortiz.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-06-08 19:15:56 -07:00
G. Liakhovetski	c0cfe7faa1	[IrDA]: Fix Rx/Tx path race. From: G. Liakhovetski <gl@dsa-ac.de> We need to switch to NRM _before_ sending the final packet otherwise we might hit a race condition where we get the first packet from the peer while we're still in LAP_XMIT_P. Signed-off-by: Samuel Ortiz <samuel@sortiz.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-06-08 19:15:17 -07:00
Paul Moore	50e5d35ce2	[CIPSO]: Fix several unaligned kernel accesses in the CIPSO engine. IPv4 options are not very well aligned within the packet and the format of a CIPSO option is even worse. The result is that the CIPSO engine in the kernel does a few unaligned accesses when parsing and validating incoming packets with CIPSO options attached which generate error messages on certain alignment sensitive platforms. This patch fixes this by marking these unaligned accesses with the get_unaliagned() macro. Signed-off-by: Paul Moore <paul.moore@hp.com> Acked-by: James Morris <jmorris@namei.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-06-08 13:33:10 -07:00
Paul Moore	ba6ff9f2b5	[NetLabel]: consolidate the struct socket/sock handling to just struct sock The current NetLabel code has some redundant APIs which allow both "struct socket" and "struct sock" types to be used; this may have made sense at some point but it is wasteful now. Remove the functions that operate on sockets and convert the callers. Not only does this make the code smaller and more consistent but it pushes the locking burden up to the caller which can be more intelligent about the locks. Also, perform the same conversion (socket to sock) on the SELinux/NetLabel glue code where it make sense. Signed-off-by: Paul Moore <paul.moore@hp.com> Acked-by: James Morris <jmorris@namei.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-06-08 13:33:09 -07:00
Herbert Xu	6363097cc4	[IPV4]: Do not remove idev when addresses are cleared Now that we create idev before addresses are added, it no longer makes sense to remove them when addresses are all deleted. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-06-08 13:33:08 -07:00
Joy Latten	4aa2e62c45	xfrm: Add security check before flushing SAD/SPD Currently we check for permission before deleting entries from SAD and SPD, (see security_xfrm_policy_delete() security_xfrm_state_delete()) However we are not checking for authorization when flushing the SPD and the SAD completely. It was perhaps missed in the original security hooks patch. This patch adds a security check when flushing entries from the SAD and SPD. It runs the entire database and checks each entry for a denial. If the process attempting the flush is unable to remove all of the entries a denial is logged the the flush function returns an error without removing anything. This is particularly useful when a process may need to create or delete its own xfrm entries used for things like labeled networking but that same process should not be able to delete other entries or flush the entire database. Signed-off-by: Joy Latten<latten@austin.ibm.com> Signed-off-by: Eric Paris <eparis@parisplace.org> Signed-off-by: James Morris <jmorris@namei.org>	2007-06-07 13:42:46 -07:00
Patrick McHardy	b00b4bf94e	[NET_SCHED]: Fix filter double free cbq and atm destroy their filters twice when destroying inner classes during qdisc destruction. Reported-and-tested-by: Strobl Anton <a.strobl@aws-it.at> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-06-07 13:41:05 -07:00
Thomas Graf	7c355f532d	[NET]: Avoid duplicate netlink notification when changing link state When changing the link state from userspace not affecting any other flags. Two duplicate notification are being sent, once as action in the NETDEV_UP/NETDEV_DOWN notification chain and a second time when comparing old and new device flags after the change has been completed. Although harmless, the duplicates should be avoided. Signed-off-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-06-07 13:40:56 -07:00
David S. Miller	df2bc459a3	[UDP]: Revert 2-pass hashing changes. This reverts changesets: `6aaf47fa48` `b7b5f487ab` `de34ed91c4` `fc038410b4` There are still some correctness issues recently discovered which do not have a known fix that doesn't involve doing a full hash table scan on port bind. So revert for now. Signed-off-by: David S. Miller <davem@davemloft.net>	2007-06-07 13:40:50 -07:00
Miklos Szeredi	3c0d2f3780	[AF_UNIX]: Fix stream recvmsg() race. A recv() on an AF_UNIX, SOCK_STREAM socket can race with a send()+close() on the peer, causing recv() to return zero, even though the sent data should be received. This happens if the send() and the close() is performed between skb_dequeue() and checking sk->sk_shutdown in unix_stream_recvmsg(): process A skb_dequeue() returns NULL, there's no data in the socket queue process B new data is inserted onto the queue by unix_stream_sendmsg() process B sk->sk_shutdown is set to SHUTDOWN_MASK by unix_release_sock() process A sk->sk_shutdown is checked, unix_release_sock() returns zero I'm surprised nobody noticed this, it's not hard to trigger. Maybe it's just (un)luck with the timing. It's possible to work around this bug in userspace, by retrying the recv() once in case of a zero return value. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-06-07 13:40:44 -07:00
Akinobu Mita	c764c9ade6	[NETFILTER]: nf_conntrack_amanda: fix textsearch_prepare() error check The return value from textsearch_prepare() needs to be checked by IS_ERR(). Because it returns error code as a pointer. Cc: "Brian J. Murrell" <netfilter@interlinx.bc.ca> Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-06-07 13:40:38 -07:00
Dmitry Mishin	4c1b52bc7a	[NETFILTER]: ip_tables: fix compat related crash check_compat_entry_size_and_hooks iterates over the matches and calls compat_check_calc_match, which loads the match and calculates the compat offsets, but unlike the non-compat version, doesn't call ->checkentry yet. On error however it calls cleanup_matches, which in turn calls ->destroy, which can result in crashes if the destroy function (validly) expects to only get called after the checkentry function. Add a compat_release_match function that only drops the module reference on error and rename compat_check_calc_match to compat_find_calc_match to reflect the fact that it doesn't call the checkentry function. Reported by Jan Engelhardt <jengelh@linux01.gwdg.de> Signed-off-by: Dmitry Mishin <dim@openvz.org> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-06-07 13:40:32 -07:00
Patrick McHarrdy	3c158f7f57	[NETFILTER]: nf_conntrack: fix helper module unload races When a helper module is unloaded all conntracks refering to it have their helper pointer NULLed out, leading to lots of races. In most places this can be fixed by proper use of RCU (they do already check for != NULL, but in a racy way), additionally nf_conntrack_expect_related needs to bail out when no helper is present. Also remove two paranoid BUG_ONs in nf_conntrack_proto_gre that are racy and not worth fixing. Signed-off-by: Patrick McHarrdy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-06-07 13:40:26 -07:00
Patrick McHardy	51055be81c	[RTNETLINK]: ifindex 0 does not exist ifindex == 0 does not exist and implies we should do a lookup by name if one was given. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-06-07 13:40:11 -07:00
Patrick McHardy	ef7c79ed64	[NETLINK]: Mark netlink policies const Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-06-07 13:40:10 -07:00
David S. Miller	14a49e1fd2	[TCP] tcp_probe: Attach printf attribute properly to printl(). GCC doesn't like the way Stephen initially did it: net/ipv4/tcp_probe.c:83: warning: empty declaration Signed-off-by: David S. Miller <davem@davemloft.net>	2007-06-07 13:40:09 -07:00
Eric Dumazet	274707cff9	[TCP]: Use LIMIT_NETDEBUG in tcp_retransmit_timer(). LIMIT_NETDEBUG allows the admin to disable some warning messages (echo 0 >/proc/sys/net/core/warnings). The "TCP: Treason uncloaked!" message can use this facility. Signed-off-by: Eric Dumazet <dada1@cosmosbay.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-06-07 13:40:08 -07:00
Denis Cheng	c4b1010f40	[NET]: Merge dst_discard_in and dst_discard_out. Signed-off-by: Denis Cheng <crquan@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-06-07 13:39:46 -07:00
Herbert Xu	71e27da961	[IPV4]: Restore old behaviour of default config values Previously inet devices were only constructed when addresses are added (or rarely in ipmr). Therefore the default config values they get are the ones at the time of these operations. Now that we're creating inet devices earlier, this changes the behaviour of default config values in an incompatible way (see bug #8519). This patch creates a compromise by setting the default values at the same point as before but only for those that have not been explicitly set by the user since the inet device's creation. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-06-07 13:39:26 -07:00
Herbert Xu	31be308541	[IPV4]: Add default config support after inetdev_init Previously once inetdev_init has been called on a device any changes made to ipv4_devconf_dflt would have no effect on that device's configuration. This creates a problem since we have moved the point where inetdev_init is called from when an address is added to where the device is registered. This patch is the first half of a set that tries to mimic the old behaviour while still calling inetdev_init. It propagates any changes to ipv4_devconf_dflt to those devices that have not had the corresponding attribute set. The next patch will forcibly set all values at the point where inetdev_init was previously called. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-06-07 13:39:19 -07:00
Herbert Xu	42f811b8bc	[IPV4]: Convert IPv4 devconf to an array This patch converts the ipv4_devconf config members (everything except sysctl) to an array. This allows easier manipulation which will be needed later on to provide better management of default config values. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-06-07 13:39:13 -07:00
Herbert Xu	8d76527e72	[IPV4]: Only panic if inetdev_init fails for loopback When I made the inetdev_init call work on all devices I incorrectly left in the panic call as well. It is obviously undesirable to panic on an allocation failure for a normal network device. This patch moves the panic call under the loopback if clause. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-06-07 13:39:03 -07:00
Patrick McHardy	f0e48dbfc5	[TCP]: Honour sk_bound_dev_if in tcp_v4_send_ack A time_wait socket inherits sk_bound_dev_if from the original socket, but it is not used when sending ACK packets using ip_send_reply. Fix by passing the oif to ip_send_reply in struct ip_reply_arg and use it for output routing. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-06-07 13:38:51 -07:00
Patrick McHardy	6e1d91039b	[ICMP]: Fix icmp_errors_use_inbound_ifaddr sysctl Currently when icmp_errors_use_inbound_ifaddr is set and an ICMP error is sent after the packet passed through ip_output(), an address from the outgoing interface is chosen as ICMP source address since skb->dev doesn't point to the incoming interface anymore. Fix this by doing an interface lookup on rt->dst.iif and using that device. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-06-03 18:08:51 -07:00
Wei Dong	584bdf8cbd	[IPV4]: Fix "ipOutNoRoutes" counter error for TCP and UDP Signed-off-by: Wei Dong <weidong@cn.fujitsu.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-06-03 18:08:50 -07:00
Herbert Xu	4fcd6b9916	[NET] gso: Fix GSO feature mask in sk_setup_caps This isn't a bug just yet as only TCP uses sk_setup_caps for GSO. However, if and when UDP or something else starts using it this is likely to cause a problem if we forget to add software emulation for it at the same time. The problem is that right now we translate GSO emulation to the bitmask NETIF_F_GSO_MASK, which includes every protocol, even ones that we cannot emulate. This patch makes it provide only the ones that we can emulate. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-06-03 18:08:49 -07:00
Ilpo Järvinen	6418204f91	[TCP]: Fix GSO ignorance of pkts_acked arg (cong.cntrl modules) The code used to ignore GSO completely, passing either way too small or zero pkts_acked when GSO skb or part of it got ACKed. In addition, there is no need to calculate the value in the loop but simple arithmetics after the loop is sufficient. There is no need to handle SYN case specially because congestion control modules are not yet initialized when FLAG_SYN_ACKED is set. Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-06-03 18:08:48 -07:00
Bill Nottingham	75202e7689	[NET]: Fix comparisons of unsigned < 0. Recent gcc versions emit warnings when unsigned variables are compared < 0 or >= 0. Signed-off-by: Bill Nottingham <notting@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-06-03 18:08:47 -07:00
Venkatesh Pallipadi	60468d5b5b	[NET]: Make net watchdog timers 1 sec jiffy aligned. round_jiffies for net dev watchdog timer. Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-06-03 18:08:46 -07:00
Mark Glines	3f196eb519	[TCP]: Use default 32768-61000 outgoing port range in all cases. This diff changes the default port range used for outgoing connections, from "use 32768-61000 in most cases, but use N-4999 on small boxes (where N is a multiple of 1024, depending on just how small the box is)" to just "use 32768-61000 in all cases". I don't believe there are any drawbacks to this change, and it keeps outgoing connection ports farther away from the mess of IANA-registered ports. Signed-off-by: Mark Glines <mark@glines.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-06-03 18:08:43 -07:00
David S. Miller	278a3de5ab	[AF_UNIX]: Fix datagram connect race causing an OOPS. Based upon an excellent bug report and initial patch by Frederik Deweerdt. The UNIX datagram connect code blindly dereferences other->sk_socket via the call down to the security_unix_may_send() function. Without locking 'other' that pointer can go NULL via unix_release_sock() which does sock_orphan() which also marks the socket SOCK_DEAD. So we have to lock both 'sk' and 'other' yet avoid all kinds of potential deadlocks (connect to self is OK for datagram sockets and it is possible for two datagram sockets to perform a simultaneous connect to each other). So what we do is have a "double lock" function similar to how we handle this situation in other areas of the kernel. We take the lock of the socket pointer with the smallest address first in order to avoid ABBA style deadlocks. Once we have them both locked, we check to see if SOCK_DEAD is set for 'other' and if so, drop everything and retry the lookup. Signed-off-by: David S. Miller <davem@davemloft.net>	2007-06-03 18:08:42 -07:00
David S. Miller	1c92b4e50e	[AF_UNIX]: Make socket locking much less confusing. The unix_state_*() locking macros imply that there is some rwlock kind of thing going on, but the implementation is actually a spinlock which makes the code more confusing than it needs to be. So use plain unix_state_lock and unix_state_unlock. Signed-off-by: David S. Miller <davem@davemloft.net>	2007-06-03 18:08:40 -07:00
Stephen Hemminger	d2d1acdb6a	VLAN: kill_vid is only useful for VLAN filtering devices The interface for network device VLAN extension was confusing. The kill_vid function is only really useful for devices that do hardware filtering. Devices that only do VLAN receiption without filtering were being forced to provide the hook, and there were bugs in those devices. Many drivers had kill_vid routine that called vlan_group_set_device, with NULL, but that is done already. Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org> Signed-off-by: Jeff Garzik <jeff@garzik.org>	2007-06-03 11:44:19 -04:00
David S. Miller	1acf6ba085	Merge branch 'mac80211' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-2.6	2007-05-31 01:23:58 -07:00
Stephen Hemminger	9a834b87c5	[BRIDGE]: Round off STP perodic timers. Peroidic STP timers don't have to be exact. The hold timer runs at 1HZ, and the hello timer normally runs at 2HZ; save power by aligning it them to next second. Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-05-31 01:23:39 -07:00
Baruch Even	071f772268	[BRIDGE]: Reduce frequency of forwarding cleanup timer in bridge. The bridge cleanup timer is fired 10 times a second for timers that are at least 15 seconds ahead in time and that are not critical to be cleaned asap. This patch calculates the next time to run the timer as the minimum of all timers or a minimum based on the current state. Signed-off-by: Baruch Even <baruch@ev-en.org> Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-05-31 01:23:38 -07:00
Stephen Hemminger	67403754bc	[TCP] tcp_probe: use GCC printf attribute The function in tcp_probe is printf like, use GCC to check the args. Sighed-off-by: Stephen Hemminger <shemminger@linux-foundation.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-05-31 01:23:37 -07:00
Sangtae Ha	63313494c4	[TCP] tcp_probe: a trivial fix for mismatched number of printl arguments. Just a fix to correct the number of printl arguments. Now, srtt is logging correctly. Signed-off-by: Sangtae Ha <sangtae.ha@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-05-31 01:23:36 -07:00
Pavel Emelianov	e4fd5da39f	[TCP]: Consolidate checking for tcp orphan count being too big. tcp_out_of_resources() and tcp_close() perform the same checking of number of orphan sockets. Move this code into common place. Signed-off-by: Pavel Emelianov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-05-31 01:23:34 -07:00
David S. Miller	be02097cf6	[AF_PACKET]: Kill CONFIG_PACKET_SOCKET. Always set, but af_packet.c, not by the Kconfig subsystem, so just get rid of it. Signed-off-by: David S. Miller <davem@davemloft.net>	2007-05-31 01:23:32 -07:00
David S. Miller	8c7fc03e27	[IPV6]: Fix build warning. net/ipv6/ip6_fib.c: In function ‘fib6_add_rt2node’: net/ipv6/ip6_fib.c:661: warning: label ‘out’ defined but not used Signed-off-by: David S. Miller <davem@davemloft.net>	2007-05-31 01:23:31 -07:00
David S. Miller	a2efcfa048	[AF_PACKET]: Kill bogus CONFIG_PACKET_MULTICAST It is unconditionally set by af_packet.c, not by the Kconfig subsystem, so just kill it off. Signed-off-by: David S. Miller <davem@davemloft.net>	2007-05-31 01:23:30 -07:00
David S. Miller	ddc31ce311	[IPV4]: Kill references to bogus non-existent CONFIG_IP_NOSIOCRT Signed-off-by: David S. Miller <davem@davemloft.net>	2007-05-31 01:23:29 -07:00

... 2 3 4 5 6 ...

5273 Commits