linux

Author	SHA1	Message	Date
Johannes Berg	552bff0c2f	cfg80211: constify name parameter to add_virtual_intf The name can't be modified by the driver, make it const. Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2012-09-19 09:32:59 +02:00
Johan Hedberg	92a25256f1	Bluetooth: mgmt: Implement support for passkey notification This patch adds support for Secure Simple Pairing with devices that have KeyboardOnly as their IO capability. Such devices will cause a passkey notification on our side and optionally also keypress notifications. Without this patch some keyboards cannot be paired using the mgmt interface. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Cc: stable@vger.kernel.org Acked-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>	2012-09-18 22:27:29 -03:00
Nicolas Dichtel	6f3118b571	ipv6: use net->rt_genid to check dst validity IPv6 dst should take care of rt_genid too. When a xfrm policy is inserted or deleted, all dst should be invalidated. To force the validation, dst entries should be created with ->obsolete set to DST_OBSOLETE_FORCE_CHK. This was already the case for all functions calling ip6_dst_alloc(), except for ip6_rt_copy(). As a consequence, we can remove the specific code in inet6_connection_sock. Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-09-18 15:57:03 -04:00
Nicolas Dichtel	b42664f898	netns: move net->ipv4.rt_genid to net->rt_genid This commit prepares the use of rt_genid by both IPv4 and IPv6. Initialization is left in IPv4 part. Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-09-18 15:57:03 -04:00
Nicolas Dichtel	bafa6d9d89	ipv4/route: arg delay is useless in rt_cache_flush() Since route cache deletion (`89aef8921b`), delay is no more used. Remove it. Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-09-18 15:44:34 -04:00
Pandiyarajan Pitchaimuthu	ed44a951c7	cfg80211/nl80211: Notify connection request failure in AP mode In AP mode, when a station requests connection to an AP and if the request is failed for particular reason, userspace is notified about the failure through NL80211_CMD_CONN_FAILED command. Reason for the failure is sent through the attribute NL80211_ATTR_CONN_FAILED_REASON. Signed-off-by: Pandiyarajan Pitchaimuthu <c_ppitch@qca.qualcomm.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2012-09-18 19:54:06 +02:00
Arend van Spriel	30d08a46ea	cfg80211: remove obsolete comment for .sched_scan_stop() callback The kerneldoc comment for .sched_scan_stop() callback describes a driver_initiated flag, but the interface does not hold such a flag. Reviewed-by: Franky (Zhenhui) Lin <frankyl@broadcom.com> Reviewed-by: Hante Meuleman <meuleman@broadcom.com> Signed-off-by: Arend van Spriel <arend@broadcom.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2012-09-18 19:54:05 +02:00
Eric W. Biederman	e1760bd5ff	userns: Convert the audit loginuid to be a kuid Always store audit loginuids in type kuid_t. Print loginuids by converting them into uids in the appropriate user namespace, and then printing the resulting uid. Modify audit_get_loginuid to return a kuid_t. Modify audit_set_loginuid to take a kuid_t. Modify /proc/<pid>/loginuid on read to convert the loginuid into the user namespace of the opener of the file. Modify /proc/<pid>/loginud on write to convert the loginuid rom the user namespace of the opener of the file. Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Eric Paris <eparis@redhat.com> Cc: Paul Moore <paul@paul-moore.com> ? Cc: David Miller <davem@davemloft.net> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>	2012-09-17 18:08:54 -07:00
Chuck Lever	35c448a8a3	include/net/sock.h: squelch compiler warning in sk_rmem_schedule() This warning: In file included from linux/include/linux/tcp.h:227:0, from linux/include/linux/ipv6.h:221, from linux/include/net/ipv6.h:16, from linux/include/linux/sunrpc/clnt.h:26, from linux/net/sunrpc/stats.c:22: linux/include/net/sock.h: In function `sk_rmem_schedule': linux/nfs-2.6/include/net/sock.h:1339:13: warning: comparison between signed and unsigned integer expressions [-Wsign-compare] is seen with gcc (GCC) 4.6.3 20120306 (Red Hat 4.6.3-2) using the -Wextra option. Commit `c76562b670` ("netvm: prevent a stream-specific deadlock") accidentally replaced the "size" parameter of sk_rmem_schedule() with an unsigned int. This changes the semantics of the comparison in the return statement. In sk_wmem_schedule we have syntactically the same comparison, but "size" is a signed integer. In addition, __sk_mem_schedule() takes a signed integer for its "size" parameter, so there is an implicit type conversion in sk_rmem_schedule() anyway. Revert the "size" parameter back to a signed integer so that the semantics of the expressions in both sk_[rw]mem_schedule() are exactly the same. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Mel Gorman <mgorman@suse.de> Cc: David Miller <davem@davemloft.net> Cc: Joonsoo Kim <js1304@gmail.com> Cc: David Rientjes <rientjes@google.com> Cc: Pekka Enberg <penberg@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2012-09-17 15:00:38 -07:00
David S. Miller	b4516a288e	llc: Remove stray reference to sysctl_llc_station_ack_timeout. Signed-off-by: David S. Miller <davem@davemloft.net>	2012-09-17 13:13:24 -04:00
David S. Miller	ba01dfe182	Merge branch 'for-davem' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next John W. Linville says: ==================== This is another batch of updates intended for the 3.7 stream. There are not a lot of large items, but iwlwifi, mwifiex, rt2x00, ath9k, and brcmfmac all get some attention. Wei Yongjun also provides a series of small maintenance fixes. This also includes a pull of the wireless tree in order to satisfy some prerequisites for later patches. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2012-09-17 00:57:32 -04:00
Greg Kroah-Hartman	7ac3c93e5d	Merge 3.6-rc6 into tty-next This pulls in the fixes in 3.6-rc6 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2012-09-16 17:31:36 -07:00
David S. Miller	b48b63a1f6	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Conflicts: net/netfilter/nfnetlink_log.c net/netfilter/xt_LOG.c Rather easy conflict resolution, the 'net' tree had bug fixes to make sure we checked if a socket is a time-wait one or not and elide the logging code if so. Whereas on the 'net-next' side we are calculating the UID and GID from the creds using different interfaces due to the user namespace changes from Eric Biederman. Signed-off-by: David S. Miller <davem@davemloft.net>	2012-09-15 11:43:53 -04:00
John W. Linville	9316f0e3c6	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next into for-davem	2012-09-14 13:53:49 -04:00
Daniel Wagner	8a8e04df47	cgroup: Assign subsystem IDs during compile time WARNING: With this change it is impossible to load external built controllers anymore. In case where CONFIG_NETPRIO_CGROUP=m and CONFIG_NET_CLS_CGROUP=m is set, corresponding subsys_id should also be a constant. Up to now, net_prio_subsys_id and net_cls_subsys_id would be of the type int and the value would be assigned during runtime. By switching the macro definition IS_SUBSYS_ENABLED from IS_BUILTIN to IS_ENABLED, all _subsys_id will have constant value. That means we need to remove all the code which assumes a value can be assigned to net_prio_subsys_id and net_cls_subsys_id. A close look is necessary on the RCU part which was introduces by following patch: commit `f845172531` Author: Herbert Xu <herbert@gondor.apana.org.au> Mon May 24 09:12:34 2010 Committer: David S. Miller <davem@davemloft.net> Mon May 24 09:12:34 2010 cls_cgroup: Store classid in struct sock Tis code was added to init_cgroup_cls() / We can't use rcu_assign_pointer because this is an int. */ smp_wmb(); net_cls_subsys_id = net_cls_subsys.subsys_id; respectively to exit_cgroup_cls() net_cls_subsys_id = -1; synchronize_rcu(); and in module version of task_cls_classid() rcu_read_lock(); id = rcu_dereference(net_cls_subsys_id); if (id >= 0) classid = container_of(task_subsys_state(p, id), struct cgroup_cls_state, css)->classid; rcu_read_unlock(); Without an explicit explaination why the RCU part is needed. (The rcu_deference was fixed by exchanging it to rcu_derefence_index_check() in a later commit, but that is a minor detail.) So here is my pondering why it was introduced and why it safe to remove it now. Note that this code was copied over to net_prio the reasoning holds for that subsystem too. The idea behind the RCU use for net_cls_subsys_id is to make sure we get a valid pointer back from task_subsys_state(). task_subsys_state() is just blindly accessing the subsys array and returning the pointer. Obviously, passing in -1 as id into task_subsys_state() returns an invalid value (out of lower bound). So this code makes sure that only after module is loaded and the subsystem registered, the id is assigned. Before unregistering the module all old readers must have left the critical section. This is done by assigning -1 to the id and issuing a synchronized_rcu(). Any new readers wont call task_subsys_state() anymore and therefore it is safe to unregister the subsystem. The new code relies on the same trick, but it looks at the subsys pointer return by task_subsys_state() (remember the id is constant and therefore we allways have a valid index into the subsys array). No precautions need to be taken during module loading module. Eventually, all CPUs will get a valid pointer back from task_subsys_state() because rebind_subsystem() which is called after the module init() function will assigned subsys[net_cls_subsys_id] the newly loaded module subsystem pointer. When the subsystem is about to be removed, rebind_subsystem() will called before the module exit() function. In this case, rebind_subsys() will assign subsys[net_cls_subsys_id] a NULL pointer and then it calls synchronize_rcu(). All old readers have left by then the critical section. Any new reader wont access the subsystem anymore. At this point we are safe to unregister the subsystem. No synchronize_rcu() call is needed. Signed-off-by: Daniel Wagner <daniel.wagner@bmw-carit.de> Signed-off-by: Tejun Heo <tj@kernel.org> Acked-by: Li Zefan <lizefan@huawei.com> Acked-by: Neil Horman <nhorman@tuxdriver.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Eric Dumazet <edumazet@google.com> Cc: Gao feng <gaofeng@cn.fujitsu.com> Cc: Glauber Costa <glommer@parallels.com> Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: Jamal Hadi Salim <jhs@mojatatu.com> Cc: John Fastabend <john.r.fastabend@intel.com> Cc: Kamezawa Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: netdev@vger.kernel.org Cc: cgroups@vger.kernel.org	2012-09-14 09:57:43 -07:00
Daniel Wagner	51e4e7faba	cgroup: net_prio: Do not define task_netpioidx() when not selected task_netprioidx() should not be defined in case the configuration is CONFIG_NETPRIO_CGROUP=n. The reason is that in a following patch the net_prio_subsys_id will only be defined if CONFIG_NETPRIO_CGROUP!=n. When net_prio is not built at all any callee should only get an empty task_netprioidx() without any references to net_prio_subsys_id. Signed-off-by: Daniel Wagner <daniel.wagner@bmw-carit.de> Signed-off-by: Tejun Heo <tj@kernel.org> Acked-by: Li Zefan <lizefan@huawei.com> Acked-by: Neil Horman <nhorman@tuxdriver.com> Cc: Gao feng <gaofeng@cn.fujitsu.com> Cc: Jamal Hadi Salim <jhs@mojatatu.com> Cc: John Fastabend <john.r.fastabend@intel.com> Cc: netdev@vger.kernel.org Cc: cgroups@vger.kernel.org	2012-09-14 09:57:28 -07:00
Daniel Wagner	8fb974c937	cgroup: net_cls: Do not define task_cls_classid() when not selected task_cls_classid() should not be defined in case the configuration is CONFIG_NET_CLS_CGROUP=n. The reason is that in a following patch the net_cls_subsys_id will only be defined if CONFIG_NET_CLS_CGROUP!=n. When net_cls is not built at all a callee should only get an empty task_cls_classid() without any references to net_cls_subsys_id. Signed-off-by: Daniel Wagner <daniel.wagner@bmw-carit.de> Signed-off-by: Tejun Heo <tj@kernel.org> Acked-by: Li Zefan <lizefan@huawei.com> Acked-by: Neil Horman <nhorman@tuxdriver.com> Cc: Gao feng <gaofeng@cn.fujitsu.com> Cc: Jamal Hadi Salim <jhs@mojatatu.com> Cc: John Fastabend <john.r.fastabend@intel.com> Cc: netdev@vger.kernel.org Cc: cgroups@vger.kernel.org	2012-09-14 09:57:25 -07:00
Daniel Wagner	f341980771	cgroup: net_cls: Move sock_update_classid() declaration to cls_cgroup.h The only user of sock_update_classid() is net/socket.c which happens to include cls_cgroup.h directly. tj: Fix build breakage due to missing cls_cgroup.h inclusion in drivers/net/tun.c reported in linux-next by Stephen. Signed-off-by: Daniel Wagner <daniel.wagner@bmw-carit.de> Signed-off-by: Tejun Heo <tj@kernel.org> Acked-by: Li Zefan <lizefan@huawei.com> Acked-by: Neil Horman <nhorman@tuxdriver.com> Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Cc: Gao feng <gaofeng@cn.fujitsu.com> Cc: Jamal Hadi Salim <jhs@mojatatu.com> Cc: John Fastabend <john.r.fastabend@intel.com> Cc: netdev@vger.kernel.org Cc: cgroups@vger.kernel.org	2012-09-14 09:55:57 -07:00
Eric W. Biederman	15e473046c	netlink: Rename pid to portid to avoid confusion It is a frequent mistake to confuse the netlink port identifier with a process identifier. Try to reduce this confusion by renaming fields that hold port identifiers portid instead of pid. I have carefully avoided changing the structures exported to userspace to avoid changing the userspace API. I have successfully built an allyesconfig kernel with this change. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Acked-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-09-10 15:30:41 -04:00
Johannes Berg	e548c49e6d	mac80211: add key flag for management keys Mark keys that might be used to receive management frames so drivers can fall back on software crypto for them if they don't support hardware offload. As the new flag is only set correctly for RX keys and the existing IEEE80211_KEY_FLAG_SW_MGMT flag can only affect TX, also rename the latter to IEEE80211_KEY_FLAG_SW_MGMT_TX. Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2012-09-10 11:29:17 +02:00
Andrei Emeltchenko	376261ae36	Bluetooth: debug: Print refcnt for hci_dev Add debug output for HCI kref. Signed-off-by: Andrei Emeltchenko <andrei.emeltchenko@intel.com> Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>	2012-09-08 18:06:10 -03:00
Andrei Emeltchenko	9472007c62	Bluetooth: trivial: Make hci_chan_del return void Return code is not needed in hci_chan_del Signed-off-by: Andrei Emeltchenko <andrei.emeltchenko@intel.com> Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>	2012-09-08 17:27:18 -03:00
Andrei Emeltchenko	6b536b5e5e	Bluetooth: Remove unneeded zero init hdev is allocated with kzalloc so zero initialization is not needed. Signed-off-by: Andrei Emeltchenko <andrei.emeltchenko@intel.com> Acked-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>	2012-09-08 16:53:48 -03:00
John W. Linville	fac805f8c1	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless	2012-09-07 15:07:55 -04:00
John W. Linville	4a3e12fd7a	Merge branch 'for-john' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211-next	2012-09-07 14:49:46 -04:00
Nicolas Dichtel	4ccfe6d410	ipv4/route: arg delay is useless in rt_cache_flush() Since route cache deletion (`89aef8921b`), delay is no more used. Remove it. Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-09-07 14:44:08 -04:00
Eric W. Biederman	dbe9a4173e	scm: Don't use struct ucred in NETLINK_CB and struct scm_cookie. Passing uids and gids on NETLINK_CB from a process in one user namespace to a process in another user namespace can result in the wrong uid or gid being presented to userspace. Avoid that problem by passing kuids and kgids instead. - define struct scm_creds for use in scm_cookie and netlink_skb_parms that holds uid and gid information in kuid_t and kgid_t. - Modify scm_set_cred to fill out scm_creds by heand instead of using cred_to_ucred to fill out struct ucred. This conversion ensures userspace does not get incorrect uid or gid values to look at. - Modify scm_recv to convert from struct scm_creds to struct ucred before copying credential values to userspace. - Modify __scm_send to populate struct scm_creds on in the scm_cookie, instead of just copying struct ucred from userspace. - Modify netlink_sendmsg to copy scm_creds instead of struct ucred into the NETLINK_CB. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-09-07 14:42:05 -04:00
John W. Linville	777bf135b7	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless into for-davem John W. Linville says: ==================== Please pull these fixes intended for 3.6. There are more commits here than I would like -- I got a bit behind while I was stalking Steven Rostedt in San Diego last week... I'll slow it down after this! There are a couple of pulls here. One is from Johannes: "Please pull (according to the below information) to get a few fixes. * a fix to properly disconnect in the driver when authentication or association fails * a fix to prevent invalid information about mesh paths being reported to userspace * a memory leak fix in an nl80211 error path" The other comes via Gustavo: "A few updates for the 3.6 kernel. There are two btusb patches to add more supported devices through the new USB_VENDOR_AND_INTEFACE_INFO() macro and another one that add a new device id for a Sony Vaio laptop, one fix for a user-after-free and, finally, two patches from Vinicius to fix a issue in SMP pairing." Along with those... Arend van Spriel provides a fix for a use-after-free bug in brcmfmac. Daniel Drake avoids a hang by not trying to touch the libertas hardware duing suspend if it is already powered-down. Felix Fietkau provides a batch of ath9k fixes that adress some potential problems with power settings, as well as a fix to avoid a potential interrupt storm. Gertjan van Wingerde provides a register-width fix for rt2x00, and a rt2x00 fix to prevent incorrectly detecting the rfkill status. He also provides a device ID patch. Hante Meuleman gives us three brcmfmac fixes, one that properly initializes a command structure, one that fixes a race condition that could lose usb requests, and one that removes some log spam. Marc Kleine-Budde offers an rt2x00 fix for a voltage setting on some specific devices. Mohammed Shafi Shajakhan sent an ath9k fix to avoid a crash related to using timers that aren't allocated when 2 wire bluetooth coexistence hardware is in use. Sergei Poselenov changes rt2800usb to do some validity checking for received packets, avoiding crashes on an ARM Soc. Stone Piao gives us an mwifiex fix for an incorrectly set skb length value for a command buffer. All of these are localized to their specific drivers, and relatively small. The power-related patches from Felix are bigger than I would like, but I merged them in consideration of their isolation to ath9k and the sensitive nature of power settings in wireless devices. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2012-09-07 14:38:50 -04:00
Johannes Berg	00b14825ee	Merge remote-tracking branch 'wireless-next/master' into mac80211-next	2012-09-06 17:05:28 +02:00
Johannes Berg	944b9e375d	Merge remote-tracking branch 'mac80211/master' into mac80211-next Pull in mac80211.git to let the next patch apply without conflicts, also resolving a hwsim conflict. Conflicts: drivers/net/wireless/mac80211_hwsim.c Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2012-09-06 15:56:02 +02:00
Nicolas Dichtel	ef2c7d7b59	ipv6: fix handling of blackhole and prohibit routes When adding a blackhole or a prohibit route, they were handling like classic routes. Moreover, it was only possible to add this kind of routes by specifying an interface. Bug already reported here: http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=498498 Before the patch: $ ip route add blackhole 2001::1/128 RTNETLINK answers: No such device $ ip route add blackhole 2001::1/128 dev eth0 $ ip -6 route \| grep 2001 2001::1 dev eth0 metric 1024 After: $ ip route add blackhole 2001::1/128 $ ip -6 route \| grep 2001 blackhole 2001::1 dev lo metric 1024 error -22 v2: wrong patch v3: add a field fc_type in struct fib6_config to store RTN_* type Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-09-05 17:49:28 -04:00
Robert P. J. Day	c9a0a30252	cfg80211: add kerneldoc entry for "vht_cap" Signed-off-by: Robert P. J. Day <rpjday@crashcourse.ca> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2012-09-05 15:54:07 +02:00
Steffen Klassert	3b59df46a4	xfrm: Workaround incompatibility of ESN and async crypto ESN for esp is defined in RFC 4303. This RFC assumes that the sequence number counters are always up to date. However, this is not true if an async crypto algorithm is employed. If the sequence number counters are not up to date on sequence number check, we may incorrectly update the upper 32 bit of the sequence number. This leads to a DOS. We workaround this by comparing the upper sequence number, (used for authentication) with the upper sequence number computed after the async processing. We drop the packet if these numbers are different. To do this, we introduce a recheck function that does this check in the ESN case. Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com> Acked-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-09-04 14:09:45 -04:00
David S. Miller	1e9f0207d3	Merge branch 'master' of git://1984.lsi.us.es/nf-next	2012-09-03 20:26:45 -04:00
Yuchung Cheng	684bad1107	tcp: use PRR to reduce cwin in CWR state Use proportional rate reduction (PRR) algorithm to reduce cwnd in CWR state, in addition to Recovery state. Retire the current rate-halving in CWR. When losses are detected via ACKs in CWR state, the sender enters Recovery state but the cwnd reduction continues and does not restart. Rename and refactor cwnd reduction functions since both CWR and Recovery use the same algorithm: tcp_init_cwnd_reduction() is new and initiates reduction state variables. tcp_cwnd_reduction() is previously tcp_update_cwnd_in_recovery(). tcp_ends_cwnd_reduction() is previously tcp_complete_cwr(). The rate halving functions and logic such as tcp_cwnd_down(), tcp_min_cwnd(), and the cwnd moderation inside tcp_enter_cwr() are removed. The unused parameter, flag, in tcp_cwnd_reduction() is also removed. Signed-off-by: Yuchung Cheng <ycheng@google.com> Acked-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-09-03 14:34:02 -04:00
Pablo Neira Ayuso	ace1fe1231	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next This merges (`3f509c6` netfilter: nf_nat_sip: fix incorrect handling of EBUSY for RTCP expectation) to Patrick McHardy's IPv6 NAT changes.	2012-09-03 15:34:51 +02:00
Pablo Neira Ayuso	84b5ee939e	netfilter: nf_conntrack: add nf_ct_timeout_lookup This patch adds the new nf_ct_timeout_lookup function to encapsulate the timeout policy attachment that is called in the nf_conntrack_in path. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2012-09-03 13:33:03 +02:00
Jerry Chu	8336886f78	tcp: TCP Fast Open Server - support TFO listeners This patch builds on top of the previous patch to add the support for TFO listeners. This includes - 1. allocating, properly initializing, and managing the per listener fastopen_queue structure when TFO is enabled 2. changes to the inet_csk_accept code to support TFO. E.g., the request_sock can no longer be freed upon accept(), not until 3WHS finishes 3. allowing a TCP_SYN_RECV socket to properly poll() and sendmsg() if it's a TFO socket 4. properly closing a TFO listener, and a TFO socket before 3WHS finishes 5. supporting TCP_FASTOPEN socket option 6. modifying tcp_check_req() to use to check a TFO socket as well as request_sock 7. supporting TCP's TFO cookie option 8. adding a new SYN-ACK retransmit handler to use the timer directly off the TFO socket rather than the listener socket. Note that TFO server side will not retransmit anything other than SYN-ACK until the 3WHS is completed. The patch also contains an important function "reqsk_fastopen_remove()" to manage the somewhat complex relation between a listener, its request_sock, and the corresponding child socket. See the comment above the function for the detail. Signed-off-by: H.K. Jerry Chu <hkchu@google.com> Cc: Yuchung Cheng <ycheng@google.com> Cc: Neal Cardwell <ncardwell@google.com> Cc: Eric Dumazet <edumazet@google.com> Cc: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-08-31 20:02:19 -04:00
Jerry Chu	1046716368	tcp: TCP Fast Open Server - header & support functions This patch adds all the necessary data structure and support functions to implement TFO server side. It also documents a number of flags for the sysctl_tcp_fastopen knob, and adds a few Linux extension MIBs. In addition, it includes the following: 1. a new TCP_FASTOPEN socket option an application must call to supply a max backlog allowed in order to enable TFO on its listener. 2. A number of key data structures: "fastopen_rsk" in tcp_sock - for a big socket to access its request_sock for retransmission and ack processing purpose. It is non-NULL iff 3WHS not completed. "fastopenq" in request_sock_queue - points to a per Fast Open listener data structure "fastopen_queue" to keep track of qlen (# of outstanding Fast Open requests) and max_qlen, among other things. "listener" in tcp_request_sock - to point to the original listener for book-keeping purpose, i.e., to maintain qlen against max_qlen as part of defense against IP spoofing attack. 3. various data structure and functions, many in tcp_fastopen.c, to support server side Fast Open cookie operations, including /proc/sys/net/ipv4/tcp_fastopen_key to allow manual rekeying. Signed-off-by: H.K. Jerry Chu <hkchu@google.com> Cc: Yuchung Cheng <ycheng@google.com> Cc: Neal Cardwell <ncardwell@google.com> Cc: Eric Dumazet <edumazet@google.com> Cc: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-08-31 20:02:18 -04:00
Alex Bergmann	6c9ff979d1	tcp: Increase timeout for SYN segments Commit `9ad7c049` ("tcp: RFC2988bis + taking RTT sample from 3WHS for the passive open side") changed the initRTO from 3secs to 1sec in accordance to RFC6298 (former RFC2988bis). This reduced the time till the last SYN retransmission packet gets sent from 93secs to 31secs. RFC1122 is stating that the retransmission should be done for at least 3 minutes, but this seems to be quite high. "However, the values of R1 and R2 may be different for SYN and data segments. In particular, R2 for a SYN segment MUST be set large enough to provide retransmission of the segment for at least 3 minutes. The application can close the connection (i.e., give up on the open attempt) sooner, of course." This patch increases the value of TCP_SYN_RETRIES to the value of 6, providing a retransmission window of 63secs. The comments for SYN and SYNACK retries have also been updated to describe the current settings. The same goes for the documentation file "Documentation/networking/ip-sysctl.txt". Signed-off-by: Alexander Bergmann <alex@linlab.net> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-08-31 15:42:10 -04:00
David S. Miller	c32f38619a	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Merge the 'net' tree to get the recent set of netfilter bug fixes in order to assist with some merge hassles Pablo is going to have to deal with for upcoming changes. Signed-off-by: David S. Miller <davem@davemloft.net>	2012-08-31 15:14:18 -04:00
David S. Miller	0dcd5052c8	Merge branch 'master' of git://1984.lsi.us.es/nf	2012-08-31 13:06:37 -04:00
Pablo Neira Ayuso	5b423f6a40	netfilter: nf_conntrack: fix racy timer handling with reliable events Existing code assumes that del_timer returns true for alive conntrack entries. However, this is not true if reliable events are enabled. In that case, del_timer may return true for entries that were just inserted in the dying list. Note that packets / ctnetlink may hold references to conntrack entries that were just inserted to such list. This patch fixes the issue by adding an independent timer for event delivery. This increases the size of the ecache extension. Still we can revisit this later and use variable size extensions to allocate this area on demand. Tested-by: Oliver Smith <olipro@8.c.9.b.0.7.4.0.1.0.0.2.ip6.arpa> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2012-08-31 15:50:28 +02:00
Patrick McHardy	b3f644fc82	netfilter: ip6tables: add MASQUERADE target Signed-off-by: Patrick McHardy <kaber@trash.net>	2012-08-30 03:00:18 +02:00
Patrick McHardy	58a317f106	netfilter: ipv6: add IPv6 NAT support Signed-off-by: Patrick McHardy <kaber@trash.net>	2012-08-30 03:00:17 +02:00
Patrick McHardy	2cf545e835	net: core: add function for incremental IPv6 pseudo header checksum updates Add inet_proto_csum_replace16 for incrementally updating IPv6 pseudo header checksums for IPv6 NAT. Signed-off-by: Patrick McHardy <kaber@trash.net> Acked-by: David S. Miller <davem@davemloft.net>	2012-08-30 03:00:16 +02:00
Patrick McHardy	c7232c9979	netfilter: add protocol independent NAT core Convert the IPv4 NAT implementation to a protocol independent core and address family specific modules. Signed-off-by: Patrick McHardy <kaber@trash.net>	2012-08-30 03:00:14 +02:00
Patrick McHardy	051966c0c6	netfilter: nf_nat: add protoff argument to packet mangling functions For mangling IPv6 packets the protocol header offset needs to be known by the NAT packet mangling functions. Add a so far unused protoff argument and convert the conntrack and NAT helpers to use it in preparation of IPv6 NAT. Signed-off-by: Patrick McHardy <kaber@trash.net>	2012-08-30 03:00:13 +02:00
Vinicius Costa Gomes	cc110922da	Bluetooth: Change signature of smp_conn_security() To make it clear that it may be called from contexts that may not have any knowledge of L2CAP, we change the connection parameter, to receive a hci_conn. This also makes it clear that it is checking the security of the link. Signed-off-by: Vinicius Costa Gomes <vinicius.gomes@openbossa.org> Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>	2012-08-27 08:07:18 -07:00
Greg Kroah-Hartman	e372dc6c62	Merge 3.6-rc3 into tty-next This picks up all of the different fixes in Linus's tree that we also need here. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2012-08-27 07:13:33 -07:00
Patrick McHardy	5f2d04f1f9	ipv4: fix path MTU discovery with connection tracking IPv4 conntrack defragments incoming packet at the PRE_ROUTING hook and (in case of forwarded packets) refragments them at POST_ROUTING independent of the IP_DF flag. Refragmentation uses the dst_mtu() of the local route without caring about the original fragment sizes, thereby breaking PMTUD. This patch fixes this by keeping track of the largest received fragment with IP_DF set and generates an ICMP fragmentation required error during refragmentation if that size exceeds the MTU. Signed-off-by: Patrick McHardy <kaber@trash.net> Acked-by: Eric Dumazet <edumazet@google.com> Acked-by: David S. Miller <davem@davemloft.net>	2012-08-26 19:13:55 +02:00
David S. Miller	e6acb38480	Merge branch 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace This is an initial merge in of Eric Biederman's work to start adding user namespace support to the networking. Signed-off-by: David S. Miller <davem@davemloft.net>	2012-08-24 18:54:37 -04:00
John W. Linville	f20b6213f1	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next into for-davem	2012-08-24 12:25:30 -04:00
Rami Rosen	f63c45e0e6	packet: fix broken build. This patch fixes a broken build due to a missing header: ... CC net/ipv4/proc.o In file included from include/net/net_namespace.h:15, from net/ipv4/proc.c:35: include/net/netns/packet.h:11: error: field 'sklist_lock' has incomplete type ... The lock of netns_packet has been replaced by a recent patch to be a mutex instead of a spinlock, but we need to replace the header file to be linux/mutex.h instead of linux/spinlock.h as well. See commit `0fa7fa98db`: packet: Protect packet sk list with mutex (v2) patch, Signed-off-by: Rami Rosen <rosenr@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-08-23 09:29:45 -07:00
Pavel Emelyanov	0fa7fa98db	packet: Protect packet sk list with mutex (v2) Change since v1: * Fixed inuse counters access spotted by Eric In patch `eea68e2f` (packet: Report socket mclist info via diag module) I've introduced a "scheduling in atomic" problem in packet diag module -- the socket list is traversed under rcu_read_lock() while performed under it sk mclist access requires rtnl lock (i.e. -- mutex) to be taken. [152363.820563] BUG: scheduling while atomic: crtools/12517/0x10000002 [152363.820573] 4 locks held by crtools/12517: [152363.820581] #0: (sock_diag_mutex){+.+.+.}, at: [<ffffffff81a2dcb5>] sock_diag_rcv+0x1f/0x3e [152363.820613] #1: (sock_diag_table_mutex){+.+.+.}, at: [<ffffffff81a2de70>] sock_diag_rcv_msg+0xdb/0x11a [152363.820644] #2: (nlk->cb_mutex){+.+.+.}, at: [<ffffffff81a67d01>] netlink_dump+0x23/0x1ab [152363.820693] #3: (rcu_read_lock){.+.+..}, at: [<ffffffff81b6a049>] packet_diag_dump+0x0/0x1af Similar thing was then re-introduced by further packet diag patches (fanount mutex and pgvec mutex for rings) :( Apart from being terribly sorry for the above, I propose to change the packet sk list protection from spinlock to mutex. This lock currently protects two modifications: * sklist * prot inuse counters The sklist modifications can be just reprotected with mutex since they already occur in a sleeping context. The inuse counters modifications are trickier -- the __this_cpu_-s are used inside, thus requiring the caller to handle the potential issues with contexts himself. Since packet sockets' counters are modified in two places only (packet_create and packet_release) we only need to protect the context from being preempted. BH disabling is not required in this case. Signed-off-by: Pavel Emelyanov <xemul@parallels.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-08-22 22:58:27 -07:00
David S. Miller	bf277b0cce	Merge git://1984.lsi.us.es/nf-next Pablo Neira Ayuso says: ==================== This is the first batch of Netfilter and IPVS updates for your net-next tree. Mostly cleanups for the Netfilter side. They are: * Remove unnecessary RTNL locking now that we have support for namespace in nf_conntrack, from Patrick McHardy. * Cleanup to eliminate unnecessary goto in the initialization path of several Netfilter tables, from Jean Sacren. * Another cleanup from Wu Fengguang, this time to PTR_RET instead of if IS_ERR then return PTR_ERR. * Use list_for_each_entry_continue_rcu in nf_iterate, from Michael Wang. * Add pmtu_disc sysctl option to disable PMTU in their tunneling transmitter, from Julian Anastasov. * Generalize application protocol registration in IPVS and modify IPVS FTP helper to use it, from Julian Anastasov. * update Kconfig. The IPVS FTP helper depends on the Netfilter FTP helper for NAT support, from Julian Anastasov. * Add logic to update PMTU for IPIP packets in IPVS, again from Julian Anastasov. * A couple of sparse warning fixes for IPVS and Netfilter from Claudiu Ghioc and Patrick McHardy respectively. Patrick's IPv6 NAT changes will follow after this batch, I need to flush this batch first before refreshing my tree. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2012-08-22 18:48:52 -07:00
David S. Miller	1304a7343b	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net	2012-08-22 14:21:38 -07:00
Eric Dumazet	e0e3cea46d	af_netlink: force credentials passing [CVE-2012-3520] Pablo Neira Ayuso discovered that avahi and potentially NetworkManager accept spoofed Netlink messages because of a kernel bug. The kernel passes all-zero SCM_CREDENTIALS ancillary data to the receiver if the sender did not provide such data, instead of not including any such data at all or including the correct data from the peer (as it is the case with AF_UNIX). This bug was introduced in commit `16e5726269` (af_unix: dont send SCM_CREDENTIALS by default) This patch forces passing credentials for netlink, as before the regression. Another fix would be to not add SCM_CREDENTIALS in netlink messages if not provided by the sender, but it might break some programs. With help from Florian Weimer & Petr Matousek This issue is designated as CVE-2012-3520 Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Petr Matousek <pmatouse@redhat.com> Cc: Florian Weimer <fweimer@redhat.com> Cc: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-08-21 14:53:01 -07:00
John W. Linville	01e17dacd4	Merge branch 'for-john' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211-next Conflicts: drivers/net/wireless/mac80211_hwsim.c	2012-08-21 16:00:21 -04:00
Syam Sidhardhan	144ad33020	Bluetooth: Use kref for l2cap channel reference counting This patch changes the struct l2cap_chan and associated code to use kref api for object refcounting and freeing. Suggested-by: Andrei Emeltchenko <andrei.emeltchenko@intel.com> Signed-off-by: Jaganath Kanakkassery <jaganath.k@samsung.com> Signed-off-by: Syam Sidhardhan <s.syam@samsung.com> Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>	2012-08-21 14:54:41 -03:00
Mikel Astiz	f0d6a0ea33	Bluetooth: mgmt: Add device disconnect reason MGMT_EV_DEVICE_DISCONNECTED will now expose the disconnection reason to userland, distinguishing four possible values: 0x00 Reason not known or unspecified 0x01 Connection timeout 0x02 Connection terminated by local host 0x03 Connection terminated by remote host Note that the local/remote distinction just determines which side terminated the low-level connection, regardless of the disconnection of the higher-level profiles. This can sometimes be misleading and thus must be used with care. For example, some hardware combinations would report a locally initiated disconnection even if the user turned Bluetooth off in the remote side. Signed-off-by: Mikel Astiz <mikel.astiz@bmw-carit.de> Acked-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>	2012-08-21 14:54:40 -03:00
Mikel Astiz	cdcba7c650	Bluetooth: Add more HCI error codes Add more HCI error codes as defined in the specification. Signed-off-by: Mikel Astiz <mikel.astiz@bmw-carit.de> Acked-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>	2012-08-21 14:54:39 -03:00
Johannes Berg	6d71117a27	mac80211: add IEEE80211_HW_P2P_DEV_ADDR_FOR_INTF Some devices like the current iwlwifi implementation require that the P2P interface address match the P2P Device address (only one P2P interface is supported.) Add the HW flag IEEE80211_HW_P2P_DEV_ADDR_FOR_INTF that allows drivers to request that P2P Interfaces added while a P2P Device is active get the same MAC address by default. Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2012-08-20 13:58:23 +02:00
Johannes Berg	98104fdeda	cfg80211: add P2P Device abstraction In order to support using a different MAC address for the P2P Device address we must first have a P2P Device abstraction that can be assigned a MAC address. This abstraction will also be useful to support offloading P2P operations to the device, e.g. periodic listen for discoverability. Currently, the driver is responsible for assigning a MAC address to the P2P Device, but this could be changed by allowing a MAC address to be given to the NEW_INTERFACE command. As it has no associated netdev, a P2P Device can only be identified by its wdev identifier but the previous patches allowed using the wdev identifier in various APIs, e.g. remain-on-channel. Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2012-08-20 13:58:21 +02:00
Johannes Berg	4c29867790	mac80211: support A-MPDU status reporting Support getting A-MPDU status information from the drivers and reporting it to userspace via radiotap in the standard fields. Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2012-08-20 13:53:22 +02:00
Johannes Berg	48613ece3d	wireless: add radiotap A-MPDU status field Define the A-MPDU status field in radiotap, also update the radiotap parser for it and the MCS field that was apparently missed last time. Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2012-08-20 13:53:09 +02:00
Antonio Quartulli	e687f61eed	mac80211: add supported rates change notification in IBSS In IBSS it is possible that the supported rates set for a station changes over time (e.g. it gets first initialised as an empty set because of no available information about rates and updated later). In this case the driver has to be notified about the change in order to update its internal table accordingly (if needed). This behaviour is needed by all those drivers that handle rc internally but leave stations management to mac80211 Reported-by: Gui Iribarren <gui@altermundi.net> Signed-off-by: Antonio Quartulli <ordex@autistici.org> [Johannes - add docs, validate IBSS mode only, fix compilation] Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2012-08-20 13:31:43 +02:00
Patrick McHardy	9d7b0fc1ef	net: ipv6: fix oops in inet_putpeer() Commit `97bab73f` (inet: Hide route peer accesses behind helpers.) introduced a bug in xfrm6_policy_destroy(). The xfrm_dst's _rt6i_peer member is not initialized, causing a false positive result from inetpeer_ptr_is_peer(), which in turn causes a NULL pointer dereference in inet_putpeer(). Pid: 314, comm: kworker/0:1 Not tainted 3.6.0-rc1+ #17 To Be Filled By O.E.M. To Be Filled By O.E.M./P4S800D-X EIP: 0060:[<c03abf93>] EFLAGS: 00010246 CPU: 0 EIP is at inet_putpeer+0xe/0x16 EAX: 00000000 EBX: f3481700 ECX: 00000000 EDX: 000dd641 ESI: f3481700 EDI: c05e949c EBP: f551def4 ESP: f551def4 DS: 007b ES: 007b FS: 0000 GS: 00e0 SS: 0068 CR0: 8005003b CR2: 00000070 CR3: 3243d000 CR4: 00000750 DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000 DR6: ffff0ff0 DR7: 00000400 f551df04 c0423de1 00000000 f3481700 f551df18 c038d5f7 f254b9f8 f551df28 f34f85d8 f551df20 c03ef48d f551df3c c0396870 f30697e8 f24e1738 c05e98f4 f5509540 c05cd2b4 f551df7c c0142d2b c043feb5 f5509540 00000000 c05cd2e8 [<c0423de1>] xfrm6_dst_destroy+0x42/0xdb [<c038d5f7>] dst_destroy+0x1d/0xa4 [<c03ef48d>] xfrm_bundle_flo_delete+0x2b/0x36 [<c0396870>] flow_cache_gc_task+0x85/0x9f [<c0142d2b>] process_one_work+0x122/0x441 [<c043feb5>] ? apic_timer_interrupt+0x31/0x38 [<c03967eb>] ? flow_cache_new_hashrnd+0x2b/0x2b [<c0143e2d>] worker_thread+0x113/0x3cc Fix by adding a init_dst() callback to struct xfrm_policy_afinfo to properly initialize the dst's peer pointer. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-08-20 02:56:56 -07:00
David S. Miller	2ea214929d	Merge branch 'for-davem' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next John W. Linville says: ==================== This is a batch of updates intended for 3.7. The ath9k, mwifiex, and b43 drivers get the bulk of the commits this time, with a handful of other driver bits thrown-in. It is mostly just minor fixes and cleanups, etc. Also included is a Bluetooth pull, with a lot of refactoring. Gustavo says: "These are the changes I queued for 3.7. There are a many small fixes/improvements by Andre Guedes. A l2cap channel refcounting refactor by Jaganath. Bluetooth sockets now appears in /proc/net, by Masatake Yamato and Sachin Kamat changes ours drivers to use devm_kzalloc()." ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2012-08-15 15:26:05 -07:00
Fan Du	65e0736bc2	xfrm: remove redundant parameter "int dir" in struct xfrm_mgr.acquire Sematically speaking, xfrm_mgr.acquire is called when kernel intends to ask user space IKE daemon to negotiate SAs with peers. IOW the direction will always be XFRM_POLICY_OUT, so remove int dir for clarity. Signed-off-by: Fan Du <fan.du@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-08-15 15:13:30 -07:00
John W. Linville	16698918cd	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next into for-davem	2012-08-15 14:29:37 -04:00
Cong Wang	1f07b62f32	sctp: fix a compile error in sctp.h I got the following compile error: In file included from include/net/sctp/checksum.h:46:0, from net/ipv4/netfilter/nf_nat_proto_sctp.c:14: include/net/sctp/sctp.h: In function ‘sctp_dbg_objcnt_init’: include/net/sctp/sctp.h:370:88: error: parameter name omitted include/net/sctp/sctp.h: In function ‘sctp_dbg_objcnt_exit’: include/net/sctp/sctp.h:371:88: error: parameter name omitted which is caused by commit `13d782f6b4` Author: Eric W. Biederman <ebiederm@xmission.com> Date: Mon Aug 6 08:45:15 2012 +0000 sctp: Make the proc files per network namespace. This patch could fix it. Cc: David S. Miller <davem@davemloft.net> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Cong Wang <amwang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-08-15 03:43:43 -07:00
Eric W. Biederman	e1fc3b14f9	sctp: Make sysctl tunables per net Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-08-14 23:32:16 -07:00
Eric W. Biederman	f53b5b097e	sctp: Push struct net down into sctp_verify_ext_param Add struct net as a parameter to sctp_verify_param so it can be passed to sctp_verify_ext_param where struct net will be needed when the sctp tunables become per net tunables. Add struct net as a parameter to sctp_verify_init so struct net can be passed to sctp_verify_param. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-08-14 23:30:37 -07:00
Eric W. Biederman	24cb81a6a9	sctp: Push struct net down into all of the state machine functions There are a handle of state machine functions primarily those dealing with processing INIT packets where there is neither a valid endpoint nor a valid assoication from which to derive a struct net. Therefore add struct net * to the parameter list of sctp_state_fn_t and update all of the state machine functions. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-08-14 23:30:37 -07:00
Eric W. Biederman	e7ff4a7037	sctp: Push struct net down into sctp_in_scope struct net will be needed shortly when the tunables are made per network namespace. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-08-14 23:30:37 -07:00
Eric W. Biederman	89bf3450cb	sctp: Push struct net down into sctp_transport_init Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-08-14 23:30:37 -07:00
Eric W. Biederman	55e26eb95a	sctp: Push struct net down to sctp_chunk_event_lookup This trickles up through sctp_sm_lookup_event up to sctp_do_sm and up further into sctp_primitiv_NAME before the code reaches places where struct net can be reliably found. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-08-14 23:30:37 -07:00
Eric W. Biederman	ebb7e95d93	sctp: Add infrastructure for per net sysctls Start with an empty sctp_net_table that will be populated as the various tunable sysctls are made per net. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Acked-by: Vlad Yasevich <vyasevich@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-08-14 23:30:37 -07:00
Eric W. Biederman	b01a24078f	sctp: Make the mib per network namespace Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Acked-by: Vlad Yasevich <vyasevich@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-08-14 23:30:36 -07:00
Eric W. Biederman	13d782f6b4	sctp: Make the proc files per network namespace. - Convert all of the files under /proc/net/sctp to be per network namespace. - Don't print anything for /proc/net/sctp/snmp except in the initial network namespaces as the snmp counters still have to be converted to be per network namespace. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Acked-by: Vlad Yasevich <vyasevich@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-08-14 23:29:53 -07:00
Eric W. Biederman	2ce9550350	sctp: Make the ctl_sock per network namespace - Kill sctp_get_ctl_sock, it is useless now. - Pass struct net where needed so net->sctp.ctl_sock is accessible. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Acked-by: Vlad Yasevich <vyasevich@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-08-14 23:17:26 -07:00
Eric W. Biederman	4db67e8086	sctp: Make the address lists per network namespace - Move the address lists into struct net - Add per network namespace initialization and cleanup - Pass around struct net so it is everywhere I need it. - Rename all of the global variable references into references to the variables moved into struct net Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Acked-by: Vlad Yasevich <vyasevich@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-08-14 23:12:17 -07:00
Eric W. Biederman	4110cc255d	sctp: Make the association hashtable handle multiple network namespaces - Use struct net in the hash calculation - Use sock_net(association.base.sk) in the association lookups. - On receive calculate the network namespace from skb->dev. - Pass struct net from receive down to the functions that actually do the association lookup. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Acked-by: Vlad Yasevich <vyasevich@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-08-14 22:44:12 -07:00
Eric W. Biederman	4cdadcbcb6	sctp: Make the endpoint hashtable handle multiple network namespaces - Use struct net in the hash calculation - Use sock_net(endpoint.base.sk) in the endpoint lookups. - On receive calculate the network namespace from skb->dev. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Acked-by: Vlad Yasevich <vyasevich@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-08-14 22:44:12 -07:00
Eric W. Biederman	f1f4376307	sctp: Make the port hash table use struct net in it's key. - Add struct net into the port hash table hash calculation - Add struct net inot the struct sctp_bind_bucket so there is a memory of which network namespace a port is allocated in. No need for a ref count because sctp_bind_bucket only exists when there are sockets in the hash table and sockets can not change their network namspace, and sockets already ref count their network namespace. - Add struct net into the key comparison when we are testing to see if we have found the port hash table entry we are looking for. With these changes lookups in the port hash table becomes safe to use in multiple network namespaces. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Acked-by: Vlad Yasevich <vyasevich@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-08-14 22:44:12 -07:00
Eric W. Biederman	af4c6641f5	net sched: Pass the skb into change so it can access NETLINK_CB cls_flow.c plays with uids and gids. Unless I misread that code it is possible for classifiers to depend on the specific uid and gid values. Therefore I need to know the user namespace of the netlink socket that is installing the packet classifiers. Pass in the rtnetlink skb so I can access the NETLINK_CB of the passed packet. In particular I want access to sk_user_ns(NETLINK_CB(in_skb).ssk). Pass in not the user namespace but the incomming rtnetlink skb into the the classifier change routines as that is generally the more useful parameter. Cc: Jamal Hadi Salim <jhs@mojatatu.com> Acked-by: David S. Miller <davem@davemloft.net> Acked-by: Serge Hallyn <serge.hallyn@canonical.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>	2012-08-14 21:55:28 -07:00
Eric W. Biederman	c336d148ad	userns: Implement sk_user_ns Add a helper sk_user_ns to make it easy to find the user namespace of the process that opened a socket. Acked-by: David S. Miller <davem@davemloft.net> Acked-by: Serge Hallyn <serge.hallyn@canonical.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>	2012-08-14 21:49:56 -07:00
Eric W. Biederman	d13fda8564	userns: Convert net/ax25 to use kuid_t where appropriate Cc: Ralf Baechle <ralf@linux-mips.org> Acked-by: David S. Miller <davem@davemloft.net> Acked-by: Serge Hallyn <serge.hallyn@canonical.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>	2012-08-14 21:49:42 -07:00
Eric W. Biederman	4f82f45730	net ip6 flowlabel: Make owner a union of struct pid * and kuid_t Correct a long standing omission and use struct pid in the owner field of struct ip6_flowlabel when the share type is IPV6_FL_S_PROCESS. This guarantees we don't have issues when pid wraparound occurs. Use a kuid_t in the owner field of struct ip6_flowlabel when the share type is IPV6_FL_S_USER to add user namespace support. In /proc/net/ip6_flowlabel capture the current pid namespace when opening the file and release the pid namespace when the file is closed ensuring we print the pid owner value that is meaning to the reader of the file. Similarly use from_kuid_munged to print uid values that are meaningful to the reader of the file. This requires exporting pid_nr_ns so that ipv6 can continue to built as a module. Yoiks what silliness Acked-by: David S. Miller <davem@davemloft.net> Acked-by: Serge Hallyn <serge.hallyn@canonical.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>	2012-08-14 21:49:25 -07:00
Eric W. Biederman	7064d16e16	userns: Use kgids for sysctl_ping_group_range - Store sysctl_ping_group_range as a paire of kgid_t values instead of a pair of gid_t values. - Move the kgid conversion work from ping_init_sock into ipv4_ping_group_range - For invalid cases reset to the default disabled state. With the kgid_t conversion made part of the original value sanitation from userspace understand how the code will react becomes clearer and it becomes possible to set the sysctl ping group range from something other than the initial user namespace. Cc: Vasiliy Kulikov <segoon@openwall.com> Acked-by: David S. Miller <davem@davemloft.net> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>	2012-08-14 21:49:10 -07:00
Eric W. Biederman	a7cb5a49bf	userns: Print out socket uids in a user namespace aware fashion. Cc: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru> Cc: James Morris <jmorris@namei.org> Cc: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org> Cc: Patrick McHardy <kaber@trash.net> Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net> Cc: Sridhar Samudrala <sri@us.ibm.com> Acked-by: Vlad Yasevich <vyasevich@gmail.com> Acked-by: David S. Miller <davem@davemloft.net> Acked-by: Serge Hallyn <serge.hallyn@canonical.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>	2012-08-14 21:48:06 -07:00
Eric W. Biederman	976d020150	userns: Convert sock_i_uid to return a kuid_t Acked-by: David S. Miller <davem@davemloft.net> Acked-by: Serge Hallyn <serge.hallyn@canonical.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>	2012-08-14 21:47:34 -07:00
Vinicius Costa Gomes	57f5d0d1d9	Bluetooth: Remove some functions from being exported Some connection related functions are only used inside hci_conn.c so no need to have them exported. Signed-off-by: Vinicius Costa Gomes <vinicius.gomes@openbossa.org> Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>	2012-08-15 00:53:11 -03:00
Ben Hutchings	6024935f5f	llc2: Fix silent failure of llc_station_init() llc_station_init() creates and processes an event skb with no effect other than to change the state from DOWN to UP. Allocation failure is reported, but then ignored by its caller, llc2_init(). Remove this possibility by simply initialising the state as UP. Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-08-14 16:51:18 -07:00
xeb@mail.ru	c12b395a46	gre: Support GRE over IPv6 GRE over IPv6 implementation. Signed-off-by: Dmitry Kozlov <xeb@mail.ru> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-08-14 14:28:32 -07:00
Eric Dumazet	2359a47671	codel: refine one condition to avoid a nul rec_inv_sqrt One condition before codel_Newton_step() was not good if we never left the dropping state for a flow. As a result rec_inv_sqrt was 0, instead of the ~0 initial value. codel control law was then set to a very aggressive mode, dropping many packets before reaching 'target' and recovering from this problem. To keep codel_vars_init() as efficient as possible, refine the condition to make sure rec_inv_sqrt initial value is correct Many thanks to Anton Mich for discovering the issue and suggesting a fix. Reported-by: Anton Mich <lp2s1h@gmail.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-08-10 16:52:54 -07:00
Eric Dumazet	b5ec8eeac4	ipv4: fix ip_send_skb() ip_send_skb() can send orphaned skb, so we must pass the net pointer to avoid possible NULL dereference in error path. Bug added by commit `3a7c384ffd` (ipv4: tcp: unicast_sock should not land outside of TCP stack) Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-08-10 14:08:57 -07:00
John W. Linville	57f784fed3	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth-next	2012-08-10 15:13:12 -04:00
Eric Dumazet	63d02d157e	net: tcp: ipv6_mapped needs sk_rx_dst_set method commit `5d299f3d3c` (net: ipv6: fix TCP early demux) added a regression for ipv6_mapped case. [ 67.422369] SELinux: initialized (dev autofs, type autofs), uses genfs_contexts [ 67.449678] SELinux: initialized (dev autofs, type autofs), uses genfs_contexts [ 92.631060] BUG: unable to handle kernel NULL pointer dereference at (null) [ 92.631435] IP: [< (null)>] (null) [ 92.631645] PGD 0 [ 92.631846] Oops: 0010 [#1] SMP [ 92.632095] Modules linked in: autofs4 sunrpc ipv6 dm_mirror dm_region_hash dm_log dm_multipath dm_mod video sbs sbshc battery ac lp parport sg snd_hda_intel snd_hda_codec snd_seq_oss snd_seq_midi_event snd_seq snd_seq_device pcspkr snd_pcm_oss snd_mixer_oss snd_pcm snd_timer serio_raw button floppy snd i2c_i801 i2c_core soundcore snd_page_alloc shpchp ide_cd_mod cdrom microcode ehci_hcd ohci_hcd uhci_hcd [ 92.634294] CPU 0 [ 92.634294] Pid: 4469, comm: sendmail Not tainted 3.6.0-rc1 #3 [ 92.634294] RIP: 0010:[<0000000000000000>] [< (null)>] (null) [ 92.634294] RSP: 0018:ffff880245fc7cb0 EFLAGS: 00010282 [ 92.634294] RAX: ffffffffa01985f0 RBX: ffff88024827ad00 RCX: 0000000000000000 [ 92.634294] RDX: 0000000000000218 RSI: ffff880254735380 RDI: ffff88024827ad00 [ 92.634294] RBP: ffff880245fc7cc8 R08: 0000000000000001 R09: 0000000000000000 [ 92.634294] R10: 0000000000000000 R11: ffff880245fc7bf8 R12: ffff880254735380 [ 92.634294] R13: ffff880254735380 R14: 0000000000000000 R15: 7fffffffffff0218 [ 92.634294] FS: 00007f4516ccd6f0(0000) GS:ffff880256600000(0000) knlGS:0000000000000000 [ 92.634294] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b [ 92.634294] CR2: 0000000000000000 CR3: 0000000245ed1000 CR4: 00000000000007f0 [ 92.634294] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 92.634294] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 [ 92.634294] Process sendmail (pid: 4469, threadinfo ffff880245fc6000, task ffff880254b8cac0) [ 92.634294] Stack: [ 92.634294] ffffffff813837a7 ffff88024827ad00 ffff880254b6b0e8 ffff880245fc7d68 [ 92.634294] ffffffff81385083 00000000001d2680 ffff8802547353a8 ffff880245fc7d18 [ 92.634294] ffffffff8105903a ffff88024827ad60 0000000000000002 00000000000000ff [ 92.634294] Call Trace: [ 92.634294] [<ffffffff813837a7>] ? tcp_finish_connect+0x2c/0xfa [ 92.634294] [<ffffffff81385083>] tcp_rcv_state_process+0x2b6/0x9c6 [ 92.634294] [<ffffffff8105903a>] ? sched_clock_cpu+0xc3/0xd1 [ 92.634294] [<ffffffff81059073>] ? local_clock+0x2b/0x3c [ 92.634294] [<ffffffff8138caf3>] tcp_v4_do_rcv+0x63a/0x670 [ 92.634294] [<ffffffff8133278e>] release_sock+0x128/0x1bd [ 92.634294] [<ffffffff8139f060>] __inet_stream_connect+0x1b1/0x352 [ 92.634294] [<ffffffff813325f5>] ? lock_sock_nested+0x74/0x7f [ 92.634294] [<ffffffff8104b333>] ? wake_up_bit+0x25/0x25 [ 92.634294] [<ffffffff813325f5>] ? lock_sock_nested+0x74/0x7f [ 92.634294] [<ffffffff8139f223>] ? inet_stream_connect+0x22/0x4b [ 92.634294] [<ffffffff8139f234>] inet_stream_connect+0x33/0x4b [ 92.634294] [<ffffffff8132e8cf>] sys_connect+0x78/0x9e [ 92.634294] [<ffffffff813fd407>] ? sysret_check+0x1b/0x56 [ 92.634294] [<ffffffff81088503>] ? __audit_syscall_entry+0x195/0x1c8 [ 92.634294] [<ffffffff811cc26e>] ? trace_hardirqs_on_thunk+0x3a/0x3f [ 92.634294] [<ffffffff813fd3e2>] system_call_fastpath+0x16/0x1b [ 92.634294] Code: Bad RIP value. [ 92.634294] RIP [< (null)>] (null) [ 92.634294] RSP <ffff880245fc7cb0> [ 92.634294] CR2: 0000000000000000 [ 92.648982] ---[ end trace 24e2bed94314c8d9 ]--- [ 92.649146] Kernel panic - not syncing: Fatal exception in interrupt Fix this using inet_sk_rx_dst_set(), and export this function in case IPv6 is modular. Reported-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-08-09 20:56:09 -07:00
Julian Anastasov	3654e61137	ipvs: add pmtu_disc option to disable IP DF for TUN packets Disabling PMTU discovery can increase the output packet rate but some users have enough resources and prefer to fragment than to drop traffic. By default, we copy the DF bit but if pmtu_disc is disabled we do not send FRAG_NEEDED messages anymore. Signed-off-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Simon Horman <horms@verge.net.au>	2012-08-10 10:35:07 +09:00
Julian Anastasov	be97fdb5fb	ipvs: generalize app registration in netns Get rid of the ftp_app pointer and allow applications to be registered without adding fields in the netns_ipvs structure. v2: fix coding style as suggested by Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Simon Horman <horms@verge.net.au>	2012-08-10 10:34:51 +09:00
Pavel Emelyanov	1fb9489bf1	net: Loopback ifindex is constant now As pointed out, there are places, that access net->loopback_dev->ifindex and after ifindex generation is made per-net this value becomes constant equals 1. So go ahead and introduce the LOOPBACK_IFINDEX constant and use it where appropriate. Signed-off-by: Pavel Emelyanov <xemul@parallels.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-08-09 16:18:07 -07:00
Pavel Emelyanov	aa79e66eee	net: Make ifindex generation per-net namespace Strictly speaking this is only _really_ required for checkpoint-restore to make loopback device always have the same index. This change appears to be safe wrt "ifindex should be unique per-system" concept, as all the ifindex usage is either already made per net namespace of is explicitly limited with init_net only. There are two cool side effects of this. The first one -- ifindices of devices in container are always small, regardless of how many containers we've started (and re-started) so far. The second one is -- we can speed up the loopback ifidex access as shown in the next patch. v2: Place ifindex right after dev_base_seq : avoid two holes and use the same cache line, dirtied in list_netdevice()/unlist_netdevice() Signed-off-by: Pavel Emelyanov <xemul@parallels.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-08-09 16:18:07 -07:00
Pavel Emelyanov	b14f243a42	net: Dont use ifindices in hash fns Eric noticed, that when there will be devices with equal indices, some hash functions that use them will become less effective as they could. Fix this in advance by mixing the net_device address into the hash value instead of the device index. This is true for arp and ndisc hash fns. The netlabel, can and llc ones are also ifindex-based, but that three are init_net-only, thus will not be affected. Many thanks to David and Eric for the hash32_ptr implementation! Signed-off-by: Pavel Emelyanov <xemul@parallels.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-08-09 16:18:06 -07:00
Eric Dumazet	a37e6e3449	net: force dst_default_metrics to const section While investigating on network performance problems, I found this little gem : $ nm -v vmlinux \| grep -1 dst_default_metrics ffffffff82736540 b busy.46605 ffffffff82736560 B dst_default_metrics ffffffff82736598 b dst_busy_list Apparently, declaring a const array without initializer put it in (writeable) bss section, in middle of possibly often dirtied cache lines. Since we really want dst_default_metrics be const to avoid any possible false sharing and catch any buggy writes, I force a null initializer. ffffffff818a4c20 R dst_default_metrics Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Ben Hutchings <bhutchings@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-08-08 16:00:28 -07:00
Eric Dumazet	425f09ab7d	net: output path optimizations 1) Avoid dirtying neighbour's confirmed field. TCP workloads hits this cache line for each incoming ACK. Lets write n->confirmed only if there is a jiffie change. 2) Optimize neigh_hh_output() for the common Ethernet case, were hh_len is less than 16 bytes. Replace the memcpy() call by two inlined 64bit load/stores on x86_64. Bench results using udpflood test, with -C option (MSG_CONFIRM flag added to sendto(), to reproduce the n->confirmed dirtying on UDP) 24 threads doing 1.000.000 UDP sendto() on dummy device, 4 runs. before : 2.247s, 2.235s, 2.247s, 2.318s after : 1.884s, 1.905s, 1.891s, 1.895s Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-08-07 16:24:55 -07:00
Eric Dumazet	d25398df59	net: avoid reloads in SNMP_UPD_PO_STATS Avoid two instructions to reload dev->nd_net->mib.ip_statistics pointer, unsing a temp variable, in ip_rcv(), ip_output() paths for example. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-08-06 13:40:47 -07:00
Eric Dumazet	5d299f3d3c	net: ipv6: fix TCP early demux IPv6 needs a cookie in dst_check() call. We need to add rx_dst_cookie and provide a family independent sk_rx_dst_set(sk, skb) method to properly support IPv6 TCP early demux. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-08-06 13:33:21 -07:00
Andre Guedes	b9b343d254	Bluetooth: Fix hci_le_conn_complete_evt We need to check the 'Role' parameter from the LE Connection Complete Event in order to properly set 'out' and 'link_mode' fields from hci_conn structure. Signed-off-by: Andre Guedes <andre.guedes@openbossa.org> Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>	2012-08-06 15:05:10 -03:00
Masatake YAMATO	256a06c8a8	Bluetooth: /proc/net/ entries for bluetooth protocols lsof command can tell the type of socket processes are using. Internal lsof uses inode numbers on socket fs to resolve the type of sockets. Files under /proc/net/, such as tcp, udp, unix, etc provides such inode information. Unfortunately bluetooth related protocols don't provide such inode information. This patch series introduces /proc/net files for the protocols. This patch against af_bluetooth.c provides facility to the implementation of protocols. This patch extends bt_sock_list and introduces two exported function bt_procfs_init, bt_procfs_cleanup. The type bt_sock_list is already used in some of implementation of protocols. bt_procfs_init prepare seq_operations which converts protocol own bt_sock_list data to protocol own proc entry when the entry is accessed. What I, lsof user, need is just inode number of bluetooth socket. However, people may want more information. The bt_procfs_init takes a function pointer for customizing the show handler of seq_operations. In v4 patch, __acquires and __releases attributes are added to suppress sparse warning. Suggested by Andrei Emeltchenko. In v5 patch, linux/proc_fs.h is included to use PDE. Build error is reported by Fengguang Wu. Signed-off-by: Masatake YAMATO <yamato@redhat.com> Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>	2012-08-06 15:02:58 -03:00
Jaganath Kanakkassery	4af66c691f	Bluetooth: Free the l2cap channel list only when refcount is zero Move the l2cap channel list chan->global_l under the refcnt protection and free it based on the refcnt. Signed-off-by: Jaganath Kanakkassery <jaganath.k@samsung.com> Signed-off-by: Syam Sidhardhan <s.syam@samsung.com> Reviewed-by: Andrei Emeltchenko <andrei.emeltchenko@intel.com> Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>	2012-08-06 15:02:58 -03:00
Jaganath Kanakkassery	3064837289	Bluetooth: Move l2cap_chan_hold/put to l2cap_core.c Refactor the code in order to use the l2cap_chan_destroy() from l2cap_chan_put() under the refcnt protection. Signed-off-by: Jaganath Kanakkassery <jaganath.k@samsung.com> Signed-off-by: Syam Sidhardhan <s.syam@samsung.com> Reviewed-by: Andrei Emeltchenko <andrei.emeltchenko@intel.com> Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>	2012-08-06 15:02:58 -03:00
Andrei Emeltchenko	9e66463127	Bluetooth: Make connect / disconnect cfm functions return void Return values are never used because callers hci_proto_connect_cfm and hci_proto_disconn_cfm return void. Signed-off-by: Andrei Emeltchenko <andrei.emeltchenko@intel.com> Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>	2012-08-06 15:02:58 -03:00
Andrei Emeltchenko	ab846ec4ea	Bluetooth: Define AMP controller statuses AMP status codes copied from Bluez patch sent by Peter Krystad <pkrystad@codeaurora.org>. Signed-off-by: Andrei Emeltchenko <andrei.emeltchenko@intel.com> Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>	2012-08-06 15:02:55 -03:00
Andrei Emeltchenko	b93a68295f	Bluetooth: trivial: Fix mixing spaces and tabs in smp Change spaces to tabs in smp code Signed-off-by: Andrei Emeltchenko <andrei.emeltchenko@intel.com> Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>	2012-08-06 15:02:55 -03:00
Andrei Emeltchenko	71becf0cea	Bluetooth: debug: Fix printing refcnt for hci_conn Use the same style for refcnt printing through all Bluetooth code taking the reference the l2cap_chan refcnt printing. Signed-off-by: Andrei Emeltchenko <andrei.emeltchenko@intel.com> Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>	2012-08-06 15:02:55 -03:00
Andrei Emeltchenko	bb4b2a9ae3	Bluetooth: mgmt: Managing only BR/EDR HCI controllers Add check that HCI controller is BR/EDR. AMP controller shall not be managed by mgmt interface and consequently user space. Signed-off-by: Andrei Emeltchenko <andrei.emeltchenko@intel.com> Acked-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>	2012-08-06 15:02:54 -03:00
Andre Guedes	3f1732462c	Bluetooth: Remove missing code This patch removes the struct adv_entry since it is not used anymore. This struct should have been removed in commit `479453d` (Bluetooth: Remove advertising cache). Signed-off-by: Andre Guedes <andre.guedes@openbossa.org> Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>	2012-08-06 15:02:54 -03:00
Greg Kroah-Hartman	c87985a3ce	Merge tty-next into 3.6-rc1 This handles the merge issue in: arch/um/drivers/line.c arch/um/drivers/line.h And resolves the duplicate patches that were in both trees do to the tty-next branch not getting merged into 3.6-rc1. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2012-08-06 09:48:31 -07:00
Jiri Pirko	4778e0be16	netlink: add signed types Signed types might be needed in NL communication from time to time (I need s32 in team driver), so add them. Signed-off-by: Jiri Pirko <jiri@resnulli.us> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-08-03 20:40:11 -07:00
John W. Linville	1a26904eb6	Merge branch 'for-john' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211	2012-08-02 13:49:38 -04:00
Seth Forshee	03f6b0843a	cfg80211: add channel flag to prohibit OFDM operation Currently the only way for wireless drivers to tell whether or not OFDM is allowed on the current channel is to check the regulatory information. However, this requires hodling cfg80211_mutex, which is not visible to the drivers. Other regulatory restrictions are provided as flags in the channel definition, so let's do similarly with OFDM. This patch adds a new flag, IEEE80211_CHAN_NO_OFDM, to tell drivers that OFDM on a channel is not allowed. This flag is set on any channels for which regulatory indicates that OFDM is prohibited. Signed-off-by: Seth Forshee <seth.forshee@canonical.com> Tested-by: Arend van Spriel <arend@broadcom.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2012-08-02 15:30:49 +02:00
Fan Du	e3c0d04750	Fix unexpected SA hard expiration after changing date After SA is setup, one timer is armed to detect soft/hard expiration, however the timer handler uses xtime to do the math. This makes hard expiration occurs first before soft expiration after setting new date with big interval. As a result new child SA is deleted before rekeying the new one. Signed-off-by: Fan Du <fdu@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-08-02 00:19:17 -07:00
Ben Hutchings	1485348d24	tcp: Apply device TSO segment limit earlier Cache the device gso_max_segs in sock::sk_gso_max_segs and use it to limit the size of TSO skbs. This avoids the need to fall back to software GSO for local TCP senders. Signed-off-by: Ben Hutchings <bhutchings@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-08-02 00:19:17 -07:00
Linus Torvalds	ac694dbdbc	Merge branch 'akpm' (Andrew's patch-bomb) Merge Andrew's second set of patches: - MM - a few random fixes - a couple of RTC leftovers * emailed patches from Andrew Morton <akpm@linux-foundation.org>: (120 commits) rtc/rtc-88pm80x: remove unneed devm_kfree rtc/rtc-88pm80x: assign ret only when rtc_register_driver fails mm: hugetlbfs: close race during teardown of hugetlbfs shared page tables tmpfs: distribute interleave better across nodes mm: remove redundant initialization mm: warn if pg_data_t isn't initialized with zero mips: zero out pg_data_t when it's allocated memcg: gix memory accounting scalability in shrink_page_list mm/sparse: remove index_init_lock mm/sparse: more checks on mem_section number mm/sparse: optimize sparse_index_alloc memcg: add mem_cgroup_from_css() helper memcg: further prevent OOM with too many dirty pages memcg: prevent OOM with too many dirty pages mm: mmu_notifier: fix freed page still mapped in secondary MMU mm: memcg: only check anon swapin page charges for swap cache mm: memcg: only check swap cache pages for repeated charging mm: memcg: split swapin charge function into private and public part mm: memcg: remove needless !mm fixup to init_mm when charging mm: memcg: remove unneeded shmem charge type ...	2012-07-31 19:25:39 -07:00
Mel Gorman	c76562b670	netvm: prevent a stream-specific deadlock This patch series is based on top of "Swap-over-NBD without deadlocking v15" as it depends on the same reservation of PF_MEMALLOC reserves logic. When a user or administrator requires swap for their application, they create a swap partition and file, format it with mkswap and activate it with swapon. In diskless systems this is not an option so if swap if required then swapping over the network is considered. The two likely scenarios are when blade servers are used as part of a cluster where the form factor or maintenance costs do not allow the use of disks and thin clients. The Linux Terminal Server Project recommends the use of the Network Block Device (NBD) for swap but this is not always an option. There is no guarantee that the network attached storage (NAS) device is running Linux or supports NBD. However, it is likely that it supports NFS so there are users that want support for swapping over NFS despite any performance concern. Some distributions currently carry patches that support swapping over NFS but it would be preferable to support it in the mainline kernel. Patch 1 avoids a stream-specific deadlock that potentially affects TCP. Patch 2 is a small modification to SELinux to avoid using PFMEMALLOC reserves. Patch 3 adds three helpers for filesystems to handle swap cache pages. For example, page_file_mapping() returns page->mapping for file-backed pages and the address_space of the underlying swap file for swap cache pages. Patch 4 adds two address_space_operations to allow a filesystem to pin all metadata relevant to a swapfile in memory. Upon successful activation, the swapfile is marked SWP_FILE and the address space operation ->direct_IO is used for writing and ->readpage for reading in swap pages. Patch 5 notes that patch 3 is bolting filesystem-specific-swapfile-support onto the side and that the default handlers have different information to what is available to the filesystem. This patch refactors the code so that there are generic handlers for each of the new address_space operations. Patch 6 adds an API to allow a vector of kernel addresses to be translated to struct pages and pinned for IO. Patch 7 adds support for using highmem pages for swap by kmapping the pages before calling the direct_IO handler. Patch 8 updates NFS to use the helpers from patch 3 where necessary. Patch 9 avoids setting PF_private on PG_swapcache pages within NFS. Patch 10 implements the new swapfile-related address_space operations for NFS and teaches the direct IO handler how to manage kernel addresses. Patch 11 prevents page allocator recursions in NFS by using GFP_NOIO where appropriate. Patch 12 fixes a NULL pointer dereference that occurs when using swap-over-NFS. With the patches applied, it is possible to mount a swapfile that is on an NFS filesystem. Swap performance is not great with a swap stress test taking roughly twice as long to complete than if the swap device was backed by NBD. This patch: netvm: prevent a stream-specific deadlock It could happen that all !SOCK_MEMALLOC sockets have buffered so much data that we're over the global rmem limit. This will prevent SOCK_MEMALLOC buffers from receiving data, which will prevent userspace from running, which is needed to reduce the buffered data. Fix this by exempting the SOCK_MEMALLOC sockets from the rmem limit. Once this change it applied, it is important that sockets that set SOCK_MEMALLOC do not clear the flag until the socket is being torn down. If this happens, a warning is generated and the tokens reclaimed to avoid accounting errors until the bug is fixed. [davem@davemloft.net: Warning about clearing SOCK_MEMALLOC] Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Mel Gorman <mgorman@suse.de> Acked-by: David S. Miller <davem@davemloft.net> Acked-by: Rik van Riel <riel@redhat.com> Cc: Trond Myklebust <Trond.Myklebust@netapp.com> Cc: Neil Brown <neilb@suse.de> Cc: Christoph Hellwig <hch@infradead.org> Cc: Mike Christie <michaelc@cs.wisc.edu> Cc: Eric B Munson <emunson@mgebm.net> Cc: Sebastian Andrzej Siewior <sebastian@breakpoint.cc> Cc: Mel Gorman <mgorman@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2012-07-31 18:42:47 -07:00
Mel Gorman	b4b9e35585	netvm: set PF_MEMALLOC as appropriate during SKB processing In order to make sure pfmemalloc packets receive all memory needed to proceed, ensure processing of pfmemalloc SKBs happens under PF_MEMALLOC. This is limited to a subset of protocols that are expected to be used for writing to swap. Taps are not allowed to use PF_MEMALLOC as these are expected to communicate with userspace processes which could be paged out. [a.p.zijlstra@chello.nl: Ideas taken from various patches] [jslaby@suse.cz: Lock imbalance fix] Signed-off-by: Mel Gorman <mgorman@suse.de> Acked-by: David S. Miller <davem@davemloft.net> Cc: Neil Brown <neilb@suse.de> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Christie <michaelc@cs.wisc.edu> Cc: Eric B Munson <emunson@mgebm.net> Cc: Eric Dumazet <eric.dumazet@gmail.com> Cc: Sebastian Andrzej Siewior <sebastian@breakpoint.cc> Cc: Mel Gorman <mgorman@suse.de> Cc: Christoph Lameter <cl@linux.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2012-07-31 18:42:46 -07:00
Mel Gorman	c93bdd0e03	netvm: allow skb allocation to use PFMEMALLOC reserves Change the skb allocation API to indicate RX usage and use this to fall back to the PFMEMALLOC reserve when needed. SKBs allocated from the reserve are tagged in skb->pfmemalloc. If an SKB is allocated from the reserve and the socket is later found to be unrelated to page reclaim, the packet is dropped so that the memory remains available for page reclaim. Network protocols are expected to recover from this packet loss. [a.p.zijlstra@chello.nl: Ideas taken from various patches] [davem@davemloft.net: Use static branches, coding style corrections] [sebastian@breakpoint.cc: Avoid unnecessary cast, fix !CONFIG_NET build] Signed-off-by: Mel Gorman <mgorman@suse.de> Acked-by: David S. Miller <davem@davemloft.net> Cc: Neil Brown <neilb@suse.de> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Christie <michaelc@cs.wisc.edu> Cc: Eric B Munson <emunson@mgebm.net> Cc: Eric Dumazet <eric.dumazet@gmail.com> Cc: Sebastian Andrzej Siewior <sebastian@breakpoint.cc> Cc: Mel Gorman <mgorman@suse.de> Cc: Christoph Lameter <cl@linux.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2012-07-31 18:42:46 -07:00
Mel Gorman	7cb0240492	netvm: allow the use of __GFP_MEMALLOC by specific sockets Allow specific sockets to be tagged SOCK_MEMALLOC and use __GFP_MEMALLOC for their allocations. These sockets will be able to go below watermarks and allocate from the emergency reserve. Such sockets are to be used to service the VM (iow. to swap over). They must be handled kernel side, exposing such a socket to user-space is a bug. There is a risk that the reserves be depleted so for now, the administrator is responsible for increasing min_free_kbytes as necessary to prevent deadlock for their workloads. [a.p.zijlstra@chello.nl: Original patches] Signed-off-by: Mel Gorman <mgorman@suse.de> Acked-by: David S. Miller <davem@davemloft.net> Cc: Neil Brown <neilb@suse.de> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Christie <michaelc@cs.wisc.edu> Cc: Eric B Munson <emunson@mgebm.net> Cc: Eric Dumazet <eric.dumazet@gmail.com> Cc: Sebastian Andrzej Siewior <sebastian@breakpoint.cc> Cc: Mel Gorman <mgorman@suse.de> Cc: Christoph Lameter <cl@linux.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2012-07-31 18:42:46 -07:00
Mel Gorman	99a1dec70d	net: introduce sk_gfp_atomic() to allow addition of GFP flags depending on the individual socket Introduce sk_gfp_atomic(), this function allows to inject sock specific flags to each sock related allocation. It is only used on allocation paths that may be required for writing pages back to network storage. [davem@davemloft.net: Use sk_gfp_atomic only when necessary] Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Mel Gorman <mgorman@suse.de> Acked-by: David S. Miller <davem@davemloft.net> Cc: Neil Brown <neilb@suse.de> Cc: Mike Christie <michaelc@cs.wisc.edu> Cc: Eric B Munson <emunson@mgebm.net> Cc: Eric Dumazet <eric.dumazet@gmail.com> Cc: Sebastian Andrzej Siewior <sebastian@breakpoint.cc> Cc: Mel Gorman <mgorman@suse.de> Cc: Christoph Lameter <cl@linux.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2012-07-31 18:42:46 -07:00
Andrew Morton	c255a45805	memcg: rename config variables Sanity: CONFIG_CGROUP_MEM_RES_CTLR -> CONFIG_MEMCG CONFIG_CGROUP_MEM_RES_CTLR_SWAP -> CONFIG_MEMCG_SWAP CONFIG_CGROUP_MEM_RES_CTLR_SWAP_ENABLED -> CONFIG_MEMCG_SWAP_ENABLED CONFIG_CGROUP_MEM_RES_CTLR_KMEM -> CONFIG_MEMCG_KMEM [mhocko@suse.cz: fix missed bits] Cc: Glauber Costa <glommer@parallels.com> Acked-by: Michal Hocko <mhocko@suse.cz> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Hugh Dickins <hughd@google.com> Cc: Tejun Heo <tj@kernel.org> Cc: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Cc: David Rientjes <rientjes@google.com> Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2012-07-31 18:42:43 -07:00
David S. Miller	caacf05e5a	ipv4: Properly purge netdev references on uncached routes. When a device is unregistered, we have to purge all of the references to it that may exist in the entire system. If a route is uncached, we currently have no way of accomplishing this. So create a global list that is scanned when a network device goes down. This mirrors the logic in net/core/dst.c's dst_ifdown(). Signed-off-by: David S. Miller <davem@davemloft.net>	2012-07-31 15:06:50 -07:00
David S. Miller	c5038a8327	ipv4: Cache routes in nexthop exception entries. Signed-off-by: David S. Miller <davem@davemloft.net>	2012-07-31 15:02:02 -07:00
Eric Dumazet	d26b3a7c4b	ipv4: percpu nh_rth_output cache Input path is mostly run under RCU and doesnt touch dst refcnt But output path on forwarding or UDP workloads hits badly dst refcount, and we have lot of false sharing, for example in ipv4_mtu() when reading rt->rt_pmtu Using a percpu cache for nh_rth_output gives a nice performance increase at a small cost. 24 udpflood test on my 24 cpu machine (dummy0 output device) (each process sends 1.000.000 udp frames, 24 processes are started) before : 5.24 s after : 2.06 s For reference, time on linux-3.5 : 6.60 s Signed-off-by: Eric Dumazet <edumazet@google.com> Tested-by: Alexander Duyck <alexander.h.duyck@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-07-31 14:41:39 -07:00
Eric Dumazet	54764bb647	ipv4: Restore old dst_free() behavior. commit `404e0a8b6a` (net: ipv4: fix RCU races on dst refcounts) tried to solve a race but added a problem at device/fib dismantle time : We really want to call dst_free() as soon as possible, even if sockets still have dst in their cache. dst_release() calls in free_fib_info_rcu() are not welcomed. Root of the problem was that now we also cache output routes (in nh_rth_output), we must use call_rcu() instead of call_rcu_bh() in rt_free(), because output route lookups are done in process context. Based on feedback and initial patch from David Miller (adding another call_rcu_bh() call in fib, but it appears it was not the right fix) I left the inet_sk_rx_dst_set() helper and added __rcu attributes to nh_rth_output and nh_rth_input to better document what is going on in this code. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-07-31 14:41:38 -07:00
Thomas Huehn	36323f817a	mac80211: move TX station pointer and restructure TX Remove the control.sta pointer from ieee80211_tx_info to free up sufficient space in the TX skb control buffer for the upcoming Transmit Power Control (TPC). Instead, the pointer is now on the stack in a new control struct that is passed as a function parameter to the drivers' tx method. Signed-off-by: Thomas Huehn <thomas@net.t-labs.tu-berlin.de> Signed-off-by: Alina Friedrichsen <x-alina@gmx.net> Signed-off-by: Felix Fietkau <nbd@openwrt.org> [reworded commit message] Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2012-07-31 16:18:39 +02:00
Eliad Peller	ab09587740	mac80211: add PS flag to bss_conf Currently, ps mode is indicated per device (rather than per interface), which doesn't make a lot of sense. Moreover, there are subtle bugs caused by the inability to indicate ps change along with other changes (e.g. when the AP deauth us, we'd like to indicate CHANGED_PS \| CHANGED_ASSOC, as changing PS before notifying about disassociation will result in null-packets being sent (if IEEE80211_HW_SUPPORTS_DYNAMIC_PS) while the sta is already disconnected.) Keep the current per-device notifications, and add parallel per-vif notifications. In order to keep it simple, the per-device ps and the per-vif ps are orthogonal - the per-vif ps configuration is determined only by the user configuration (enable/disable) and the connection state, and is not affected by other vifs state and (temporary) dynamic_ps/offchannel operations (unlike per-device ps). Signed-off-by: Eliad Peller <eliad@wizery.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2012-07-31 16:11:04 +02:00
Eric Dumazet	0c7462a235	ipv4: remove rt_cache_rebuild_count After IP route cache removal, rt_cache_rebuild_count is no longer used. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-07-30 14:53:22 -07:00
Eric Dumazet	404e0a8b6a	net: ipv4: fix RCU races on dst refcounts commit `c6cffba4ff` (ipv4: Fix input route performance regression.) added various fatal races with dst refcounts. crashes happen on tcp workloads if routes are added/deleted at the same time. The dst_free() calls from free_fib_info_rcu() are clearly racy. We need instead regular dst refcounting (dst_release()) and make sure dst_release() is aware of RCU grace periods : Add DST_RCU_FREE flag so that dst_release() respects an RCU grace period before dst destruction for cached dst Introduce a new inet_sk_rx_dst_set() helper, using atomic_inc_not_zero() to make sure we dont increase a zero refcount (On a dst currently waiting an rcu grace period before destruction) rt_cache_route() must take a reference on the new cached route, and release it if was not able to install it. With this patch, my machines survive various benchmarks. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-07-30 14:53:22 -07:00
Eric Dumazet	c7109986db	ipv6: Early TCP socket demux This is the IPv6 missing bits for infrastructure added in commit `41063e9dd1` (ipv4: Early TCP socket demux.) Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-07-26 15:50:39 -07:00
David S. Miller	c6cffba4ff	ipv4: Fix input route performance regression. With the routing cache removal we lost the "noref" code paths on input, and this can kill some routing workloads. Reinstate the noref path when we hit a cached route in the FIB nexthops. With help from Eric Dumazet. Reported-by: Alexander Duyck <alexander.duyck@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-07-26 15:50:39 -07:00
Linus Torvalds	3c4cfadef6	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next Pull networking changes from David S Miller: 1) Remove the ipv4 routing cache. Now lookups go directly into the FIB trie and use prebuilt routes cached there. No more garbage collection, no more rDOS attacks on the routing cache. Instead we now get predictable and consistent performance, no matter what the pattern of traffic we service. This has been almost 2 years in the making. Special thanks to Julian Anastasov, Eric Dumazet, Steffen Klassert, and others who have helped along the way. I'm sure that with a change of this magnitude there will be some kind of fallout, but such things ought the be simple to fix at this point. Luckily I'm not European so I'll be around all of August to fix things :-) The major stages of this work here are each fronted by a forced merge commit whose commit message contains a top-level description of the motivations and implementation issues. 2) Pre-demux of established ipv4 TCP sockets, saves a route demux on input. 3) TCP SYN/ACK performance tweaks from Eric Dumazet. 4) Add namespace support for netfilter L4 conntrack helpers, from Gao Feng. 5) Add config mechanism for Energy Efficient Ethernet to ethtool, from Yuval Mintz. 6) Remove quadratic behavior from /proc/net/unix, from Eric Dumazet. 7) Support for connection tracker helpers in userspace, from Pablo Neira Ayuso. 8) Allow userspace driven TX load balancing functions in TEAM driver, from Jiri Pirko. 9) Kill off NLMSG_PUT and RTA_PUT macros, more gross stuff with embedded gotos. 10) TCP Small Queues, essentially minimize the amount of TCP data queued up in the packet scheduler layer. Whereas the existing BQL (Byte Queue Limits) limits the pkt_sched --> netdevice queuing levels, this controls the TCP --> pkt_sched queueing levels. From Eric Dumazet. 11) Reduce the number of get_page/put_page ops done on SKB fragments, from Alexander Duyck. 12) Implement protection against blind resets in TCP (RFC 5961), from Eric Dumazet. 13) Support the client side of TCP Fast Open, basically the ability to send data in the SYN exchange, from Yuchung Cheng. Basically, the sender queues up data with a sendmsg() call using MSG_FASTOPEN, then they do the connect() which emits the queued up fastopen data. 14) Avoid all the problems we get into in TCP when timers or PMTU events hit a locked socket. The TCP Small Queues changes added a tcp_release_cb() that allows us to queue work up to the release_sock() caller, and that's what we use here too. From Eric Dumazet. 15) Zero copy on TX support for TUN driver, from Michael S. Tsirkin. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1870 commits) genetlink: define lockdep_genl_is_held() when CONFIG_LOCKDEP r8169: revert "add byte queue limit support". ipv4: Change rt->rt_iif encoding. net: Make skb->skb_iif always track skb->dev ipv4: Prepare for change of rt->rt_iif encoding. ipv4: Remove all RTCF_DIRECTSRC handliing. ipv4: Really ignore ICMP address requests/replies. decnet: Don't set RTCF_DIRECTSRC. net/ipv4/ip_vti.c: Fix __rcu warnings detected by sparse. ipv4: Remove redundant assignment rds: set correct msg_namelen openvswitch: potential NULL deref in sample() tcp: dont drop MTU reduction indications bnx2x: Add new 57840 device IDs tcp: avoid oops in tcp_metrics and reset tcpm_stamp niu: Change niu_rbr_fill() to use unlikely() to check niu_rbr_add_page() return value niu: Fix to check for dma mapping errors. net: Fix references to out-of-scope variables in put_cmsg_compat() net: ethernet: davinci_emac: add pm_runtime support net: ethernet: davinci_emac: Remove unnecessary #include ...	2012-07-24 10:01:50 -07:00
David S. Miller	13378cad02	ipv4: Change rt->rt_iif encoding. On input packet processing, rt->rt_iif will be zero if we should use skb->dev->ifindex. Since we access rt->rt_iif consistently via inet_iif(), that is the only spot whose interpretation have to adjust. Signed-off-by: David S. Miller <davem@davemloft.net>	2012-07-23 16:36:27 -07:00
David S. Miller	92101b3b2e	ipv4: Prepare for change of rt->rt_iif encoding. Use inet_iif() consistently, and for TCP record the input interface of cached RX dst in inet sock. rt->rt_iif is going to be encoded differently, so that we can legitimately cache input routes in the FIB info more aggressively. When the input interface is "use SKB device index" the rt->rt_iif will be set to zero. This forces us to move the TCP RX dst cache installation into the ipv4 specific code, and as well it should since doing the route caching for ipv6 is pointless at the moment since it is not inspected in the ipv6 input paths yet. Also, remove the unlikely on dst->obsolete, all ipv4 dsts have obsolete set to a non-zero value to force invocation of the check callback. Signed-off-by: David S. Miller <davem@davemloft.net>	2012-07-23 16:36:26 -07:00
Linus Torvalds	a66d2c8f7e	Merge branch 'for-linus-2' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull the big VFS changes from Al Viro: "This one is big and changes quite a few things around VFS. What's in there: - the first of two really major architecture changes - death to open intents. The former is finally there; it was very long in making, but with Miklos getting through really hard and messy final push in fs/namei.c, we finally have it. Unlike his variant, this one doesn't introduce struct opendata; what we have instead is ->atomic_open() taking preallocated struct file * and passing everything via its fields. Instead of returning struct file , it returns -E... on error, 0 on success and 1 in "deal with it yourself" case (e.g. symlink found on server, etc.). See comments before fs/namei.c:atomic_open(). That made a lot of goodies finally possible and quite a few are in that pile: ->lookup(), ->d_revalidate() and ->create() do not get struct nameidata anymore; ->lookup() and ->d_revalidate() get lookup flags instead, ->create() gets "do we want it exclusive" flag. With the introduction of new helper (kern_path_locked()) we are rid of all struct nameidata instances outside of fs/namei.c; it's still visible in namei.h, but not for long. Come the next cycle, declaration will move either to fs/internal.h or to fs/namei.c itself. [me, miklos, hch] - The second major change: behaviour of final fput(). Now we have __fput() done without any locks held by caller and not from deep in call stack. That obviously lifts a lot of constraints on the locking in there. Moreover, it's legal now to call fput() from atomic contexts (which has immediately simplified life for aio.c). We also don't need anti-recursion logics in __scm_destroy() anymore. There is a price, though - the damn thing has become partially asynchronous. For fput() from normal process we are guaranteed that pending __fput() will be done before the caller returns to userland, exits or gets stopped for ptrace. For kernel threads and atomic contexts it's done via schedule_work(), so theoretically we might need a way to make sure it's finished; so far only one such place had been found, but there might be more. There's flush_delayed_fput() (do all pending __fput()) and there's __fput_sync() (fput() analog doing __fput() immediately). I hope we won't need them often; see warnings in fs/file_table.c for details. [me, based on task_work series from Oleg merged last cycle] - sync series from Jan - large part of "death to sync_supers()" work from Artem; the only bits missing here are exofs and ext4 ones. As far as I understand, those are going via the exofs and ext4 trees resp.; once they are in, we can put ->write_super() to the rest, along with the thread calling it. - preparatory bits from unionmount series (from dhowells). - assorted cleanups and fixes all over the place, as usual. This is not the last pile for this cycle; there's at least jlayton's ESTALE work and fsfreeze series (the latter - in dire need of fixes, so I'm not sure it'll make the cut this cycle). I'll probably throw symlink/hardlink restrictions stuff from Kees into the next pile, too. Plus there's a lot of misc patches I hadn't thrown into that one - it's large enough as it is..." * 'for-linus-2' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (127 commits) ext4: switch EXT4_IOC_RESIZE_FS to mnt_want_write_file() btrfs: switch btrfs_ioctl_balance() to mnt_want_write_file() switch dentry_open() to struct path, make it grab references itself spufs: shift dget/mntget towards dentry_open() zoran: don't bother with struct file * in zoran_map ecryptfs: don't reinvent the wheels, please - use struct completion don't expose I_NEW inodes via dentry->d_inode tidy up namei.c a bit unobfuscate follow_up() a bit ext3: pass custom EOF to generic_file_llseek_size() ext4: use core vfs llseek code for dir seeks vfs: allow custom EOF in generic_file_llseek code vfs: Avoid unnecessary WB_SYNC_NONE writeback during sys_sync and reorder sync passes vfs: Remove unnecessary flushing of block devices vfs: Make sys_sync writeout also block device inodes vfs: Create function for iterating over block devices vfs: Reorder operations during sys_sync quota: Move quota syncing to ->sync_fs method quota: Split dquot_quota_sync() to writeback and cache flushing part vfs: Move noop_backing_dev_info check from sync into writeback ...	2012-07-23 12:27:27 -07:00
Eric Dumazet	563d34d057	tcp: dont drop MTU reduction indications ICMP messages generated in output path if frame length is bigger than mtu are actually lost because socket is owned by user (doing the xmit) One example is the ipgre_tunnel_xmit() calling icmp_send(skb, ICMP_DEST_UNREACH, ICMP_FRAG_NEEDED, htonl(mtu)); We had a similar case fixed in commit `a34a101e1e` (ipv6: disable GSO on sockets hitting dst_allfrag). Problem of such fix is that it relied on retransmit timers, so short tcp sessions paid a too big latency increase price. This patch uses the tcp_release_cb() infrastructure so that MTU reduction messages (ICMP messages) are not lost, and no extra delay is added in TCP transmits. Reported-by: Maciej Żenczykowski <maze@google.com> Diagnosed-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Nandita Dukkipati <nanditad@google.com> Cc: Tom Herbert <therbert@google.com> Cc: Tore Anderson <tore@fud.no> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-07-23 00:58:46 -07:00
David S. Miller	5e9965c15b	Merge branch 'kill_rtcache' The ipv4 routing cache is non-deterministic, performance wise, and is subject to reasonably easy to launch denial of service attacks. The routing cache works great for well behaved traffic, and the world was a much friendlier place when the tradeoffs that led to the routing cache's design were considered. What it boils down to is that the performance of the routing cache is a product of the traffic patterns seen by a system rather than being a product of the contents of the routing tables. The former of which is controllable by external entitites. Even for "well behaved" legitimate traffic, high volume sites can see hit rates in the routing cache of only ~%10. The general flow of this patch series is that first the routing cache is removed. We build a completely new rtable entry every lookup request. Next we make some simplifications due to the fact that removing the routing cache causes several members of struct rtable to become no longer necessary. Then we need to make some amends such that we can legally cache pre-constructed routes in the FIB nexthops. Firstly, we need to invalidate routes which are hit with nexthop exceptions. Secondly we have to change the semantics of rt->rt_gateway such that zero means that the destination is on-link and non-zero otherwise. Now that the preparations are ready, we start caching precomputed routes in the FIB nexthops. Output and input routes need different kinds of care when determining if we can legally do such caching or not. The details are in the commit log messages for those changes. The patch series then winds down with some more struct rtable simplifications and other tidy ups that remove unnecessary overhead. On a SPARC-T3 output route lookups are ~876 cycles. Input route lookups are ~1169 cycles with rpfilter disabled, and about ~1468 cycles with rpfilter enabled. These measurements were taken with the kbench_mod test module in the net_test_tools GIT tree: git://git.kernel.org/pub/scm/linux/kernel/git/davem/net_test_tools.git That GIT tree also includes a udpflood tester tool and stresses route lookups on packet output. For example, on the same SPARC-T3 system we can run: time ./udpflood -l 10000000 10.2.2.11 with routing cache: real 1m21.955s user 0m6.530s sys 1m15.390s without routing cache: real 1m31.678s user 0m6.520s sys 1m25.140s Performance undoubtedly can easily be improved further. For example fib_table_lookup() performs a lot of excessive computations with all the masking and shifting, some of it conditionalized to deal with edge cases. Also, Eric's no-ref optimization for input route lookups can be re-instated for the FIB nexthop caching code path. I would be really pleased if someone would work on that. In fact anyone suitable motivated can just fire up perf on the loading of the test net_test_tools benchmark kernel module. I spend much of my time going: bash# perf record insmod ./kbench_mod.ko dst=172.30.42.22 src=74.128.0.1 iif=2 bash# perf report Thanks to helpful feedback from Joe Perches, Eric Dumazet, Ben Hutchings, and others. Signed-off-by: David S. Miller <davem@davemloft.net>	2012-07-22 17:04:15 -07:00
Al Viro	6120d3dbb1	get rid of ->scm_work_list recursion in __scm_destroy() will be cut by delaying final fput() Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2012-07-22 23:58:00 +04:00
John Fastabend	406a3c638c	net: netprio_cgroup: rework update socket logic Instead of updating the sk_cgrp_prioidx struct field on every send this only updates the field when a task is moved via cgroup infrastructure. This allows sockets that may be used by a kernel worker thread to be managed. For example in the iscsi case today a user can put iscsid in a netprio cgroup and control traffic will be sent with the correct sk_cgrp_prioidx value set but as soon as data is sent the kernel worker thread isssues a send and sk_cgrp_prioidx is updated with the kernel worker threads value which is the default case. It seems more correct to only update the field when the user explicitly sets it via control group infrastructure. This allows the users to manage sockets that may be used with other threads. Signed-off-by: John Fastabend <john.r.fastabend@intel.com> Acked-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-07-22 12:44:01 -07:00
Neil Horman	5aa93bcf66	sctp: Implement quick failover draft from tsvwg I've seen several attempts recently made to do quick failover of sctp transports by reducing various retransmit timers and counters. While its possible to implement a faster failover on multihomed sctp associations, its not particularly robust, in that it can lead to unneeded retransmits, as well as false connection failures due to intermittent latency on a network. Instead, lets implement the new ietf quick failover draft found here: http://tools.ietf.org/html/draft-nishida-tsvwg-sctp-failover-05 This will let the sctp stack identify transports that have had a small number of errors, and avoid using them quickly until their reliability can be re-established. I've tested this out on two virt guests connected via multiple isolated virt networks and believe its in compliance with the above draft and works well. Signed-off-by: Neil Horman <nhorman@tuxdriver.com> CC: Vlad Yasevich <vyasevich@gmail.com> CC: Sridhar Samudrala <sri@us.ibm.com> CC: "David S. Miller" <davem@davemloft.net> CC: linux-sctp@vger.kernel.org CC: joe@perches.com Acked-by: Vlad Yasevich <vyasevich@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-07-22 12:13:46 -07:00
David S. Miller	0bb4087cbe	ipv4: Fix neigh lookup keying over loopback/point-to-point devices. We were using a special key "0" for all loopback and point-to-point device neigh lookups under ipv4, but we wouldn't use that special key for the neigh creation. So basically we'd make a new neigh at each and every lookup :-) This special case to use only one neigh for these device types is of dubious value, so just remove it entirely. Reported-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-07-20 16:06:10 -07:00
David S. Miller	2860583fe8	ipv4: Kill rt->fi It's not really needed. We only grabbed a reference to the fib_info for the sake of fib_info local metrics. However, fib_info objects are freed using RCU, as are therefore their private metrics (if any). We would have triggered a route cache flush if we eliminated a reference to a fib_info object in the routing tables. Therefore, any existing cached routes will first check and see that they have been invalidated before an errant reference to these metric values would occur. Signed-off-by: David S. Miller <davem@davemloft.net>	2012-07-20 13:40:07 -07:00
David S. Miller	9917e1e876	ipv4: Turn rt->rt_route_iif into rt->rt_is_input. That is this value's only use, as a boolean to indicate whether a route is an input route or not. So implement it that way, using a u16 gap present in the struct already. Signed-off-by: David S. Miller <davem@davemloft.net>	2012-07-20 13:40:02 -07:00
David S. Miller	4fd551d7be	ipv4: Kill rt->rt_oif Never actually used. It was being set on output routes to the original OIF specified in the flow key used for the lookup. Adjust the only user, ipmr_rt_fib_lookup(), for greater correctness of the flowi4_oif and flowi4_iif values, thanks to feedback from Julian Anastasov. Signed-off-by: David S. Miller <davem@davemloft.net>	2012-07-20 13:38:34 -07:00
David S. Miller	ba3f7f04ef	ipv4: Kill FLOWI_FLAG_RT_NOCACHE and associated code. Signed-off-by: David S. Miller <davem@davemloft.net>	2012-07-20 13:36:54 -07:00
David S. Miller	d2d68ba9fe	ipv4: Cache input routes in fib_info nexthops. Caching input routes is slightly simpler than output routes, since we don't need to be concerned with nexthop exceptions. (locally destined, and routed packets, never trigger PMTU events or redirects that will be processed by us). However, we have to elide caching for the DIRECTSRC and non-zero itag cases. Signed-off-by: David S. Miller <davem@davemloft.net>	2012-07-20 13:36:40 -07:00
David S. Miller	f2bb4bedf3	ipv4: Cache output routes in fib_info nexthops. If we have an output route that lacks nexthop exceptions, we can cache it in the FIB info nexthop. Such routes will have DST_HOST cleared because such routes refer to a family of destinations, rather than just one. The sequence of the handling of exceptions during route lookup is adjusted to make the logic work properly. Before we allocate the route, we lookup the exception. Then we know if we will cache this route or not, and therefore whether DST_HOST should be set on the allocated route. Then we use DST_HOST to key off whether we should store the resulting route, during rt_set_nexthop(), in the FIB nexthop cache. With help from Eric Dumazet. Signed-off-by: David S. Miller <davem@davemloft.net>	2012-07-20 13:36:16 -07:00
David S. Miller	ceb3320610	ipv4: Kill routes during PMTU/redirect updates. Mark them obsolete so there will be a re-lookup to fetch the FIB nexthop exception info. Signed-off-by: David S. Miller <davem@davemloft.net>	2012-07-20 13:31:22 -07:00
David S. Miller	f5b0a87436	net: Document dst->obsolete better. Add a big comment explaining how the field works, and use defines instead of magic constants for the values assigned to it. Suggested by Joe Perches. Signed-off-by: David S. Miller <davem@davemloft.net>	2012-07-20 13:31:21 -07:00
David S. Miller	f8126f1d51	ipv4: Adjust semantics of rt->rt_gateway. In order to allow prefixed routes, we have to adjust how rt_gateway is set and interpreted. The new interpretation is: 1) rt_gateway == 0, destination is on-link, nexthop is iph->daddr 2) rt_gateway != 0, destination requires a nexthop gateway Abstract the fetching of the proper nexthop value using a new inline helper, rt_nexthop(), as suggested by Joe Perches. Signed-off-by: David S. Miller <davem@davemloft.net> Tested-by: Vijay Subramanian <subramanian.vijay@gmail.com>	2012-07-20 13:31:20 -07:00
David S. Miller	f1ce3062c5	ipv4: Remove 'rt_dst' from 'struct rtable' Signed-off-by: David S. Miller <davem@davemloft.net>	2012-07-20 13:31:19 -07:00
David Miller	b48698895d	ipv4: Remove 'rt_mark' from 'struct rtable' Signed-off-by: David S. Miller <davem@davemloft.net>	2012-07-20 13:31:18 -07:00
David Miller	d6c0a4f609	ipv4: Kill 'rt_src' from 'struct rtable' Signed-off-by: David S. Miller <davem@davemloft.net>	2012-07-20 13:31:00 -07:00
David Miller	1a00fee4ff	ipv4: Remove rt_key_{src,dst,tos} from struct rtable. They are always used in contexts where they can be reconstituted, or where the finally resolved rt->rt_{src,dst} is semantically equivalent. Signed-off-by: David S. Miller <davem@davemloft.net>	2012-07-20 13:30:59 -07:00
David Miller	38a424e465	ipv4: Kill ip_route_input_noref(). The "noref" argument to ip_route_input_common() is now always ignored because we do not cache routes, and in that case we must always grab a reference to the resulting 'dst'. Signed-off-by: David S. Miller <davem@davemloft.net>	2012-07-20 13:30:59 -07:00
David S. Miller	89aef8921b	ipv4: Delete routing cache. The ipv4 routing cache is non-deterministic, performance wise, and is subject to reasonably easy to launch denial of service attacks. The routing cache works great for well behaved traffic, and the world was a much friendlier place when the tradeoffs that led to the routing cache's design were considered. What it boils down to is that the performance of the routing cache is a product of the traffic patterns seen by a system rather than being a product of the contents of the routing tables. The former of which is controllable by external entitites. Even for "well behaved" legitimate traffic, high volume sites can see hit rates in the routing cache of only ~%10. Signed-off-by: David S. Miller <davem@davemloft.net>	2012-07-20 13:30:27 -07:00
Jiri Pirko	df4ab5b3c2	net: rename bond_queue_mapping to slave_dev_queue_mapping As this is going to be used not only by bonding. Signed-off-by: Jiri Pirko <jiri@resnulli.us> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-07-20 11:07:00 -07:00
Jiri Pirko	d40156aa5e	rtnl: allow to specify different num for rx and tx queue count Also cut out unused function parameters and possible err in return value. Signed-off-by: Jiri Pirko <jiri@resnulli.us> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-07-20 11:06:59 -07:00
Eric Dumazet	6f458dfb40	tcp: improve latencies of timer triggered events Modern TCP stack highly depends on tcp_write_timer() having a small latency, but current implementation doesn't exactly meet the expectations. When a timer fires but finds the socket is owned by the user, it rearms itself for an additional delay hoping next run will be more successful. tcp_write_timer() for example uses a 50ms delay for next try, and it defeats many attempts to get predictable TCP behavior in term of latencies. Use the recently introduced tcp_release_cb(), so that the user owning the socket will call various handlers right before socket release. This will permit us to post a followup patch to address the tcp_tso_should_defer() syndrome (some deferred packets have to wait RTO timer to be transmitted, while cwnd should allow us to send them sooner) Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Tom Herbert <therbert@google.com> Cc: Yuchung Cheng <ycheng@google.com> Cc: Neal Cardwell <ncardwell@google.com> Cc: Nandita Dukkipati <nanditad@google.com> Cc: H.K. Jerry Chu <hkchu@google.com> Cc: John Heffner <johnwheffner@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-07-20 10:59:41 -07:00
Eric Dumazet	5815d5e7aa	tcp: use hash_32() in tcp_metrics Fix a missing roundup_pow_of_two(), since tcpmhash_entries is not guaranteed to be a power of two. Uses hash_32() instead of custom hash. tcpmhash_entries should be an unsigned int. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-07-20 10:59:41 -07:00
John W. Linville	90b90f60c4	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next into for-davem	2012-07-20 12:30:48 -04:00
David S. Miller	abaa72d7fd	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Conflicts: drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c	2012-07-19 11:17:30 -07:00
Yuchung Cheng	67da22d23f	net-tcp: Fast Open client - cookie-less mode In trusted networks, e.g., intranet, data-center, the client does not need to use Fast Open cookie to mitigate DoS attacks. In cookie-less mode, sendmsg() with MSG_FASTOPEN flag will send SYN-data regardless of cookie availability. Signed-off-by: Yuchung Cheng <ycheng@google.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-07-19 11:02:03 -07:00
Yuchung Cheng	aab4874355	net-tcp: Fast Open client - detecting SYN-data drops On paths with firewalls dropping SYN with data or experimental TCP options, Fast Open connections will have experience SYN timeout and bad performance. The solution is to track such incidents in the cookie cache and disables Fast Open temporarily. Since only the original SYN includes data and/or Fast Open option, the SYN-ACK has some tell-tale sign (tcp_rcv_fastopen_synack()) to detect such drops. If a path has recurring Fast Open SYN drops, Fast Open is disabled for 2^(recurring_losses) minutes starting from four minutes up to roughly one and half day. sendmsg with MSG_FASTOPEN flag will succeed but it behaves as connect() then write(). Signed-off-by: Yuchung Cheng <ycheng@google.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-07-19 11:02:03 -07:00
Yuchung Cheng	cf60af03ca	net-tcp: Fast Open client - sendmsg(MSG_FASTOPEN) sendmsg() (or sendto()) with MSG_FASTOPEN is a combo of connect(2) and write(2). The application should replace connect() with it to send data in the opening SYN packet. For blocking socket, sendmsg() blocks until all the data are buffered locally and the handshake is completed like connect() call. It returns similar errno like connect() if the TCP handshake fails. For non-blocking socket, it returns the number of bytes queued (and transmitted in the SYN-data packet) if cookie is available. If cookie is not available, it transmits a data-less SYN packet with Fast Open cookie request option and returns -EINPROGRESS like connect(). Using MSG_FASTOPEN on connecting or connected socket will result in simlar errno like repeating connect() calls. Therefore the application should only use this flag on new sockets. The buffer size of sendmsg() is independent of the MSS of the connection. Signed-off-by: Yuchung Cheng <ycheng@google.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-07-19 11:02:03 -07:00
Yuchung Cheng	783237e8da	net-tcp: Fast Open client - sending SYN-data This patch implements sending SYN-data in tcp_connect(). The data is from tcp_sendmsg() with flag MSG_FASTOPEN (implemented in a later patch). The length of the cookie in tcp_fastopen_req, init'd to 0, controls the type of the SYN. If the cookie is not cached (len==0), the host sends data-less SYN with Fast Open cookie request option to solicit a cookie from the remote. If cookie is not available (len > 0), the host sends a SYN-data with Fast Open cookie option. If cookie length is negative, the SYN will not include any Fast Open option (for fall back operations). To deal with middleboxes that may drop SYN with data or experimental TCP option, the SYN-data is only sent once. SYN retransmits do not include data or Fast Open options. The connection will fall back to regular TCP handshake. Signed-off-by: Yuchung Cheng <ycheng@google.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-07-19 11:02:03 -07:00
Yuchung Cheng	1fe4c481ba	net-tcp: Fast Open client - cookie cache With help from Eric Dumazet, add Fast Open metrics in tcp metrics cache. The basic ones are MSS and the cookies. Later patch will cache more to handle unfriendly middleboxes. Signed-off-by: Yuchung Cheng <ycheng@google.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-07-19 10:55:36 -07:00
Yuchung Cheng	2100c8d2d9	net-tcp: Fast Open base This patch impelements the common code for both the client and server. 1. TCP Fast Open option processing. Since Fast Open does not have an option number assigned by IANA yet, it shares the experiment option code 254 by implementing draft-ietf-tcpm-experimental-options with a 16 bits magic number 0xF989. This enables global experiments without clashing the scarce(2) experimental options available for TCP. When the draft status becomes standard (maybe), the client should switch to the new option number assigned while the server supports both numbers for transistion. 2. The new sysctl tcp_fastopen 3. A place holder init function Signed-off-by: Yuchung Cheng <ycheng@google.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-07-19 10:55:36 -07:00
David S. Miller	d8f1641b58	net: Fix warnings in dst_ops.h include/net/dst_ops.h:28:20: warning: ‘struct sock’ declared inside parameter list Signed-off-by: David S. Miller <davem@davemloft.net>	2012-07-19 10:43:03 -07:00
Eric Dumazet	be9f4a44e7	ipv4: tcp: remove per net tcp_sock tcp_v4_send_reset() and tcp_v4_send_ack() use a single socket per network namespace. This leads to bad behavior on multiqueue NICS, because many cpus contend for the socket lock and once socket lock is acquired, extra false sharing on various socket fields slow down the operations. To better resist to attacks, we use a percpu socket. Each cpu can run without contention, using appropriate memory (local node) Additional features : 1) We also mirror the queue_mapping of the incoming skb, so that answers use the same queue if possible. 2) Setting SOCK_USE_WRITE_QUEUE socket flag speedup sock_wfree() 3) We now limit the number of in-flight RST/ACK [1] packets per cpu, instead of per namespace, and we honor the sysctl_wmem_default limit dynamically. (Prior to this patch, sysctl_wmem_default value was copied at boot time, so any further change would not affect tcp_sock limit) [1] These packets are only generated when no socket was matched for the incoming packet. Reported-by: Bill Sommerfeld <wsommerfeld@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-07-19 10:35:30 -07:00
Julian Anastasov	aee06da672	ipv4: use seqlock for nh_exceptions Use global seqlock for the nh_exceptions. Call fnhe_oldest with the right hash chain. Correct the diff value for dst_set_expires. v2: after suggestions from Eric Dumazet: * get rid of spin lock fnhe_lock, rearrange update_or_create_fnhe * continue daddr search in rt_bind_exception v3: * remove the daddr check before seqlock in rt_bind_exception * restart lookup in rt_bind_exception on detected seqlock change, as suggested by David Miller Signed-off-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-07-19 10:30:14 -07:00
John W. Linville	0cd06647b7	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth-next	2012-07-18 14:53:10 -04:00
Eric Dumazet	ddbe503203	ipv6: add ipv6_addr_hash() helper Introduce ipv6_addr_hash() helper doing a XOR on all bits of an IPv6 address, with an optimized x86_64 version. Use it in flow dissector, as suggested by Andrew McGregor, to reduce hash collision probabilities in fq_codel (and other users of flow dissector) Use it in ip6_tunnel.c and use more bit shuffling, as suggested by David Laight, as existing hash was ignoring most of them. Use it in sunrpc and use more bit shuffling, using hash_32(). Use it in net/ipv6/addrconf.c, using hash_32() as well. As a cleanup, use it in net/ipv4/tcp_metrics.c Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: Andrew McGregor <andrewmcgr@gmail.com> Cc: Dave Taht <dave.taht@gmail.com> Cc: Tom Herbert <therbert@google.com> Cc: David Laight <David.Laight@ACULAB.COM> Cc: Joe Perches <joe@perches.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-07-18 11:28:46 -07:00
Saurabh	eb8637cd4a	net/ipv4: VTI support rx-path hook in xfrm4_mode_tunnel. Incorporated David and Steffen's comments. Add hook for rx-path xfmr4_mode_tunnel for VTI tunnel module. Signed-off-by: Saurabh Mohan <saurabh.mohan@vyatta.com> Reviewed-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-07-18 09:36:12 -07:00
Eric Dumazet	d3818c92af	ipv6: fix inet6_csk_xmit() We should provide to inet6_csk_route_socket a struct flowi6 pointer, so that net6_csk_xmit() works correctly instead of sending garbage. Also add some consts Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: Yuchung Cheng <ycheng@google.com> Cc: Neal Cardwell <ncardwell@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-07-18 08:59:58 -07:00
John W. Linville	707be0ae13	Merge branch 'for-john' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211-next	2012-07-17 15:07:31 -04:00
David S. Miller	a6ff1a2f1e	Merge branch 'nexthop_exceptions' These patches implement the final mechanism necessary to really allow us to go without the route cache in ipv4. We need a place to have long-term storage of PMTU/redirect information which is independent of the routes themselves, yet does not get us back into a situation where we have to write to metrics or anything like that. For this we use an "next-hop exception" table in the FIB nexthops. The one thing I desperately want to avoid is having to create clone routes in the FIB trie for this purpose, because that is very expensive. However, I'm willing to entertain such an idea later if this current scheme proves to have downsides that the FIB trie variant would not have. In order to accomodate this any such scheme, we need to be able to produce a full flow key at PMTU/redirect time. That required an adjustment of the interface call-sites used to propagate these events. For a PMTU/redirect with a fully specified socket, we pass that socket and use it to produce the flow key. Otherwise we use a passed in SKB to formulate the key. There are two cases that need to be distinguished, ICMP message processing (in which case the IP header is at skb->data) and output packet processing (mostly tunnels, and in all such cases the IP header is at ip_hdr(skb)). We also have to make the code able to handle the case where the dst itself passed into the dst_ops->{update_pmtu,redirect} method is invalidated. This matters for calls from sockets that have cached that route. We provide a inet{,6} helper function for this purpose, and edit SCTP specially since it caches routes at the transport rather than socket level. Signed-off-by: David S. Miller <davem@davemloft.net>	2012-07-17 10:48:26 -07:00
David S. Miller	4895c771c7	ipv4: Add FIB nexthop exceptions. In a regime where we have subnetted route entries, we need a way to store persistent storage about destination specific learned values such as redirects and PMTU values. This is implemented here via nexthop exceptions. The initial implementation is a 2048 entry hash table with relaiming starting at chain length 5. A more sophisticated scheme can be devised if that proves necessary. Signed-off-by: David S. Miller <davem@davemloft.net>	2012-07-17 08:48:50 -07:00
David S. Miller	6700c2709c	net: Pass optional SKB and SK arguments to dst_ops->{update_pmtu,redirect}() This will be used so that we can compose a full flow key. Even though we have a route in this context, we need more. In the future the routes will be without destination address, source address, etc. keying. One ipv4 route will cover entire subnets, etc. In this environment we have to have a way to possess persistent storage for redirects and PMTU information. This persistent storage will exist in the FIB tables, and that's why we'll need to be able to rebuild a full lookup flow key here. Using that flow key will do a fib_lookup() and create/update the persistent entry. Signed-off-by: David S. Miller <davem@davemloft.net>	2012-07-17 03:29:28 -07:00
Luis R. Rodriguez	57b5ce072e	cfg80211: add cellular base station regulatory hint support Cellular base stations can provide hints to cfg80211 about where they think we are. This can be done for example on a cell phone. To enable these hints we simply allow them through as user regulatory hints but we allow userspace to clasify the hint as either coming directly from the user or coming from a cellular base station. This option is only available when you enable CONFIG_CFG80211_CERTIFICATION_ONUS. The base station hints themselves will not be processed by the core unless at least one device on the system supports this feature. Signed-off-by: Luis R. Rodriguez <mcgrof@qca.qualcomm.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2012-07-17 12:16:39 +02:00
Lin Ming	9e33ce453f	ipvs: fix oops on NAT reply in br_nf context IPVS should not reset skb->nf_bridge in FORWARD hook by calling nf_reset for NAT replies. It triggers oops in br_nf_forward_finish. [ 579.781508] BUG: unable to handle kernel NULL pointer dereference at 0000000000000004 [ 579.781669] IP: [<ffffffff817b1ca5>] br_nf_forward_finish+0x58/0x112 [ 579.781792] PGD 218f9067 PUD 0 [ 579.781865] Oops: 0000 [#1] SMP [ 579.781945] CPU 0 [ 579.781983] Modules linked in: [ 579.782047] [ 579.782080] [ 579.782114] Pid: 4644, comm: qemu Tainted: G W 3.5.0-rc5-00006-g95e69f9 #282 Hewlett-Packard /30E8 [ 579.782300] RIP: 0010:[<ffffffff817b1ca5>] [<ffffffff817b1ca5>] br_nf_forward_finish+0x58/0x112 [ 579.782455] RSP: 0018:ffff88007b003a98 EFLAGS: 00010287 [ 579.782541] RAX: 0000000000000008 RBX: ffff8800762ead00 RCX: 000000000001670a [ 579.782653] RDX: 0000000000000000 RSI: 000000000000000a RDI: ffff8800762ead00 [ 579.782845] RBP: ffff88007b003ac8 R08: 0000000000016630 R09: ffff88007b003a90 [ 579.782957] R10: ffff88007b0038e8 R11: ffff88002da37540 R12: ffff88002da01a02 [ 579.783066] R13: ffff88002da01a80 R14: ffff88002d83c000 R15: ffff88002d82a000 [ 579.783177] FS: 0000000000000000(0000) GS:ffff88007b000000(0063) knlGS:00000000f62d1b70 [ 579.783306] CS: 0010 DS: 002b ES: 002b CR0: 000000008005003b [ 579.783395] CR2: 0000000000000004 CR3: 00000000218fe000 CR4: 00000000000027f0 [ 579.783505] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 579.783684] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 [ 579.783795] Process qemu (pid: 4644, threadinfo ffff880021b20000, task ffff880021aba760) [ 579.783919] Stack: [ 579.783959] ffff88007693cedc ffff8800762ead00 ffff88002da01a02 ffff8800762ead00 [ 579.784110] ffff88002da01a02 ffff88002da01a80 ffff88007b003b18 ffffffff817b26c7 [ 579.784260] ffff880080000000 ffffffff81ef59f0 ffff8800762ead00 ffffffff81ef58b0 [ 579.784477] Call Trace: [ 579.784523] <IRQ> [ 579.784562] [ 579.784603] [<ffffffff817b26c7>] br_nf_forward_ip+0x275/0x2c8 [ 579.784707] [<ffffffff81704b58>] nf_iterate+0x47/0x7d [ 579.784797] [<ffffffff817ac32e>] ? br_dev_queue_push_xmit+0xae/0xae [ 579.784906] [<ffffffff81704bfb>] nf_hook_slow+0x6d/0x102 [ 579.784995] [<ffffffff817ac32e>] ? br_dev_queue_push_xmit+0xae/0xae [ 579.785175] [<ffffffff8187fa95>] ? _raw_write_unlock_bh+0x19/0x1b [ 579.785179] [<ffffffff817ac417>] __br_forward+0x97/0xa2 [ 579.785179] [<ffffffff817ad366>] br_handle_frame_finish+0x1a6/0x257 [ 579.785179] [<ffffffff817b2386>] br_nf_pre_routing_finish+0x26d/0x2cb [ 579.785179] [<ffffffff817b2cf0>] br_nf_pre_routing+0x55d/0x5c1 [ 579.785179] [<ffffffff81704b58>] nf_iterate+0x47/0x7d [ 579.785179] [<ffffffff817ad1c0>] ? br_handle_local_finish+0x44/0x44 [ 579.785179] [<ffffffff81704bfb>] nf_hook_slow+0x6d/0x102 [ 579.785179] [<ffffffff817ad1c0>] ? br_handle_local_finish+0x44/0x44 [ 579.785179] [<ffffffff81551525>] ? sky2_poll+0xb35/0xb54 [ 579.785179] [<ffffffff817ad62a>] br_handle_frame+0x213/0x229 [ 579.785179] [<ffffffff817ad417>] ? br_handle_frame_finish+0x257/0x257 [ 579.785179] [<ffffffff816e3b47>] __netif_receive_skb+0x2b4/0x3f1 [ 579.785179] [<ffffffff816e69fc>] process_backlog+0x99/0x1e2 [ 579.785179] [<ffffffff816e6800>] net_rx_action+0xdf/0x242 [ 579.785179] [<ffffffff8107e8a8>] __do_softirq+0xc1/0x1e0 [ 579.785179] [<ffffffff8135a5ba>] ? trace_hardirqs_off_thunk+0x3a/0x6c [ 579.785179] [<ffffffff8188812c>] call_softirq+0x1c/0x30 The steps to reproduce as follow, 1. On Host1, setup brige br0(192.168.1.106) 2. Boot a kvm guest(192.168.1.105) on Host1 and start httpd 3. Start IPVS service on Host1 ipvsadm -A -t 192.168.1.106:80 -s rr ipvsadm -a -t 192.168.1.106:80 -r 192.168.1.105:80 -m 4. Run apache benchmark on Host2(192.168.1.101) ab -n 1000 http://192.168.1.106/ ip_vs_reply4 ip_vs_out handle_response ip_vs_notrack nf_reset() { skb->nf_bridge = NULL; } Actually, IPVS wants in this case just to replace nfct with untracked version. So replace the nf_reset(skb) call in ip_vs_notrack() with a nf_conntrack_put(skb->nfct) call. Signed-off-by: Lin Ming <mlin@ss.pku.edu.cn> Signed-off-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Simon Horman <horms@verge.net.au> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2012-07-17 12:00:46 +02:00
Thomas Pedersen	84f10708f7	cfg80211: support TX error rate CQM Let the user configure serveral TX error conection quality monitoring parameters: % error rate, survey interval, and # of attempted packets. On exceeding the TX failure rate over the given interval, the driver will send a CQM notify event with the actual TX failure rate and packets attempted. Signed-off-by: Thomas Pedersen <c_tpeder@qca.qualcomm.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2012-07-17 11:57:23 +02:00
Eric Dumazet	282f23c6ee	tcp: implement RFC 5961 3.2 Implement the RFC 5691 mitigation against Blind Reset attack using RST bit. Idea is to validate incoming RST sequence, to match RCV.NXT value, instead of previouly accepted window : (RCV.NXT <= SEG.SEQ < RCV.NXT+RCV.WND) If sequence is in window but not an exact match, send a "challenge ACK", so that the other part can resend an RST with the appropriate sequence. Add a new sysctl, tcp_challenge_ack_limit, to limit number of challenge ACK sent per second. Add a new SNMP counter to count number of challenge acks sent. (netstat -s \| grep TCPChallengeACK) Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Kiran Kumar Kella <kkiran@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-07-17 01:36:20 -07:00
Andrey Vagin	51d7cccf07	net: make sock diag per-namespace Before this patch sock_diag works for init_net only and dumps information about sockets from all namespaces. This patch expands sock_diag for all name-spaces. It creates a netlink kernel socket for each netns and filters data during dumping. v2: filter accoding with netns in all places remove an unused variable. Cc: "David S. Miller" <davem@davemloft.net> Cc: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru> Cc: James Morris <jmorris@namei.org> Cc: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org> Cc: Patrick McHardy <kaber@trash.net> Cc: Pavel Emelyanov <xemul@parallels.com> CC: Eric Dumazet <eric.dumazet@gmail.com> Cc: linux-kernel@vger.kernel.org Cc: netdev@vger.kernel.org Signed-off-by: Andrew Vagin <avagin@openvz.org> Acked-by: Pavel Emelyanov <xemul@parallels.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-07-16 22:31:34 -07:00
Greg Kroah-Hartman	467a3ca5ca	Merge branch 'v3.6-rc7' into tty-next This is to sync up on Linus's branch to get the other tty and core changes. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2012-07-16 12:32:42 -07:00
David S. Miller	02f3d4ce9e	sctp: Adjust PMTU updates to accomodate route invalidation. This adjusts the call to dst_ops->update_pmtu() so that we can transparently handle the fact that, in the future, the dst itself can be invalidated by the PMTU update (when we have non-host routes cached in sockets). Signed-off-by: David S. Miller <davem@davemloft.net>	2012-07-16 03:57:14 -07:00
David S. Miller	35ad9b9cf7	ipv6: Add helper inet6_csk_update_pmtu(). This is the ipv6 version of inet_csk_update_pmtu(). Signed-off-by: David S. Miller <davem@davemloft.net>	2012-07-16 03:44:56 -07:00
David S. Miller	80d0a69fc5	ipv4: Add helper inet_csk_update_pmtu(). This abstracts away the call to dst_ops->update_pmtu() so that we can transparently handle the fact that, in the future, the dst itself can be invalidated by the PMTU update (when we have non-host routes cached in sockets). So we try to rebuild the socket cached route after the method invocation if necessary. This isn't used by SCTP because it needs to cache dsts per-transport, and thus will need it's own local version of this helper. Signed-off-by: David S. Miller <davem@davemloft.net>	2012-07-16 03:28:06 -07:00
Mat Martineau	c20f8e35ca	Bluetooth: Use tx window from config response for ack timing This change addresses an L2CAP ERTM throughput problem when a remote device does not fully utilize the available transmit window. The L2CAP ERTM transmit window size determines the maximum number of unacked frames that may be outstanding at any time. It is configured separately for each direction of an ERTM connection. Each side sends a configuration request with a tx_win field indicating how many unacked frames it is capable of receiving before sending an ack. The configuration response's tx_win field shows how many frames the transmitter will actually send before waiting for an ack. It's important to trace both the actual transmit window (to check for validity of incoming frames) and the number of frames that the transmitter will send before waiting (to send acks at the appropriate time). Now there are separate tx_win and ack_win values. ack_win is updated based on configuration responses, and is used to determine when acks are sent. Signed-off-by: Mat Martineau <mathewm@codeaurora.org> Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>	2012-07-15 12:18:29 -03:00
David S. Miller	921a678cb6	Merge branch 'for-davem' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next John Linville says: ==================== Several drivers see updates: mwifiex, ath9k, iwlwifi, brcmsmac, wlcore/wl12xx/wl18xx, and a handful of others. The bcma bus got a lot of attention from Hauke Mehrtens. The cfg80211 component gets a flurry of patches for multi-channel support, and the mac80211 component gets the first few VHT (11ac) and 60GHz (11ad) patches. This also includes the removal of the iwmc3200 drivers, since the hardware never became available to normal people. Additionally, the NFC subsystem gets a series of updates. According to Samuel, "Here are the interesting bits: - A better error management for the HCI stack. - An LLCP "late" binding implementation for a better NFC SAP usage. SAPs are now reserved only when there's a client for it. - Support for Sony RC-S360 (a.k.a. PaSoRi) pn533 based dongle. We can read and write NFC tags and also establish a p2p link with this dongle now. - A few LLCP fixes." Finally, this includes another pull of the fixes from the wireless tree in order to resolve some merge issues. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2012-07-13 23:02:28 -07:00
David S. Miller	85b91b0339	ipv4: Don't store a rule pointer in fib_result. We only use it to fetch the rule's tclassid, so just store the tclassid there instead. This also decreases the size of fib_result by a full 8 bytes on 64-bit. On 32-bits it's a wash. Signed-off-by: David S. Miller <davem@davemloft.net>	2012-07-13 08:21:29 -07:00
Johannes Berg	4290cb4bf2	cfg80211: reduce monitor interface tracking Revert commit `b78e8ceac2` ("cfg80211: track monitor channel") and remove the set_monitor_enabled() callback. Due to the tracking happening in NETDEV_PRE_UP, it had introduced bugs because the monitor interface callback would be called before the device was started. It looks like there's no way to fix this, and using NETDEV_PRE_UP is broken anyway (since there's no NETDEV_UP_FAIL), so remove all that code, track interfaces in NETDEV_UP and also stop tracking the monitor channel in cfg80211. This mostly reverts to before the tracking, except that we keep the interface count tracking so that setting the monitor channel can be rejected properly. Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2012-07-13 16:16:11 +02:00
Johannes Berg	5b7ccaf3fc	cfg80211/mac80211: re-add get_channel operation This essentially reverts commit `2e165b8184` but introduces the get_channel operation with a new wireless_dev argument so that you can retrieve the channel per interface. This is necessary as even though we can track all interface channels (except monitor) we can't track the channel type used. Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2012-07-13 16:16:11 +02:00
John W. Linville	d07d152892	Merge branch 'for-john' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211-next Conflicts: drivers/net/wireless/iwmc3200wifi/cfg80211.c drivers/net/wireless/mwifiex/cfg80211.c	2012-07-12 15:21:05 -04:00
John W. Linville	38a0084063	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next into for-davem	2012-07-12 13:44:50 -04:00
David S. Miller	391e5c22f5	ipv4: Remove tb_peers from fib_table. No longer used. Signed-off-by: David S. Miller <davem@davemloft.net>	2012-07-12 09:39:28 -07:00
Johannes Berg	8c358bcd09	mac80211: add time synchronisation with BSS for assoc Some drivers (iwlegacy, iwlwifi and rt2x00) today use the bss_conf.last_tsf value. By itself though that value is completely worthless since it may be ancient. What really is needed is synchronisation between some device time and the TSF. To clarify this, rename bss_conf.last_tsf to sync_tsf and add sync_device_ts which is obtained from rx_status which gets a new field device_timestamp for this purpose. This is intentionally not using the mactime field since that is used for other things and in IBSS is expected to sync with the IBSS's TSF which isn't necessarily true for the device timestamp. Also, since we have the information and it's useful even before the connection has been established, give all the timing details to the driver before authenticating. Reviewed-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2012-07-12 12:10:46 +02:00
Johannes Berg	30f422925c	mac80211: optimize ieee80211_rx_status struct layout We waste a lot of space in this struct because it uses int values where smaller ones would be sufficient. The upcoming A-MPDU information needs some space, optimize the struct now. Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2012-07-12 12:10:45 +02:00
Johannes Berg	fd0142844e	nl80211: move scan API to wdev The new P2P Device will have to be able to scan for P2P search, so move scanning to use struct wireless_dev instead of struct net_device. Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2012-07-12 12:10:41 +02:00
Johannes Berg	84efbb84cf	cfg80211: use wireless_dev for interface management In order to be able to create P2P Device wdevs, move the virtual interface management over to wireless_dev structures. Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2012-07-12 12:08:10 +02:00
David S. Miller	b94f1c0904	ipv6: Use icmpv6_notify() to propagate redirect, instead of rt6_redirect(). And delete rt6_redirect(), since it is no longer used. Signed-off-by: David S. Miller <davem@davemloft.net>	2012-07-12 00:33:37 -07:00
David S. Miller	ec18d9a269	ipv6: Add redirect support to all protocol icmp error handlers. Signed-off-by: David S. Miller <davem@davemloft.net>	2012-07-12 00:25:15 -07:00
David S. Miller	3a5ad2ee5e	ipv6: Add ip6_redirect() and ip6_sk_redirect() helper functions. Signed-off-by: David S. Miller <davem@davemloft.net>	2012-07-12 00:08:07 -07:00
David S. Miller	e8599ff4b1	ipv6: Move bulk of redirect handling into rt6_redirect(). This sets things up so that we can have the protocol error handlers call down into the ipv6 route code for redirects just as ipv4 already does. Signed-off-by: David S. Miller <davem@davemloft.net>	2012-07-11 23:43:53 -07:00
David S. Miller	30f2a5f379	ipv6: Export ndisc option parsing from ndisc.c This is going to be used internally by the rt6 redirect code. Signed-off-by: David S. Miller <davem@davemloft.net>	2012-07-11 23:39:11 -07:00
David S. Miller	1f42539d25	ipv4: Kill ip_rt_redirect(). No longer needed, as the protocol handlers now all properly propagate the redirect back into the routing code. Signed-off-by: David S. Miller <davem@davemloft.net>	2012-07-11 21:30:08 -07:00
David S. Miller	b42597e2f3	ipv4: Add ipv4_redirect() and ipv4_sk_redirect() helper functions. Signed-off-by: David S. Miller <davem@davemloft.net>	2012-07-11 21:25:45 -07:00
David S. Miller	e47a185b31	ipv4: Generalize ip_do_redirect() and hook into new dst_ops->redirect. All of the redirect acceptance policy is now contained within. Signed-off-by: David S. Miller <davem@davemloft.net>	2012-07-11 20:55:47 -07:00
David S. Miller	94206125c4	ipv4: Rearrange arguments to ip_rt_redirect() Pass in the SKB rather than just the IP addresses, so that policy and other aspects can reside in ip_rt_redirect() rather then icmp_redirect(). Signed-off-by: David S. Miller <davem@davemloft.net>	2012-07-11 20:38:08 -07:00
Eric Dumazet	46d3ceabd8	tcp: TCP Small Queues This introduce TSQ (TCP Small Queues) TSQ goal is to reduce number of TCP packets in xmit queues (qdisc & device queues), to reduce RTT and cwnd bias, part of the bufferbloat problem. sk->sk_wmem_alloc not allowed to grow above a given limit, allowing no more than ~128KB [1] per tcp socket in qdisc/dev layers at a given time. TSO packets are sized/capped to half the limit, so that we have two TSO packets in flight, allowing better bandwidth use. As a side effect, setting the limit to 40000 automatically reduces the standard gso max limit (65536) to 40000/2 : It can help to reduce latencies of high prio packets, having smaller TSO packets. This means we divert sock_wfree() to a tcp_wfree() handler, to queue/send following frames when skb_orphan() [2] is called for the already queued skbs. Results on my dev machines (tg3/ixgbe nics) are really impressive, using standard pfifo_fast, and with or without TSO/GSO. Without reduction of nominal bandwidth, we have reduction of buffering per bulk sender : < 1ms on Gbit (instead of 50ms with TSO) < 8ms on 100Mbit (instead of 132 ms) I no longer have 4 MBytes backlogged in qdisc by a single netperf session, and both side socket autotuning no longer use 4 Mbytes. As skb destructor cannot restart xmit itself ( as qdisc lock might be taken at this point ), we delegate the work to a tasklet. We use one tasklest per cpu for performance reasons. If tasklet finds a socket owned by the user, it sets TSQ_OWNED flag. This flag is tested in a new protocol method called from release_sock(), to eventually send new segments. [1] New /proc/sys/net/ipv4/tcp_limit_output_bytes tunable [2] skb_orphan() is usually called at TX completion time, but some drivers call it in their start_xmit() handler. These drivers should at least use BQL, or else a single TCP session can still fill the whole NIC TX ring, since TSQ will have no effect. Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Dave Taht <dave.taht@bufferbloat.net> Cc: Tom Herbert <therbert@google.com> Cc: Matt Mathis <mattmathis@google.com> Cc: Yuchung Cheng <ycheng@google.com> Cc: Nandita Dukkipati <nanditad@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-07-11 18:12:59 -07:00
Andrei Emeltchenko	4b10b274e2	Bluetooth: debug: Print l2cap_chan refcount Improve debug output. Signed-off-by: Andrei Emeltchenko <andrei.emeltchenko@intel.com> Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>	2012-07-11 10:09:20 -03:00
David S. Miller	04c9f416e3	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Conflicts: net/batman-adv/bridge_loop_avoidance.c net/batman-adv/bridge_loop_avoidance.h net/batman-adv/soft-interface.c net/mac80211/mlme.c With merge help from Antonio Quartulli (batman-adv) and Stephen Rothwell (drivers/net/usb/qmi_wwan.c). The net/mac80211/mlme.c conflict seemed easy enough, accounting for a conversion to some new tracing macros. Signed-off-by: David S. Miller <davem@davemloft.net>	2012-07-10 23:56:33 -07:00
Eric Dumazet	1a203cb33a	ipv6: optimize ipv6 addresses compares On 64 bit arches having efficient unaligned accesses (eg x86_64) we can use long words to reduce number of instructions for free. Joe Perches suggested to change ipv6_masked_addr_cmp() to return a bool instead of 'int', to make sure ipv6_masked_addr_cmp() cannot be used in a sorting function. Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Joe Perches <joe@perches.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-07-10 23:13:46 -07:00
David S. Miller	f185071ddf	ipv4: Remove inetpeer from routes. No longer used. Signed-off-by: David S. Miller <davem@davemloft.net>	2012-07-10 22:40:18 -07:00
David S. Miller	5943634fc5	ipv4: Maintain redirect and PMTU info in struct rtable again. Maintaining this in the inetpeer entries was not the right way to do this at all. Signed-off-by: David S. Miller <davem@davemloft.net>	2012-07-10 22:40:14 -07:00
David S. Miller	3e12939a2a	inet: Kill FLOWI_FLAG_PRECOW_METRICS. No longer needed. TCP writes metrics, but now in it's own special cache that does not dirty the route metrics. Therefore there is no longer any reason to pre-cow metrics in this way. Signed-off-by: David S. Miller <davem@davemloft.net>	2012-07-10 22:40:12 -07:00
David S. Miller	16d1839907	inet: Remove ->get_peer() method. No longer used. Signed-off-by: David S. Miller <davem@davemloft.net>	2012-07-10 22:40:10 -07:00
David S. Miller	81166dd6fa	tcp: Move timestamps from inetpeer to metrics cache. With help from Lin Ming. Signed-off-by: David S. Miller <davem@davemloft.net>	2012-07-10 22:40:08 -07:00
David S. Miller	94334d5ed4	net: Kill set_dst_metric_rtt(). No longer used. Signed-off-by: David S. Miller <davem@davemloft.net>	2012-07-10 22:40:07 -07:00
David S. Miller	51c5d0c4b1	tcp: Maintain dynamic metrics in local cache. Maintain a local hash table of TCP dynamic metrics blobs. Computed TCP metrics are no longer maintained in the route metrics. The table uses RCU and an extremely simple hash so that it has low latency and low overhead. A simple hash is legitimate because we only make metrics blobs for fully established connections. Some tweaking of the default hash table sizes, metric timeouts, and the hash chain length limit certainly could use some tweaking. But the basic design seems sound. With help from Eric Dumazet and Joe Perches. Signed-off-by: David S. Miller <davem@davemloft.net>	2012-07-10 22:39:57 -07:00
David S. Miller	ab92bb2f67	tcp: Abstract back handling peer aliveness test into helper function. Signed-off-by: David S. Miller <davem@davemloft.net>	2012-07-10 20:33:49 -07:00
David S. Miller	4aabd8ef8c	tcp: Move dynamnic metrics handling into seperate file. Signed-off-by: David S. Miller <davem@davemloft.net>	2012-07-10 20:31:36 -07:00
David S. Miller	e044a651b9	ipv4: Fix crashes in fib_rules_tclass(). All paths assume, when CONFIG_IP_MULTIPLE_TABLES is enabled, that any successful call to fib_lookup() will initialize the fib_result->r value to something. We violated that expectation in the new fib_lookup() fast path. Reported-by: Or Gerlitz <ogerlitz@mellanox.com> Tested-by: Eric Dumazet <eric.dumazet@gmail.com> Tested-by: Greg Rose <gregory.v.rose@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-07-10 18:05:28 -07:00
Eric Lapuyade	a10d595b10	NFC: Allow HCI driver to pre-open pipes to some gates Some NFC chips will statically create and open pipes for both standard and proprietary gates. The driver can now pass this information to HCI such that HCI will not attempt to create and open them, but will instead directly use the passed pipe ids. Signed-off-by: Eric Lapuyade <eric.lapuyade@intel.com> Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>	2012-07-09 16:42:12 -04:00
Eric Lapuyade	456411ca81	NFC: Driver failure API This API should be used by drivers, HCI, SHDLC or NCI stacks to report an unrecoverable error. Signed-off-by: Eric Lapuyade <eric.lapuyade@intel.com> Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>	2012-07-09 16:42:08 -04:00
Eric Lapuyade	a9a741a7e2	NFC: Prepare asynchronous error management for driver and shdlc Signed-off-by: Eric Lapuyade <eric.lapuyade@intel.com> Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>	2012-07-09 16:42:04 -04:00
Johannes Berg	71bbc99438	cfg80211: use wdev in mgmt-tx/ROC APIs The management frame and remain-on-channel APIs will be needed in the P2P device abstraction, so move them over to the new wdev-based APIs. Userspace can still use both the interface index and wdev identifier for them so it's backward compatible, but for the P2P Device wdev it will be able to use the wdev identifier only. Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2012-07-09 14:51:47 +02:00
Johannes Berg	89a54e48b9	nl80211: prepare for non-netdev wireless devs In order to support a P2P device abstraction and Bluetooth high-speed AMPs, we need to have a way to identify virtual interfaces that don't have a netdev associated. Do this by adding a NL80211_ATTR_WDEV attribute to identify a wdev which may or may not also be a netdev. To simplify things, use a 64-bit value with the high 32 bits being the wiphy index for this new wdev identifier in the nl80211 API. Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2012-07-09 14:51:46 +02:00
Johannes Berg	f72b85b8eb	mac80211: remove ieee80211_key_removed This API call was intended to be used by drivers if they want to optimize key handling by removing one key when another is added. Remove it since no driver is using it. If needed, it can always be added back. Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2012-07-09 14:49:15 +02:00
Pablo Neira Ayuso	6bd0405bb4	netfilter: nf_ct_ecache: fix crash with multiple containers, one shutting down Hans reports that he's still hitting: BUG: unable to handle kernel NULL pointer dereference at 000000000000027c IP: [<ffffffff813615db>] netlink_has_listeners+0xb/0x60 PGD 0 Oops: 0000 [#3] PREEMPT SMP CPU 0 It happens when adding a number of containers with do: nfct_query(h, NFCT_Q_CREATE, ct); and most likely one namespace shuts down. this problem was supposed to be fixed by: `70e9942` netfilter: nf_conntrack: make event callback registration per-netns Still, it was missing one rcu_access_pointer to check if the callback is set or not. Reported-by: Hans Schillstrom <hans@schillstrom.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2012-07-09 10:53:19 +02:00
David S. Miller	d3a5ea6e21	Merge branch 'master' of git://1984.lsi.us.es/nf-next	2012-07-07 16:18:50 -07:00
David S. Miller	f4530fa574	ipv4: Avoid overhead when no custom FIB rules are installed. If the user hasn't actually installed any custom rules, or fiddled with the default ones, don't go through the whole FIB rules layer. It's just pure overhead. Instead do what we do with CONFIG_IP_MULTIPLE_TABLES disabled, check the individual tables by hand, one by one. Also, move fib_num_tclassid_users into the ipv4 network namespace. Signed-off-by: David S. Miller <davem@davemloft.net>	2012-07-05 22:13:13 -07:00
Vladimir Kondratiev	95ddc1fc45	cfg80211: bitrate calculation for 60g 60g band uses different from .11n MCS scheme, so bitrate should be calculated differently Signed-off-by: Vladimir Kondratiev <qca_vkondrat@qca.qualcomm.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2012-07-05 15:18:32 +02:00
Vladimir Kondratiev	8eb41c8dfb	{nl,cfg}80211: support high bitrates Until now, a u16 value was used to represent bitrate value. With VHT bitrates this becomes too small. Introduce a new 32-bit bitrate attribute. nl80211 will report both the new and the old attribute, unless the bitrate doesn't fit into the old u16 attribute in which case only the new one will be reported. User space tools encouraged to prefer the 32-bit attribute, if available (since it won't be available on older kernels.) Signed-off-by: Vladimir Kondratiev <qca_vkondrat@qca.qualcomm.com> [reword commit message and comments a bit] Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2012-07-05 15:18:30 +02:00
David S. Miller	c90a9bb907	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net	2012-07-05 03:44:25 -07:00
David S. Miller	36bdbcae2f	net: Kill dst->_neighbour, accessors, and final uses. No longer used. Signed-off-by: David S. Miller <davem@davemloft.net>	2012-07-05 02:42:00 -07:00
David S. Miller	97cac0821a	ipv6: Store route neighbour in rt6_info struct. This makes for a simplified conversion away from dst_get_neighbour(). All code outside of ipv6 will use neigh lookups via dst_neigh_lookup(). Signed-off-by: David S. Miller <davem@davemloft.net>	2012-07-05 02:41:58 -07:00
David S. Miller	1d248b1cf4	net: Pass neighbours and dest address into NETEVENT_REDIRECT events. Signed-off-by: David S. Miller <davem@davemloft.net>	2012-07-05 02:21:55 -07:00
David S. Miller	fccd7d5c77	decnet: Use neighbours privately in dn_route struct. This allows an easy conversion away from dst_get_neighbour*(). Signed-off-by: David S. Miller <davem@davemloft.net>	2012-07-05 01:12:14 -07:00
David S. Miller	f894cbf847	net: Add optional SKB arg to dst_ops->neigh_lookup(). Causes the handler to use the daddr in the ipv4/ipv6 header when the route gateway is unspecified (local subnet). Signed-off-by: David S. Miller <davem@davemloft.net>	2012-07-05 01:04:01 -07:00
David S. Miller	5110effee8	net: Do delayed neigh confirmation. When a dst_confirm() happens, mark the confirmation as pending in the dst. Then on the next packet out, when we have the neigh in-hand, do the update. This removes the dependency in dst_confirm() of dst's having an attached neigh. While we're here, remove the explicit 'dst' NULL check, all except 2 or 3 call sites ensure it's not NULL. So just fix those cases up. Signed-off-by: David S. Miller <davem@davemloft.net>	2012-07-05 01:03:06 -07:00
David S. Miller	a263b30936	ipv4: Make neigh lookups directly in output packet path. Do not use the dst cached neigh, we'll be getting rid of that. Signed-off-by: David S. Miller <davem@davemloft.net>	2012-07-05 01:02:12 -07:00
Pablo Neira Ayuso	08911475d1	netfilter: nf_conntrack: generalize nf_ct_l4proto_net This patch generalizes nf_ct_l4proto_net by splitting it into chunks and moving the corresponding protocol part to where it really belongs to. To clarify, note that we follow two different approaches to support per-net depending if it's built-in or run-time loadable protocol tracker. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Acked-by: Gao feng <gaofeng@cn.fujitsu.com>	2012-07-04 19:37:22 +02:00
Johannes Berg	a1845fc7c5	mac80211: add TX prepare API Some drivers require setup before being able to send management frames in managed mode, in particular in multi-channel cases. Introduce API to allow the drivers to do such setup while being able to sleep waiting for the setup to finish in the device. This isn't possible inside the TX call since that can't sleep. A future patch may also restructure the TX retry to wait for the driver to report the frame status, as suggested by Arik in http://mid.gmane.org/CA+XVXffKSEL6ZQPQ98x-zO-NL2=TNF1uN==mprRyUmAaRn254g@mail.gmail.com Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2012-07-03 13:50:34 +02:00
Thomas Huehn	e3e1a0bcb3	mac80211: reduce IEEE80211_TX_MAX_RATES IEEE80211_TX_MAX_RATES can be reduced from 5 to 4 as there is no current hardware supporting a rate chain with 5 multi rate stages (mrr), so 4 mrr stages are sufficient. The memory that is freed within the ieee80211_tx_info struct will be used in the upcoming Transmission Power Control (TPC) implementation. Suggested-by: Felix Fietkau <nbd@openwrt.org> Signed-off-by: Thomas Huehn <thomas@net.t-labs.tu-berlin.de> [reword commit message] Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2012-07-03 13:48:37 +02:00
Johannes Berg	cb831b537d	mac80211: remove tx_frags driver callback The implementation of tx_frags is buggy due to not handling queue stop, and there's no driver implementing it so remove it. Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2012-07-02 15:40:18 +02:00
Vladimir Kondratiev	3a0c52a6d8	cfg80211: add 802.11ad (60gHz band) support Add enumerations for both cfg80211 and nl80211. This expands wiphy.bands etc. arrays. Extend channel <-> frequency translation to cover 60g band and modify the rate check logic since there are no legacy mandatory rates (only MCS is used.) Signed-off-by: Vladimir Kondratiev <qca_vkondrat@qca.qualcomm.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2012-07-02 15:11:10 +02:00
Neil Horman	4244854d22	sctp: be more restrictive in transport selection on bundled sacks It was noticed recently that when we send data on a transport, its possible that we might bundle a sack that arrived on a different transport. While this isn't a major problem, it does go against the SHOULD requirement in section 6.4 of RFC 2960: An endpoint SHOULD transmit reply chunks (e.g., SACK, HEARTBEAT ACK, etc.) to the same destination transport address from which it received the DATA or control chunk to which it is replying. This rule should also be followed if the endpoint is bundling DATA chunks together with the reply chunk. This patch seeks to correct that. It restricts the bundling of sack operations to only those transports which have moved the ctsn of the association forward since the last sack. By doing this we guarantee that we only bundle outbound saks on a transport that has received a chunk since the last sack. This brings us into stricter compliance with the RFC. Vlad had initially suggested that we strictly allow only sack bundling on the transport that last moved the ctsn forward. While this makes sense, I was concerned that doing so prevented us from bundling in the case where we had received chunks that moved the ctsn on multiple transports. In those cases, the RFC allows us to select any of the transports having received chunks to bundle the sack on. so I've modified the approach to allow for that, by adding a state variable to each transport that tracks weather it has moved the ctsn since the last sack. This I think keeps our behavior (and performance), close enough to our current profile that I think we can do this without a sysctl knob to enable/disable it. Signed-off-by: Neil Horman <nhorman@tuxdriver.com> CC: Vlad Yaseivch <vyasevich@gmail.com> CC: David S. Miller <davem@davemloft.net> CC: linux-sctp@vger.kernel.org Reported-by: Michele Baldessari <michele@redhat.com> Reported-by: sorin serban <sserban@redhat.com> Acked-by: Vlad Yasevich <vyasevich@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-06-30 22:44:35 -07:00
Andrei Emeltchenko	38b3fef173	Bluetooth: Improve debugging messages for hci_conn Improve debugging of hci_conn objects by: adding print to hci_conn refcounting, adding object spcifier when missing, change conn to hcon since conn is heavily used for l2cap_conn objects and this is misleading. Signed-off-by: Andrei Emeltchenko <andrei.emeltchenko@intel.com> Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>	2012-06-30 11:41:24 -03:00
John W. Linville	8732baafc3	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next into for-davem Conflicts: drivers/net/wireless/brcm80211/brcmfmac/dhd_sdio.c	2012-06-29 12:42:14 -04:00
Michal Kazior	2e165b8184	cfg80211/mac80211: remove .get_channel We do not need it anymore since cfg80211 tracks monitor channel and monitor channel type. Signed-off-by: Michal Kazior <michal.kazior@tieto.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2012-06-29 13:39:18 +02:00
Michal Kazior	dbbae26afa	cfg80211: track monitor interfaces count Implements .set_monitor_enabled(wiphy, enabled). Notifies driver upon change of interface layout. If only monitor interfaces become present it is called with 2nd argument being true. If non-monitor interface appears then 2nd argument is false. Driver is notified only upon change. This makes it more obvious about the fact that cfg80211 supports single monitor channel. Once we implement multi-channel we don't want to allow setting monitor channel while other interface types are running. Otherwise it would be ambiguous once we start considering num_different_channels. Signed-off-by: Michal Kazior <michal.kazior@tieto.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2012-06-29 13:39:16 +02:00
Michal Kazior	c30a3d3868	cfg80211: track ibss fixed channel IBSS may hop between channels. It is necessary to account this special case when considering interface combinations. Signed-off-by: Michal Kazior <michal.kazior@tieto.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2012-06-29 13:39:15 +02:00
Michal Kazior	f4489ebeff	cfg80211: add channel tracking for AP and mesh We need to know which channel is used by a running AP and mesh for channel context accounting and finding matching/active interface combination. STA/IBSS have current_bss already which allows us to check which channel a vif is tuned to. Non-fixed channel IBSS can be handled with additional changes. Monitor mode is going to be handled differently. Signed-off-by: Michal Kazior <michal.kazior@tieto.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2012-06-29 13:39:15 +02:00
David S. Miller	7a9bc9b81a	ipv4: Elide fib_validate_source() completely when possible. If rpfilter is off (or the SKB has an IPSEC path) and there are not tclassid users, we don't have to do anything at all when fib_validate_source() is invoked besides setting the itag to zero. We monitor tclassid uses with a counter (modified only under RTNL and marked __read_mostly) and we protect the fib_validate_source() real work with a test against this counter and whether rpfilter is to be done. Having a way to know whether we need no tclassid processing or not also opens the door for future optimized rpfilter algorithms that do not perform full FIB lookups. Signed-off-by: David S. Miller <davem@davemloft.net>	2012-06-29 01:36:36 -07:00
Ville Nuorvala	d0087b29f7	ipv6_tunnel: Allow receiving packets on the fallback tunnel if they pass sanity checks At Facebook, we do Layer-3 DSR via IP-in-IP tunneling. Our load balancers wrap an extra IP header on incoming packets so they can be routed to the backend. In the v4 tunnel driver, when these packets fall on the default tunl0 device, the behavior is to decapsulate them and drop them back on the stack. So our setup is that tunl0 has the VIP and eth0 has (obviously) the backend's real address. In IPv6 we do the same thing, but the v6 tunnel driver didn't have this same behavior - if you didn't have an explicit tunnel setup, it would drop the packet. This patch brings that v4 feature to the v6 driver. The same IPv6 address checks are performed as with any normal tunnel, but as the fallback tunnel endpoint addresses are unspecified, the checks must be performed on a per-packet basis, rather than at tunnel configuration time. [Patch description modified by phil@ipom.com] Signed-off-by: Ville Nuorvala <ville.nuorvala@gmail.com> Tested-by: Phil Dibowitz <phil@ipom.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-06-29 00:52:32 -07:00
David S. Miller	9e56e3800e	ipv4: Adjust in_dev handling in fib_validate_source() Checking for in_dev being NULL is pointless. In fact, all of our callers have in_dev precomputed already, so just pass it in and remove the NULL checking. Signed-off-by: David S. Miller <davem@davemloft.net>	2012-06-28 18:54:02 -07:00
Thomas Graf	58050fce35	net: Use NLMSG_DEFAULT_SIZE in combination with nlmsg_new() Using NLMSG_GOODSIZE results in multiple pages being used as nlmsg_new() will automatically add the size of the netlink header to the payload thus exceeding the page limit. NLMSG_DEFAULT_SIZE takes this into account. Signed-off-by: Thomas Graf <tgraf@suug.ch> Cc: Jiri Pirko <jpirko@redhat.com> Cc: Dmitry Eremin-Solenikov <dbaryshkov@gmail.com> Cc: Sergey Lapin <slapin@ossfans.org> Cc: Johannes Berg <johannes@sipsolutions.net> Cc: Lauro Ramos Venancio <lauro.venancio@openbossa.org> Cc: Aloisio Almeida Jr <aloisio.almeida@openbossa.org> Cc: Samuel Ortiz <sameo@linux.intel.com> Reviewed-by: Jiri Pirko <jpirko@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-06-28 17:56:43 -07:00
Neal Cardwell	3840a06e60	tcp: pass fl6 to inet6_csk_route_req() This commit changes inet_csk_route_req() so that it uses a pointer to a struct flowi6, rather than allocating its own on the stack. This brings its behavior in line with its IPv4 cousin, inet_csk_route_req(), and allows a follow-on patch to fix a dst leak. Signed-off-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-06-28 17:53:50 -07:00
Johannes Berg	b1fbd46976	Merge remote-tracking branch 'wireless-next/master' into mac80211-next	2012-06-28 13:45:58 +02:00
Mahesh Palivela	bf0c111ec8	cfg80211: allow advertising VHT capabilities Allow drivers to advertise their VHT capabilities and export them to userspace via nl80211. Signed-off-by: Mahesh Palivela <maheshp@posedge.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2012-06-28 13:08:34 +02:00
David S. Miller	41347dcdd8	ipv4: Kill rt->rt_spec_dst, no longer used. Signed-off-by: David S. Miller <davem@davemloft.net>	2012-06-28 04:05:27 -07:00
David S. Miller	35ebf65e85	ipv4: Create and use fib_compute_spec_dst() helper. The specific destination is the host we direct unicast replies to. Usually this is the original packet source address, but if we are responding to a multicast or broadcast packet we have to use something different. Specifically we must use the source address we would use if we were to send a packet to the unicast source of the original packet. The routing cache precomputes this value, but we want to remove that precomputation because it creates a hard dependency on the expensive rpfilter source address validation which we'd like to make cheaper. There are only three places where this matters: 1) ICMP replies. 2) pktinfo CMSG 3) IP options Now there will be no real users of rt->rt_spec_dst and we can simply remove it altogether. Signed-off-by: David S. Miller <davem@davemloft.net>	2012-06-28 03:59:11 -07:00
David S. Miller	70e7341673	ipv4: Show that ip_send_reply() is purely unicast routine. Rename it to ip_send_unicast_reply() and add explicit 'saddr' argument. This removed one of the few users of rt->rt_spec_dst. Signed-off-by: David S. Miller <davem@davemloft.net>	2012-06-28 03:21:41 -07:00
Johannes Berg	fc8a7321d3	mac80211: don't expose ieee80211_add_srates_ie() This and ieee80211_add_ext_srates_ie() aren't exported, so can't be used by drivers anyway, but there's also no reason that they should be so make them private to mac80211 and use sdata instead of vif arguments. Acked-by: Arik Nemtsov <arik@wizery.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2012-06-28 10:35:50 +02:00
David S. Miller	160eb5a6b1	ipv4: Kill early demux method return value. It's completely unnecessary. Signed-off-by: David S. Miller <davem@davemloft.net>	2012-06-27 22:01:22 -07:00
David S. Miller	1d1e34ddd4	xfrm_user: Propagate netlink error codes properly. Instead of using a fixed value of "-1" or "-EMSGSIZE", propagate what the nla_*() interfaces actually return. Signed-off-by: David S. Miller <davem@davemloft.net>	2012-06-27 21:57:03 -07:00
David S. Miller	c10237e077	Revert "ipv4: tcp: dont cache unconfirmed intput dst" This reverts commit `c074da2810`. This change has several unwanted side effects: 1) Sockets will cache the DST_NOCACHE route in sk->sk_rx_dst and we'll thus never create a real cached route. 2) All TCP traffic will use DST_NOCACHE and never use the routing cache at all. Signed-off-by: David S. Miller <davem@davemloft.net>	2012-06-27 17:05:06 -07:00
Eric Dumazet	c074da2810	ipv4: tcp: dont cache unconfirmed intput dst DDOS synflood attacks hit badly IP route cache. On typical machines, this cache is allowed to hold up to 8 Millions dst entries, 256 bytes for each, for a total of 2GB of memory. rt_garbage_collect() triggers and tries to cleanup things. Eventually route cache is disabled but machine is under fire and might OOM and crash. This patch exploits the new TCP early demux, to set a nocache boolean in case incoming TCP frame is for a not yet ESTABLISHED or TIMEWAIT socket. This 'nocache' boolean is then used in case dst entry is not found in route cache, to create an unhashed dst entry (DST_NOCACHE) SYN-cookie-ACK sent use a similar mechanism (ipv4: tcp: dont cache output dst for syncookies), so after this patch, a machine is able to absorb a DDOS synflood attack without polluting its IP route cache. Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Hans Schillstrom <hans.schillstrom@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-06-27 15:34:24 -07:00
Gao feng	f28997e27a	netfilter: nf_conntrack: add nf_ct_kfree_compat_sysctl_table This patch is a cleanup. It adds nf_ct_kfree_compat_sysctl_table to release l4proto's compat sysctl table and set the compat sysctl table point to NULL. This new function will be used by follow-up patches. Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2012-06-27 18:36:25 +02:00
Gao feng	f1caad2745	netfilter: nf_conntrack: prepare l4proto->init_net cleanup l4proto->init contain quite redundant code. We can simplify this by adding a new parameter l3proto. This patch prepares that code simplification. Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2012-06-27 18:31:14 +02:00
Johannes Berg	dfb89c56ad	cfg80211: don't allow WoWLAN support without CONFIG_PM When CONFIG_PM is disabled, no device can possibly support WoWLAN since it can't go to sleep to start with. Due to this, mac80211 had even rejected the hardware registration. By making all the code and data for WoWLAN depend on CONFIG_PM we can promote this runtime error to a compile-time error. Add #ifdef around all WoWLAN code to remove it in systems that don't need it as they never suspend. Cc: Kalle Valo <kvalo@qca.qualcomm.com> Acked-by: Luciano Coelho <coelho@ti.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2012-06-27 17:55:11 +02:00
alex.bluesman.smirnov@gmail.com	32bad7e30f	mac802154: add wpan device-class support Every real 802.15.4 transceiver, which works with software MAC layer, can be classified as a wpan device in this stack. So the wpan device implementation provides missing link in datapath between the device drivers and the Linux network queue. According to the IEEE 802.15.4 standard each packet can be one of the following types: - beacon - MAC layer command - ACK - data This patch adds support for the data packet-type only, but this is enough to perform data transmission and receiving over radio. Signed-off-by: Alexander Smirnov <alex.bluesman.smirnov@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-06-26 21:06:11 -07:00
Greg Kroah-Hartman	fc915c8b93	Merge 3.5-rc4 into tty-next This is to pick up the serial port and tty changes in Linus's tree to allow everyone to sync up. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2012-06-26 16:04:29 -07:00
John W. Linville	2c443443e7	Merge branch 'for-john' of git://git.sipsolutions.net/mac80211-next	2012-06-26 14:27:34 -04:00
Thomas Pedersen	88e920b450	nl80211: specify RSSI threshold in scheduled scan Support configuring an RSSI threshold in dBm (s32) when requesting scheduled scan, below which a BSS won't be reported by the cfg80211 driver. Signed-off-by: Thomas Pedersen <c_tpeder@qca.qualcomm.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2012-06-26 09:32:28 +02:00
Sjur Brændeland	91fa0cbc0c	caif-hsi: Remove use of module parameters Remove use of module parameters on caif hsi device, as rtnl configuration parameters are already supported. All caif hsi configuration data is put in cfhsi_config, and default values in hsi_default_config. Signed-off-by: Sjur Brændeland <sjur.brandeland@stericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-06-25 16:44:12 -07:00
Sjur Brændeland	1c385f1fdf	caif-hsi: Replace platform device with ops structure. Remove use of struct platform_device, and replace it with struct cfhsi_ops. Updated variable names in the same spirit: cfhsi_get_dev to cfhsi_get_ops, cfhsi->dev to cfhsi->ops and, cfhsi->dev.drv to cfhsi->ops->cb_ops. Signed-off-by: Sjur Brændeland <sjur.brandeland@stericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-06-25 16:44:12 -07:00
Sjur Brændeland	c412540063	caif-hsi: Add rtnl support Add RTNL support for managing the caif hsi interface. The HSI HW interface is no longer registering as a device, instead we use symbol_get to get hold of the HSI API. Signed-off-by: Sjur Brændeland <sjur.brandeland@stericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-06-25 16:44:12 -07:00
Eric Dumazet	deaa58542b	net: struct sock cleanups Add missing kernel doc for sk_rx_dst Move sk_rx_dst to avoid two 32bit holes on 64bit arches Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-06-25 16:09:18 -07:00
Vijay Subramanian	efc27f8cee	net: Remove 'unlikely' qualifier in skb_steal_sock() With early demux enabled by default for TCP flows, there is high chance that skb->sk will be non-null. 'unlikely()' was removed from __inet_lookup_skb() but maybe it can be removed from skb_steal_sock() as well. Note: skb_steal_sock() is also called by __inet6_lookup_skb() and __udp4_lib_lookup_skb() but they are protected by their own 'unlikely' calls. Signed-off-by: Vijay Subramanian <subramanian.vijay@gmail.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-06-25 16:08:36 -07:00
David S. Miller	e486463e82	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Conflicts: drivers/net/usb/qmi_wwan.c net/batman-adv/translation-table.c net/ipv6/route.c qmi_wwan.c resolution provided by Bjørn Mork. batman-adv conflict is dealing merely with the changes of global function names to have a proper subsystem prefix. ipv6's route.c conflict is merely two side-by-side additions of network namespace methods. Signed-off-by: David S. Miller <davem@davemloft.net>	2012-06-25 15:50:32 -07:00
Johannes Berg	bdcbd8e0e3	mac80211: clean up debugging There are a few things that make the logging and debugging in mac80211 less useful than it should be right now: * a lot of messages should be pr_info, not pr_debug * wholesale use of pr_debug makes it require both Kconfig and dynamic configuration * there are still a lot of ifdefs * the style is very inconsistent, sometimes the sdata->name is printed in front Clean up everything, introducing new macros and separating out the station MLME debugging into a new Kconfig symbol. Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2012-06-24 11:32:29 +02:00
Eric Dumazet	7586eceb0a	ipv4: tcp: dont cache output dst for syncookies Don't cache output dst for syncookies, as this adds pressure on IP route cache and rcu subsystem for no gain. Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Hans Schillstrom <hans.schillstrom@ericsson.com> Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-06-22 21:47:33 -07:00
Alexander Duyck	6648bd7e0e	ipv4: Add sysctl knob to control early socket demux This change is meant to add a control for disabling early socket demux. The main motivation behind this patch is to provide an option to disable the feature as it adds an additional cost to routing that reduces overall throughput by up to 5%. For example one of my systems went from 12.1Mpps to 11.6 after the early socket demux was added. It looks like the reason for the regression is that we are now having to perform two lookups, first the one for an established socket, and then the one for the routing table. By adding this patch and toggling the value for ip_early_demux to 0 I am able to get back to the 12.1Mpps I was previously seeing. [ Move local variables in ip_rcv_finish() down into the basic block in which they are actually used. -DaveM ] Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-06-22 17:11:13 -07:00
John W. Linville	133189a46c	Merge branch 'for-john' of git://git.sipsolutions.net/mac80211-next	2012-06-22 14:39:53 -04:00
Victor Goldenshtein	66572cfc30	mac80211: add command to get current rssi Get current rssi (in dBm) from the driver/FW. Instead of reporting the signal received in the last rx packet, which might be inaccurate if rx traffic is low and beacon filtering is enabled, get the signal from the driver/FW. Signed-off-by: Victor Goldenshtein <victorg@ti.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2012-06-21 16:42:17 +02:00
David S. Miller	41063e9dd1	ipv4: Early TCP socket demux. Input packet processing for local sockets involves two major demuxes. One for the route and one for the socket. But we can optimize this down to one demux for certain kinds of local sockets. Currently we only do this for established TCP sockets, but it could at least in theory be expanded to other kinds of connections. If a TCP socket is established then it's identity is fully specified. This means that whatever input route was used during the three-way handshake must work equally well for the rest of the connection since the keys will not change. Once we move to established state, we cache the receive packet's input route to use later. Like the existing cached route in sk->sk_dst_cache used for output packets, we have to check for route invalidations using dst->obsolete and dst->ops->check(). Early demux occurs outside of a socket locked section, so when a route invalidation occurs we defer the fixup of sk->sk_rx_dst until we are actually inside of established state packet processing and thus have the socket locked. Signed-off-by: David S. Miller <davem@davemloft.net>	2012-06-19 21:22:05 -07:00
David S. Miller	f9242b6b28	inet: Sanitize inet{,6} protocol demux. Don't pretend that inet_protos[] and inet6_protos[] are hashes, thay are just a straight arrays. Remove all unnecessary hash masking. Document MAX_INET_PROTOS. Use RAW_HTABLE_SIZE when appropriate. Reported-by: Ben Hutchings <bhutchings@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-06-19 18:56:21 -07:00
David S. Miller	a77f4b4acf	Merge branch 'for-davem' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next John Linville says: ==================== This is a sizeable batch of updates intended for 3.6... The bulk of the changes here are Bluetooth. Gustavo says: Here goes the first Bluetooth pull request for 3.6, we have queued quite a lot of work. Andrei Emeltchenko added the AMP Manager code, a lot of work is needed, but the first bit are already there. This code is disabled by default. Mat Martineau changed the whole L2CAP ERTM state machine code, replacing the old one with a new implementation. Besides that we had lot of coding style fixes (to follow net rules), more l2cap core separation from socket and many clean ups and fixed all over the tree. Along with the above, there is a healthy dose of ath9k, iwlwifi, and other driver updates. There is also another pull from the wireless tree to resolve some merge issues. I also fixed-up some merge discrepencies between net-next and wireless-next. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2012-06-19 14:37:15 -07:00
John W. Linville	b3c911eeb4	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next into for-davem Conflicts: drivers/net/wireless/iwlwifi/dvm/testmode.c drivers/net/wireless/iwlwifi/pcie/trans.c	2012-06-19 14:41:22 -04:00
Pablo Neira Ayuso	674147e211	netfilter: fix missing symbols if CONFIG_NETFILTER_NETLINK_QUEUE_CT unset ERROR: "nfqnl_ct_parse" [net/netfilter/nfnetlink_queue.ko] undefined! ERROR: "nfqnl_ct_seq_adjust" [net/netfilter/nfnetlink_queue.ko] undefined! ERROR: "nfqnl_ct_put" [net/netfilter/nfnetlink_queue.ko] undefined! ERROR: "nfqnl_ct_get" [net/netfilter/nfnetlink_queue.ko] undefined! We have to use CONFIG_NETFILTER_NETLINK_QUEUE_CT in include/net/netfilter/nfnetlink_queue.h, not CONFIG_NF_CONNTRACK. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-06-18 21:09:17 -07:00
Andrei Emeltchenko	9345d40c58	Bluetooth: Use AUTO_OFF constant in jiffies Move AUTO_OFF_TIMEOUT to other constants changing name to HCI_AUTO_OFF_TIMEOUT and convert to jiffies. Signed-off-by: Andrei Emeltchenko <andrei.emeltchenko@intel.com> Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>	2012-06-19 00:12:37 -03:00
Pablo Neira Ayuso	7c62234547	netfilter: nfnetlink_queue: fix compilation with NF_CONNTRACK disabled In "9cb0176 netfilter: add glue code to integrate nfnetlink_queue and ctnetlink" the compilation with NF_CONNTRACK disabled is broken. This patch fixes this issue. I have moved the conntrack part into nfnetlink_queue_ct.c to avoid peppering the entire nfnetlink_queue.c code with ifdefs. I also needed to rename nfnetlink_queue.c to nfnetlink_queue_pkt.c to update the net/netfilter/Makefile to support conditional compilation of the conntrack integration. This patch also adds CONFIG_NETFILTER_QUEUE_CT in case you want to explicitly disable the integration between nf_conntrack and nfnetlink_queue. Reported-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2012-06-19 04:44:57 +02:00
John W. Linville	8cfe523a12	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless into for-davem	2012-06-18 15:13:27 -04:00
Chun-Yeow Yeoh	728b19e5fb	{nl,cfg,mac}80211: implement dot11MeshHWMPconfirmationInterval As defined in section 13.10.9.3 Case D (802.11-2012), this control variable is used to limit the mesh STA to send only one PREQ to a root mesh STA within this interval of time (in TUs). The default value for this variable is set to 2000 TUs. However, for current implementation, the maximum configurable of dot11MeshHWMPconfirmationInterval is restricted by dot11MeshHWMPactivePathTimeout. Signed-off-by: Chun-Yeow Yeoh <yeohchunyeow@gmail.com> [line-break commit log] Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2012-06-18 13:55:15 +02:00
Rémi Denis-Courmont	31fdc5553b	net: remove my future former mail address Signed-off-by: Rémi Denis-Courmont <remi@remlab.net> Cc: Sakari Ailus <sakari.ailus@nokia.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-06-17 16:29:38 -07:00
David S. Miller	82f437b950	Merge branch 'master' of git://1984.lsi.us.es/nf-next Pablo says: ==================== This is the second batch of Netfilter updates for net-next. It contains the kernel changes for the new user-space connection tracking helper infrastructure. More details on this infrastructure are provides here: http://lwn.net/Articles/500196/ Still, I plan to provide some official documentation through the conntrack-tools user manual on how to setup user-space utilities for this. So far, it provides two helper in user-space, one for NFSv3 and another for Oracle/SQLnet/TNS. Yet in my TODO list. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2012-06-16 15:23:35 -07:00
Eldad Zack	7f95e1880e	include/net/dst.h: neaten asterisk placement Fix code style - place the asterisk where it belongs. Signed-off-by: Eldad Zack <eldad@fogrefinery.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-06-16 15:20:35 -07:00
Pablo Neira Ayuso	12f7a50533	netfilter: add user-space connection tracking helper infrastructure There are good reasons to supports helpers in user-space instead: * Rapid connection tracking helper development, as developing code in user-space is usually faster. * Reliability: A buggy helper does not crash the kernel. Moreover, we can monitor the helper process and restart it in case of problems. * Security: Avoid complex string matching and mangling in kernel-space running in privileged mode. Going further, we can even think about running user-space helpers as a non-root process. * Extensibility: It allows the development of very specific helpers (most likely non-standard proprietary protocols) that are very likely not to be accepted for mainline inclusion in the form of kernel-space connection tracking helpers. This patch adds the infrastructure to allow the implementation of user-space conntrack helpers by means of the new nfnetlink subsystem `nfnetlink_cthelper' and the existing queueing infrastructure (nfnetlink_queue). I had to add the new hook NF_IP6_PRI_CONNTRACK_HELPER to register ipv[4\|6]_helper which results from splitting ipv[4\|6]_confirm into two pieces. This change is required not to break NAT sequence adjustment and conntrack confirmation for traffic that is enqueued to our user-space conntrack helpers. Basic operation, in a few steps: 1) Register user-space helper by means of `nfct': nfct helper add ftp inet tcp [ It must be a valid existing helper supported by conntrack-tools ] 2) Add rules to enable the FTP user-space helper which is used to track traffic going to TCP port 21. For locally generated packets: iptables -I OUTPUT -t raw -p tcp --dport 21 -j CT --helper ftp For non-locally generated packets: iptables -I PREROUTING -t raw -p tcp --dport 21 -j CT --helper ftp 3) Run the test conntrackd in helper mode (see example files under doc/helper/conntrackd.conf conntrackd 4) Generate FTP traffic going, if everything is OK, then conntrackd should create expectations (you can check that with `conntrack': conntrack -E expect [NEW] 301 proto=6 src=192.168.1.136 dst=130.89.148.12 sport=0 dport=54037 mask-src=255.255.255.255 mask-dst=255.255.255.255 sport=0 dport=65535 master-src=192.168.1.136 master-dst=130.89.148.12 sport=57127 dport=21 class=0 helper=ftp [DESTROY] 301 proto=6 src=192.168.1.136 dst=130.89.148.12 sport=0 dport=54037 mask-src=255.255.255.255 mask-dst=255.255.255.255 sport=0 dport=65535 master-src=192.168.1.136 master-dst=130.89.148.12 sport=57127 dport=21 class=0 helper=ftp This confirms that our test helper is receiving packets including the conntrack information, and adding expectations in kernel-space. The user-space helper can also store its private tracking information in the conntrack structure in the kernel via the CTA_HELP_INFO. The kernel will consider this a binary blob whose layout is unknown. This information will be included in the information that is transfered to user-space via glue code that integrates nfnetlink_queue and ctnetlink. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2012-06-16 15:40:02 +02:00
Pablo Neira Ayuso	ae243bee39	netfilter: ctnetlink: add CTA_HELP_INFO attribute This attribute can be used to modify and to dump the internal protocol information. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2012-06-16 15:09:15 +02:00
Pablo Neira Ayuso	8c88f87cb2	netfilter: nfnetlink_queue: add NAT TCP sequence adjustment if packet mangled User-space programs that receive traffic via NFQUEUE may mangle packets. If NAT is enabled, this usually puzzles sequence tracking, leading to traffic disruptions. With this patch, nfnl_queue will make the corresponding NAT TCP sequence adjustment if: 1) The packet has been mangled, 2) the NFQA_CFG_F_CONNTRACK flag has been set, and 3) NAT is detected. There are some records on the Internet complaning about this issue: http://stackoverflow.com/questions/260757/packet-mangling-utilities-besides-iptables By now, we only support TCP since we have no helpers for DCCP or SCTP. Better to add this if we ever have some helper over those layer 4 protocols. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2012-06-16 15:09:08 +02:00
Pablo Neira Ayuso	1afc56794e	netfilter: nf_ct_helper: implement variable length helper private data This patch uses the new variable length conntrack extensions. Instead of using union nf_conntrack_help that contain all the helper private data information, we allocate variable length area to store the private helper data. This patch includes the modification of all existing helpers. It also includes a couple of include header to avoid compilation warnings. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2012-06-16 15:08:55 +02:00
Pablo Neira Ayuso	3cf4c7e381	netfilter: nf_ct_ext: support variable length extensions We can now define conntrack extensions of variable size. This patch is useful to get rid of these unions: union nf_conntrack_help union nf_conntrack_proto union nf_conntrack_nat_help Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2012-06-16 15:08:49 +02:00
Pablo Neira Ayuso	3a8fc53a45	netfilter: nf_ct_helper: allocate 16 bytes for the helper and policy names This patch modifies the struct nf_conntrack_helper to allocate the room for the helper name. The maximum length is 16 bytes (this was already introduced in 2.6.24). For the maximum length for expectation policy names, I have also selected 16 bytes. This patch is required by the follow-up patch to support user-space connection tracking helpers. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2012-06-16 15:08:39 +02:00
David S. Miller	aee289baaa	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Conflicts: net/ipv6/route.c Pull in 'net' again to get the revert of Thomas's change which introduced regressions. Signed-off-by: David S. Miller <davem@davemloft.net>	2012-06-16 01:23:04 -07:00
David S. Miller	e8803b6c38	Revert "ipv6: Prevent access to uninitialized fib_table_hash via /proc/net/ipv6_route" This reverts commit `2a0c451ade`. It causes crashes, because now ip6_null_entry is used before it is initialized. Signed-off-by: David S. Miller <davem@davemloft.net>	2012-06-16 01:12:19 -07:00
David S. Miller	7e52b33bd5	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Conflicts: net/ipv6/route.c This deals with a merge conflict between the net-next addition of the inetpeer network namespace ops, and Thomas Graf's bug fix in `2a0c451ade` which makes sure we don't register /proc/net/ipv6_route before it is actually safe to do so. Signed-off-by: David S. Miller <davem@davemloft.net>	2012-06-15 15:51:55 -07:00
Thomas Graf	2a0c451ade	ipv6: Prevent access to uninitialized fib_table_hash via /proc/net/ipv6_route /proc/net/ipv6_route reflects the contents of fib_table_hash. The proc handler is installed in ip6_route_net_init() whereas fib_table_hash is allocated in fib6_net_init() _after_ the proc handler has been installed. This opens up a short time frame to access fib_table_hash with its pants down. fib6_init() as a whole can't be moved to an earlier position as it also registers the rtnetlink message handlers which should be registered at the end. Therefore split it into fib6_init() which is run early and fib6_init_late() to register the rtnetlink message handlers. Signed-off-by: Thomas Graf <tgraf@suug.ch> Reviewed-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-06-15 15:30:15 -07:00
David S. Miller	81aded2467	ipv6: Handle PMTU in ICMP error handlers. One tricky issue on the ipv6 side vs. ipv4 is that the ICMP callouts to handle the error pass the 32-bit info cookie in network byte order whereas ipv4 passes it around in host byte order. Like the ipv4 side, we have two helper functions. One for when we have a socket context and one for when we do not. ip6ip6 tunnels are not handled here, because they handle PMTU events by essentially relaying another ICMP packet-too-big message back to the original sender. This patch allows us to get rid of rt6_do_pmtu_disc(). It handles all kinds of situations that simply cannot happen when we do the PMTU update directly using a fully resolved route. In fact, the "plen == 128" check in ip6_rt_update_pmtu() can very likely be removed or changed into a BUG_ON() check. We should never have a prefixed ipv6 route when we get there. Another piece of strange history here is that TCP and DCCP, unlike in ipv4, never invoke the update_pmtu() method from their ICMP error handlers. This is incredibly astonishing since this is the context where we have the most accurate context in which to make a PMTU update, namely we have a fully connected socket and associated cached socket route. Signed-off-by: David S. Miller <davem@davemloft.net>	2012-06-15 14:54:11 -07:00
David S. Miller	3639339553	ipv4: Handle PMTU in all ICMP error handlers. With ip_rt_frag_needed() removed, we have to explicitly update PMTU information in every ICMP error handler. Create two helper functions to facilitate this. 1) ipv4_sk_update_pmtu() This updates the PMTU when we have a socket context to work with. 2) ipv4_update_pmtu() Raw version, used when no socket context is available. For this interface, we essentially just pass in explicit arguments for the flow identity information we would have extracted from the socket. And you'll notice that ipv4_sk_update_pmtu() is simply implemented in terms of ipv4_update_pmtu() Note that __ip_route_output_key() is used, rather than something like ip_route_output_flow() or ip_route_output_key(). This is because we absolutely do not want to end up with a route that does IPSEC encapsulation and the like. Instead, we only want the route that would get us to the node described by the outermost IP header. Reported-by: Steffen Klassert <steffen.klassert@secunet.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-06-14 22:22:07 -07:00
Chun-Yeow Yeoh	ac1073a61d	{nl,cfg,mac}80211: implement dot11MeshHWMProotInterval and dot11MeshHWMPactivePathToRootTimeout Add the mesh configuration parameters dot11MeshHWMProotInterval and dot11MeshHWMPactivePathToRootTimeout to be used by proactive PREQ mechanism. Signed-off-by: Chun-Yeow Yeoh <yeohchunyeow@gmail.com> [line-break commit log] Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2012-06-14 09:08:22 +02:00
John W. Linville	211c17aaee	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless Conflicts: drivers/net/wireless/ath/ath9k/main.c net/bluetooth/hci_event.c	2012-06-13 15:35:35 -04:00
John W. Linville	ec8eb9ae58	Merge branch 'for-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth-next	2012-06-13 15:12:07 -04:00
John W. Linville	1f7e010282	Merge branch 'for-john' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211	2012-06-13 14:05:40 -04:00
Johannes Berg	73c3df3ba3	cfg80211/nl80211: fix kernel-doc Add missing entries to nl80211.h and fix the kernel-doc notation in cfg80211.h. Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2012-06-13 11:17:14 +02:00
David S. Miller	43b03f1f6d	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Conflicts: MAINTAINERS drivers/net/wireless/iwlwifi/pcie/trans.c The iwlwifi conflict was resolved by keeping the code added in 'net' that turns off the buggy chip feature. The MAINTAINERS conflict was merely overlapping changes, one change updated all the wireless web site URLs and the other changed some GIT trees to be Johannes's instead of John's. Signed-off-by: David S. Miller <davem@davemloft.net>	2012-06-12 21:59:18 -07:00
Jefferson Delfes	af7985bf85	Bluetooth: Fix flags of mgmt_device_found event Change flags field to matches userspace structure. This field needs to be converted to little endian before forward it. Signed-off-by: Jefferson Delfes <jefferson.delfes@openbossa.org> Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>	2012-06-12 23:19:21 -03:00
Jiri Slaby	62f228acb8	TTY: ircomm, use tty from tty_port This also includes a switch to tty refcounting. It makes sure, the code no longer can access a freed TTY struct. Sometimes the only thing needed is to pass tty down to the callies. Signed-off-by: Jiri Slaby <jslaby@suse.cz> Cc: Samuel Ortiz <samuel@sortiz.org> Cc: netdev@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2012-06-12 15:50:24 -07:00
Jiri Slaby	e673927d8a	TTY: ircomm, revamp locking Use self->spinlock only for ctrl_skb and tx_skb. TTY stuff is now protected by tty_port->lock. This is needed for further cleanup (and conversion to tty_port helpers). This also closes the race in the end of close. Signed-off-by: Jiri Slaby <jslaby@suse.cz> Cc: Samuel Ortiz <samuel@sortiz.org> Cc: netdev@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2012-06-12 15:50:23 -07:00
Jiri Slaby	849d5a997f	TTY: ircomm, use flags from tty_port Switch to tty_port->flags. And while at it, remove redefined flags for them. Signed-off-by: Jiri Slaby <jslaby@suse.cz> Cc: Samuel Ortiz <samuel@sortiz.org> Cc: netdev@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2012-06-12 15:50:23 -07:00
Jiri Slaby	580d27b449	TTY: ircomm, use open counts from tty_port Switch to tty_port->count and blocked_open. Signed-off-by: Jiri Slaby <jslaby@suse.cz> Cc: Samuel Ortiz <samuel@sortiz.org> Cc: netdev@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2012-06-12 15:50:23 -07:00
Jiri Slaby	2a0213cb1e	TTY: ircomm, use close times from tty_port Switch to tty_port->close_delay and closing_wait. Signed-off-by: Jiri Slaby <jslaby@suse.cz> Cc: Samuel Ortiz <samuel@sortiz.org> Cc: netdev@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2012-06-12 15:50:23 -07:00
Jiri Slaby	a3cc9fcff8	TTY: ircomm, add tty_port And use close/open_wait from there. Signed-off-by: Jiri Slaby <jslaby@suse.cz> Cc: Samuel Ortiz <samuel@sortiz.org> Cc: netdev@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2012-06-12 15:50:23 -07:00
Eric Dumazet	5ee31c6898	bonding: Fix corrupted queue_mapping In the transmit path of the bonding driver, skb->cb is used to stash the skb->queue_mapping so that the bonding device can set its own queue mapping. This value becomes corrupted since the skb->cb is also used in __dev_xmit_skb. When transmitting through bonding driver, bond_select_queue is called from dev_queue_xmit. In bond_select_queue the original skb->queue_mapping is copied into skb->cb (via bond_queue_mapping) and skb->queue_mapping is overwritten with the bond driver queue. Subsequently in dev_queue_xmit, __dev_xmit_skb is called which writes the packet length into skb->cb, thereby overwriting the stashed queue mappping. In bond_dev_queue_xmit (called from hard_start_xmit), the queue mapping for the skb is set to the stashed value which is now the skb length and hence is an invalid queue for the slave device. If we want to save skb->queue_mapping into skb->cb[], best place is to add a field in struct qdisc_skb_cb, to make sure it wont conflict with other layers (eg : Qdiscc, Infiniband...) This patchs also makes sure (struct qdisc_skb_cb)->data is aligned on 8 bytes : netem qdisc for example assumes it can store an u64 in it, without misalignment penalty. Note : we only have 20 bytes left in (struct qdisc_skb_cb)->data[]. The largest user is CHOKe and it fills it. Based on a previous patch from Tom Herbert. Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: Tom Herbert <therbert@google.com> Cc: John Fastabend <john.r.fastabend@intel.com> Cc: Roland Dreier <roland@kernel.org> Acked-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-06-12 15:29:21 -07:00
John W. Linville	0440507bbc	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next into for-davem	2012-06-12 14:25:04 -04:00
Andrei Emeltchenko	5f246e8905	Bluetooth: Update HCI timeouts constants to use msecs_to_jiffies The HCI constants are always used in form of jiffies. So just include the conversion from msecs in the define itself. This has the advantage of making the code where the timeout is used more readable and avoiding unnecessary conversions. The patch is similar to commit `ba13ccd9` doing the same job for L2CAP Reported-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Andrei Emeltchenko <andrei.emeltchenko@intel.com> Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>	2012-06-12 00:07:05 -03:00
Gustavo Padovan	cbe461c526	Merge branch 'for-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth Conflicts: net/bluetooth/hci_event.c	2012-06-11 22:36:42 -03:00
David S. Miller	55afabaa0d	inet: Fix BUG triggered by __rt{,6}_get_peer(). If no peer actually gets attached (either because create is zero or the peer allocation fails) we'll trigger a BUG because we unconditionally do an rt{,6}_peer_ptr() afterwards. Fix this by guarding it with the proper check. Signed-off-by: David S. Miller <davem@davemloft.net>	2012-06-11 15:52:29 -07:00
David S. Miller	67da255210	Merge branch 'master' of git://1984.lsi.us.es/net-next	2012-06-11 12:56:14 -07:00
John W. Linville	2e48686835	Merge tag 'nfc-next-3.6-1' of git://git.kernel.org/pub/scm/linux/kernel/git/sameo/nfc-3.0	2012-06-11 14:46:04 -04:00
David S. Miller	7b34ca2ac7	inet: Avoid potential NULL peer dereference. We handle NULL in rt{,6}_set_peer but then our caller will try to pass that NULL pointer into inet_putpeer() which isn't ready for it. Fix this by moving the NULL check one level up, and then remove the now unnecessary NULL check from inetpeer_ptr_set_peer(). Reported-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-06-11 04:13:57 -07:00
David S. Miller	8e77327783	inet: Add inetpeer tree roots to the FIB tables. Signed-off-by: David S. Miller <davem@davemloft.net>	2012-06-11 02:09:16 -07:00
David S. Miller	b48c80ece9	inet: Add family scope inetpeer flushes. This implementation can deal with having many inetpeer roots, which is a necessary prerequisite for per-FIB table rooted peer tables. Each family (AF_INET, AF_INET6) has a sequence number which we bump when we get a family invalidation request. Each peer lookup cheaply checks whether the flush sequence of the root we are using is out of date, and if so flushes it and updates the sequence number. Signed-off-by: David S. Miller <davem@davemloft.net>	2012-06-11 02:09:10 -07:00
David S. Miller	46517008e1	ipv4: Kill ip_rt_frag_needed(). There is zero point to this function. It's only real substance is to perform an extremely outdated BSD4.2 ICMP check, which we can safely remove. If you really have a MTU limited link being routed by a BSD4.2 derived system, here's a nickel go buy yourself a real router. The other actions of ip_rt_frag_needed(), checking and conditionally updating the peer, are done by the per-protocol handlers of the ICMP event. TCP, UDP, et al. have a handler which will receive this event and transmit it back into the associated route via dst_ops->update_pmtu(). This simplification is important, because it eliminates the one place where we do not have a proper route context in which to make an inetpeer lookup. Signed-off-by: David S. Miller <davem@davemloft.net>	2012-06-11 02:08:59 -07:00
David S. Miller	97bab73f98	inet: Hide route peer accesses behind helpers. We encode the pointer(s) into an unsigned long with one state bit. The state bit is used so we can store the inetpeer tree root to use when resolving the peer later. Later the peer roots will be per-FIB table, and this change works to facilitate that. Signed-off-by: David S. Miller <davem@davemloft.net>	2012-06-11 02:08:47 -07:00
Chun-Yeow Yeoh	a4f606ea73	{nl,cfg,mac}80211: fix the coding style related to mesh parameters fix the coding style related to mesh parameters, especially the indentation, as pointed out by Johannes Berg. Signed-off-by: Chun-Yeow Yeoh <yeohchunyeow@gmail.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2012-06-11 09:23:45 +02:00
Chun-Yeow Yeoh	3ddd53f392	cfg80211: add missing kernel-doc for mesh configuration structure Add the missing kernel-doc for mesh configuration parameters as pointed out by Johannes Berg. Signed-off-by: Chun-Yeow Yeoh <yeohchunyeow@gmail.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2012-06-11 09:22:57 +02:00
Roland Dreier	c5d21c4b2a	net: Reorder initialization in ip_route_output to fix gcc warning If I build with W=1, for every file that includes <net/route.h>, I get the warning include/net/route.h: In function 'ip_route_output': include/net/route.h:135:3: warning: initialized field overwritten [-Woverride-init] include/net/route.h:135:3: warning: (near initialization for 'fl4') [-Woverride-init] (This is with "gcc (Debian 4.6.3-1) 4.6.3") A fix seems pretty trivial: move the initialization of .flowi4_tos earlier. As far as I can tell, this has no effect on code generation. Signed-off-by: Roland Dreier <roland@purestorage.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-06-11 00:04:47 -07:00
David S. Miller	c0efc887dc	inet: Pass inetpeer root into inet_getpeer*() interfaces. Otherwise we reference potentially non-existing members when ipv6 is disabled. Signed-off-by: David S. Miller <davem@davemloft.net>	2012-06-09 19:12:36 -07:00
David S. Miller	56a6b248eb	inet: Consolidate inetpeer_invalidate_tree() interfaces. We only need one interface for this operation, since we always know which inetpeer root we want to flush. Signed-off-by: David S. Miller <davem@davemloft.net>	2012-06-09 16:32:41 -07:00
David S. Miller	c3426b4719	inet: Initialize per-netns inetpeer roots in net/ipv{4,6}/route.c Instead of net/ipv4/inetpeer.c Signed-off-by: David S. Miller <davem@davemloft.net>	2012-06-09 16:27:05 -07:00
David S. Miller	2397849baa	[PATCH] tcp: Cache inetpeer in timewait socket, and only when necessary. Since it's guarenteed that we will access the inetpeer if we're trying to do timewait recycling and TCP options were enabled on the connection, just cache the peer in the timewait socket. In the future, inetpeer lookups will be context dependent (per routing realm), and this helps facilitate that as well. Signed-off-by: David S. Miller <davem@davemloft.net>	2012-06-09 14:56:12 -07:00
Johannes Berg	d13e141481	mac80211: add some missing kernel-doc Add a few kernel-doc descriptions that were missed during development. Reported-by: Randy Dunlap <rdunlap@xenotime.net> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2012-06-09 10:31:09 +02:00
David S. Miller	4670fd819e	tcp: Get rid of inetpeer special cases. The get_peer method TCP uses is full of special cases that make no sense accommodating, and it also gets in the way of doing more reasonable things here. First of all, if the socket doesn't have a usable cached route, there is no sense in trying to optimize timewait recycling. Likewise for the case where we have IP options, such as SRR enabled, that make the IP header destination address (and thus the destination address of the route key) differ from that of the connection's destination address. Just return a NULL peer in these cases, and thus we're also able to get rid of the clumsy inetpeer release logic. Signed-off-by: David S. Miller <davem@davemloft.net>	2012-06-09 01:25:47 -07:00
David S. Miller	fbfe95a42e	inet: Create and use rt{,6}_get_peer_create(). There's a lot of places that open-code rt{,6}_get_peer() only because they want to set 'create' to one. So add an rt{,6}_get_peer_create() for their sake. There were also a few spots open-coding plain rt{,6}_get_peer() and those are transformed here as well. Signed-off-by: David S. Miller <davem@davemloft.net>	2012-06-08 23:24:18 -07:00
Johan Hedberg	1c2e004183	Bluetooth: Add support for encryption key refresh With LE/SMP the completion of a security level elavation from medium to high is indicated by a HCI Encryption Key Refresh Complete event. The necessary behavior upon receiving this event is a mix of what's done for auth_complete and encryption_change, which is also where most of the event handling code has been copied from. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>	2012-06-08 21:00:40 -03:00
Eric Dumazet	7123aaa3a1	af_unix: speedup /proc/net/unix /proc/net/unix has quadratic behavior, and can hold unix_table_lock for a while if high number of unix sockets are alive. (90 ms for 200k sockets...) We already have a hash table, so its quite easy to use it. Problem is unbound sockets are still hashed in a single hash slot (unix_socket_table[UNIX_HASH_TABLE]) This patch also spreads unbound sockets to 256 hash slots, to speedup both /proc/net/unix and unix_diag. Time to read /proc/net/unix with 200k unix sockets : (time dd if=/proc/net/unix of=/dev/null bs=4k) before : 520 secs after : 2 secs Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Steven Whitehouse <swhiteho@redhat.com> Cc: Pavel Emelyanov <xemul@parallels.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-06-08 14:27:23 -07:00
Gao feng	54db0cc2ba	inetpeer: add parameter net for inet_getpeer_v4,v6 add struct net as a parameter of inet_getpeer_v[4,6], use net to replace &init_net. and modify some places to provide net for inet_getpeer_v[4,6] Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-06-08 14:27:23 -07:00
Gao feng	c8a627ed06	inetpeer: add namespace support for inetpeer now inetpeer doesn't support namespace,the information will be leaking across namespace. this patch move the global vars v4_peers and v6_peers to netns_ipv4 and netns_ipv6 as a field peers. add struct pernet_operations inetpeer_ops to initial pernet inetpeer data. and change family_to_base and inet_getpeer to support namespace. Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-06-08 14:27:23 -07:00
Gao feng	8264deb818	netfilter: nf_conntrack: add namespace support for cttimeout This patch adds namespace support for cttimeout. Acked-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2012-06-07 14:58:41 +02:00
Pablo Neira Ayuso	e76d0af5e4	netfilter: nf_conntrack: remove now unused sysctl for nf_conntrack_l[3\|4]proto Since the sysctl data for l[3\|4]proto now resides in pernet nf_proto_net. We can now remove this unused fields from struct nf_contrack_l[3,4]proto. Acked-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2012-06-07 14:58:41 +02:00
Gao feng	7080ba0955	netfilter: nf_ct_icmp: add namespace support This patch adds namespace support for ICMPv6 protocol tracker. Acked-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2012-06-07 14:58:40 +02:00
Gao feng	4b626b9c5d	netfilter: nf_ct_icmp: add namespace support This patch adds namespace support for ICMP protocol tracker. Acked-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2012-06-07 14:58:40 +02:00
Gao feng	0ce490ad43	netfilter: nf_ct_udp: add namespace support This patch adds namespace support for UDP protocol tracker. Acked-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2012-06-07 14:58:40 +02:00
Gao feng	d2ba1fde42	netfilter: nf_ct_tcp: add namespace support This patch adds namespace support for TCP protocol tracker. Acked-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2012-06-07 14:58:39 +02:00
Gao feng	15f585bd76	netfilter: nf_ct_generic: add namespace support This patch adds namespace support for the generic layer 4 protocol tracker. Acked-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2012-06-07 14:58:39 +02:00
Gao feng	524a53e5ad	netfilter: nf_conntrack: prepare namespace support for l3 protocol trackers This patch prepares the namespace support for layer 3 protocol trackers. Basically, this modifies the following interfaces: * nf_ct_l3proto_[un]register_sysctl. * nf_conntrack_l3proto_[un]register. We add a new nf_ct_l3proto_net is used to get the pernet data of l3proto. This adds rhe new struct nf_ip_net that is used to store the sysctl header and l3proto_ipv4,l4proto_tcp(6),l4proto_udp(6),l4proto_icmp(v6) because the protos such tcp and tcp6 use the same data,so making nf_ip_net as a field of netns_ct is the easiest way to manager it. This patch also adds init_net to struct nf_conntrack_l3proto to initial the layer 3 protocol pernet data. Acked-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2012-06-07 14:58:39 +02:00
Gao feng	2c352f444c	netfilter: nf_conntrack: prepare namespace support for l4 protocol trackers This patch prepares the namespace support for layer 4 protocol trackers. Basically, this modifies the following interfaces: * nf_ct_[un]register_sysctl * nf_conntrack_l4proto_[un]register to include the namespace parameter. We still use init_net in this patch to prepare the ground for follow-up patches for each layer 4 protocol tracker. We add a new net_id field to struct nf_conntrack_l4proto that is used to store the pernet_operations id for each layer 4 protocol tracker. Note that AF_INET6's protocols do not need to do sysctl compat. Thus, we only register compat sysctl when l4proto.l3proto != AF_INET6. Acked-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2012-06-07 14:58:39 +02:00
Johannes Berg	2eb278e083	mac80211: unify SW/offload remain-on-channel Redesign all the off-channel code, getting rid of the generic off-channel work concept, replacing it with a simple remain-on-channel list. This fixes a number of small issues with the ROC implementation: * offloaded remain-on-channel couldn't be queued, now we can queue it as well, if needed * in iwlwifi (the only user) offloaded ROC is mutually exclusive with scanning, use the new queue to handle that case -- I expect that it will later depend on a HW flag The bigger issue though is that there's a bad bug in the current implementation: if we get a mgmt TX request while HW roc is active, and this new request has a wait time, we actually schedule a software ROC instead since we can't guarantee the existing offloaded ROC will still be that long. To fix this, the queuing mechanism was needed. The queuing mechanism for offloaded ROC isn't yet optimal, ideally we should add API to have the HW extend the ROC if needed. We could add that later but for now use a software implementation. Overall, this unifies the behaviour between the offloaded and software-implemented case as much as possible. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2012-06-06 15:31:18 -04:00
Johannes Berg	196ac1c13d	mac80211: do remain-on-channel while idle The IDLE handling in HW off-channel is broken right now since we turn off IDLE only when the off-channel period already started. Therefore, all drivers that use it today (only iwlwifi!) must support off-channel while idle, so playing with idle isn't needed at all. Off-channel in general, since it's no longer used for authentication/association, shouldn't affect PS, so also remove that logic. Also document a small caveat for reporting TX status from off-channel frames in HW remain-on-channel. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2012-06-06 15:20:33 -04:00
Johannes Berg	e8c9bd5b8d	cfg80211: clarify set_channel APIs Now that we've removed all uses of the set_channel API except for the monitor channel and in libertas, clarify this. Split the libertas mesh use into a new libertas_set_mesh_channel() operation, just to keep backward compatibility, and rename the normal set_channel() to set_monitor_channel(). Also describe the desired set_monitor_channel() semantics more clearly. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2012-06-06 15:18:17 -04:00
Eric Dumazet	55432d2b54	inetpeer: fix a race in inetpeer_gc_worker() commit `5faa5df1fa` (inetpeer: Invalidate the inetpeer tree along with the routing cache) added a race : Before freeing an inetpeer, we must respect a RCU grace period, and make sure no user will attempt to increase refcnt. inetpeer_invalidate_tree() waits for a RCU grace period before inserting inetpeer tree into gc_list and waking the worker. At that time, no concurrent lookup can find a inetpeer in this tree. Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Steffen Klassert <steffen.klassert@secunet.com> Acked-by: Steffen Klassert <steffen.klassert@secunet.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-06-06 10:45:15 -07:00
Johannes Berg	cc1d2806bf	cfg80211: provide channel to join_mesh function Just like the AP mode patch, instead of setting the channel and then joining the mesh network, provide the channel to join the network on to the join_mesh() function. Like in AP mode, you can also give the channel to the join-mesh nl80211 command now. Unlike AP mode, it picks a default channel if none was given. As libertas uses mesh mode interfaces but has no join_mesh callback and we can't simply break it, keep some compatibility code for that case and configure the channel directly for it. In the non-libertas case, where we store the channel until join, allow setting it while the interface is down. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2012-06-05 15:32:18 -04:00
Johannes Berg	aa430da410	cfg80211: provide channel to start_ap function Instead of setting the channel first and then starting the AP, let cfg80211 store the channel and provide it as one of the AP settings. This means that now you have to set the channel before you can start an AP interface, but since hostapd/wpa_supplicant always do that we're OK with this change. Alternatively, it's now possible to give the channel as an attribute to the start-ap nl80211 command, overriding any preset channel. Cc: Kalle Valo <kvalo@qca.qualcomm.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2012-06-05 15:32:16 -04:00
Johannes Berg	d58e7e37aa	cfg80211: simplify cfg80211_can_beacon_sec_chan API Change cfg80211_can_beacon_sec_chan() to return true if there is no secondary channel to simplify all the current users of it. They all check the channel type before calling the function because it returns false if there's no secondary channel. Also actually document the return value. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2012-06-05 15:32:16 -04:00
Eliad Peller	51ca9d8db2	mac80211: remove ieee80211_get_operstate() ieee80211_get_operstate() was used by drivers in order to know whether the sta link is up, but it's no longer needed (nor used) as mac80211 notifies the drivers about authorization changes (via the sta_state callback) Signed-off-by: Eliad Peller <eliad@wizery.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2012-06-05 15:32:10 -04:00
Joe Perches	499f42bb03	net: mac80211: Add and use ibss_vdbg debugging macro Simplify the use of #ifdef CONFIG_MAC80211_IBSS_DEBUG/#endif by adding a logging macro to encapsulate the test. Convert the appropriate uses too. Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2012-06-05 15:32:10 -04:00
Joe Perches	d63e9ae3b1	net: mac80211: Add and use ht_vdbg debugging macro Simplify the use of #ifdef CONFIG_MAC80211_HT_DEBUG/#endif by adding a logging macro to encapsulate the test. Convert the appropriate uses too. Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2012-06-05 15:32:10 -04:00
Arik Nemtsov	72d7872852	mac80211: allow low-level drivers to set netdev feature bits Low level drivers can now set certain netdev feature bits in netdev_features member of the ieee80211_hw struct. These will be propagated to every netdev created from this HW. The white-listed features currently include only ones related to HW checksumming. Signed-off-by: Arik Nemtsov <arik@wizery.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2012-06-05 15:21:46 -04:00
Gustavo Padovan	7e1af8a3a5	Bluetooth: Create empty l2cap ops function A2MP doesn't use part of the L2CAP chan ops API so we just create general empty function instead of the A2MP specific one. Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>	2012-06-05 06:34:16 +03:00
Andre Guedes	8c3a4f004e	Bluetooth: Rename L2CAP_LE_DEFAULT_MTU This patch renames L2CAP_LE_DEFAULT_MTU macro to L2CAP_LE_MIN_MTU since it represents the minimum MTU value, not the default MTU value for LE. Signed-off-by: Andre Guedes <andre.guedes@openbossa.org> Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>	2012-06-05 06:34:16 +03:00
Szymon Janc	1afd5be87e	Bluetooth: Remove unused HCI timeouts definitions Those are not used anywhere in code (and never were since introduction in 2006) so just remove them. Signed-off-by: Szymon Janc <szymon.janc@tieto.com> Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>	2012-06-05 06:34:14 +03:00
Andrei Emeltchenko	97e8e89d2d	Bluetooth: A2MP: Manage incoming connections Handle incoming A2MP connection by creating AMP manager and processing A2MP messages. Signed-off-by: Andrei Emeltchenko <andrei.emeltchenko@intel.com> Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>	2012-06-05 06:34:14 +03:00
Andrei Emeltchenko	416fa7527d	Bluetooth: A2MP: Handling fixed channels A2MP fixed channel do not have sk Signed-off-by: Andrei Emeltchenko <andrei.emeltchenko@intel.com> Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>	2012-06-05 06:34:13 +03:00
Andrei Emeltchenko	8598d064cb	Bluetooth: A2MP: Process A2MP Discover Request Adds helper functions to count HCI devs and process A2MP Discover Request, code makes sure that first controller in the list is BREDR one. Trace is shown below: ... > ACL data: handle 11 flags 0x02 dlen 16 A2MP: Discover req: mtu/mps 670 mask: 0x0000 < ACL data: handle 11 flags 0x00 dlen 22 A2MP: Discover rsp: mtu/mps 670 mask: 0x0000 Controller list: id 0 type 0 (BR-EDR) status 0x01 (Bluetooth only) id 1 type 1 (802.11 AMP) status 0x01 (Bluetooth only) ... Signed-off-by: Andrei Emeltchenko <andrei.emeltchenko@intel.com> Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>	2012-06-05 06:34:12 +03:00
Andrei Emeltchenko	e7af522e04	Bluetooth: A2MP: Define A2MP status codes A2MP status codes copied from Bluez patch sent by Peter Krystad <pkrystad@codeaurora.org>. Signed-off-by: Andrei Emeltchenko <andrei.emeltchenko@intel.com> Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>	2012-06-05 06:34:12 +03:00
Andrei Emeltchenko	b9058fb67c	Bluetooth: A2MP: Definitions for A2MP commands Define A2MP command IDs and packet structures. Signed-off-by: Andrei Emeltchenko <andrei.emeltchenko@intel.com> Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>	2012-06-05 06:34:12 +03:00
Andrei Emeltchenko	f6d3c6e783	Bluetooth: A2MP: Build and Send msg helpers Helper function to build and send A2MP messages. Signed-off-by: Andrei Emeltchenko <andrei.emeltchenko@intel.com> Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>	2012-06-05 06:34:12 +03:00
Andrei Emeltchenko	9740e49d17	Bluetooth: A2MP: AMP Manager basic functions Define AMP Manager and some basic functions. Signed-off-by: Andrei Emeltchenko <andrei.emeltchenko@intel.com> Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>	2012-06-05 06:34:11 +03:00
Andrei Emeltchenko	466f8004f3	Bluetooth: A2MP: Create A2MP channel Create and initialize fixed A2MP channel Signed-off-by: Andrei Emeltchenko <andrei.emeltchenko@intel.com> Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>	2012-06-05 06:34:11 +03:00
Andrei Emeltchenko	54a59aa2b5	Bluetooth: Add l2cap_chan->ops->ready() This move socket specific code to l2cap_sock.c. Signed-off-by: Andrei Emeltchenko <andrei.emeltchenko@intel.com> Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2012-06-05 06:34:11 +03:00
Andrei Emeltchenko	c0df7f6e06	Bluetooth: Move clean up code and set of SOCK_ZAPPED to l2cap_sock.c This remove a bit more of socket code from l2cap core, this calls set the SOCK_ZAPPED and do some clean up depending on the socket state. Reported-by: Mat Martineau <mathewm@codeaurora.org> Signed-off-by: Andrei Emeltchenko <andrei.emeltchenko@intel.com> Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2012-06-05 06:34:10 +03:00
Gustavo Padovan	80b9802795	Bluetooth: Use chan as parameters for l2cap chan ops Use chan instead of void * makes more sense here. Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2012-06-05 06:34:10 +03:00
Andrei Emeltchenko	523e93cdb3	Bluetooth: Define HCI AMP cmd struct Add HCI commands to deal with Bluetooth AMP controllers. Those commands will be used by bluetooth and softamp code. Signed-off-by: Andrei Emeltchenko <andrei.emeltchenko@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2012-06-05 06:34:09 +03:00
Andrei Emeltchenko	2983fd6824	Bluetooth: Define and use PSM identifiers Define assigned Protocol and Service Multiplexor (PSM) identifiers and use them instead of magic numbers. Signed-off-by: Andrei Emeltchenko <andrei.emeltchenko@intel.com> Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>	2012-06-05 06:34:09 +03:00
Andrei Emeltchenko	59e54bd15d	Bluetooth: Define L2CAP conf continuation flag Define Continuation flag which the only flag used from Flags field in L2CAP Configuration Request and Response. Signed-off-by: Andrei Emeltchenko <andrei.emeltchenko@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2012-06-05 06:34:08 +03:00
Gustavo Padovan	8c520a5992	Bluetooth: Remove unnecessary headers include Most of the include were unnecessary or already included by some other header. Replace module.h by export.h where possible. Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2012-06-05 06:34:08 +03:00
Gustavo Padovan	c3c7ea6594	Bluetooth: Fix coding style in include/net/bluetooth Fix all warning and errors reported by checkpatch but license trailing whitespace and bdaddr_t definition. Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2012-06-05 06:34:08 +03:00
Andrei Emeltchenko	9b3b44604a	Bluetooth: Use defined link key size Remove magic number with defined link key size. Signed-off-by: Andrei Emeltchenko <andrei.emeltchenko@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2012-06-05 06:34:06 +03:00
Szymon Janc	a6c511c636	Bluetooth: Rename HCI_QUIRK_NO_RESET to HCI_QUIRK_RESET_ON_CLOSE HCI_QUIRK_NO_RESET name is misleading - purpose of this quirk is to reset device on close instead of init, not to not reset at all. Rename it to HCI_QUIRK_RESET_ON_CLOSE to avoid confusion. Signed-off-by: Szymon Janc <szymon.janc@tieto.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2012-06-05 06:34:06 +03:00
Gustavo Padovan	38351c66e4	Bluetooth: Fix trailing whitespaces in license text As reported by checkpatch.pl Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2012-06-05 06:34:06 +03:00
Mat Martineau	522cc2ee6e	Bluetooth: Remove unused ERTM control field macros Now that l2cap_ctrl is used to set up control fields, these macros are not needed. Signed-off-by: Mat Martineau <mathewm@codeaurora.org> Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>	2012-06-05 06:34:05 +03:00
Mat Martineau	4239d16f36	Bluetooth: Check rules when setting retransmit or monitor timers The ERTM specification requires the retransmit timer to be cancelled when the monitor timer is set. The retransmit timer cannot be set again while the monitor timer is pending. Signed-off-by: Mat Martineau <mathewm@codeaurora.org> Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>	2012-06-05 06:34:04 +03:00
Mat Martineau	f5dbb0772d	Bluetooth: Remove receive code that has been superceded This deletes the receive code that had handlers for each frame type at the top level, and then had logic to determine the receive state within each handler. Signed-off-by: Mat Martineau <mathewm@codeaurora.org> Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>	2012-06-05 06:34:03 +03:00
Mat Martineau	2827011f66	Bluetooth: Fix early return from l2cap_chan_del This fixes a regression from commit `2ead70b839` that is present in all kernels starting at v3.0. When L2CAP information was moved to struct l2cap_chan, a check was added to l2cap_chan_del to avoid certain cleanup operations when ERTM or streaming mode had not yet been initialized. The logic in the check did not take in to account that chan->conf_state is set to 0 in l2cap_chan_ready, so l2cap_chan_del failed to cancel timers and leaked memory any time the ERTM queues or lists were not empty. This change makes sure that l2cap_chan_del only returns early if ERTM initialization was not performed. Signed-off-by: Mat Martineau <mathewm@codeaurora.org> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2012-06-05 06:34:02 +03:00
Samuel Ortiz	73167ced31	NFC: Introduce target mode rx data callback This routine will be called by drivers whenever they receive data in target mode. This should be unexpected events and as such should be handled by a standalone API (i.e. not as a callback pointer from an existing API). Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>	2012-06-04 21:34:31 +02:00
Samuel Ortiz	be9ae4ce4e	NFC: Introduce target mode tx ops And rename the initiator mode data exchange ops for consistency sake. Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>	2012-06-04 21:34:30 +02:00
Samuel Ortiz	f212ad5e99	NFC: Set the NFC device RF mode appropriately Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>	2012-06-04 21:34:30 +02:00
Samuel Ortiz	fc40a8c1a0	NFC: Add target mode activation netlink event Userspace gets a netlink event upon target mode activation. The LLCP layer is also signaled when we get an ATR_REQ in order to get the remote general bytes. Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>	2012-06-04 21:34:30 +02:00
Samuel Ortiz	fe7c580073	NFC: Add target mode protocols to the polling loop startup routine Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>	2012-06-04 21:34:29 +02:00
Samuel Ortiz	ab73b75130	NFC: Export LLCP general bytes getter Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>	2012-06-04 21:34:29 +02:00
Paul Moore	20e2a86485	cipso: handle CIPSO options correctly when NetLabel is disabled When NetLabel is not enabled, e.g. CONFIG_NETLABEL=n, and the system receives a CIPSO tagged packet it is dropped (cipso_v4_validate() returns non-zero). In most cases this is the correct and desired behavior, however, in the case where we are simply forwarding the traffic, e.g. acting as a network bridge, this becomes a problem. This patch fixes the forwarding problem by providing the basic CIPSO validation code directly in ip_options_compile() without the need for the NetLabel or CIPSO code. The new validation code can not perform any of the CIPSO option label/value verification that cipso_v4_validate() does, but it can verify the basic CIPSO option format. The behavior when NetLabel is enabled is unchanged. Signed-off-by: Paul Moore <pmoore@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-06-01 14:18:29 -04:00
Linus Torvalds	13199a0845	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Pull networking changes from David S. Miller: 1) Fix IPSEC header length calculation for transport mode in ESP. The issue is whether to do the calculation before or after alignment. Fix from Benjamin Poirier. 2) Fix regression in IPV6 IPSEC fragment length calculations, from Gao Feng. This is another transport vs tunnel mode issue. 3) Handle AF_UNSPEC connect()s properly in L2TP to avoid OOPSes. Fix from James Chapman. 4) Fix USB ASIX driver's reception of full sized VLAN packets, from Eric Dumazet. 5) Allow drop monitor (and, more generically, all generic netlink protocols) to be automatically loaded as a module. From Neil Horman. Fix up trivial conflict in Documentation/feature-removal-schedule.txt due to new entries added next to each other at the end. As usual. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (38 commits) net/smsc911x: Repair broken failure paths virtio-net: remove useless disable on freeze netdevice: Update netif_dbg for CONFIG_DYNAMIC_DEBUG drop_monitor: Add module alias to enable automatic module loading genetlink: Build a generic netlink family module alias net: add MODULE_ALIAS_NET_PF_PROTO_NAME r6040: Do a Proper deinit at errorpath and also when driver unloads (calling r6040_remove_one) r6040: disable pci device if the subsequent calls (after pci_enable_device) fails skb: avoid unnecessary reallocations in __skb_cow net: sh_eth: fix the rxdesc pointer when rx descriptor empty happens asix: allow full size 8021Q frames to be received rds_rdma: don't assume infiniband device is PCI l2tp: fix oops in L2TP IP sockets for connect() AF_UNSPEC case mac80211: fix ADDBA declined after suspend with wowlan wlcore: fix undefined symbols when CONFIG_PM is not defined mac80211: fix flag check for QoS NOACK frames ath9k_hw: apply internal regulator settings on AR933x ath9k_hw: update AR933x initvals to fix issues with high power devices ath9k: fix a use-after-free-bug when ath_tx_setup_buffer() fails ath9k: stop rx dma before stopping tx ...	2012-05-31 10:32:36 -07:00
Glauber Costa	3f13461939	memcg: decrement static keys at real destroy time We call the destroy function when a cgroup starts to be removed, such as by a rmdir event. However, because of our reference counters, some objects are still inflight. Right now, we are decrementing the static_keys at destroy() time, meaning that if we get rid of the last static_key reference, some objects will still have charges, but the code to properly uncharge them won't be run. This becomes a problem specially if it is ever enabled again, because now new charges will be added to the staled charges making keeping it pretty much impossible. We just need to be careful with the static branch activation: since there is no particular preferred order of their activation, we need to make sure that we only start using it after all call sites are active. This is achieved by having a per-memcg flag that is only updated after static_key_slow_inc() returns. At this time, we are sure all sites are active. This is made per-memcg, not global, for a reason: it also has the effect of making socket accounting more consistent. The first memcg to be limited will trigger static_key() activation, therefore, accounting. But all the others will then be accounted no matter what. After this patch, only limited memcgs will have its sockets accounted. [akpm@linux-foundation.org: move enum sock_flag_bits into sock.h, document enum sock_flag_bits, convert memcg_proto_active() and memcg_proto_activated() to test_bit(), redo tcp_update_limit() comment to 80 cols] Signed-off-by: Glauber Costa <glommer@parallels.com> Cc: Tejun Heo <tj@kernel.org> Cc: Li Zefan <lizefan@huawei.com> Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Michal Hocko <mhocko@suse.cz> Acked-by: David Miller <davem@davemloft.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2012-05-29 16:22:28 -07:00
Gao feng	0c1833797a	ipv6: fix incorrect ipsec fragment Since commit `ad0081e43a` "ipv6: Fragment locally generated tunnel-mode IPSec6 packets as needed" the fragment of packets is incorrect. because tunnel mode needs IPsec headers and trailer for all fragments, while on transport mode it is sufficient to add the headers to the first fragment and the trailer to the last. so modify mtu and maxfraglen base on ipsec mode and if fragment is first or last. with my test,it work well(every fragment's size is the mtu) and does not trigger slow fragment path. Changes from v1: though optimization, mtu_prev and maxfraglen_prev can be delete. replace xfrm mode codes with dst_entry's new frag DST_XFRM_TUNNEL. add fuction ip6_append_data_mtu to make codes clearer. Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-05-27 01:11:22 -04:00
Linus Torvalds	28f3d71761	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Pull more networking updates from David Miller: "Ok, everything from here on out will be bug fixes." 1) One final sync of wireless and bluetooth stuff from John Linville. These changes have all been in his tree for more than a week, and therefore have had the necessary -next exposure. John was just away on a trip and didn't have a change to send the pull request until a day or two ago. 2) Put back some defines in user exposed header file areas that were removed during the tokenring purge. From Stephen Hemminger and Paul Gortmaker. 3) A bug fix for UDP hash table allocation got lost in the pile due to one of those "you got it.. no I've got it.." situations. :-) From Tim Bird. 4) SKB coalescing in TCP needs to have stricter checks, otherwise we'll try to coalesce overlapping frags and crash. Fix from Eric Dumazet. 5) RCU routing table lookups can race with free_fib_info(), causing crashes when we deref the device pointers in the route. Fix by releasing the net device in the RCU callback. From Yanmin Zhang. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (293 commits) tcp: take care of overlaps in tcp_try_coalesce() ipv4: fix the rcu race between free_fib_info and ip_route_output_slow mm: add a low limit to alloc_large_system_hash ipx: restore token ring define to include/linux/ipx.h if: restore token ring ARP type to header xen: do not disable netfront in dom0 phy/micrel: Fix ID of KSZ9021 mISDN: Add X-Tensions USB ISDN TA XC-525 gianfar:don't add FCB length to hard_header_len Bluetooth: Report proper error number in disconnection Bluetooth: Create flags for bt_sk() Bluetooth: report the right security level in getsockopt Bluetooth: Lock the L2CAP channel when sending Bluetooth: Restore locking semantics when looking up L2CAP channels Bluetooth: Fix a redundant and problematic incoming MTU check Bluetooth: Add support for Foxconn/Hon Hai AR5BBU22 0489:E03C Bluetooth: Fix EIR data generation for mgmt_device_found Bluetooth: Fix Inquiry with RSSI event mask Bluetooth: improve readability of l2cap_seq_list code Bluetooth: Fix skb length calculation ...	2012-05-24 11:54:29 -07:00
Linus Torvalds	88d6ae8dc3	Merge branch 'for-3.5' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup Pull cgroup updates from Tejun Heo: "cgroup file type addition / removal is updated so that file types are added and removed instead of individual files so that dynamic file type addition / removal can be implemented by cgroup and used by controllers. blkio controller changes which will come through block tree are dependent on this. Other changes include res_counter cleanup and disallowing kthread / PF_THREAD_BOUND threads to be attached to non-root cgroups. There's a reported bug with the file type addition / removal handling which can lead to oops on cgroup umount. The issue is being looked into. It shouldn't cause problems for most setups and isn't a security concern." Fix up trivial conflict in Documentation/feature-removal-schedule.txt * 'for-3.5' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup: (21 commits) res_counter: Account max_usage when calling res_counter_charge_nofail() res_counter: Merge res_counter_charge and res_counter_charge_nofail cgroups: disallow attaching kthreadd or PF_THREAD_BOUND threads cgroup: remove cgroup_subsys->populate() cgroup: get rid of populate for memcg cgroup: pass struct mem_cgroup instead of struct cgroup to socket memcg cgroup: make css->refcnt clearing on cgroup removal optional cgroup: use negative bias on css->refcnt to block css_tryget() cgroup: implement cgroup_rm_cftypes() cgroup: introduce struct cfent cgroup: relocate __d_cgrp() and __d_cft() cgroup: remove cgroup_add_file[s]() cgroup: convert memcg controller to the new cftype interface memcg: always create memsw files if CONFIG_CGROUP_MEM_RES_CTLR_SWAP cgroup: convert all non-memcg controllers to the new cftype interface cgroup: relocate cftype and cgroup_subsys definitions in controllers cgroup: merge cft_release_agent cftype array into the base files array cgroup: implement cgroup_add_cftypes() and friends cgroup: build list of all cgroups under a given cgroupfs_root cgroup: move cgroup_clear_directory() call out of cgroup_populate_dir() ...	2012-05-22 17:40:19 -07:00
John W. Linville	a0d0d1685f	Merge git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next	2012-05-22 15:18:06 -04:00
Eric Dumazet	a50feda546	ipv6: bool/const conversions phase2 Mostly bool conversions, some inline removals and const additions. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-05-19 01:08:16 -04:00
Eric Dumazet	92113bfde2	ipv6: bool conversions phase1 ipv6_opt_accepted() returns a bool, and can use const pointers ipv6_addr_equal(), ipv6_addr_any(), ipv6_addr_loopback(), ipv6_addr_orchid() return a bool. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-05-18 02:24:13 -04:00
Eric Dumazet	cbc264cacd	ip_frag: struct inet_frags match() method returns a bool - match() method returns a boolean - return (A && B && C && D) -> return A && B && C && D - fix indentation Signed-off-by: Eric Dumazet <edumazet@google.com>	2012-05-18 01:40:27 -04:00
Joe Perches	a508da6cc0	lapb: Neaten debugging Enable dynamic debugging and remove a bunch of #ifdef/#endifs. Add a lapb_dbg(level, fmt, ...) macro and replace the printk(KERN_DEBUG uses. Add pr_fmt and remove embedded prefixes. Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-05-17 18:45:20 -04:00
Eric Dumazet	a2a385d627	tcp: bool conversions bool conversions where possible. __inline__ -> inline space cleanups Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-05-17 14:59:59 -04:00
Eric Dumazet	dc6b9b7823	net: include/net/sock.h cleanup bool/const conversions where possible __inline__ -> inline space cleanups Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-05-17 04:50:21 -04:00
David S. Miller	028940342a	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net	2012-05-16 22:17:37 -04:00
John W. Linville	05f8f25276	Merge branch 'for-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth-next	2012-05-16 15:38:11 -04:00
Eric Dumazet	1b23a5dfc2	net: sock_flag() cleanup - sock_flag() accepts a const pointer - sock_flag() returns a boolean Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-05-16 15:30:26 -04:00
Eric Dumazet	865ec5523d	fq_codel: should use qdisc backlog as threshold codel_should_drop() logic allows a packet being not dropped if queue size is under max packet size. In fq_codel, we have two possible backlogs : The qdisc global one, and the flow local one. The meaningful one for codel_should_drop() should be the global backlog, not the per flow one, so that thin flows can have a non zero drop/mark probability. Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Dave Taht <dave.taht@bufferbloat.net> Cc: Kathleen Nichols <nichols@pollere.com> Cc: Van Jacobson <van@pollere.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-05-16 15:30:26 -04:00
alex.bluesman.smirnov@gmail.com	0606069d9e	mac802154: monitor device support Support for monitor device intended to capture all the network activity. This interface could be used by networks sniffers and is already supported by WireShark. That's a good test point to check that basic MAC support works. Signed-off-by: Alexander Smirnov <alex.bluesman.smirnov@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-05-16 15:17:08 -04:00
alex.bluesman.smirnov@gmail.com	90c049b2c6	ieee802154: interface type to be added This stack implementation distinguishes several types of slave interfaces. Another parameter to 'add_iface_' function is added to clarify the interface type is going to be registered. Signed-off-by: Alexander Smirnov <alex.bluesman.smirnov@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-05-16 15:17:08 -04:00
alex.bluesman.smirnov@gmail.com	74a02fcf77	mac802154: declare reduced mlme operations According IEEE 802.15.4 standard each node can be either full functionality device (FFD) or reduce functionality device (RFD). So 2 sets of operations are needed. This patch declare RFD operations structure. Signed-off-by: Alexander Smirnov <alex.bluesman.smirnov@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-05-16 15:16:56 -04:00
alex.bluesman.smirnov@gmail.com	1cd829c83e	mac802154: RX data path Main RX data path implementation between physical and mac layers. Signed-off-by: Alexander Smirnov <alex.bluesman.smirnov@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-05-16 15:16:44 -04:00
alex.bluesman.smirnov@gmail.com	1010f54018	mac802154: allocation of ieee802154 device An interface to allocate and register ieee802154 compatible device. The allocated device has the following representation in memory: +-----------------------+ \| struct wpan_phy \| +-----------------------+ \| struct mac802154_priv \| +-----------------------+ \| driver's private data \| +-----------------------+ Used by device drivers to register new instance in the stack. Signed-off-by: Alexander Smirnov <alex.bluesman.smirnov@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-05-16 15:16:35 -04:00
alex.bluesman.smirnov@gmail.com	0afd7ad9de	mac802154: basic ieee802.15.4 device structures The IEEE 802.15.4 Working Group focuses on the standardization of the bottom two layers of ISO/OSI protocol stack: Physical (PHY) and MAC. The MAC layer provides access control to a shared channel and reliable data delivery. The main functions performed by the MAC sublayer are: association and disassociation, security control, optional star network topology functions, such as beacon generation and Guaranteed Time Slots (GTSs) management, generation of ACK frames (if used), and, finally, application support for the two possible network topologies described in the standard. This is an initial commit which describes main data structures needed for ieee802.15.4 compatible devices representation in the MAC layer. Signed-off-by: Alexander Smirnov <alex.bluesman.smirnov@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-05-16 15:16:14 -04:00
Gustavo Padovan	c5daa683f2	Bluetooth: Create flags for bt_sk() defer_setup and suspended are now flags into bt_sk(). Signed-off-by: Gustavo Padovan <gustavo@padovan.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>	2012-05-16 16:14:17 -03:00
Mat Martineau	a6a5568c03	Bluetooth: Lock the L2CAP channel when sending The ERTM and streaming mode transmit queue must only be accessed while the L2CAP channel lock is held. Locking the channel before calling l2cap_chan_send ensures that multiple threads cannot simultaneously manipulate the queue when sending and receiving concurrently. L2CAP channel locking had previously moved to the l2cap_chan struct instead of the associated socket, so some of the old socket locking can also be removed in this patch. Signed-off-by: Mat Martineau <mathewm@codeaurora.org> Signed-off-by: Gustavo Padovan <gustavo@padovan.org>	2012-05-16 16:14:02 -03:00
Vishal Agarwal	9d939d9484	Bluetooth: Fix EIR data generation for mgmt_device_found The mgmt_device_found function expects to receive only the significant part of the EIR data so it needs to be removed before calling the function. This patch adds a new eir_get_length() helper function to calculate the length of the significant part. Signed-off-by: Vishal Agarwal <vishal.agarwal@stericsson.com> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>	2012-05-16 16:13:19 -03:00
Gustavo Padovan	08e6d907fe	Merge git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth	2012-05-16 16:11:44 -03:00
Johannes Berg	294a20e039	cfg80211: fix cfg80211_can_beacon_sec_chan prototype It should return bool, not int. The function even does return true/false. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2012-05-16 13:08:15 -04:00
Johannes Berg	ac55d2fe05	mac80211: (selectively) add HT details in radiotap Add a flag for the HT format (mixed vs. greenfield) to allow drivers to report that on receive. Not all drivers will do that though, so allow drivers to set which radiotap MCS details they report. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2012-05-16 12:46:38 -04:00
Janusz.Dziedzic@tieto.com	ee70108fa2	mac80211: Add IV-room in the skb for TKIP and WEP Add IV-room in skb also for TKIP and WEP. Extend patch: "mac80211: support adding IV-room in the skb for CCMP keys" Signed-off-by: Janusz Dziedzic <janusz.dziedzic@tieto.com> Acked-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2012-05-16 12:46:37 -04:00
David S. Miller	c727e7f007	Merge branch 'delete-tokenring' of git://git.kernel.org/pub/scm/linux/kernel/git/paulg/linux	2012-05-16 01:02:40 -04:00
Paul Gortmaker	211ed86510	net: delete all instances of special processing for token ring We are going to delete the Token ring support. This removes any special processing in the core networking for token ring, (aside from net/tr.c itself), leaving the drivers and remaining tokenring support present but inert. The mass removal of the drivers and net/tr.c will be in a separate commit, so that the history of these files that we still care about won't have the giant deletion tied into their history. Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>	2012-05-15 20:14:35 -04:00
Eric Lapuyade	03bed29e05	NFC: HCI drivers don't have to keep track of polling state The NFC core code already does that for them. Signed-off-by: Eric Lapuyade <eric.lapuyade@intel.com> Signed-off-by: Samuel Ortiz <sameo@linux.intel.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2012-05-15 17:31:22 -04:00
Eric Lapuyade	1676f75159	NFC: Add HCI/SHDLC support to let driver check for tag presence Signed-off-by: Eric Lapuyade <eric.lapuyade@intel.com> Signed-off-by: Samuel Ortiz <sameo@linux.intel.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2012-05-15 17:28:00 -04:00
Eric Lapuyade	d4ccb13280	NFC: Specify usage for targets found and target lost events It is now specified that nfc_target_found() and nfc_target_lost() core functions must not be called from an atomic context. This allow us to serialize calls and protect the targets table using the nfc device lock instead of a spinlock. Signed-off-by: Eric Lapuyade <eric.lapuyade@intel.com> Signed-off-by: Samuel Ortiz <sameo@linux.intel.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2012-05-15 17:28:00 -04:00
Eric Lapuyade	addfabf98d	NFC: Remove useless HCI private nfc target table Signed-off-by: Eric Lapuyade <eric.lapuyade@intel.com> Signed-off-by: Samuel Ortiz <sameo@linux.intel.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2012-05-15 17:28:00 -04:00
Eric Lapuyade	9009943326	NFC: Cache the core NFC active target pointer instead of its index The NFC Core now caches the active nfc target pointer, thereby avoiding the need to lookup the target table for each invocation of a driver ops. Consequently, pn533, HCI and NCI now directly receive an nfc_target pointer instead of a target index. Cc: Ilan Elias <ilane@ti.com> Signed-off-by: Eric Lapuyade <eric.lapuyade@intel.com> Signed-off-by: Samuel Ortiz <sameo@linux.intel.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2012-05-15 17:27:59 -04:00
John W. Linville	6037463148	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless into for-davem	2012-05-15 16:38:00 -04:00
David S. Miller	bc9b35ad41	xfrm: Convert several xfrm policy match functions to bool. xfrm_selector_match xfrm_sec_ctx_match __xfrm4_selector_match __xfrm6_selector_match Signed-off-by: David S. Miller <davem@davemloft.net>	2012-05-15 15:04:57 -04:00
Eric Dumazet	6ff272c9ad	codel: use u16 field instead of 31bits for rec_inv_sqrt David pointed out gcc might generate poor code with 31bit fields. Using u16 is more than enough and permits a better code output. Also make the code intent more readable using constants, fixed point arithmetic not being trivial for everybody. Suggested-by: David Miller <davem@davemloft.net> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-05-14 18:32:56 -04:00
David S. Miller	c597f6653d	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next	2012-05-14 18:00:48 -04:00
Gustavo Padovan	a7d7723ae7	Bluetooth: notify userspace of security level change It fixes L2CAP socket based security level elevation during a connection. The HID profile needs this (for keyboards) and it is the only way to achieve the security level elevation when using the management interface to talk to the kernel (hence the management enabling patch being the one that exposes this issue). It enables the userspace a security level change when the socket is already connected and create a way to notify the socket the result of the request. At the moment of the request the socket is made non writable, if the request fails the connections closes, otherwise the socket is made writable again, POLL_OUT is emmited. Signed-off-by: Gustavo Padovan <gustavo@padovan.org> Acked-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2012-05-14 13:51:25 -04:00
Eric Dumazet	536edd6710	codel: use Newton method instead of sqrt() and divides As Van pointed out, interval/sqrt(count) can be implemented using multiplies only. http://en.wikipedia.org/wiki/Methods_of_computing_square_roots#Iterative_methods_for_reciprocal_square_roots This patch implements the Newton method and reciprocal divide. Total cost is 15 cycles instead of 120 on my Corei5 machine (64bit kernel). There is a small 'error' for count values < 5, but we don't really care. I reuse a hole in struct codel_vars : - pack the dropping boolean into one bit - use 31bit to store the reciprocal value of sqrt(count). Suggested-by: Van Jacobson <van@pollere.net> Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Dave Taht <dave.taht@bufferbloat.net> Cc: Kathleen Nichols <nichols@pollere.com> Cc: Tom Herbert <therbert@google.com> Cc: Matt Mathis <mattmathis@google.com> Cc: Yuchung Cheng <ycheng@google.com> Cc: Nandita Dukkipati <nanditad@google.com> Cc: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-05-12 15:50:49 -04:00
Eric Dumazet	76e3cc126b	codel: Controlled Delay AQM An implementation of CoDel AQM, from Kathleen Nichols and Van Jacobson. http://queue.acm.org/detail.cfm?id=2209336 This AQM main input is no longer queue size in bytes or packets, but the delay packets stay in (FIFO) queue. As we don't have infinite memory, we still can drop packets in enqueue() in case of massive load, but mean of CoDel is to drop packets in dequeue(), using a control law based on two simple parameters : target : target sojourn time (default 5ms) interval : width of moving time window (default 100ms) Based on initial work from Dave Taht. Refactored to help future codel inclusion as a plugin for other linux qdisc (FQ_CODEL, ...), like RED. include/net/codel.h contains codel algorithm as close as possible than Kathleen reference. net/sched/sch_codel.c contains the linux qdisc specific glue. Separate structures permit a memory efficient implementation of fq_codel (to be sent as a separate work) : Each flow has its own struct codel_vars. timestamps are taken at enqueue() time with 1024 ns precision, allowing a range of 2199 seconds in queue, and 100Gb links support. iproute2 uses usec as base unit. Selected packets are dropped, unless ECN is enabled and packets can get ECN mark instead. Tested from 2Mb to 10Gb speeds with no particular problems, on ixgbe and tg3 drivers (BQL enabled). Usage: tc qdisc ... codel [ limit PACKETS ] [ target TIME ] [ interval TIME ] [ ecn ] qdisc codel 10: parent 1:1 limit 2000p target 3.0ms interval 60.0ms ecn Sent 13347099587 bytes 8815805 pkt (dropped 0, overlimits 0 requeues 0) rate 202365Kbit 16708pps backlog 113550b 75p requeues 0 count 116 lastcount 98 ldelay 4.3ms dropping drop_next 816us maxpacket 1514 ecn_mark 84399 drop_overlimit 0 CoDel must be seen as a base module, and should be used keeping in mind there is still a FIFO queue. So a typical setup will probably need a hierarchy of several qdiscs and packet classifiers to be able to meet whatever constraints a user might have. One possible example would be to use fq_codel, which combines Fair Queueing and CoDel, in replacement of sfq / sfq_red. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Dave Taht <dave.taht@bufferbloat.net> Cc: Kathleen Nichols <nichols@pollere.com> Cc: Van Jacobson <van@pollere.net> Cc: Tom Herbert <therbert@google.com> Cc: Matt Mathis <mattmathis@google.com> Cc: Yuchung Cheng <ycheng@google.com> Cc: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-05-10 23:35:02 -04:00
Pavel Emelyanov	292e8d8c85	tcp: Move rcvq sending to tcp_input.c It actually works on the input queue and will use its read mem routines, thus it's better to have in in the tcp_input.c file. Signed-off-by: Pavel Emelyanov <xemul@parallels.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-05-10 23:24:35 -04:00
Nicolas Dichtel	e0268868ba	sctp: check cached dst before using it dst_check() will take care of SA (and obsolete field), hence IPsec rekeying scenario is taken into account. Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Acked-by: Vlad Yaseivch <vyasevich@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-05-10 23:15:47 -04:00
Mat Martineau	94122bbe9c	Bluetooth: Refactor L2CAP ERTM and streaming transmit segmentation Use more common code for ERTM and streaming mode segmentation and transmission, and begin using skb control block data for delaying extended or enhanced header generation until just before the packet is transmitted. This code is also better suited for resegmentation, which is needed when L2CAP links are reconfigured after an AMP channel move. Signed-off-by: Mat Martineau <mathewm@codeaurora.org> Reviewed-by: Ulisses Furquim <ulisses@profusion.mobi> Signed-off-by: Gustavo Padovan <gustavo@padovan.org>	2012-05-09 01:40:53 -03:00
Marcel Holtmann	9d42820f37	Bluetooth: Enable Low Energy support by default The Bluetooth Low Energy support so far was disabled by default via a module parameter. With this change the module parameter will be removed and Low Energy is enabled by default. Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Gustavo Padovan <gustavo@padovan.org>	2012-05-09 01:40:52 -03:00
Syam Sidhardhan	2ee8ce35b1	Bluetooth: Remove unused hci_le_ltk_neg_reply() No one is using hci_le_ltk_neg_reply() in bluetooth subsystem. Signed-off-by: Syam Sidhardhan <s.syam@samsung.com> Signed-off-by: Gustavo Padovan <gustavo@padovan.org>	2012-05-09 01:40:51 -03:00
Syam Sidhardhan	e10b9969f2	Bluetooth: Remove unused hci_le_ltk_reply() In this API, we were using sizeof operator for an array given as function argument, which is invalid. However this API is not used anywhere. Signed-off-by: Syam Sidhardhan <s.syam@samsung.com> Signed-off-by: Gustavo Padovan <gustavo@padovan.org>	2012-05-09 01:40:50 -03:00
Mat Martineau	3ce3514f5d	Bluetooth: Remove duplicate structure members from bt_skb_cb These values are now in the nested l2cap_ctrl struct. Signed-off-by: Mat Martineau <mathewm@codeaurora.org> Acked-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Gustavo Padovan <gustavo@padovan.org>	2012-05-09 01:40:47 -03:00
Mat Martineau	5a364bd399	Bluetooth: Improve ERTM sequence number offset calculation Instead of using modular division, the offset can be calculated using only addition and subtraction. The previous calculation did not work as intended and was more difficult to understand, involving unsigned integer underflow and a check for a negative value where one was not possible. Signed-off-by: Mat Martineau <mathewm@codeaurora.org> Acked-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Gustavo Padovan <gustavo@padovan.org>	2012-05-09 01:40:46 -03:00
Andre Guedes	479453d5fe	Bluetooth: Remove advertising cache User-space pass the remote device address type to kernel through struct sockaddr_l2 what makes the advertising useless. This patch removes all advertising cache code. Signed-off-by: Andre Guedes <andre.guedes@openbossa.org> Acked-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2012-05-09 01:40:46 -03:00
Andre Guedes	8e9f98921c	Bluetooth: Use address type info from user-space In order to establish a LE connection we need the address type information. User-space already pass this information to kernel through struct sockaddr_l2. This patch adds the dst_type parameter to l2cap_chan_connect so we are able to pass the address type info from user-space down to hci_conn layer. Signed-off-by: Andre Guedes <andre.guedes@openbossa.org> Acked-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2012-05-09 01:40:45 -03:00
Andre Guedes	b12f62cfd9	Bluetooth: Add dst_type parameter to hci_connect This patch adds the dst_type parameter to hci_connect function. Instead of searching the address type in advertising cache, we use the dst_type parameter to establish LE connections. The dst_type is ignored for BR/EDR connection establishment. Signed-off-by: Andre Guedes <andre.guedes@openbossa.org> Acked-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2012-05-09 01:40:45 -03:00
Andre Guedes	31f7956c66	Bluetooth: Move bdaddr_to_le to hci_core This patch moves the helper function bdaddr_to_le to hci_core, so it can be used in mgmt.c and hci_conn.c. Signed-off-by: Andre Guedes <andre.guedes@openbossa.org> Acked-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2012-05-09 01:40:44 -03:00
Andre Guedes	43ef0b8b8d	Bluetooth: Add address type to struct sockaddr_l2 This patch adds the address type info to struct sockaddr_l2 so user-space can inform the remote device address type required to establish LE connections. Soon, instead of looking the advertising cache up to discover the address type, we'll use this address type info to establish LE connections. Signed-off-by: Andre Guedes <andre.guedes@openbossa.org> Acked-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2012-05-09 01:40:43 -03:00
Andre Guedes	591f47f31b	Bluetooth: Move address type macros to bluetooth.h This patch moves address type macros to bluetooth.h since they will be used by management interface and Bluetooth socket interface. It also replaces the macro prefix MGMT_ADDR_ by BDADDR_. Signed-off-by: Andre Guedes <andre.guedes@openbossa.org> Acked-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2012-05-09 01:40:42 -03:00
Andrei Emeltchenko	2bbf2968e5	Bluetooth: trivial: Remove empty line Signed-off-by: Andrei Emeltchenko <andrei.emeltchenko@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2012-05-09 01:40:42 -03:00
Syam Sidhardhan	e47872209d	Bluetooth: Remove strtoba header declared but not defined No one is using strtoba() in the bluetooth subsystem. Signed-off-by: Syam Sidhardhan <s.syam@samsung.com> Signed-off-by: Gustavo Padovan <gustavo@padovan.org>	2012-05-09 01:40:34 -03:00
Syam Sidhardhan	270ca16bc7	Bluetooth: remove header declared but not defined hci_del_off_timer() doesn't exist anymore. Signed-off-by: Syam Sidhardhan <s.syam@samsung.com> Signed-off-by: Gustavo Padovan <gustavo@padovan.org>	2012-05-09 01:40:34 -03:00
Mat Martineau	3c588192b5	Bluetooth: Add the l2cap_seq_list structure for tracking frames A sequence list is a data structure used to track frames that need to be retransmitted, and frames that have been requested for retransmission by the remote device. It can compactly represent a list of sequence numbers within the ERTM transmit window. Memory for the list is allocated once at connection time, and common operations in ERTM are O(1). Signed-off-by: Mat Martineau <mathewm@codeaurora.org> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2012-05-09 01:40:30 -03:00
Gustavo Padovan	9033894722	Bluetooth: Remove err parameter from alloc_skb() Use ERR_PTR maginc instead. Signed-off-by: Gustavo Padovan <gustavo@padovan.org> Acked-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>	2012-05-09 01:40:26 -03:00
Andrei Emeltchenko	bd4b165312	Bluetooth: Adds set_default function in L2CAP setup Some parameters in L2CAP chan are set to default similar way in socket based channels and A2MP channels. Adds common function which sets all defaults. Signed-off-by: Andrei Emeltchenko <andrei.emeltchenko@intel.com> Signed-off-by: Gustavo Padovan <gustavo@padovan.org>	2012-05-09 00:41:39 -03:00
Andre Guedes	0ed09148fa	Bluetooth: Remove MGMT_ADDR_INVALID macro This patch removes the MGMT_ADDR_INVALID macro. If the address type isn't LE, we consider it is BR/EDR type. Signed-off-by: Andre Guedes <andre.guedes@openbossa.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>	2012-05-09 00:41:37 -03:00
Gustavo Padovan	eef1d9b668	Bluetooth: Remove sk parameter from l2cap_chan_create() Following the separation if core and sock code this change avoid manipulation of sk inside l2cap_chan_create(). Signed-off-by: Gustavo Padovan <gustavo@padovan.org>	2012-05-09 00:41:36 -03:00
Mat Martineau	00e3112c5a	Bluetooth: Add a structure to carry ERTM data in skb control blocks Every field from ERTM control headers is now carried in the control block so it only has to be parsed or generated once, and can be efficiently accessed throughout the ERTM code. Signed-off-by: Mat Martineau <mathewm@codeaurora.org> Acked-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Gustavo Padovan <gustavo@padovan.org>	2012-05-09 00:41:35 -03:00
Mat Martineau	d5f7ac3810	Bluetooth: Add definitions and struct members for new ERTM state machine Adds some missing values for control field parsing, additional data for the new state machine, and enumerations for states, incoming packet classification, and state machine events. Signed-off-by: Mat Martineau <mathewm@codeaurora.org> Acked-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Gustavo Padovan <gustavo@padovan.org>	2012-05-09 00:41:35 -03:00
Andrei Emeltchenko	6f74b6f36f	Bluetooth: Comments and style fixes Add comments to timer implementation and style fixes. Signed-off-by: Andrei Emeltchenko <andrei.emeltchenko@intel.com> Acked-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Gustavo Padovan <gustavo@padovan.org>	2012-05-09 00:41:35 -03:00
Andre Guedes	21693c15c0	Bluetooth: Add HCI_PERIODIC_INQ to dev_flags This patch adds the HCI_PERIODIC_INQ flag to dev_flags. This flag tracks if periodic inquiry is enabled or not. Signed-off-by: Andre Guedes <aguedespe@gmail.com> Acked-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Gustavo Padovan <gustavo@padovan.org>	2012-05-09 00:41:35 -03:00
Andre Guedes	79d6e068be	Bluetooth: Add Periodic Inquiry command complete handler This patch adds a handler function to Periodic Inquiry command complete event. Signed-off-by: Andre Guedes <aguedespe@gmail.com> Acked-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Gustavo Padovan <gustavo@padovan.org>	2012-05-09 00:41:35 -03:00
Andre Guedes	7dbfac1d72	Bluetooth: Add hci_cancel_le_scan() to hci_core This patch adds to hci_core the hci_cancel_le_scan function which should be used to cancel an ongoing LE scan. Signed-off-by: Andre Guedes <andre.guedes@openbossa.org> Acked-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>	2012-05-09 00:41:32 -03:00
Andrei Emeltchenko	58115373e7	Bluetooth: Correct ediv in SMP ediv is already in little endian order. Signed-off-by: Andrei Emeltchenko <andrei.emeltchenko@intel.com> Acked-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>	2012-05-09 00:41:30 -03:00
Marcel Holtmann	cdbaccca73	Bluetooth: Add management command for setting Device ID The Device ID details need to be programmed into the kernel for every controller at least once. So provide management command for this. Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>	2012-05-09 00:41:30 -03:00
Marcel Holtmann	2b9be137b7	Bluetooth: Handle EIR tags for Device ID The Device ID information can be provided via Extended Inquiry Data as well. If a valid source is present, then include it. Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>	2012-05-09 00:41:30 -03:00
Marcel Holtmann	91c4e9b1ac	Bluetooth: Add TX power tag to EIR data The Inquiry Response TX power tag should be added to the Extended Inquiry Data (EIR) as well. Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>	2012-05-09 00:41:30 -03:00
David Herrmann	6935e0f518	Bluetooth: Remove redundant hdev->parent field We initialize the "struct device" in hci_alloc_dev() for a long time now so we can access hdev->dev.parent directly. Hence, we can drop the temporary field hdev->parent which is used in no other place than hci_add_sysfs(). SET_HCIDEV_DEV() is never called after registering a device by the drivers so we do not overwrite internal device-state. Furthermore, hdev->dev is initialized to 0 by kzalloc() inside hci_alloc_dev() so the default behavior with dev.parent = NULL is kept. Signed-off-by: David Herrmann <dh.herrmann@googlemail.com> Acked-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>	2012-05-09 00:41:30 -03:00
Andrei Emeltchenko	9a00665792	Bluetooth: Correct type for ediv to __le16 Correct type warnings reported by sparse to show that this functions takes ediv argument in __le16 format. Signed-off-by: Andrei Emeltchenko <andrei.emeltchenko@intel.com> Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>	2012-05-09 00:41:29 -03:00
Andrei Emeltchenko	7d69230c43	Bluetooth: Correct type for hdev lmp_subver Keep lmp_subver in host byte order. We have following conversion in hci_cc_read_local_version: hdev->lmp_subver = __le16_to_cpu(rp->lmp_subver); Signed-off-by: Andrei Emeltchenko <andrei.emeltchenko@intel.com> Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>	2012-05-09 00:41:28 -03:00
Ashok Nagarajan	70c33eaae7	{nl,cfg,mac}80211: Allow user to see/configure HT protection mode This patch introduces a new mesh configuration parameter "ht_opmode" and will allow user to check the current HT protection mode selected. Users could configure the protection mode by the command "iw mesh_iface set mesh_param mesh_ht_protection_mode=2". The default protection mode of mesh is set to non-HT mixed mode. Signed-off-by: Ashok Nagarajan <ashok@cozybit.com> Reviewed-by: Thomas Pedersen <thomas@cozybit.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2012-05-08 21:53:59 -04:00
Ben Greear	e352114fd6	mac80211: Framework to get wifi-driver stats via ethtool. This adds hooks to call into the driver to get additional stats for the ethtool API. Signed-off-by: Ben Greear <greearb@candelatech.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2012-05-08 21:53:51 -04:00
Ben Greear	d61992182e	cfg80211: Add framework to support ethtool stats. Signed-off-by: Ben Greear <greearb@candelatech.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2012-05-08 21:53:49 -04:00
Pablo Neira Ayuso	f73181c828	ipvs: add support for sync threads Allow master and backup servers to use many threads for sync traffic. Add sysctl var "sync_ports" to define the number of threads. Every thread will use single UDP port, thread 0 will use the default port 8848 while last thread will use port 8848+sync_ports-1. The sync traffic for connections is scheduled to many master threads based on the cp address but one connection is always assigned to same thread to avoid reordering of the sync messages. Remove ip_vs_sync_switch_mode because this check for sync mode change is still risky. Instead, check for mode change under sync_buff_lock. Make sure the backup socks do not block on reading. Special thanks to Aleksey Chudov for helping in all tests. Signed-off-by: Julian Anastasov <ja@ssi.bg> Tested-by: Aleksey Chudov <aleksey.chudov@gmail.com> Signed-off-by: Simon Horman <horms@verge.net.au>	2012-05-08 19:40:33 +02:00
Julian Anastasov	749c42b620	ipvs: reduce sync rate with time thresholds Add two new sysctl vars to control the sync rate with the main idea to reduce the rate for connection templates because currently it depends on the packet rate for controlled connections. This mechanism should be useful also for normal connections with high traffic. sync_refresh_period: in seconds, difference in reported connection timer that triggers new sync message. It can be used to avoid sync messages for the specified period (or half of the connection timeout if it is lower) if connection state is not changed from last sync. sync_retries: integer, 0..3, defines sync retries with period of sync_refresh_period/8. Useful to protect against loss of sync messages. Allow sysctl_sync_threshold to be used with sysctl_sync_period=0, so that only single sync message is sent if sync_refresh_period is also 0. Add new field "sync_endtime" in connection structure to hold the reported time when connection expires. The 2 lowest bits will represent the retry count. As the sysctl_sync_period now can be 0 use ACCESS_ONCE to avoid division by zero. Special thanks to Aleksey Chudov for being patient with me, for his extensive reports and helping in all tests. Signed-off-by: Julian Anastasov <ja@ssi.bg> Tested-by: Aleksey Chudov <aleksey.chudov@gmail.com> Signed-off-by: Simon Horman <horms@verge.net.au>	2012-05-08 19:40:10 +02:00
Pablo Neira Ayuso	1c003b1580	ipvs: wakeup master thread High rate of sync messages in master can lead to overflowing the socket buffer and dropping the messages. Fixed sleep of 1 second without wakeup events is not suitable for loaded masters, Use delayed_work to schedule sending for queued messages and limit the delay to IPVS_SYNC_SEND_DELAY (20ms). This will reduce the rate of wakeups but to avoid sending long bursts we wakeup the master thread after IPVS_SYNC_WAKEUP_RATE (8) messages. Add hard limit for the queued messages before sending by using "sync_qlen_max" sysctl var. It defaults to 1/32 of the memory pages but actually represents number of messages. It will protect us from allocating large parts of memory when the sending rate is lower than the queuing rate. As suggested by Pablo, add new sysctl var "sync_sock_size" to configure the SNDBUF (master) or RCVBUF (slave) socket limit. Default value is 0 (preserve system defaults). Change the master thread to detect and block on SNDBUF overflow, so that we do not drop messages when the socket limit is low but the sync_qlen_max limit is not reached. On ENOBUFS or other errors just drop the messages. Change master thread to enter TASK_INTERRUPTIBLE state early, so that we do not miss wakeups due to messages or kthread_should_stop event. Thanks to Pablo Neira Ayuso for his valuable feedback! Signed-off-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Simon Horman <horms@verge.net.au>	2012-05-08 19:39:53 +02:00
Eric Dumazet	ac3a546ac8	netfilter: nf_conntrack: use this_cpu_inc() this_cpu_inc() is IRQ safe and faster than local_bh_disable()/__this_cpu_inc()/local_bh_enable(), at least on x86. Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Patrick McHardy <kaber@trash.net> Cc: Christoph Lameter <cl@linux.com> Cc: Tejun Heo <tj@kernel.org> Reviewed-by: Christoph Lameter <cl@linux.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2012-05-08 19:36:33 +02:00
Eric Leblond	a900689264	netfilter: nf_ct_helper: allow to disable automatic helper assignment This patch allows you to disable automatic conntrack helper lookup based on TCP/UDP ports, eg. echo 0 > /proc/sys/net/netfilter/nf_conntrack_helper [ Note: flows that already got a helper will keep using it even if automatic helper assignment has been disabled ] Once this behaviour has been disabled, you have to explicitly use the iptables CT target to attach helper to flows. There are good reasons to stop supporting automatic helper assignment, for further information, please read: http://www.netfilter.org/news.html#2012-04-03 This patch also adds one message to inform that automatic helper assignment is deprecated and it will be removed soon (this is spotted only once, with the first flow that gets a helper attached to make it as less annoying as possible). Signed-off-by: Eric Leblond <eric@regit.org> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2012-05-08 19:35:18 +02:00

... 8 9 10 11 12 ...

6028 Commits