linux

Author	SHA1	Message	Date
John Fastabend	b0ab6f9275	net: sched: enable per cpu qstats After previous patches to simplify qstats the qstats can be made per cpu with a packed union in Qdisc struct. Signed-off-by: John Fastabend <john.r.fastabend@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-30 01:02:26 -04:00
John Fastabend	6401585366	net: sched: restrict use of qstats qlen This removes the use of qstats->qlen variable from the classifiers and makes it an explicit argument to gnet_stats_copy_queue(). The qlen represents the qdisc queue length and is packed into the qstats at the last moment before passnig to user space. By handling it explicitely we avoid, in the percpu stats case, having to figure out which per_cpu variable to put it in. It would probably be best to remove it from qstats completely but qstats is a user space ABI and can't be broken. A future patch could make an internal only qstats structure that would avoid having to allocate an additional u32 variable on the Qdisc struct. This would make the qstats struct 128bits instead of 128+32. Signed-off-by: John Fastabend <john.r.fastabend@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-30 01:02:26 -04:00
John Fastabend	25331d6ce4	net: sched: implement qstat helper routines This adds helpers to manipulate qstats logic and replaces locations that touch the counters directly. This simplifies future patches to push qstats onto per cpu counters. Signed-off-by: John Fastabend <john.r.fastabend@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-30 01:02:26 -04:00
John Fastabend	22e0f8b932	net: sched: make bstats per cpu and estimator RCU safe In order to run qdisc's without locking statistics and estimators need to be handled correctly. To resolve bstats make the statistics per cpu. And because this is only needed for qdiscs that are running without locks which is not the case for most qdiscs in the near future only create percpu stats when qdiscs set the TCQ_F_CPUSTATS flag. Next because estimators use the bstats to calculate packets per second and bytes per second the estimator code paths are updated to use the per cpu statistics. Signed-off-by: John Fastabend <john.r.fastabend@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-30 01:02:26 -04:00
David S. Miller	852248449c	Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-next Pablo Neira Ayuso says: ==================== pull request: netfilter/ipvs updates for net-next The following patchset contains Netfilter/IPVS updates for net-next, most relevantly they are: 1) Four patches to make the new nf_tables masquerading support independent of the x_tables infrastructure. This also resolves a compilation breakage if the masquerade target is disabled but the nf_tables masq expression is enabled. 2) ipset updates via Jozsef Kadlecsik. This includes the addition of the skbinfo extension that allows you to store packet metainformation in the elements. This can be used to fetch and restore this to the packets through the iptables SET target, patches from Anton Danilov. 3) Add the hash:mac set type to ipset, from Jozsef Kadlecsick. 4) Add simple weighted fail-over scheduler via Simon Horman. This provides a fail-over IPVS scheduler (unlike existing load balancing schedulers). Connections are directed to the appropriate server based solely on highest weight value and server availability, patch from Kenny Mathis. 5) Support IPv6 real servers in IPv4 virtual-services and vice versa. Simon Horman informs that the motivation for this is to allow more flexibility in the choice of IP version offered by both virtual-servers and real-servers as they no longer need to match: An IPv4 connection from an end-user may be forwarded to a real-server using IPv6 and vice versa. No ip_vs_sync support yet though. Patches from Alex Gartrell and Julian Anastasov. 6) Add global generation ID to the nf_tables ruleset. When dumping from several different object lists, we need a way to identify that an update has ocurred so userspace knows that it needs to refresh its lists. This also includes a new command to obtain the 32-bits generation ID. The less significant 16-bits of this ID is also exposed through res_id field in the nfnetlink header to quickly detect the interference and retry when there is no risk of ID wraparound. 7) Move br_netfilter out of the bridge core. The br_netfilter code is built in the bridge core by default. This causes problems of different kind to people that don't want this: Jesper reported performance drop due to the inconditional hook registration and I remember to have read complains on netdev from people regarding the unexpected behaviour of our bridging stack when br_netfilter is enabled (fragmentation handling, layer 3 and upper inspection). People that still need this should easily undo the damage by modprobing the new br_netfilter module. 8) Dump the set policy nf_tables that allows set parameterization. So userspace can keep user-defined preferences when saving the ruleset. From Arturo Borrero. 9) Use __seq_open_private() helper function to reduce boiler plate code in x_tables, From Rob Jones. 10) Safer default behaviour in case that you forget to load the protocol tracker. Daniel Borkmann and Florian Westphal detected that if your ruleset is stateful, you allow traffic to at least one single SCTP port and the SCTP protocol tracker is not loaded, then any SCTP traffic may be pass through unfiltered. After this patch, the connection tracking classifies SCTP/DCCP/UDPlite/GRE packets as invalid if your kernel has been compiled with support for these modules. ==================== Trivially resolved conflict in include/linux/skbuff.h, Eric moved some netfilter skbuff members around, and the netfilter tree adjusted the ifdef guards for the bridging info pointer. Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-29 14:46:53 -04:00
Florian Westphal	d82bd12298	tcp: move TCP_ECN_create_request out of header After Octavian Purdilas tcp ipv4/ipv6 unification work this helper only has a single callsite. While at it, convert name to lowercase, suggested by Stephen. Suggested-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-29 14:41:22 -04:00
Arturo Borrero	9363dc4b59	netfilter: nf_tables: store and dump set policy We want to know in which cases the user explicitly sets the policy options. In that case, we also want to dump back the info. Signed-off-by: Arturo Borrero Gonzalez <arturo.borrero.glez@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2014-09-29 11:28:03 +02:00
Florian Westphal	9890092e46	net: tcp: more detailed ACK events and events for CE marked packets DataCenter TCP (DCTCP) determines cwnd growth based on ECN information and ACK properties, e.g. ACK that updates window is treated differently than DUPACK. Also DCTCP needs information whether ACK was delayed ACK. Furthermore, DCTCP also implements a CE state machine that keeps track of CE markings of incoming packets. Therefore, extend the congestion control framework to provide these event types, so that DCTCP can be properly implemented as a normal congestion algorithm module outside of the core stack. Joint work with Daniel Borkmann and Glenn Judd. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Signed-off-by: Glenn Judd <glenn.judd@morganstanley.com> Acked-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-29 00:13:10 -04:00
Florian Westphal	7354c8c389	net: tcp: split ack slow/fast events from cwnd_event The congestion control ops "cwnd_event" currently supports CA_EVENT_FAST_ACK and CA_EVENT_SLOW_ACK events (among others). Both FAST and SLOW_ACK are only used by Westwood congestion control algorithm. This removes both flags from cwnd_event and adds a new in_ack_event callback for this. The goal is to be able to provide more detailed information about ACKs, such as whether ECE flag was set, or whether the ACK resulted in a window update. It is required for DataCenter TCP (DCTCP) congestion control algorithm as it makes a different choice depending on ECE being set or not. Joint work with Daniel Borkmann and Glenn Judd. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Signed-off-by: Glenn Judd <glenn.judd@morganstanley.com> Acked-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-29 00:13:10 -04:00
Daniel Borkmann	30e502a34b	net: tcp: add flag for ca to indicate that ECN is required This patch adds a flag to TCP congestion algorithms that allows for requesting to mark IPv4/IPv6 sockets with transport as ECN capable, that is, ECT(0), when required by a congestion algorithm. It is currently used and needed in DataCenter TCP (DCTCP), as it requires both peers to assert ECT on all IP packets sent - it uses ECN feedback (i.e. CE, Congestion Encountered information) from switches inside the data center to derive feedback to the end hosts. Therefore, simply add a new flag to icsk_ca_ops. Note that DCTCP's algorithm/behaviour slightly diverges from RFC3168, therefore this is only (!) enabled iff the assigned congestion control ops module has requested this. By that, we can tightly couple this logic really only to the provided congestion control ops. Joint work with Florian Westphal and Glenn Judd. Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Glenn Judd <glenn.judd@morganstanley.com> Acked-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-29 00:13:10 -04:00
Florian Westphal	55d8694fa8	net: tcp: assign tcp cong_ops when tcp sk is created Split assignment and initialization from one into two functions. This is required by followup patches that add Datacenter TCP (DCTCP) congestion control algorithm - we need to be able to determine if the connection is moderated by DCTCP before the 3WHS has finished. As we walk the available congestion control list during the assignment, we are always guaranteed to have Reno present as it's fixed compiled-in. Therefore, since we're doing the early assignment, we don't have a real use for the Reno alias tcp_init_congestion_ops anymore and can thus remove it. Actual usage of the congestion control operations are being made after the 3WHS has finished, in some cases however we can access get_info() via diag if implemented, therefore we need to zero out the private area for those modules. Joint work with Daniel Borkmann and Glenn Judd. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Signed-off-by: Glenn Judd <glenn.judd@morganstanley.com> Acked-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-29 00:13:10 -04:00
WANG Cong	18d0264f63	net_sched: remove the first parameter from tcf_exts_destroy() Cc: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Acked-by: Jamal Hadi Salim <hadi@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-28 17:29:01 -04:00
David S. Miller	f5c7e1a47a	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec-next Steffen Klassert says: ==================== pull request (net-next): ipsec-next 2014-09-25 1) Remove useless hash_resize_mutex in xfrm_hash_resize(). This mutex is used only there, but xfrm_hash_resize() can't be called concurrently at all. From Ying Xue. 2) Extend policy hashing to prefixed policies based on prefix lenght thresholds. From Christophe Gouault. 3) Make the policy hash table thresholds configurable via netlink. From Christophe Gouault. 4) Remove the maximum authentication length for AH. This was needed to limit stack usage. We switched already to allocate space, so no need to keep the limit. From Herbert Xu. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-28 17:19:15 -04:00
Florian Fainelli	7905288f09	net: dsa: allow switches driver to implement get/set EEE Allow switches driver to query and enable/disable EEE on a per-port basis by implementing the ethtool_{get,set}_eee settings and delegating these operations to the switch driver. set_eee() will need to coordinate with the PHY driver to make sure that EEE is enabled, the link-partner supports it and the auto-negotiation result is satisfactory. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-28 17:14:09 -04:00
Florian Fainelli	b2f2af21e3	net: dsa: allow enabling and disable switch ports Whenever a per-port network device is used/unused, invoke the switch driver port_enable/port_disable callbacks to allow saving as much power as possible by disabling unused parts of the switch (RX/TX logic, memory arrays, PHYs...). We supply a PHY device argument to make sure the switch driver can act on the PHY device if needed (like putting/taking the PHY out of deep low power mode). Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-28 17:14:08 -04:00
Eric Dumazet	cd7d8498c9	tcp: change tcp_skb_pcount() location Our goal is to access no more than one cache line access per skb in a write or receive queue when doing the various walks. After recent TCP_SKB_CB() reorganizations, it is almost done. Last part is tcp_skb_pcount() which currently uses skb_shinfo(skb)->gso_segs, which is a terrible choice, because it needs 3 cache lines in current kernel (skb->head, skb->end, and shinfo->gso_segs are all in 3 different cache lines, far from skb->cb) This very simple patch reuses space currently taken by tcp_tw_isn only in input path, as tcp_skb_pcount is only needed for skb stored in write queue. This considerably speeds up tcp_ack(), granted we avoid shinfo->tx_flags to get SKBTX_ACK_TSTAMP, which seems possible. This also speeds up all sack processing in general. This speeds up tcp_sendmsg() because it no longer has to access/dirty shinfo. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-28 16:36:48 -04:00
Eric Dumazet	971f10eca1	tcp: better TCP_SKB_CB layout to reduce cache line misses TCP maintains lists of skb in write queue, and in receive queues (in order and out of order queues) Scanning these lists both in input and output path usually requires access to skb->next, TCP_SKB_CB(skb)->seq, and TCP_SKB_CB(skb)->end_seq These fields are currently in two different cache lines, meaning we waste lot of memory bandwidth when these queues are big and flows have either packet drops or packet reorders. We can move TCP_SKB_CB(skb)->header at the end of TCP_SKB_CB, because this header is not used in fast path. This allows TCP to search much faster in the skb lists. Even with regular flows, we save one cache line miss in fast path. Thanks to Christoph Paasch for noticing we need to cleanup skb->cb[] (IPCB/IP6CB) before entering IP stack in tx path, and that I forgot IPCB use in tcp_v4_hnd_req() and tcp_v4_save_options(). Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-28 16:35:43 -04:00
Eric Dumazet	a224772db8	ipv6: add a struct inet6_skb_parm param to ipv6_opt_accepted() ipv6_opt_accepted() assumes IP6CB(skb) holds the struct inet6_skb_parm that it needs. Lets not assume this, as TCP stack might use a different place. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-28 16:35:43 -04:00
Eric Dumazet	24a2d43d88	ipv4: rename ip_options_echo to __ip_options_echo() ip_options_echo() assumes struct ip_options is provided in &IPCB(skb)->opt Lets break this assumption, but provide a helper to not change all call points. ip_send_unicast_reply() gets a new struct ip_options pointer. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-28 16:35:42 -04:00
Dan Williams	3f33407856	net: make tcp_cleanup_rbuf private net_dma was the only external user so this can become local to tcp.c again. Cc: James Morris <jmorris@namei.org> Cc: Patrick McHardy <kaber@trash.net> Cc: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru> Cc: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org> Acked-by: Neal Cardwell <ncardwell@google.com> Acked-by: David S. Miller <davem@davemloft.net> Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2014-09-28 07:22:21 -07:00
Dan Williams	7bced39751	net_dma: simple removal Per commit "77873803363c net_dma: mark broken" net_dma is no longer used and there is no plan to fix it. This is the mechanical removal of bits in CONFIG_NET_DMA ifdef guards. Reverting the remainder of the net_dma induced changes is deferred to subsequent patches. Marked for stable due to Roman's report of a memory leak in dma_pin_iovec_pages(): https://lkml.org/lkml/2014/9/3/177 Cc: Dave Jiang <dave.jiang@intel.com> Cc: Vinod Koul <vinod.koul@intel.com> Cc: David Whipple <whipple@securedatainnovations.ch> Cc: Alexander Duyck <alexander.h.duyck@intel.com> Cc: <stable@vger.kernel.org> Reported-by: Roman Gushchin <klamm@yandex-team.ru> Acked-by: David S. Miller <davem@davemloft.net> Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2014-09-28 07:05:16 -07:00
LEROY Christophe	4565af0d40	net: optimise csum_replace4() csum_partial() is a generic function which is not optimised for small fixed length calculations, and its use requires to store "from" and "to" values in memory while we already have them available in registers. This also has impact, especially on RISC processors. In the same spirit as the change done by Eric Dumazet on csum_replace2(), this patch rewrites inet_proto_csum_replace4() taking into account RFC1624. I spotted during a NATted tcp transfert that csum_partial() is one of top 5 consuming functions (around 8%), and the second user of csum_partial() is inet_proto_csum_replace4(). I have proposed the same modification to inet_proto_csum_replace4() in another patch. Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-26 16:14:16 -04:00
David S. Miller	57219dc7bf	Merge tag 'master-2014-09-16' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next John W. Linville says: ==================== pull request: wireless-next 2014-09-22 Please pull this batch of updates intended for the 3.18 stream... For the mac80211 bits, Johannes says: "This time, I have some rate minstrel improvements, support for a very small feature from CCX that Steinar reverse-engineered, dynamic ACK timeout support, a number of changes for TDLS, early support for radio resource measurement and many fixes. Also, I'm changing a number of places to clear key memory when it's freed and Intel claims copyright for code they developed." For the bluetooth bits, Johan says: "Here are some more patches intended for 3.18. Most of them are cleanups or fixes for SMP. The only exception is a fix for BR/EDR L2CAP fixed channels which should now work better together with the L2CAP information request procedure." For the iwlwifi bits, Emmanuel says: "I fix here dvm which was broken by my last pull request. Arik continues to work on TDLS and Luca solved a few issues in CT-Kill. Eyal keeps digging into rate scaling code, more to come soon. Besides this, nothing really special here." Beyond that, there are the usual big batches of updates to ath9k, b43, mwifiex, and wil6210 as well as a handful of other bits here and there. Also, rtlwifi gets some btcoexist attention from Larry. Please let me know if there are problems! ==================== Had to adjust the wil6210 code to comply with Joe Perches's recent change in net-next to make the netdev_*() routines return void instead of 'int'. Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-26 15:39:24 -04:00
John W. Linville	30d3c071a6	Merge branch 'for-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth-next	2014-09-26 13:38:51 -04:00
John W. Linville	330bd4ec9d	NFC: 3.18 pull request This is the NFC pull request for 3.18. We've had major updates for TI and ST Microelectronics drivers: For TI's trf7970a driver: - Target mode support for trf7970a - Suspend/resume support for trf7970a - DT properties additions to handle different quirks - A bunch of fixes for smartphone IOP related issues For ST Microelectronics' ST21NFCA and ST21NFCB drivers: - ISO15693 support for st21nfcb - checkpatch and sparse related warning fixes - Code cleanups and a few minor fixes Finally, Marvell add ISO15693 support to the NCI stack, together with a couple of NCI fixes. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIcBAABAgAGBQJUIg4oAAoJEIqAPN1PVmxKg+kP/RPrgH6LA1tFubIwKR2+sGQ7 g/W2J3AE8QASZkErkpXRNCt2D/SPWEKBY/qscwz+BtcWg76taIIaGTvVUNtxSaW3 gS4hG6V1UlANWv3KFfaOKmzjEOO/SPNtkFAyI0cTOaGyUqG4o9BgBZpn1rYO16MD ZkSC39MpjMoXB9BbsfQngoUEoWc3tZNMmRzk4IVTwE/wXuQvZxmFQXEAiZ+pnYle NQfugaGMz0526rLG3QnrpkUakFb81iQwtONpbx6i8KW/Klkc6TN/ek6J9ecU8t5z tdHOViZWRmA1VwMGBHwpq8F2o/ATH6GeivTgqrQjcjGNhCUUT1Ulzve2UxGEMWi6 ncjKY/GxUrYaMMtRvLv+/knrfbWtd+EnWOav07jgNrrA0tvgBNQvEKKHPoWykDVN QKpxu3YoNxrsR/LJMS+Zjj0IIM1Y+9DTOkLXzxJ5Hvht8rOl5heYGh2DICOpWsbQ ejrQicJOJvN5vqu+Sgcqq4msyTEdbs2LfRDrW1VC9A6ILI+KzYg2laTFGMnhZ5qn TgsYIDdONS2iGUulFHylGHI7ANtUg/mhklLUccY1HQYyiAM1NQUtzq1tAz6yLoIH l8iIiyzJSBWW57nWhyrULEbzHgPE+bHIjO4T+UUOxMgquYa4V11S1uP0OfWfZogR xS24GlobS2oXHMQqh0fA =d/C+ -----END PGP SIGNATURE----- Merge tag 'nfc-next-3.18-1' of git://git.kernel.org/pub/scm/linux/kernel/git/sameo/nfc-next Samuel Ortiz <sameo@linux.intel.com> says: "NFC: 3.18 pull request This is the NFC pull request for 3.18. We've had major updates for TI and ST Microelectronics drivers: For TI's trf7970a driver: - Target mode support for trf7970a - Suspend/resume support for trf7970a - DT properties additions to handle different quirks - A bunch of fixes for smartphone IOP related issues For ST Microelectronics' ST21NFCA and ST21NFCB drivers: - ISO15693 support for st21nfcb - checkpatch and sparse related warning fixes - Code cleanups and a few minor fixes Finally, Marvell add ISO15693 support to the NCI stack, together with a couple of NCI fixes." Signed-off-by: John W. Linville <linville@tuxdriver.com>	2014-09-26 13:37:02 -04:00
Pablo Neira Ayuso	34666d467c	netfilter: bridge: move br_netfilter out of the core Jesper reported that br_netfilter always registers the hooks since this is part of the bridge core. This harms performance for people that don't need this. This patch modularizes br_netfilter so it can be rmmod'ed, thus, the hooks can be unregistered. I think the bridge netfilter should have been a separated module since the beginning, Patrick agreed on that. Note that this is breaking compatibility for users that expect that bridge netfilter is going to be available after explicitly 'modprobe bridge' or via automatic load through brctl. However, the damage can be easily undone by modprobing br_netfilter. The bridge core also spots a message to provide a clue to people that didn't notice that this has been deprecated. On top of that, the plan is that nftables will not rely on this software layer, but integrate the connection tracking into the bridge layer to enable stateful filtering and NAT, which is was bridge netfilter users seem to require. This patch still keeps the fake_dst_ops in the bridge core, since this is required by when the bridge port is initialized. So we can safely modprobe/rmmod br_netfilter anytime. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Acked-by: Florian Westphal <fw@strlen.de>	2014-09-26 18:42:31 +02:00
Tejun Heo	d06efebf0c	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux-block into for-3.18 This is to receive `0a30288da1` ("blk-mq, percpu_ref: implement a kludge for SCSI blk-mq stall during probe") which implements __percpu_ref_kill_expedited() to work around SCSI blk-mq stall. The commit reverted and patches to implement proper fix will be added. Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Kent Overstreet <kmo@daterainc.com> Cc: Jens Axboe <axboe@kernel.dk> Cc: Christoph Hellwig <hch@lst.de>	2014-09-24 13:00:21 -04:00
Johan Hedberg	d41c15cf95	Bluetooth: Fix reason code used for rejecting SCO connections The core specification defines valid values for the HCI_Reject_Synchronous_Connection_Request command to be 0x0D-0x0F. So far the code has been using HCI_ERROR_REMOTE_USER_TERM (0x13) which is not a valid value and is therefore being rejected by some controllers: > HCI Event: Connect Request (0x04) plen 10 bdaddr 40:6F:2A:6A:E5:E0 class 0x000000 type eSCO < HCI Command: Reject Synchronous Connection (0x01\|0x002a) plen 7 bdaddr 40:6F:2A:6A:E5:E0 reason 0x13 Reason: Remote User Terminated Connection > HCI Event: Command Status (0x0f) plen 4 Reject Synchronous Connection (0x01\|0x002a) status 0x12 ncmd 1 Error: Invalid HCI Command Parameters This patch introduces a new define for a value from the valid range (0x0d == Connection Rejected Due To Limited Resources) and uses it instead for rejecting incoming connections. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-09-24 14:03:32 +02:00
Joe Perches	2b0bf6c85a	Bluetooth: Convert bt_<level> logging functions to return void No caller or macro uses the return value so make all the functions return void. Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-09-24 09:40:08 +02:00
Christophe Ricard	9e87f9a9c4	NFC: nci: Add support for proprietary RF Protocols In NFC Forum NCI specification, some RF Protocol values are reserved for proprietary use (from 0x80 to 0xfe). Some CLF vendor may need to use one value within this range for specific technology. Furthermore, some CLF may not becompliant with NFC Froum NCI specification 2.0 and therefore will not support RF Protocol value 0x06 for PROTOCOL_T5T as mention in a draft specification and in a recent push. Adding get_rf_protocol handle to the nci_ops structure will help to set the correct technology to target. Signed-off-by: Christophe Ricard <christophe-h.ricard@st.com> Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>	2014-09-24 02:02:24 +02:00
Eric Dumazet	4cdf507d54	icmp: add a global rate limitation Current ICMP rate limiting uses inetpeer cache, which is an RBL tree protected by a lock, meaning that hosts can be stuck hard if all cpus want to check ICMP limits. When say a DNS or NTP server process is restarted, inetpeer tree grows quick and machine comes to its knees. iptables can not help because the bottleneck happens before ICMP messages are even cooked and sent. This patch adds a new global limitation, using a token bucket filter, controlled by two new sysctl : icmp_msgs_per_sec - INTEGER Limit maximal number of ICMP packets sent per second from this host. Only messages whose type matches icmp_ratemask are controlled by this limit. Default: 1000 icmp_msgs_burst - INTEGER icmp_msgs_per_sec controls number of ICMP packets sent per second, while icmp_msgs_burst controls the burst size of these packets. Default: 50 Note that if we really want to send millions of ICMP messages per second, we might extend idea and infra added in commit `04ca6973f7` ("ip: make IP identifiers less predictable") : add a token bucket in the ip_idents hash and no longer rely on inetpeer. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-23 12:47:38 -04:00
David S. Miller	1f6d80358d	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Conflicts: arch/mips/net/bpf_jit.c drivers/net/can/flexcan.c Both the flexcan and MIPS bpf_jit conflicts were cases of simple overlapping changes. Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-23 12:09:27 -04:00
David S. Miller	84de67b298	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec Steffen Klassert says: ==================== pull request (net): ipsec 2014-09-22 We generate a blackhole or queueing route if a packet matches an IPsec policy but a state can't be resolved. Here we assume that dst_output() is called to kill these packets. Unfortunately this assumption is not true in all cases, so it is possible that these packets leave the system without the necessary transformations. This pull request contains two patches to fix this issue: 1) Fix for blackhole routed packets. 2) Fix for queue routed packets. Both patches are serious stable candidates. Please pull or let me know if there are problems. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-22 16:41:41 -04:00
Eric Dumazet	fcdd1cf4dd	tcp: avoid possible arithmetic overflows icsk_rto is a 32bit field, and icsk_backoff can reach 15 by default, or more if some sysctl (eg tcp_retries2) are changed. Better use 64bit to perform icsk_rto << icsk_backoff operations As Joe Perches suggested, add a helper for this. Yuchung spotted the tcp_v4_err() case. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-22 16:27:10 -04:00
Daniel Borkmann	35f7aa5309	ipv6: mld: answer mldv2 queries with mldv1 reports in mldv1 fallback RFC2710 (MLDv1), section 3.7. says: The length of a received MLD message is computed by taking the IPv6 Payload Length value and subtracting the length of any IPv6 extension headers present between the IPv6 header and the MLD message. If that length is greater than 24 octets, that indicates that there are other fields present beyond the fields described above, perhaps belonging to a future backwards-compatible version of MLD. An implementation of the version of MLD specified in this document MUST NOT send an MLD message longer than 24 octets and MUST ignore anything past the first 24 octets of a received MLD message. RFC3810 (MLDv2), section 8.2.1. states for listeners regarding presence of MLDv1 routers: In order to be compatible with MLDv1 routers, MLDv2 hosts MUST operate in version 1 compatibility mode. [...] When Host Compatibility Mode is MLDv2, a host acts using the MLDv2 protocol on that interface. When Host Compatibility Mode is MLDv1, a host acts in MLDv1 compatibility mode, using only the MLDv1 protocol, on that interface. [...] While section 8.3.1. specifies router behaviour regarding presence of MLDv1 routers: MLDv2 routers may be placed on a network where there is at least one MLDv1 router. The following requirements apply: If an MLDv1 router is present on the link, the Querier MUST use the lowest version of MLD present on the network. This must be administratively assured. Routers that desire to be compatible with MLDv1 MUST have a configuration option to act in MLDv1 mode; if an MLDv1 router is present on the link, the system administrator must explicitly configure all MLDv2 routers to act in MLDv1 mode. When in MLDv1 mode, the Querier MUST send periodic General Queries truncated at the Multicast Address field (i.e., 24 bytes long), and SHOULD also warn about receiving an MLDv2 Query (such warnings must be rate-limited). The Querier MUST also fill in the Maximum Response Delay in the Maximum Response Code field, i.e., the exponential algorithm described in section 5.1.3. is not used. [...] That means that we should not get queries from different versions of MLD. When there's a MLDv1 router present, MLDv2 enforces truncation and MRC == MRD (both fields are overlapping within the 24 octet range). Section 8.3.2. specifies behaviour in the presence of MLDv1 multicast address listeners: MLDv2 routers may be placed on a network where there are hosts that have not yet been upgraded to MLDv2. In order to be compatible with MLDv1 hosts, MLDv2 routers MUST operate in version 1 compatibility mode. MLDv2 routers keep a compatibility mode per multicast address record. The compatibility mode of a multicast address is determined from the Multicast Address Compatibility Mode variable, which can be in one of the two following states: MLDv1 or MLDv2. The Multicast Address Compatibility Mode of a multicast address record is set to MLDv1 whenever an MLDv1 Multicast Listener Report is received for that multicast address. At the same time, the Older Version Host Present timer for the multicast address is set to Older Version Host Present Timeout seconds. The timer is re-set whenever a new MLDv1 Report is received for that multicast address. If the Older Version Host Present timer expires, the router switches back to Multicast Address Compatibility Mode of MLDv2 for that multicast address. [...] That means, what can happen is the following scenario, that hosts can act in MLDv1 compatibility mode when they previously have received an MLDv1 query (or, simply operate in MLDv1 mode-only); and at the same time, an MLDv2 router could start up and transmits MLDv2 startup query messages while being unaware of the current operational mode. Given RFC2710, section 3.7 we would need to answer to that with an MLDv1 listener report, so that the router according to RFC3810, section 8.3.2. would receive that and internally switch to MLDv1 compatibility as well. Right now, I believe since the initial implementation of MLDv2, Linux hosts would just silently drop such MLDv2 queries instead of replying with an MLDv1 listener report, which would prevent a MLDv2 router going into fallback mode (until it receives other MLDv1 queries). Since the mapping of MRC to MRD in exactly such cases can make use of the exponential algorithm from 5.1.3, we cannot [strictly speaking] be aware in MLDv1 of the encoding in MRC, it seems also not mentioned by the RFC. Since encodings are the same up to 32767, assume in such a situation this value as a hard upper limit we would clamp. We have asked one of the RFC authors on that regard, and he mentioned that there seem not to be any implementations that make use of that exponential algorithm on startup messages. In any case, this patch fixes this MLD interoperability issue. Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-22 16:23:15 -04:00
Florian Fainelli	19e57c4e6d	net: dsa: add {get, set}_wol callbacks to slave devices Allow switch drivers to implement per-port Wake-on-LAN getter and setters. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-22 14:41:23 -04:00
Florian Fainelli	2446254915	net: dsa: allow switch drivers to implement suspend/resume hooks Add an abstraction layer to suspend/resume switch devices, doing the following split: - suspend/resume the slave network devices and their corresponding PHY devices - suspend/resume the switch hardware using switch driver callbacks Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-22 14:41:23 -04:00
Eric Dumazet	2571178626	net: sched: shrink struct qdisc_skb_cb to 28 bytes We cannot make struct qdisc_skb_cb bigger without impacting IPoIB, or increasing skb->cb[] size. Commit `e0f31d8498` ("flow_keys: Record IP layer protocol in skb_flow_dissect()") broke IPoIB. Only current offender is sch_choke, and this one do not need an absolutely precise flow key. If we store 17 bytes of flow key, its more than enough. (Its the actual size of flow_keys if it was a packed structure, but we might add new fields at the end of it later) Signed-off-by: Eric Dumazet <edumazet@google.com> Fixes: `e0f31d8498` ("flow_keys: Record IP layer protocol in skb_flow_dissect()") Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-22 14:21:47 -04:00
Nicolas Dichtel	0d566379c5	genetlink: add function genl_has_listeners() This function is the counterpart of the function netlink_has_listeners(). Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Acked-by: Pravin B Shelar <pshelar@nicira.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-19 17:28:26 -04:00
Sabrina Dubroca	54003f119c	net: fix sparse warnings in SNMP_UPD_PO_STATS(_BH) ptr used to be a non __percpu pointer (result of a this_cpu_ptr assignment, `7d720c3e4f` ("percpu: add __percpu sparse annotations to net")). Since `d25398df59` ("net: avoid reloads in SNMP_UPD_PO_STATS"), that's no longer the case, SNMP_UPD_PO_STATS uses this_cpu_add and ptr is now __percpu. Silence sparse warnings by preserving the original type and annotation, and remove the out-of-date comment. warning: incorrect type in initializer (different address spaces) expected unsigned long long ptr got unsigned long long [noderef] <asn:3><noident> warning: incorrect type in initializer (different address spaces) expected void const [noderef] <asn:3>__vpp_verify got unsigned long long <noident> warning: incorrect type in initializer (different address spaces) expected void const [noderef] <asn:3>__vpp_verify got unsigned long long <noident> Signed-off-by: Sabrina Dubroca <sd@queasysnail.net> Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-19 17:22:31 -04:00
Tom Herbert	5632848653	net: Changes to ip_tunnel to support foo-over-udp encapsulation This patch changes IP tunnel to support (secondary) encapsulation, Foo-over-UDP. Changes include: 1) Adding tun_hlen as the tunnel header length, encap_hlen as the encapsulation header length, and hlen becomes the grand total of these. 2) Added common netlink define to support FOU encapsulation. 3) Routines to perform FOU encapsulation. Signed-off-by: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-19 17:15:32 -04:00
Florian Fainelli	6819563e64	net: dsa: allow switch drivers to specify phy_device::dev_flags Some switch drivers (e.g: bcm_sf2) may have to communicate specific workarounds or flags towards the PHY device driver. Allow switches driver to be delegated that task by introducing a get_phy_flags() callback which will do just that. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-19 16:27:07 -04:00
Andy Zhou	6a93cc9052	udp-tunnel: Add a few more UDP tunnel APIs Added a few more UDP tunnel APIs that can be shared by UDP based tunnel protocol implementation. The main ones are highlighted below. setup_udp_tunnel_sock() configures UDP listener socket for receiving UDP encapsulated packets. udp_tunnel_xmit_skb() and upd_tunnel6_xmit_skb() transmit skb using UDP encapsulation. udp_tunnel_sock_release() closes the UDP tunnel listener socket. Signed-off-by: Andy Zhou <azhou@nicira.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-19 15:57:15 -04:00
Andy Zhou	fd384412e1	udp_tunnel: Seperate ipv6 functions into its own file. Add ip6_udp_tunnel.c for ipv6 UDP tunnel functions to avoid ifdefs in udp_tunnel.c Signed-off-by: Andy Zhou <azhou@nicira.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-19 15:57:15 -04:00
Herbert Xu	689f1c9de2	ipsec: Remove obsolete MAX_AH_AUTH_LEN While tracking down the MAX_AH_AUTH_LEN crash in an old kernel I thought that this limit was rather arbitrary and we should just get rid of it. In fact it seems that we've already done all the work needed to remove it apart from actually removing it. This limit was there in order to limit stack usage. Since we've already switched over to allocating scratch space using kmalloc, there is no longer any need to limit the authentication length. This patch kills all references to it, including the BUG_ONs that led me here. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>	2014-09-18 10:54:36 +02:00
Marcel Holtmann	0097db06f5	Bluetooth: Remove exported hci_recv_fragment function The hci_recv_fragment function is no longer used by any driver and thus do not export it. In fact it is not even needed by the core and it can be removed altogether. Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>	2014-09-17 10:23:03 +03:00
Steffen Klassert	b8c203b2d2	xfrm: Generate queueing routes only from route lookup functions Currently we genarate a queueing route if we have matching policies but can not resolve the states and the sysctl xfrm_larval_drop is disabled. Here we assume that dst_output() is called to kill the queued packets. Unfortunately this assumption is not true in all cases, so it is possible that these packets leave the system unwanted. We fix this by generating queueing routes only from the route lookup functions, here we can guarantee a call to dst_output() afterwards. Fixes: `a0073fe18e` ("xfrm: Add a state resolution packet queue") Reported-by: Konstantinos Kolelis <k.kolelis@sirrix.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>	2014-09-16 10:08:49 +02:00
Steffen Klassert	f92ee61982	xfrm: Generate blackhole routes only from route lookup functions Currently we genarate a blackhole route route whenever we have matching policies but can not resolve the states. Here we assume that dst_output() is called to kill the balckholed packets. Unfortunately this assumption is not true in all cases, so it is possible that these packets leave the system unwanted. We fix this by generating blackhole routes only from the route lookup functions, here we can guarantee a call to dst_output() afterwards. Fixes: `2774c131b1` ("xfrm: Handle blackhole route creation via afinfo.") Reported-by: Konstantinos Kolelis <k.kolelis@sirrix.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>	2014-09-16 10:08:40 +02:00
Alex Gartrell	391f503d69	ipvs: prevent mixing heterogeneous pools and synchronization The synchronization protocol is not compatible with heterogeneous pools, so we need to verify that we're not turning both on at the same time. Signed-off-by: Alex Gartrell <agartrell@fb.com> Acked-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Simon Horman <horms@verge.net.au>	2014-09-16 09:03:35 +09:00
Alex Gartrell	ba38528aae	ipvs: Supply destination address family to ip_vs_conn_new The assumption that dest af is equal to service af is now unreliable, so we must specify it manually so as not to copy just the first 4 bytes of a v6 address or doing an illegal read of 16 butes on a v6 address. We "lie" in two places: for synchronization (which we will explicitly disallow from happening when we have heterogeneous pools) and for black hole addresses where there's no real dest. Signed-off-by: Alex Gartrell <agartrell@fb.com> Acked-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Simon Horman <horms@verge.net.au>	2014-09-16 09:03:34 +09:00
Alex Gartrell	655eef103d	ipvs: Supply destination addr family to ip_vs_{lookup_dest,find_dest} We need to remove the assumption that virtual address family is the same as real address family in order to support heterogeneous services (that is, services with v4 vips and v6 backends or the opposite). Signed-off-by: Alex Gartrell <agartrell@fb.com> Acked-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Simon Horman <horms@verge.net.au>	2014-09-16 09:03:33 +09:00
Alex Gartrell	6cff339bbd	ipvs: Add destination address family to netlink interface This is necessary to support heterogeneous pools. For example, if you have an ipv6 addressed network, you'll want to be able to forward ipv4 traffic into it. This patch enforces that destination address family is the same as service family, as none of the forwarding mechanisms support anything else. For the old setsockopt mechanism, we simply set the dest address family to AF_INET as we do with the service. Signed-off-by: Alex Gartrell <agartrell@fb.com> Acked-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Simon Horman <horms@verge.net.au>	2014-09-16 09:03:33 +09:00
Alexander Duyck	b4d2394d01	dsa: Replace mii_bus with a generic host device This change makes it so that instead of passing and storing a mii_bus we instead pass and store a host_dev. From there we can test to determine the exact type of device, and can verify it is the correct device for our switch. So for example it would be possible to pass a device pointer from a pci_dev and instead of checking for a PHY ID we could check for a vendor and/or device ID. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-15 17:24:20 -04:00
Alexander Duyck	5075314e4e	dsa: Split ops up, and avoid assigning tag_protocol and receive separately This change addresses several issues. First, it was possible to set tag_protocol without setting the ops pointer. To correct that I have reordered things so that rcv is now populated before we set tag_protocol. Second, it didn't make much sense to keep setting the device ops each time a new slave was registered. So by moving the receive portion out into root switch initialization that issue should be addressed. Third, I wanted to avoid sending tags if the rcv pointer was not registered so I changed the tag check to verify if the rcv function pointer is set on the root tree. If it is then we start sending DSA tagged frames. Finally I split the device ops pointer in the structures into two spots. I placed the rcv function pointer in the root switch since this makes it easiest to access from there, and I placed the xmit function pointer in the slave for the same reason. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-15 17:24:20 -04:00
John W. Linville	6bd2bd27ba	This time, I have some rate minstrel improvements, support for a very small feature from CCX that Steinar reverse-engineered, dynamic ACK timeout support, a number of changes for TDLS, early support for radio resource measurement and many fixes. Also, I'm changing a number of places to clear key memory when it's freed and Intel claims copyright for code they developed. -----BEGIN PGP SIGNATURE----- iQIcBAABCAAGBQJUEpv0AAoJEDBSmw7B7bqr6CMP/2CXvWr/98AY2Flt74KDNyaE vmJBVCsu+eT0G9FL6YxbVU5+rvInGDHd9qTHkU4ljd+uXwnG8XAT+WHFlhBjzm+V juXPWblbSdMzwpWDfq7Kbk134b9ALTEUqekhqSFvhPA5h0Dq0/8lDK9CFyfwKWbN 07PwUv0VUUEHKVqQoVSNJu9Szi5NvZvDcN7Jwg1Cpnv0sUOeH7J2Kz1OUT4RaEhI c/UJjCQV4ssXaEkTDIxciQ62HrglZanMqyx4a9LGbrxLdw1KJ19CNmSkwB5mQuZg LhV05Y0Gv4tkRC8sCo7HF7cqgjBfjTNiEjZYfbExW0QFOMKIgKmmjYIEezVdbrk7 gFIyhTRE595UtztUJV0dcitoOlybbRf3OdEwAIJD6fc0vhoe/rSjUIyS7/CZisMT 9zg33JvtK3eYPSJS1jy4lk2yZ5alhLoPMQTNmsEuyOGcU3sH9vTGMjONPffOlcH9 nzj7aUS2Qvwn3H+4CIaZbZhySpa0B9zkGL3oxeaEBmLJbFMTo5ua2FNGhubC2O+O BwNULDBEMwsGHKMUCWCLmQwACWdVdNxYYWtXbWfxdmC/CJoXgdLCJIUfoa1aOf2A DyCqUFvG/n8ObHVy+P3RU6poQFj0M/yclJAMHRW6x2qzNvAkDb0G6TVeIlgN5dG8 jLoZPL5OH0wb0BPVNEH8 =OPIp -----END PGP SIGNATURE----- Merge tag 'mac80211-next-for-john-2014-09-12' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211-next Johannes Berg <johannes@sipsolutions.net> says: "This time, I have some rate minstrel improvements, support for a very small feature from CCX that Steinar reverse-engineered, dynamic ACK timeout support, a number of changes for TDLS, early support for radio resource measurement and many fixes. Also, I'm changing a number of places to clear key memory when it's freed and Intel claims copyright for code they developed." Conflicts: net/mac80211/iface.c Signed-off-by: John W. Linville <linville@tuxdriver.com>	2014-09-15 14:51:23 -04:00
Marcel Holtmann	43e73e4e2a	Bluetooth: Provide HCI command opcode information to driver The Bluetooth core already does processing of the HCI command header and puts it together before sending it to the driver. It is not really efficient for the driver to look at the HCI command header again in case it has to make certain decisions about certain commands. To make this easier, just provide the opcode as part of the SKB control buffer information. The extra information about the opcode is optional and only provided for HCI commands. Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>	2014-09-15 07:15:45 +03:00
Florian Fainelli	ac7a04c33d	net: dsa: change tag_protocol to an enum Now that we introduced an additional multiplexing/demultiplexing layer with commit `3e8a72d1da` ("net: dsa: reduce number of protocol hooks") that lives within the DSA code, we no longer need to have a given switch driver tag_protocol be an actual ethertype value, instead, we can replace it with an enum: dsa_tag_protocol. Do this replacement in the drivers, which allows us to get rid of the cpu_to_be16()/htons() dance, and remove ETH_P_BRCMTAG since we do not need it anymore. Suggested-by: Alexander Duyck <alexander.duyck@gmail.com> Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-13 17:04:35 -04:00
WANG Cong	013b4d9038	ipv6: clean up ipv6_dev_ac_inc() Make it accept inet6_dev, and rename it to __ipv6_dev_ac_inc() to reflect this change. Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-13 16:38:42 -04:00
John Fastabend	25d8c0d55f	net: rcu-ify tcf_proto rcu'ify tcf_proto this allows calling tc_classify() without holding any locks. Updaters are protected by RTNL. This patch prepares the core net_sched infrastracture for running the classifier/action chains without holding the qdisc lock however it does nothing to ensure cls_xxx and act_xxx types also work without locking. Additional patches are required to address the fall out. Signed-off-by: John Fastabend <john.r.fastabend@intel.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-13 12:30:25 -04:00
John Fastabend	46e5da40ae	net: qdisc: use rcu prefix and silence sparse warnings Add __rcu notation to qdisc handling by doing this we can make smatch output more legible. And anyways some of the cases should be using rcu_dereference() see qdisc_all_tx_empty(), qdisc_tx_chainging(), and so on. Also *wake_queue() API is commonly called from driver timer routines without rcu lock or rtnl lock. So I added rcu_read_lock() blocks around netif_wake_subqueue and netif_tx_wake_queue. Signed-off-by: John Fastabend <john.r.fastabend@intel.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-13 12:30:25 -04:00
Sabrina Dubroca	381f4dca48	ipv6: clean up anycast when an interface is destroyed If we try to rmmod the driver for an interface while sockets with setsockopt(JOIN_ANYCAST) are alive, some refcounts aren't cleaned up and we get stuck on: unregister_netdevice: waiting for ens3 to become free. Usage count = 1 If we LEAVE_ANYCAST/close everything before rmmod'ing, there is no problem. We need to perform a cleanup similar to the one for multicast in addrconf_ifdown(how == 1). Signed-off-by: Sabrina Dubroca <sd@queasysnail.net> Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-12 17:33:06 -04:00
Pablo Neira Ayuso	67981fefb2	netfilter: fix compilation of masquerading without IP_NF_TARGET_MASQUERADE CONFIG_NF_NAT_MASQUERADE_IPV6=m # CONFIG_IP6_NF_TARGET_MASQUERADE is not set results in: net/ipv6/netfilter/nf_nat_masquerade_ipv6.c: In function ‘nf_nat_masquerade_ipv6’: net/ipv6/netfilter/nf_nat_masquerade_ipv6.c:41:14: error: ‘struct nf_conn_nat’ has no member named ‘masq_index’ nfct_nat(ct)->masq_index = out->ifindex; ^ net/ipv6/netfilter/nf_nat_masquerade_ipv6.c: In function ‘device_cmp’: net/ipv6/netfilter/nf_nat_masquerade_ipv6.c:61:12: error: ‘const struct nf_conn_nat’ has no member named ‘masq_index’ return nat->masq_index == (int)(long)ifindex; ^ net/ipv6/netfilter/nf_nat_masquerade_ipv6.c:62:1: warning: control reaches end of non-void function [-Wreturn-type] } ^ make[3]: *** [net/ipv6/netfilter/nf_nat_masquerade_ipv6.o] Error 1 Fix this by using the new NF_NAT_MASQUERADE_IPV4 and _IPV6 symbols in include/net/netfilter/nf_nat.h. Reported-by: Jim Davis <jim.epost@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2014-09-11 17:02:45 +02:00
Eliad Peller	0d8614b4b9	mac80211: replace SMPS hw flags with wiphy feature bits Use the new static_smps / dynamic_smps feature bits instead of mac80211-internal hw flags. Signed-off-by: Eliad Peller <eliad@wizery.com> Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-09-11 13:37:02 +02:00
Eliad Peller	18998c381b	cfg80211: allow requesting SMPS mode on ap start Add feature bits to indicate device support for static-smps and dynamic-smps modes. Add a new NL80211_ATTR_SMPS_MODE attribue to allow configuring the smps mode to be used by the ap (e.g. configuring to ap to dynamic smps mode will reduce power consumption while having minor effect on throughput) Signed-off-by: Eliad Peller <eliad@wizery.com> Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-09-11 13:37:02 +02:00
Eliad Peller	b0b6aa2c8e	cfg80211/mac80211: add wmm info to assoc event Userspace might need to know what queues are configured for uapsd (e.g. for setting proper default values in tspecs). Add this bitmap to the association event (inside wmm nested attribute) Add additional parameter to cfg80211_rx_assoc_resp, and update its callers. Signed-off-by: Eliad Peller <eliadx.peller@intel.com> Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-09-11 12:24:39 +02:00
Johannes Berg	960d01acf6	cfg80211: add WMM traffic stream API Add nl80211 and driver API to validate, add and delete traffic streams with appropriate settings. The API calls for userspace doing the action frame handshake with the peer, and then allows only to set up the parameters in the driver. To avoid setting up a session only to tear it down again, the validate API is provided, but the real usage later can still fail so userspace must be prepared for that. Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-09-11 12:21:18 +02:00
Johannes Berg	78f686cae0	cfg80211: don't put kek/kck/replay counter on the stack There's no need to put the values on the stack, just pass a pointer to the data in the nl80211 message. This reduces stack usage and avoids potential issues with putting sensitive data on the stack. Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-09-11 12:07:34 +02:00
David S. Miller	0aac383353	Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-next Pablo Neira Ayuso says: ==================== nf-next pull request The following patchset contains Netfilter/IPVS updates for your net-next tree. Regarding nf_tables, most updates focus on consolidating the NAT infrastructure and adding support for masquerading. More specifically, they are: 1) use __u8 instead of u_int8_t in arptables header, from Mike Frysinger. 2) Add support to match by skb->pkttype to the meta expression, from Ana Rey. 3) Add support to match by cpu to the meta expression, also from Ana Rey. 4) A smatch warning about IPSET_ATTR_MARKMASK validation, patch from Vytas Dauksa. 5) Fix netnet and netportnet hash types the range support for IPv4, from Sergey Popovich. 6) Fix missing-field-initializer warnings resolved, from Mark Rustad. 7) Dan Carperter reported possible integer overflows in ipset, from Jozsef Kadlecsick. 8) Filter out accounting objects in nfacct by type, so you can selectively reset quotas, from Alexey Perevalov. 9) Move specific NAT IPv4 functions to the core so x_tables and nf_tables can share the same NAT IPv4 engine. 10) Use the new NAT IPv4 functions from nft_chain_nat_ipv4. 11) Move specific NAT IPv6 functions to the core so x_tables and nf_tables can share the same NAT IPv4 engine. 12) Use the new NAT IPv6 functions from nft_chain_nat_ipv6. 13) Refactor code to add nft_delrule(), which can be reused in the enhancement of the NFT_MSG_DELTABLE to remove a table and its content, from Arturo Borrero. 14) Add a helper function to unregister chain hooks, from Arturo Borrero. 15) A cleanup to rename to nft_delrule_by_chain for consistency with the new nft_*() functions, also from Arturo. 16) Add support to match devgroup to the meta expression, from Ana Rey. 17) Reduce stack usage for IPVS socket option, from Julian Anastasov. 18) Remove unnecessary textsearch state initialization in xt_string, from Bojan Prtvar. 19) Add several helper functions to nf_tables, more work to prepare the enhancement of NFT_MSG_DELTABLE, again from Arturo Borrero. 20) Enhance NFT_MSG_DELTABLE to delete a table and its content, from Arturo Borrero. 21) Support NAT flags in the nat expression to indicate the flavour, eg. random fully, from Arturo. 22) Add missing audit code to ebtables when replacing tables, from Nicolas Dichtel. 23) Generalize the IPv4 masquerading code to allow its re-use from nf_tables, from Arturo. 24) Generalize the IPv6 masquerading code, also from Arturo. 25) Add the new masq expression to support IPv4/IPv6 masquerading from nf_tables, also from Arturo. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-10 12:46:32 -07:00
Willem de Bruijn	67cc0d4077	net-timestamp: optimize sock_tx_timestamp default path Few packets have timestamping enabled. Exit sock_tx_timestamp quickly in this common case. Signed-off-by: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-09 17:34:41 -07:00
Vincent Bernat	49a601589c	net/ipv4: bind ip_nonlocal_bind to current netns net.ipv4.ip_nonlocal_bind sysctl was global to all network namespaces. This patch allows to set a different value for each network namespace. Signed-off-by: Vincent Bernat <vincent@bernat.im> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-09 11:27:09 -07:00
Arturo Borrero	9ba1f726be	netfilter: nf_tables: add new nft_masq expression The nft_masq expression is intended to perform NAT in the masquerade flavour. We decided to have the masquerade functionality in a separated expression other than nft_nat. Signed-off-by: Arturo Borrero Gonzalez <arturo.borrero.glez@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2014-09-09 16:31:30 +02:00
Arturo Borrero	be6b635cd6	netfilter: nf_nat: generalize IPv6 masquerading support for nf_tables Let's refactor the code so we can reach the masquerade functionality from outside the xt context (ie. nftables). The patch includes the addition of an atomic counter to the masquerade notifier: the stuff to be done by the notifier is the same for xt and nftables. Therefore, only one notification handler is needed. This factorization only involves IPv6; a similar patch exists to handle IPv4. Signed-off-by: Arturo Borrero Gonzalez <arturo.borrero.glez@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2014-09-09 16:31:29 +02:00
Arturo Borrero	8dd33cc93e	netfilter: nf_nat: generalize IPv4 masquerading support for nf_tables Let's refactor the code so we can reach the masquerade functionality from outside the xt context (ie. nftables). The patch includes the addition of an atomic counter to the masquerade notifier: the stuff to be done by the notifier is the same for xt and nftables. Therefore, only one notification handler is needed. This factorization only involves IPv4; a similar patch follows to handle IPv6. Signed-off-by: Arturo Borrero Gonzalez <arturo.borrero.glez@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2014-09-09 16:31:29 +02:00
Pablo Neira Ayuso	2a5538e9aa	netfilter: nat: move specific NAT IPv6 to core Move the specific NAT IPv6 core functions that are called from the hooks from ip6table_nat.c to nf_nat_l3proto_ipv6.c. This prepares the ground to allow iptables and nft to use the same NAT engine code that comes in a follow up patch. This also renames nf_nat_ipv6_fn to nft_nat_ipv6_fn in net/ipv6/netfilter/nft_chain_nat_ipv6.c to avoid a compilation breakage. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2014-09-09 16:30:00 +02:00
Johan Hedberg	e1e930f591	Bluetooth: Fix mgmt pairing failure when authentication fails Whether through HCI with BR/EDR or SMP with LE when authentication fails we should also notify any pending Pair Device mgmt command. This patch updates the mgmt_auth_failed function to take the actual hci_conn object and makes sure that any pending pairing command is notified and cleaned up appropriately. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-09-09 03:12:15 +02:00
David S. Miller	5b4c314575	Merge tag 'master-2014-09-08' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next John W. Linville says: ==================== pull request: wireless-next 2014-09-08 Please pull this batch of updates intended for the 3.18 stream... For the mac80211 bits, Johannes says: "Not that much content this time. Some RCU cleanups, crypto performance improvements, and various patches all over, rather than listing them one might as well look into the git log instead." For the Bluetooth bits, Gustavo says: "The changes consists of: - Coding style fixes to HCI drivers - Corrupted ack value fix for the H5 HCI driver - A couple of Enhanced L2CAP fixes - Conversion of SMP code to use common L2CAP channel API - Page scan optimizations when using the kernel-side whitelist - Various mac802154 and and ieee802154 6lowpan cleanups - One new Atheros USB ID" For the iwlwifi bits, Emmanuel says: "We have a new big thing coming up which is called Dynamic Queue Allocation (or DQA). This is a completely new way to work with the Tx queues and it requires major refactoring. This is being done by Johannes and Avri. Besides this, Johannes disables U-APSD by default because of APs that would disable A-MPDU if the association supports U-ASPD. Luca contributed to the power area which he was cleaning up on the way while working on CSA. A few more random things here and there." For the Atheros bits, Kalle says: "For ath6kl we had two small fixes and a new SDIO device id. For ath10k the bigger changes are: * support for new firmware version 10.2 (Michal) * spectral scan support (Simon, Sven & Mathias) * export a firmware crash dump file (Ben & me) * cleaning up of pci.c (Michal) * print pci id in all messages, which causes most of the churn (Michal)" Beyond that, we have the usual collection of various updates to ath9k, b43, mwifiex, and wil6210, as well as a few other bits here and there. Please let me know if there are problems! ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-08 16:43:58 -07:00
Willem de Bruijn	a7f26b7e1e	inet: remove dead inetpeer sequence code inetpeer sequence numbers are no longer incremented, so no need to check and flush the tree. The function that increments the sequence number was already dead code and removed in in "ipv4: remove unused function" (`068a6e18`). Remove the code that checks for a change, too. Verifying that v4_seq and v6_seq are never incremented and thus that flush_check compares bp->flush_seq to 0 is trivial. The second part of the change removes flush_check completely even though bp->flush_seq is exactly !0 once, at initialization. This change is correct because the time this branch is true is when bp->root == peer_avl_empty_rcu, in which the branch and inetpeer_invalidate_tree are a NOOP. Signed-off-by: Willem de Bruijn <willemb@google.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-08 16:42:42 -07:00
Johan Hedberg	fc75cc8684	Bluetooth: Fix locking of the SMP context Before the move the l2cap_chan the SMP context (smp_chan) didn't have any kind of proper locking. The best there existed was the HCI_CONN_LE_SMP_PEND flag which was used to enable mutual exclusion for potential multiple creators of the SMP context. Now that SMP has been converted to use the l2cap_chan infrastructure and since the SMP context is directly mapped to a corresponding l2cap_chan we get the SMP context locking essentially for free through the l2cap_chan lock. For all callbacks that l2cap_core.c makes for each channel implementation (smp.c in the case of SMP) the l2cap_chan lock is held through l2cap_chan_lock(chan). Since the calls from l2cap_core.c to smp.c are covered the only missing piece to have the locking implemented properly is to ensure that the lock is held for any other call path that may access the SMP context. This means user responses through mgmt.c, requests to elevate the security of a connection through hci_conn.c, as well as any deferred work through workqueues. This patch adds the necessary locking to all these other code paths that try to access the SMP context. Since mutual exclusion for the l2cap_chan access is now covered from all directions the patch also removes unnecessary HCI_CONN_LE_SMP_PEND flag (once we've acquired the chan lock we can simply check whether chan->smp is set to know if there's an SMP context). Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-09-08 19:07:56 +02:00
Johan Hedberg	f3d82d0c8e	Bluetooth: Move identity address update behind a workqueue The identity address update of all channels for an l2cap_conn needs to take the lock for each channel, i.e. it's safest to do this by a separate workqueue callback. Previously this was partially solved by moving the entire SMP key distribution behind a workqueue. However, if we want SMP context locking to be correct and safe we should always use the l2cap_chan lock when accessing it, meaning even smp_distribute_keys needs to take that lock which would once again create a dead lock when updating the identity address. The simplest way to solve this is to have l2cap_conn manage the deferred work which is what this patch does. A subsequent patch will remove the now unnecessary SMP key distribution work struct. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-09-08 19:07:55 +02:00
Johan Hedberg	e3b679d56c	Bluetooth: Update hci_disconnect() to return an error value We'll soon use hci_disconnect() from places that are interested to know whether the hci_send_cmd() really succeeded or not. This patch updates hci_disconnect() to pass on any error returned from hci_send_cmd(). Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-09-08 19:07:55 +02:00
Johan Hedberg	b04afa0c28	Bluetooth: Remove unused l2cap_conn_shutdown API Now that there are no more users of the l2cap_conn_shutdown API (since smp.c switched to using hci_disconnect) we can simply remove it along with all of it's l2cap_conn variables. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-09-08 19:07:54 +02:00
Johan Hedberg	f94b665dcf	Bluetooth: Ignore incoming data after initiating disconnection When hci_chan_del is called the disconnection routines get scheduled through a workqueue. If there's any incoming ACL data before the routines get executed there's a chance that a new hci_chan is created and the disconnection never happens. This patch adds a new hci_conn flag to indicate that we're in the process of driving the connection down. We set the flag in hci_chan_del and check for it in hci_chan_create so that no new channels are created for the same connection. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-09-08 19:07:53 +02:00
Johan Hedberg	eb78d7e53d	Bluetooth: Use zero timeout for immediate scheduling There's no point in passing a "small" timeout to queue_delayed_work() to try to get the callback faster scheduled. Passing 0 is perfectly valid and will cause a shortcut to a direct queue_work(). Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-09-08 19:07:53 +02:00
Johan Hedberg	51bb8457dd	Bluetooth: Improve _get() functions to return the object type It's natural to have _get() functions that increment the reference count of an object to return the object type itself. This way it's simple to make a copy of the object pointer and increase the reference count in a single step. This patch updates two such get() functions, namely hci_conn_get() and l2cap_conn_get(), and updates the users to take advantage of the new API. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-09-08 19:07:52 +02:00
Johan Hedberg	c16900cf28	Bluetooth: Fix hci_conn reference counting for fixed channels Now that SMP has been converted to use fixed channels we've got a bit of a problem with the hci_conn reference counting. So far the L2CAP code has kept a reference for each L2CAP channel that was notified of the connection. With SMP however this would mean that the connection is never dropped even though there are no other users of it. Furthermore, SMP already does its own hci_conn reference counting internally, starting from a security or pairing request and ending with the key distribution. This patch makes L2CAP fixed channels default to the L2CAP core not keeping a hci_conn reference for them. A new FLAG_HOLD_HCI_CONN flag is added so that L2CAP users can declare an exception to this rule and hold a reference even for their fixed channels. One such exception is the L2CAP socket layer which does want a reference for each socket (e.g. an ATT socket which uses a fixed channel). Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-09-08 19:07:52 +02:00
John W. Linville	61a3d4f9d5	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless	2014-09-08 11:14:56 -04:00
David S. Miller	eb84d6b604	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net	2014-09-07 21:41:53 -07:00
Tejun Heo	908c7f1949	percpu_counter: add @gfp to percpu_counter_init() Percpu allocator now supports allocation mask. Add @gfp to percpu_counter_init() so that !GFP_KERNEL allocation masks can be used with percpu_counters too. We could have left percpu_counter_init() alone and added percpu_counter_init_gfp(); however, the number of users isn't that high and introducing _gfp variants to all percpu data structures would be quite ugly, so let's just do the conversion. This is the one with the most users. Other percpu data structures are a lot easier to convert. This patch doesn't make any functional difference. Signed-off-by: Tejun Heo <tj@kernel.org> Acked-by: Jan Kara <jack@suse.cz> Acked-by: "David S. Miller" <davem@davemloft.net> Cc: x86@kernel.org Cc: Jens Axboe <axboe@kernel.dk> Cc: "Theodore Ts'o" <tytso@mit.edu> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Andrew Morton <akpm@linux-foundation.org>	2014-09-08 09:51:29 +09:00
David S. Miller	45ce829dd0	Merge tag 'master-2014-09-04' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless John W. Linville says: ==================== pull request: wireless 2014-09-05 Please pull this batch of fixes intended for the 3.17 stream... For the mac80211 bits, Johannes says: "Here are a few fixes for mac80211. One has been discussed for a while and adds a terminating NUL-byte to the alpha2 sent to userspace, which shouldn't be necessary but since many places treat it as a string we couldn't move to just sending two bytes. In addition to that, we have two VLAN fixes from Felix, a mesh fix, a fix for the recently introduced RX aggregation offload, a revert for a broken patch (that luckily didn't really cause any harm) and a small fix for alignment in debugfs." For the iwlwifi bits, Emmanuel says: "I revert a patch that disabled CTS to self in dvm because users reported issues. The revert is CCed to stable since the offending patch was sent to stable too. I also bump the firmware API versions since a new firmware is coming up. On top of that, Marcel fixes a bug I introduced while fixing a bug in our Kconfig file." Please let me know if there are problems! ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-07 16:11:10 -07:00
Eric Dumazet	7faee5c0d5	tcp: remove TCP_SKB_CB(skb)->when After commit `740b0f1841` ("tcp: switch rtt estimations to usec resolution"), we no longer need to maintain timestamps in two different fields. TCP_SKB_CB(skb)->when can be removed, as same information sits in skb_mstamp.stamp_jiffies Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-05 17:49:33 -07:00
Eric Dumazet	04317dafd1	tcp: introduce TCP_SKB_CB(skb)->tcp_tw_isn TCP_SKB_CB(skb)->when has different meaning in output and input paths. In output path, it contains a timestamp. In input path, it contains an ISN, chosen by tcp_timewait_state_process() Lets add a different name to ease code comprehension. Note that 'when' field will disappear in following patch, as skb_mstamp already contains timestamp, the anonymous union will promptly disappear as well. Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-05 17:49:33 -07:00
Alexander Duyck	56193d1bce	net: Add function for parsing the header length out of linear ethernet frames This patch updates some of the flow_dissector api so that it can be used to parse the length of ethernet buffers stored in fragments. Most of the changes needed were to __skb_get_poff as it needed to be updated to support sending a linear buffer instead of a skb. I have split __skb_get_poff into two functions, the first is skb_get_poff and it retains the functionality of the original __skb_get_poff. The other function is __skb_get_poff which now works much like __skb_flow_dissect in relation to skb_flow_dissect in that it provides the same functionality but works with just a data buffer and hlen instead of needing an skb. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Acked-by: Alexei Starovoitov <ast@plumgrid.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-05 17:47:02 -07:00
Alexander Duyck	82eabd9eb2	net: merge cases where sock_efree and sock_edemux are the same function Since sock_efree and sock_demux are essentially the same code for non-TCP sockets and the case where CONFIG_INET is not defined we can combine the code or replace the call to sock_edemux in several spots. As a result we can avoid a bit of unnecessary code or code duplication. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-05 17:43:45 -07:00
Alexander Duyck	62bccb8cdb	net-timestamp: Make the clone operation stand-alone from phy timestamping The phy timestamping takes a different path than the regular timestamping does in that it will create a clone first so that the packets needing to be timestamped can be placed in a queue, or the context block could be used. In order to support these use cases I am pulling the core of the code out so it can be used in other drivers beyond just phy devices. In addition I have added a destructor named sock_efree which is meant to provide a simple way for dropping the reference to skb exceptions that aren't part of either the receive or send windows for the socket, and I have removed some duplication in spots where this destructor could be used in place of sock_edemux. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-05 17:43:45 -07:00
Eric Dumazet	d546c62154	ipv4: harden fnhe_hashfun() Lets make this hash function a bit secure, as ICMP attacks are still in the wild. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-05 17:40:33 -07:00
Masanari Iida	e793c0f70e	net: treewide: Fix typo found in DocBook/networking.xml This patch fix spelling typo found in DocBook/networking.xml. It is because the neworking.xml is generated from comments in the source, I have to fix typo in comments within the source. Signed-off-by: Masanari Iida <standby24x7@gmail.com> Acked-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-05 17:35:28 -07:00
Eric Dumazet	caa415270c	ipv4: fix a race in update_or_create_fnhe() nh_exceptions is effectively used under rcu, but lacks proper barriers. Between kzalloc() and setting of nh->nh_exceptions(), we need a proper memory barrier. Signed-off-by: Eric Dumazet <edumazet@google.com> Fixes: `4895c771c7` ("ipv4: Add FIB nexthop exceptions.") Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-05 17:15:50 -07:00
Willem de Bruijn	c199105d15	net-timestamp: only report sw timestamp if reporting bit is set The timestamping API has separate bits for generating and reporting timestamps. A software timestamp should only be reported for a packet when the packet has the relevant generation flag (SKBTX_..) set and the socket has reporting bit SOF_TIMESTAMPING_SOFTWARE set. The second check was accidentally removed. Reinstitute the original behavior. Tested: Without this patch, Documentation/networking/txtimestamp reports timestamps regardless of whether SOF_TIMESTAMPING_SOFTWARE is set. After the patch, it only reports them when the flag is set. Fixes: `f24b9be595` ("net-timestamp: extend SCM_TIMESTAMPING ancillary data struct") Signed-off-by: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-09-05 15:02:43 -07:00
Lorenzo Bianconi	a4bcaf5556	mac80211: extend set_coverage_class signature Extend mac80211 set_coverage_class API in order to enable ACK timeout estimation algorithm (dynack) passing coverage class equals to -1 to lower drivers. Synchronize set_coverage_class routine signature with mac80211 function pointer for p54, ath9k, ath9k_htc and ath5k drivers. Signed-off-by: Lorenzo Bianconi <lorenzo.bianconi83@gmail.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-09-05 13:54:07 +02:00
Lorenzo Bianconi	3057dbfdab	cfg80211: enable dynack through nl80211 Enable ACK timeout estimation algorithm (dynack) using mac80211 set_coverage_class API. Dynack is activated passing coverage class equals to -1 to lower drivers and it is automatically disabled setting valid value for coverage class. Define NL80211_ATTR_WIPHY_DYN_ACK flag attribute to enable dynack from userspace. In order to activate dynack NL80211_FEATURE_ACKTO_ESTIMATION feature flag must be set by lower drivers to indicate dynack capability. Signed-off-by: Lorenzo Bianconi <lorenzo.bianconi83@gmail.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-09-05 13:54:03 +02:00

1 2 3 4 5 ...

7732 Commits