linux

Author	SHA1	Message	Date
Pablo Neira Ayuso	85d30e2416	netfilter: nft_log: request explicit logger when loading rules This includes the special handling for NFPROTO_INET. There is no real inet logger since we don't see packets of this family. However, rules are loaded using this special family type. So let's just request both IPV4 and IPV6 loggers. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2014-06-27 13:21:30 +02:00
Pablo Neira Ayuso	960649d192	netfilter: bridge: add generic packet logger This adds the generic plain text packet loggger for bridged packets. It routes the logging message to the real protocol packet logger. I decided not to refactor the ebt_log code for two reasons: 1) The ebt_log output is not consistent with the IPv4 and IPv6 Netfilter packet loggers. The output is different for no good reason and it adds redundant code to handle packet logging. 2) To avoid breaking backward compatibility for applications outthere that are parsing the specific ebt_log output, the ebt_log output has been left as is. So only nftables will use the new consistent logging format for logged bridged packets. More decisions coming in this patch: 1) This also removes ebt_log as default logger for bridged packets. Thus, nf_log_packet() routes packet to this new packet logger instead. This doesn't break backward compatibility since nf_log_packet() is not used to log packets in plain text format from anywhere in the ebtables/netfilter bridge code. 2) The new bridge packet logger also performs a lazy request to register the real IPv4, ARP and IPv6 netfilter packet loggers. If the real protocol logger is no available (not compiled or the module is not available in the system, not packet logging happens. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2014-06-27 13:20:47 +02:00
Pablo Neira Ayuso	35b9395104	netfilter: add generic ARP packet logger This adds the generic plain text packet loggger for ARP packets. It is based on the ebt_log code. Nevertheless, the output has been modified to make it consistent with the original xt_LOG output. This is an example output: IN=wlan0 OUT= ARP HTYPE=1 PTYPE=0x0800 OPCODE=2 MACSRC=00🆎12:34:55:63 IPSRC=192.168.10.1 MACDST=80:09:12:70:4f:50 IPDST=192.168.10.150 This patch enables packet logging from ARP chains, eg. nft add rule arp filter input log prefix "input: " Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2014-06-27 13:20:38 +02:00
Pablo Neira Ayuso	fab4085f4e	netfilter: log: nf_log_packet() as real unified interface Before this patch, the nf_loginfo parameter specified the logging configuration in case the specified default logger was loaded. This patch updates the semantics of the nf_loginfo parameter in nf_log_packet() which now indicates the logger that you explicitly want to use. Thus, nf_log_packet() is exposed as an unified interface which internally routes the log message to the corresponding logger type by family. The module dependencies are expressed by the new nf_logger_find_get() and nf_logger_put() functions which bump the logger module refcount. Thus, you can not remove logger modules that are used by rules anymore. Another important effect of this change is that the family specific module is only loaded when required. Therefore, xt_LOG and nft_log will just trigger the autoload of the nf_log_{ip,ip6} modules according to the family. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2014-06-27 13:20:13 +02:00
Pablo Neira Ayuso	83e96d443b	netfilter: log: split family specific code to nf_log_{ip,ip6,common}.c files The plain text logging is currently embedded into the xt_LOG target. In order to be able to use the plain text logging from nft_log, as a first step, this patch moves the family specific code to the following files and Kconfig symbols: 1) net/ipv4/netfilter/nf_log_ip.c: CONFIG_NF_LOG_IPV4 2) net/ipv6/netfilter/nf_log_ip6.c: CONFIG_NF_LOG_IPV6 3) net/netfilter/nf_log_common.c: CONFIG_NF_LOG_COMMON These new modules will be required by xt_LOG and nft_log. This patch is based on original patch from Arturo Borrero Gonzalez. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2014-06-27 13:19:59 +02:00
Hangbin Liu	e940f5d6ba	ipv6: Fix MLD Query message check Based on RFC3810 6.2, we also need to check the hop limit and router alert option besides source address. Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> Acked-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-06-27 00:21:50 -07:00
James M Leddy	3e215c8d1b	udp: Add MIB counters for rcvbuferrors Add MIB counters for rcvbuferrors in UDP to help diagnose problems. Signed-off-by: James M Leddy <james.leddy@redhat.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-06-27 00:20:55 -07:00
David S. Miller	9b8d90b963	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net	2014-06-25 22:40:43 -07:00
Linus Torvalds	f40ede392d	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Pull networking fixes from David Miller: 1) Fix crash in ipvs tot_stats estimator, from Julian Anastasov. 2) Fix OOPS in nf_nat on netns removal, from Florian Westphal. 3) Really really really fix locking issues in slip and slcan tty write wakeups, from Tyler Hall. 4) Fix checksum offloading in fec driver, from Fugang Duan. 5) Off by one in BPF instruction limit test, from Kees Cook. 6) Need to clear all TSO capability flags when doing software TSO in tg3 driver, from Prashant Sreedharan. 7) Fix memory leak in vlan_reorder_header() error path, from Li RongQing. 8) Fix various bugs in xen-netfront and xen-netback multiqueue support, from David Vrabel and Wei Liu. 9) Fix deadlock in cxgb4 driver, from Li RongQing. 10) Prevent double free of no-cache DST entries, from Eric Dumazet. 11) Bad csum_start handling in skb_segment() leads to crashes when forwarding, from Tom Herbert. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (76 commits) net: fix setting csum_start in skb_segment() ipv4: fix dst race in sk_dst_get() net: filter: Use kcalloc/kmalloc_array to allocate arrays trivial: net: filter: Change kerneldoc parameter order trivial: net: filter: Fix typo in comment net: allwinner: emac: Add missing free_irq cxgb4: use dev_port to identify ports xen-netback: bookkeep number of active queues in our own module tg3: Change nvram command timeout value to 50ms cxgb4: Not need to hold the adap_rcu_lock lock when read adap_rcu_list be2net: fix qnq mode detection on VFs of: mdio: fixup of_phy_register_fixed_link parsing of new bindings at86rf230: fix irq setup net: phy: at803x: fix coccinelle warnings net/mlx4_core: Fix the error flow when probing with invalid VF configuration tulip: Poll link status more frequently for Comet chips net: huawei_cdc_ncm: increase command buffer size drivers: net: cpsw: fix dual EMAC stall when connected to same switch xen-netfront: recreate queues correctly when reconnecting xen-netfront: fix oops when disconnected from backend ...	2014-06-25 21:08:24 -07:00
Tom Herbert	de843723f9	net: fix setting csum_start in skb_segment() Dave Jones reported that a crash is occurring in csum_partial tcp_gso_segment inet_gso_segment ? update_dl_migration skb_mac_gso_segment __skb_gso_segment dev_hard_start_xmit sch_direct_xmit __dev_queue_xmit ? dev_hard_start_xmit dev_queue_xmit ip_finish_output ? ip_output ip_output ip_forward_finish ip_forward ip_rcv_finish ip_rcv __netif_receive_skb_core ? __netif_receive_skb_core ? trace_hardirqs_on __netif_receive_skb netif_receive_skb_internal napi_gro_complete ? napi_gro_complete dev_gro_receive ? dev_gro_receive napi_gro_receive It looks like a likely culprit is that SKB_GSO_CB()->csum_start is not set correctly when doing non-scatter gather. We are using offset as opposed to doffset. Reported-by: Dave Jones <davej@redhat.com> Tested-by: Dave Jones <davej@redhat.com> Signed-off-by: Tom Herbert <therbert@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Fixes: `7e2b10c1e5` ("net: Support for multiple checksums with gso") Acked-by: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-06-25 20:45:54 -07:00
Eric Dumazet	f886497212	ipv4: fix dst race in sk_dst_get() When IP route cache had been removed in linux-3.6, we broke assumption that dst entries were all freed after rcu grace period. DST_NOCACHE dst were supposed to be freed from dst_release(). But it appears we want to keep such dst around, either in UDP sockets or tunnels. In sk_dst_get() we need to make sure dst refcount is not 0 before incrementing it, or else we might end up freeing a dst twice. DST_NOCACHE set on a dst does not mean this dst can not be attached to a socket or a tunnel. Then, before actual freeing, we need to observe a rcu grace period to make sure all other cpus can catch the fact the dst is no longer usable. Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: Dormando <dormando@rydia.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-06-25 17:41:44 -07:00
Tobias Klauser	99e72a0fed	net: filter: Use kcalloc/kmalloc_array to allocate arrays Use kcalloc/kmalloc_array to make it clear we're allocating arrays. No integer overflow can actually happen here, since len/flen is guaranteed to be less than BPF_MAXINSNS (4096). However, this changed makes sure we're not going to get one if BPF_MAXINSNS were ever increased. Signed-off-by: Tobias Klauser <tklauser@distanz.ch> Acked-by: Daniel Borkmann <dborkman@redhat.com> Acked-by: Alexei Starovoitov <ast@plumgrid.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-06-25 16:40:02 -07:00
Tobias Klauser	677a9fd3e6	trivial: net: filter: Change kerneldoc parameter order Change the order of the parameters to sk_unattached_filter_create() in the kerneldoc to reflect the order they appear in the actual function. This fix is only cosmetic, in the generated doc they still appear in the correct order without the fix. Signed-off-by: Tobias Klauser <tklauser@distanz.ch> Acked-by: Daniel Borkmann <dborkman@redhat.com> Acked-by: Alexei Starovoitov <ast@plumgrid.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-06-25 16:38:54 -07:00
Tobias Klauser	285276e72c	trivial: net: filter: Fix typo in comment Signed-off-by: Tobias Klauser <tklauser@distanz.ch> Acked-by: Daniel Borkmann <dborkman@redhat.com> Acked-by: Alexei Starovoitov <ast@plumgrid.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-06-25 16:38:54 -07:00
Eric Dumazet	f6d8cb2eed	inet: reduce TLB pressure for listeners It seems overkill to use vmalloc() for typical listeners with less than 2048 hash buckets. Try kmalloc() and fallback to vmalloc() to reduce TLB pressure. Use kvfree() helper as it is now available. Use ilog2() instead of a loop. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-06-25 16:37:24 -07:00
Fabian Frederick	1f74714f1e	net/dsa/dsa.c: remove unnecessary null test before kfree Fix checkpatch warning: WARNING: kfree(NULL) is safe this check is probably not required Cc: "David S. Miller" <davem@davemloft.net> Cc: Grant Likely <grant.likely@linaro.org> Cc: netdev@vger.kernel.org Cc: Joe Perches <joe@perches.com> Signed-off-by: Fabian Frederick <fabf@skynet.be> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-06-25 16:15:16 -07:00
John W. Linville	8b87efba61	Merge branch 'for-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth	2014-06-25 14:22:35 -04:00
John W. Linville	12307e4208	Merge git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211	2014-06-25 14:17:50 -04:00
Pablo Neira Ayuso	27fd8d90c9	netfilter: nf_log: move log buffering to core logging This patch moves Eric Dumazet's log buffer implementation from the xt_log.h header file to the core net/netfilter/nf_log.c. This also includes the renaming of the structure and functions to avoid possible undesired namespace clashes. This change allows us to use it from the arp and bridge packet logging implementation in follow up patches.	2014-06-25 19:28:43 +02:00
Pablo Neira Ayuso	5962815a6a	netfilter: nf_log: use an array of loggers instead of list Now that legacy ulog targets are not available anymore in the tree, we can have up to two possible loggers: 1) The plain text logging via kernel logging ring. 2) The nfnetlink_log infrastructure which delivers log messages to userspace. This patch replaces the list of loggers by an array of two pointers per family for each possible logger and it also introduces a new field to the nf_logger structure which indicates the position in the logger array (based on the logger type). This prepares a follow up patch that consolidates the nf_log_packet() interface by allowing to specify the logger as parameter. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2014-06-25 19:28:43 +02:00
Pablo Neira Ayuso	7200135bc1	netfilter: kill ulog targets This has been marked as deprecated for quite some time and the NFLOG target replacement has been also available since 2006. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2014-06-25 19:28:43 +02:00
Florian Westphal	9500507c61	netfilter: conntrack: remove timer from ecache extension This brings the (per-conntrack) ecache extension back to 24 bytes in size (was 152 byte on x86_64 with lockdep on). When event delivery fails, re-delivery is attempted via work queue. Redelivery is attempted at least every 0.1 seconds, but can happen more frequently if userspace is not congested. The nf_ct_release_dying_list() function is removed. With this patch, ownership of the to-be-redelivered conntracks (on-dying-list-with-DYING-bit not yet set) is with the work queue, which will release the references once event is out. Joint work with Pablo Neira Ayuso. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2014-06-25 19:15:38 +02:00
Michal Kazior	97dc94f1d9	cfg80211: remove channel_switch combination check Driver is now responsible for veryfing if the switch is possible. Since this is inherently tricky driver may decide to disconnect an interface later with cfg80211_stop_iface(). This doesn't mean driver can accept everything. It should do it's best to verify requests and reject them as soon as possible. Signed-off-by: Michal Kazior <michal.kazior@tieto.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-06-25 18:06:20 +02:00
Michal Kazior	4c3ebc56d7	mac80211: use chanctx reservation for STA CSA Channel switch finalization is now 2-step. First step is when driver calls chswitch_done(), the other is when reservation is actually finalized (which be defered for in-place reservation). It is now safe to call ieee80211_chswitch_done() more than once. Also remove the ieee80211_vif_change_channel() because it is no longer used. Signed-off-by: Michal Kazior <michal.kazior@tieto.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-06-25 18:06:20 +02:00
Michal Kazior	03078de4f9	mac80211: use chanctx reservation for AP CSA Channel switch finalization is now 2-step. First step is when driver calls csa_finish(), the other is when reservation is actually finalized (which can be deferred for in-place reservation). It is now safe to call ieee80211_csa_finish() more than once. Signed-off-by: Michal Kazior <michal.kazior@tieto.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-06-25 18:06:20 +02:00
Michal Kazior	71e6195ed2	mac80211: make check_combinations() aware of chanctx reservation The ieee80211_check_combinations() computes radar_detect accordingly depending on chanctx reservation status. This makes it possible to use the function for channel_switch validation. Signed-off-by: Michal Kazior <michal.kazior@tieto.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-06-25 18:06:20 +02:00
Michal Kazior	5bcae31d9c	mac80211: implement multi-vif in-place reservations Multi-vif in-place reservations happen when it is impossible to allocate more channel contexts as indicated by interface combinations. Such reservations are not finalized until all assigned interfaces are ready. This still doesn't handle all possible cases (i.e. degradation of number of channels) properly. Signed-off-by: Michal Kazior <michal.kazior@tieto.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-06-25 18:06:20 +02:00
Eric Dumazet	f6b50824f7	netfilter: x_tables: xt_free_table_info() cleanup kvfree() helper can make xt_free_table_info() much cleaner. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2014-06-25 14:52:16 +02:00
Fabian Frederick	397304b52d	netfilter: ctnetlink: remove null test before kfree Fix checkpatch warning: WARNING: kfree(NULL) is safe this check is probably not required Signed-off-by: Fabian Frederick <fabf@skynet.be> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2014-06-25 14:49:38 +02:00
David Spinadel	633e271326	mac80211: split sched scan IEs Split sched scan IEs to band specific and not band specific blocks. Common IEs blocks may be sent to the FW once per command, instead of per band. This allows optimization of size of the command, which may be required by some drivers (eg. iwlmvm with newer firmware version). As this changes the mac80211 API, update all drivers to use the new version correctly, even if they don't (yet) make use of the split data. Signed-off-by: David Spinadel <david.spinadel@intel.com> Reviewed-by: Alexander Bondar <alexander.bondar@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-06-25 09:10:43 +02:00
David Spinadel	c56ef67250	mac80211: support more than one band in scan request Some drivers (such as iwlmvm) can handle multiple bands in a single HW scan request. Add a HW flag to indicate that the driver support this. To hold the required data, create a separate structure for HW scan request that holds cfg scan request and data about different parts of the scan IEs. As this changes the mac80211 API, update all drivers using it to use the correct new function type/argument. Signed-off-by: David Spinadel <david.spinadel@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-06-25 09:10:42 +02:00
Andy Adamson	66b0686049	NFSv4: test SECINFO RPC_AUTH_GSS pseudoflavors for support Fix nfs4_negotiate_security to create an rpc_clnt used to test each SECINFO returned pseudoflavor. Check credential creation (and gss_context creation) which is important for RPC_AUTH_GSS pseudoflavors which can fail for multiple reasons including mis-configuration. Don't call nfs4_negotiate in nfs4_submount as it was just called by nfs4_proc_lookup_mountpoint (nfs4_proc_lookup_common) Signed-off-by: Andy Adamson <andros@netapp.com> [Trond: fix corrupt return value from nfs_find_best_sec()] Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>	2014-06-24 18:46:58 -04:00
Johannes Berg	02df00eb00	nl80211: move set_qos_map command into split state The non-split wiphy state shouldn't be increased in size so move the new set_qos_map command into the split if statement. Cc: stable@vger.kernel.org (3.14+) Fixes: `fa9ffc7456` ("cfg80211: Add support for QoS mapping") Reviewed-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-06-24 16:13:10 +02:00
Rasmus Villemoes	79631c89ed	trivial: net/irda/irlmp.c: Fix closing brace followed by if Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-06-23 15:04:33 -07:00
Govindarajulu Varadarajan	e0f31d8498	flow_keys: Record IP layer protocol in skb_flow_dissect() skb_flow_dissect() dissects only transport header type in ip_proto. It dose not give any information about IPv4 or IPv6. This patch adds new member, n_proto, to struct flow_keys. Which records the IP layer type. i.e IPv4 or IPv6. This can be used in netdev->ndo_rx_flow_steer driver function to dissect flow. Adding new member to flow_keys increases the struct size by around 4 bytes. This causes BUILD_BUG_ON(sizeof(qcb->data) < sz); to fail in qdisc_cb_private_validate() So increase data size by 4 Signed-off-by: Govindarajulu Varadarajan <_govind@gmx.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-06-23 14:32:19 -07:00
Arik Nemtsov	ee10f2c779	mac80211: protect TDLS discovery session After sending a TDLS discovery-request, we expect a reply to arrive on the AP's channel. We must stay on the channel (no PSM, scan, etc.), since a TDLS setup-response is a direct packet not buffered by the AP. Add a new mac80211 driver callback to allow discovery session protection. Signed-off-by: Arik Nemtsov <arikx.nemtsov@intel.com> Reviewed-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-06-23 14:28:19 +02:00
Arik Nemtsov	7adc3e4664	mac80211: make sure TDLS peer STA exists during setup Make sure userspace added a TDLS peer station before invoking the transmission of the first setup frame. This ensures packets to the peer won't go through the AP path. Signed-off-by: Arik Nemtsov <arikx.nemtsov@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-06-23 14:28:18 +02:00
Arik Nemtsov	c887f0d3a0	mac80211: add API to request TDLS operation from userspace Write a mac80211 to the cfg80211 API for requesting a userspace TDLS operation. Define TDLS specific reason codes that can be used here. Signed-off-by: Arik Nemtsov <arikx.nemtsov@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-06-23 14:28:17 +02:00
Arik Nemtsov	db67d661e8	mac80211: implement proper Tx path flushing for TDLS As the spec mandates, flush data in the AP path before transmitting the first setup frame. Data packets transmitted during setup are already dropped in the Tx path. For the teardown flow, flush all packets in the direct path before transmitting the teardown frame. Un-authorize the peer sta after teardown is sent, forcing all subsequent Tx to the peer through the AP. Make sure to flush the queues when disabling the link to get the teardown packet out. Signed-off-by: Arik Nemtsov <arikx.nemtsov@intel.com> [adjust to Luca's new quuee API and stop only vif queues] Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-06-23 14:27:58 +02:00
Arik Nemtsov	191dd46905	mac80211: split tdls_mgmt function There are setup/teardown specific actions to be done that accompany the sending of a TDLS management packet. Split the main function to simplify future additions. Signed-off-by: Arik Nemtsov <arikx.nemtsov@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-06-23 14:24:55 +02:00
Arik Nemtsov	2fb6b9b8e5	mac80211: use TDLS initiator in tdls_mgmt operations The TDLS initiator is set once during link setup. If determines the address ordering in the link identifier IE. Use the value from userspace in order to have a correct teardown packet. With the current code, a teardown from the responder side fails the TDLS MIC check because of a bad link identifier IE. Signed-off-by: Arik Nemtsov <arikx.nemtsov@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-06-23 14:24:55 +02:00
Arik Nemtsov	31fa97c5de	cfg80211: pass TDLS initiator in tdls_mgmt operations The TDLS initiator is set once during link setup. If determines the address ordering in the link identifier IE. Fix dependent drivers - mwifiex and mac80211. Signed-off-by: Arik Nemtsov <arikx.nemtsov@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-06-23 14:24:55 +02:00
Arik Nemtsov	17e6a59a36	mac80211: cleanup TDLS state during failed setup When setting up a TDLS session, register a delayed work to remove the peer if setup times out. Prevent concurrent setups to support this capacity. Signed-off-by: Arik Nemtsov <arikx.nemtsov@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-06-23 14:24:55 +02:00
Arik Nemtsov	68885a54cd	mac80211: set auth flags after other station info For TDLS, the AUTHORIZED flag arrives with all other important station info (supported rates, HT/VHT caps, ...). Make sure to set the station state in the low-level driver after transferring this information to the mac80211 STA entry. This aligns the STA information during sta_state callbacks with the non-TDLS case. Signed-off-by: Arik Nemtsov <arikx.nemtsov@intel.com> Reviewed-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-06-23 14:24:55 +02:00
Arik Nemtsov	9deba04d0f	mac80211: clarify TDLS Tx handling Rename the flags used in the Tx path and add an explanation for the reasons to drop, send directly or through the AP. Signed-off-by: Arik Nemtsov <arikx.nemtsov@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-06-23 14:24:54 +02:00
Luciano Coelho	a46992b441	mac80211: stop only the queues assigned to the vif during channel switch Instead of stopping all the hardware queues during channel switch, which is especially bad when we have large CSA counts, stop only the queues that are assigned to the vif that is performing the channel switch. Additionally, check for (sdata->csa_block_tx) instead of calling ieee80211_csa_needs_block_tx(), which can now be removed. Signed-off-by: Luciano Coelho <luciano.coelho@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-06-23 14:22:29 +02:00
Luciano Coelho	26da23b695	mac80211: add functions to stop and wake all queues assigned to a vif In some cases we may want to stop the queues of a single vif (for instance during a channel-switch). Add a function that stops all the queues that are assigned to a vif. If a queue is assigned to more than one vif, the corresponding netdev subqueue of the other vif(s) will also be stopped. If the HW doesn't set the IEEE80211_HW_QUEUE_CONTROL flag, then all queues are stopped. Also add a corresponding function to wake the queues of a vif back. Signed-off-by: Luciano Coelho <luciano.coelho@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-06-23 14:22:27 +02:00
Luciano Coelho	cca07b00a5	mac80211: introduce refcount for queue_stop_reasons Sometimes different vifs may be stopping the queues for the same reason (e.g. when several interfaces are performing a channel switch). Instead of using a bitmask for the reasons, use an integer that holds a refcount instead. In order to keep it backwards compatible, introduce a boolean in some functions that tell us whether the queue stopping should be refcounted or not. For now, use not refcounted for all calls to keep it functionally the same as before. Signed-off-by: Luciano Coelho <luciano.coelho@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-06-23 14:22:25 +02:00
Luciano Coelho	59f48fe22f	mac80211: don't stop all queues when flushing There is no need to stop all queues when we want to flush specific queues, so stop only the queues that will be flushed. Signed-off-by: Luciano Coelho <luciano.coelho@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-06-23 14:22:24 +02:00
Thomas Gleixner	8d7b70fb7b	net: Mac80211: Remove silly timespec dance Converting time from one format to another seems to give coders a warm and fuzzy feeling. Use the proper interfaces. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: John Stultz <john.stultz@linaro.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: John W. Linville <linville@tuxdriver.com> [fix compile error] Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-06-23 14:22:21 +02:00
Thomas Gleixner	181715203b	mac80211: Use ktime_get_ts() do_posix_clock_monotonic_gettime() is a leftover from the initial posix timer implementation which maps to ktime_get_ts(). Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-06-23 14:22:18 +02:00
Michal Kazior	10d78f2782	mac80211: use csa counter offsets instead of csa_active vif->csa_active is protected by mutexes only. This means it is unreliable to depend on it on codeflow in non-sleepable beacon and CSA code. There was no guarantee to have vif->csa_active update be visible before beacons are updated on SMP systems. Using csa counter offsets which are embedded in beacon struct (and thus are protected with single RCU assignment) is much safer. Signed-off-by: Michal Kazior <michal.kazior@tieto.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-06-23 14:22:16 +02:00
Michal Kazior	af296bdb8d	mac80211: move csa counters from sdata to beacon/presp Having csa counters part of beacon and probe_resp structures makes it easier to get rid of possible races between setting a beacon and updating counters on SMP systems by guaranteeing counters are always consistent against given beacon struct. While at it relax WARN_ON into WARN_ON_ONCE to prevent spamming logs and racing. Signed-off-by: Michal Kazior <michal.kazior@tieto.com> [remove pointless array check] Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-06-23 14:22:06 +02:00
Eliad Peller	0ce12026d6	cfg80211: fix elapsed_jiffies calculation MAX_JIFFY_OFFSET has no meaning when calculating the elapsed jiffies, as jiffies run out until ULONG_MAX. This miscalculation results in erroneous values in case of a wrap-around. Signed-off-by: Eliad Peller <eliad@wizery.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-06-23 11:29:25 +02:00
Johannes Berg	e33e2241e2	Revert "cfg80211: Use 5MHz bandwidth by default when checking usable channels" This reverts commit `8eca1fb692`. Felix notes that this broke regulatory, leaving channel 12 open for AP operation in the US regulatory domain where it isn't permitted. Link: http://mid.gmane.org/53A6C0FF.9090104@openwrt.org Reported-by: Felix Fietkau <nbd@openwrt.org> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-06-23 11:06:28 +02:00
Janusz Dziedzic	b49328361b	mac80211: allow tx via monitor iface when DFS Allow send frames using monitor interface when DFS chandef and we pass CAC (beaconing allowed). This fix problem when old kernel and new backports used, in such case hostapd create/use also monitor interface. Before this patch all frames hostapd send using monitor iface were dropped when AP was configured on DFS channel. Signed-off-by: Janusz Dziedzic <janusz.dziedzic@tieto.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-06-23 11:05:35 +02:00
Johannes Berg	b7ffbd7ef6	cfg80211: make ethtool the driver's responsibility Currently, cfg80211 tries to implement ethtool, but that doesn't really scale well, with all the different operations. Make the lower-level driver responsible for it, which currently only has an effect on mac80211. It will similarly not scale well at that level though, since mac80211 also has many drivers. To cleanly implement this in mac80211, introduce a new file and move some code to appropriate places. Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-06-23 11:05:33 +02:00
Johannes Berg	ba9030c20a	mac80211: remove weak WEP IV accounting Since WEP is practically dead, there seems very little point in keeping WEP weak IV accounting. Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-06-23 11:05:31 +02:00
Antonio Ospite	b314c66990	trivial: net/mac80211/mesh.c: fix typo s/Substract/Subtract/ Signed-off-by: Antonio Ospite <ao2@ao2.it> Cc: Luis Carlos Cobo <luisca@cozybit.com> Cc: Johannes Berg <johannes@sipsolutions.net> Cc: linux-wireless@vger.kernel.org Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-06-23 11:05:29 +02:00
Bob Copeland	2b470c39e8	mac80211: remove ignore_plink_timer flag The mesh_plink code is doing some interesting things with the ignore_plink_timer flag. It seems the original intent was to handle this race: cpu 0 cpu 1 ----- ----- start timer handler for state X acquire sta_lock change state from X to Y mod_timer() / del_timer() release sta_lock acquire sta_lock execute state Y timer too soon However, using the mod_timer()/del_timer() return values to detect these cases is broken. As a result, timers get ignored unnecessarily, and stations can get stuck in the peering state machine. Instead, we can detect the case by looking at the timer expiration. In the case of del_timer, just ignore the timers in the following (LISTEN/ESTAB) states since they won't have timers anyway. Signed-off-by: Bob Copeland <me@bobcopeland.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-06-23 11:05:27 +02:00
Johannes Berg	5ac2e35030	mac80211: fix station/driver powersave race It is currently possible to have a race due to the station PS unblock work like this: * station goes to sleep with frames buffered in the driver * driver blocks wakeup * station wakes up again * driver flushes/returns frames, and unblocks, which schedules the unblock work * unblock work starts to run, and checks that the station is awake (i.e. that the WLAN_STA_PS_STA flag isn't set) * we process a received frame with PM=1, setting the flag again * ieee80211_sta_ps_deliver_wakeup() runs, delivering all frames to the driver, and then clearing the WLAN_STA_PS_DRIVER and WLAN_STA_PS_STA flags In this scenario, mac80211 will think that the station is awake, while it really is asleep, and any TX'ed frames should be filtered by the device (it will know that the station is sleeping) but then passed to mac80211 again, which will not buffer it either as it thinks the station is awake, and eventually the packets will be dropped. Fix this by moving the clearing of the flags to exactly where we learn about the situation. This creates a problem of reordering, so introduce another flag indicating that delivery is being done, this new flag also queues frames and is cleared only while the spinlock is held (which the queuing code also holds) so that any concurrent delivery/TX is handled correctly. Reported-by: Andrei Otcheretianski <andrei.otcheretianski@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-06-23 11:05:25 +02:00
John W. Linville	20edb50e59	mac80211: remove PID rate control Minstrel has long since proven its worth. Signed-off-by: John W. Linville <linville@tuxdriver.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-06-23 11:05:23 +02:00
Max Stepanov	744462a91e	mac80211: WEP extra head/tail room in ieee80211_send_auth After skb allocation and call to ieee80211_wep_encrypt in ieee80211_send_auth the flow fails with a warning in ieee80211_wep_add_iv on verification of available head/tailroom needed for WEP_IV and WEP_ICV. Signed-off-by: Max Stepanov <Max.Stepanov@intel.com> Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-06-23 10:45:14 +02:00
Duan Jiong	2b74e2caec	net: em_canid: remove useless statements from em_canid_change tcf_ematch is allocated by kzalloc in function tcf_em_tree_validate(), so cm_old is always NULL. Signed-off-by: Duan Jiong <duanj.fnst@cn.fujitsu.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-06-21 15:40:22 -07:00
Li RongQing	a3f5ee71cd	bridge: use list_for_each_entry_continue_reverse use list_for_each_entry_continue_reverse to rollback in fdb_add_hw when add address failed Signed-off-by: Li RongQing <roy.qing.li@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-06-21 15:33:22 -07:00
Li RongQing	916c1689a0	8021q: fix a potential memory leak skb_cow called in vlan_reorder_header does not free the skb when it failed, and vlan_reorder_header returns NULL to reset original skb when it is called in vlan_untag, lead to a memory leak. Signed-off-by: Li RongQing <roy.qing.li@gmail.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-06-21 15:12:13 -07:00
Lukasz Rymanowski	1d56dc4f5f	Bluetooth: Fix for ACL disconnect when pairing fails When pairing fails hci_conn refcnt drops below zero. This cause that ACL link is not disconnected when disconnect timeout fires. Probably this is because l2cap_conn_del calls l2cap_chan_del for each channel, and inside l2cap_chan_del conn is dropped. After that loop hci_chan_del is called which also drops conn. Anyway, as it is desrcibed in hci_core.h, it is known that refcnt drops below 0 sometimes and it should be fine. If so, let disconnect link when hci_conn_timeout fires and refcnt is 0 or below. This patch does it. This affects PTS test SM_TC_JW_BV_05_C Logs from scenario: [69713.706227] [6515] pair_device: [69713.706230] [6515] hci_conn_add: hci0 dst 00:1b:dc:06:06:22 [69713.706233] [6515] hci_dev_hold: hci0 orig refcnt 8 [69713.706235] [6515] hci_conn_init_sysfs: conn ffff88021f65a000 [69713.706239] [6515] hci_req_add_ev: hci0 opcode 0x200d plen 25 [69713.706242] [6515] hci_prepare_cmd: skb len 28 [69713.706243] [6515] hci_req_run: length 1 [69713.706248] [6515] hci_conn_hold: hcon ffff88021f65a000 orig refcnt 0 [69713.706251] [6515] hci_dev_put: hci0 orig refcnt 9 [69713.706281] [8909] hci_cmd_work: hci0 cmd_cnt 1 cmd queued 1 [69713.706288] [8909] hci_send_frame: hci0 type 1 len 28 [69713.706290] [8909] hci_send_to_monitor: hdev ffff88021f0c7000 len 28 [69713.706316] [5949] hci_sock_recvmsg: sock ffff8800941a9680, sk ffff88012bf4d000 [69713.706382] [5949] hci_sock_recvmsg: sock ffff8800941a9680, sk ffff88012bf4d000 [69713.711664] [8909] hci_rx_work: hci0 [69713.711668] [8909] hci_send_to_monitor: hdev ffff88021f0c7000 len 6 [69713.711680] [8909] hci_rx_work: hci0 Event packet [69713.711683] [8909] hci_cs_le_create_conn: hci0 status 0x00 [69713.711685] [8909] hci_sent_cmd_data: hci0 opcode 0x200d [69713.711688] [8909] hci_req_cmd_complete: opcode 0x200d status 0x00 [69713.711690] [8909] hci_sent_cmd_data: hci0 opcode 0x200d [69713.711695] [5949] hci_sock_recvmsg: sock ffff8800941a9680, sk ffff88012bf4d000 [69713.711744] [5949] hci_sock_recvmsg: sock ffff8800941a9680, sk ffff88012bf4d000 [69713.818875] [8909] hci_rx_work: hci0 [69713.818889] [8909] hci_send_to_monitor: hdev ffff88021f0c7000 len 21 [69713.818913] [8909] hci_rx_work: hci0 Event packet [69713.818917] [8909] hci_le_conn_complete_evt: hci0 status 0x00 [69713.818922] [8909] hci_send_to_control: len 19 [69713.818927] [5949] hci_sock_recvmsg: sock ffff8800941a9680, sk ffff88012bf4d000 [69713.818938] [8909] hci_conn_add_sysfs: conn ffff88021f65a000 [69713.818975] [6450] bt_sock_poll: sock ffff88005e758500, sk ffff88010323b800 [69713.818981] [6515] hci_sock_recvmsg: sock ffff88005e75a080, sk ffff88010323ac00 ... [69713.819021] [8909] hci_dev_hold: hci0 orig refcnt 10 [69713.819025] [8909] l2cap_connect_cfm: hcon ffff88021f65a000 bdaddr 00:1b:dc:06:06:22 status 0 [69713.819028] [8909] hci_chan_create: hci0 hcon ffff88021f65a000 [69713.819031] [8909] l2cap_conn_add: hcon ffff88021f65a000 conn ffff880221005c00 hchan ffff88020d60b1c0 [69713.819034] [8909] l2cap_conn_ready: conn ffff880221005c00 [69713.819036] [5949] hci_sock_recvmsg: sock ffff8800941a9680, sk ffff88012bf4d000 [69713.819037] [8909] smp_conn_security: conn ffff880221005c00 hcon ffff88021f65a000 level 0x02 [69713.819039] [8909] smp_chan_create: [69713.819041] [8909] hci_conn_hold: hcon ffff88021f65a000 orig refcnt 1 [69713.819043] [8909] smp_send_cmd: code 0x01 [69713.819045] [8909] hci_send_acl: hci0 chan ffff88020d60b1c0 flags 0x0000 [69713.819046] [5949] hci_sock_recvmsg: sock ffff8800941a9900, sk ffff88012bf4e800 [69713.819049] [8909] hci_queue_acl: hci0 nonfrag skb ffff88005157c100 len 15 [69713.819055] [5949] hci_sock_recvmsg: sock ffff8800941a9900, sk ffff88012bf4e800 [69713.819057] [8909] l2cap_le_conn_ready: [69713.819064] [8909] l2cap_chan_create: chan ffff88005ede2c00 [69713.819066] [8909] l2cap_chan_hold: chan ffff88005ede2c00 orig refcnt 1 [69713.819069] [8909] l2cap_sock_init: sk ffff88005ede5800 [69713.819072] [8909] bt_accept_enqueue: parent ffff880160356000, sk ffff88005ede5800 [69713.819074] [8909] __l2cap_chan_add: conn ffff880221005c00, psm 0x00, dcid 0x0004 [69713.819076] [8909] l2cap_chan_hold: chan ffff88005ede2c00 orig refcnt 2 [69713.819078] [8909] hci_conn_hold: hcon ffff88021f65a000 orig refcnt 2 [69713.819080] [8909] smp_conn_security: conn ffff880221005c00 hcon ffff88021f65a000 level 0x01 [69713.819082] [8909] l2cap_sock_ready_cb: sk ffff88005ede5800, parent ffff880160356000 [69713.819086] [8909] le_pairing_complete_cb: status 0 [69713.819091] [8909] hci_tx_work: hci0 acl 10 sco 8 le 0 [69713.819093] [8909] hci_sched_acl: hci0 [69713.819094] [8909] hci_sched_sco: hci0 [69713.819096] [8909] hci_sched_esco: hci0 [69713.819098] [8909] hci_sched_le: hci0 [69713.819099] [8909] hci_chan_sent: hci0 [69713.819101] [8909] hci_chan_sent: chan ffff88020d60b1c0 quote 10 [69713.819104] [8909] hci_sched_le: chan ffff88020d60b1c0 skb ffff88005157c100 len 15 priority 7 [69713.819106] [8909] hci_send_frame: hci0 type 2 len 15 [69713.819108] [8909] hci_send_to_monitor: hdev ffff88021f0c7000 len 15 [69713.819119] [8909] hci_chan_sent: hci0 [69713.819121] [8909] hci_prio_recalculate: hci0 [69713.819123] [8909] process_pending_rx: [69713.819226] [6450] hci_sock_recvmsg: sock ffff88005e758780, sk ffff88010323d400 ... [69713.822022] [6450] l2cap_sock_accept: sk ffff880160356000 timeo 0 [69713.822024] [6450] bt_accept_dequeue: parent ffff880160356000 [69713.822026] [6450] bt_accept_unlink: sk ffff88005ede5800 state 1 [69713.822028] [6450] l2cap_sock_accept: new socket ffff88005ede5800 [69713.822368] [6450] l2cap_sock_getname: sock ffff8800941ab700, sk ffff88005ede5800 [69713.822375] [6450] l2cap_sock_getsockopt: sk ffff88005ede5800 [69713.822383] [6450] l2cap_sock_getname: sock ffff8800941ab700, sk ffff88005ede5800 [69713.822414] [6450] bt_sock_poll: sock ffff8800941ab700, sk ffff88005ede5800 ... [69713.823255] [6450] l2cap_sock_getname: sock ffff8800941ab700, sk ffff88005ede5800 [69713.823259] [6450] l2cap_sock_getsockopt: sk ffff88005ede5800 [69713.824322] [6450] l2cap_sock_getname: sock ffff8800941ab700, sk ffff88005ede5800 [69713.824330] [6450] l2cap_sock_getsockopt: sk ffff88005ede5800 [69713.825029] [6450] bt_sock_poll: sock ffff88005e758500, sk ffff88010323b800 ... [69713.825187] [6450] l2cap_sock_sendmsg: sock ffff8800941ab700, sk ffff88005ede5800 [69713.825189] [6450] bt_sock_wait_ready: sk ffff88005ede5800 [69713.825192] [6450] l2cap_create_basic_pdu: chan ffff88005ede2c00 len 3 [69713.825196] [6450] l2cap_do_send: chan ffff88005ede2c00, skb ffff880160b0b500 len 7 priority 0 [69713.825199] [6450] hci_send_acl: hci0 chan ffff88020d60b1c0 flags 0x0000 [69713.825201] [6450] hci_queue_acl: hci0 nonfrag skb ffff880160b0b500 len 11 [69713.825210] [8909] hci_tx_work: hci0 acl 9 sco 8 le 0 [69713.825213] [8909] hci_sched_acl: hci0 [69713.825214] [8909] hci_sched_sco: hci0 [69713.825216] [8909] hci_sched_esco: hci0 [69713.825217] [8909] hci_sched_le: hci0 [69713.825219] [8909] hci_chan_sent: hci0 [69713.825221] [8909] hci_chan_sent: chan ffff88020d60b1c0 quote 9 [69713.825223] [8909] hci_sched_le: chan ffff88020d60b1c0 skb ffff880160b0b500 len 11 priority 0 [69713.825225] [8909] hci_send_frame: hci0 type 2 len 11 [69713.825227] [8909] hci_send_to_monitor: hdev ffff88021f0c7000 len 11 [69713.825242] [8909] hci_chan_sent: hci0 [69713.825253] [5949] hci_sock_recvmsg: sock ffff8800941a9680, sk ffff88012bf4d000 [69713.825253] [8909] hci_prio_recalculate: hci0 [69713.825292] [5949] hci_sock_recvmsg: sock ffff8800941a9680, sk ffff88012bf4d000 [69713.825768] [6450] bt_sock_poll: sock ffff88005e758500, sk ffff88010323b800 ... [69713.866902] [8909] hci_rx_work: hci0 [69713.866921] [8909] hci_send_to_monitor: hdev ffff88021f0c7000 len 7 [69713.866928] [8909] hci_rx_work: hci0 Event packet [69713.866931] [8909] hci_num_comp_pkts_evt: hci0 num_hndl 1 [69713.866937] [8909] hci_tx_work: hci0 acl 9 sco 8 le 0 [69713.866939] [5949] hci_sock_recvmsg: sock ffff8800941a9680, sk ffff88012bf4d000 [69713.866940] [8909] hci_sched_acl: hci0 ... [69713.866944] [8909] hci_sched_le: hci0 [69713.866953] [8909] hci_chan_sent: hci0 [69713.866997] [5949] hci_sock_recvmsg: sock ffff8800941a9680, sk ffff88012bf4d000 [69713.867840] [28074] hci_rx_work: hci0 [69713.867844] [28074] hci_send_to_monitor: hdev ffff88021f0c7000 len 7 [69713.867850] [28074] hci_rx_work: hci0 Event packet [69713.867853] [28074] hci_num_comp_pkts_evt: hci0 num_hndl 1 [69713.867857] [5949] hci_sock_recvmsg: sock ffff8800941a9680, sk ffff88012bf4d000 [69713.867858] [28074] hci_tx_work: hci0 acl 10 sco 8 le 0 [69713.867860] [28074] hci_sched_acl: hci0 [69713.867861] [28074] hci_sched_sco: hci0 [69713.867862] [28074] hci_sched_esco: hci0 [69713.867863] [28074] hci_sched_le: hci0 [69713.867865] [28074] hci_chan_sent: hci0 [69713.867888] [5949] hci_sock_recvmsg: sock ffff8800941a9680, sk ffff88012bf4d000 [69714.145661] [8909] hci_rx_work: hci0 [69714.145666] [8909] hci_send_to_monitor: hdev ffff88021f0c7000 len 10 [69714.145676] [8909] hci_rx_work: hci0 ACL data packet [69714.145679] [8909] hci_acldata_packet: hci0 len 6 handle 0x002d flags 0x0002 [69714.145681] [8909] hci_conn_enter_active_mode: hcon ffff88021f65a000 mode 0 [69714.145683] [8909] l2cap_recv_acldata: conn ffff880221005c00 len 6 flags 0x2 [69714.145693] [8909] l2cap_recv_frame: len 2, cid 0x0006 [69714.145696] [8909] hci_send_to_control: len 14 [69714.145710] [8909] smp_chan_destroy: [69714.145713] [8909] pairing_complete: status 3 [69714.145714] [8909] cmd_complete: sock ffff88010323ac00 [69714.145717] [8909] hci_conn_drop: hcon ffff88021f65a000 orig refcnt 3 [69714.145719] [5949] hci_sock_recvmsg: sock ffff8800941a9680, sk ffff88012bf4d000 [69714.145720] [6450] bt_sock_poll: sock ffff88005e758500, sk ffff88010323b800 [69714.145722] [6515] hci_sock_recvmsg: sock ffff88005e75a080, sk ffff88010323ac00 [69714.145724] [6450] bt_sock_poll: sock ffff8801db6b4f00, sk ffff880160351c00 ... [69714.145735] [6515] hci_sock_recvmsg: sock ffff88005e75a080, sk ffff88010323ac00 [69714.145737] [8909] hci_conn_drop: hcon ffff88021f65a000 orig refcnt 2 [69714.145739] [8909] l2cap_conn_del: hcon ffff88021f65a000 conn ffff880221005c00, err 13 [69714.145740] [6450] bt_sock_poll: sock ffff8801db6b5400, sk ffff88021e775000 [69714.145743] [6450] bt_sock_poll: sock ffff8801db6b5e00, sk ffff880160356000 [69714.145744] [8909] l2cap_chan_hold: chan ffff88005ede2c00 orig refcnt 3 [69714.145746] [6450] bt_sock_poll: sock ffff8800941ab700, sk ffff88005ede5800 [69714.145748] [8909] l2cap_chan_del: chan ffff88005ede2c00, conn ffff880221005c00, err 13 [69714.145749] [8909] l2cap_chan_put: chan ffff88005ede2c00 orig refcnt 4 [69714.145751] [8909] hci_conn_drop: hcon ffff88021f65a000 orig refcnt 1 [69714.145754] [6450] bt_sock_poll: sock ffff8800941ab700, sk ffff88005ede5800 [69714.145756] [8909] l2cap_chan_put: chan ffff88005ede2c00 orig refcnt 3 [69714.145759] [8909] hci_chan_del: hci0 hcon ffff88021f65a000 chan ffff88020d60b1c0 [69714.145766] [5949] hci_sock_recvmsg: sock ffff8800941a9680, sk ffff88012bf4d000 [69714.145787] [6515] hci_sock_release: sock ffff88005e75a080 sk ffff88010323ac00 [69714.146002] [6450] hci_sock_recvmsg: sock ffff88005e758780, sk ffff88010323d400 [69714.150795] [6450] l2cap_sock_release: sock ffff8800941ab700, sk ffff88005ede5800 [69714.150799] [6450] l2cap_sock_shutdown: sock ffff8800941ab700, sk ffff88005ede5800 [69714.150802] [6450] l2cap_chan_close: chan ffff88005ede2c00 state BT_CLOSED [69714.150805] [6450] l2cap_sock_kill: sk ffff88005ede5800 state BT_CLOSED [69714.150806] [6450] l2cap_chan_put: chan ffff88005ede2c00 orig refcnt 2 [69714.150808] [6450] l2cap_sock_destruct: sk ffff88005ede5800 [69714.150809] [6450] l2cap_chan_put: chan ffff88005ede2c00 orig refcnt 1 [69714.150811] [6450] l2cap_chan_destroy: chan ffff88005ede2c00 [69714.150970] [6450] bt_sock_poll: sock ffff88005e758500, sk ffff88010323b800 ... [69714.151991] [8909] hci_conn_drop: hcon ffff88021f65a000 orig refcnt 0 [69716.150339] [8909] hci_conn_timeout: hcon ffff88021f65a000 state BT_CONNECTED, refcnt -1 Signed-off-by: Lukasz Rymanowski <lukasz.rymanowski@tieto.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-06-20 13:53:54 +02:00
Johan Hedberg	2ed8f65ca2	Bluetooth: Fix rejecting pairing in case of insufficient capabilities If we need an MITM protected connection but the local and remote IO capabilities cannot provide it we should reject the pairing attempt in the appropriate way. This patch adds the missing checks for such a situation to the smp_cmd_pairing_req() and smp_cmd_pairing_rsp() functions. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-06-20 13:53:48 +02:00
Johan Hedberg	581370cc74	Bluetooth: Refactor authentication method lookup into its own function We'll need to do authentication method lookups from more than one place, so refactor the lookup into its own function. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-06-20 13:53:42 +02:00
Johan Hedberg	c7262e711a	Bluetooth: Fix overriding higher security level in SMP When we receive a pairing request or an internal request to start pairing we shouldn't blindly overwrite the existing pending_sec_level value as that may actually be higher than the new one. This patch fixes the SMP code to only overwrite the value in case the new one is higher than the old. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-06-20 13:53:38 +02:00
David S. Miller	1b0608fd9b	Merge branch 'for-davem' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless John W. Linville says: ==================== pull request: wireless 2014-06-18 Please pull this batch of fixes intended for the 3.16 stream! For the Bluetooth bits, Gustavo says: "This is our first batch of fixes for 3.16. Be aware that two patches here are not exactly bugfixes: * 71f28af57066 Bluetooth: Add clarifying comment for conn->auth_type This commit just add some important security comments to the code, we found it important enough to include it here for 3.16 since it is security related. * 9f7ec8871132 Bluetooth: Refactor discovery stopping into its own function This commit is just a refactor in a preparation for a fix in the next commit (`f8680f128b`). All the other patches are fixes for deadlocks and for the Bluetooth protocols, most of them related to authentication and encryption." On top of that... Chin-Ran Lo fixes a problems with overlapping DMA areas in mwifiex. Michael Braun corrects a couple of issues in order to enable a new device in rt2800usb. Rafał Miłecki reverts a b43 patch that caused a regression, fixes a Kconfig typo, and corrects a frequency reporting error with the G-PHY. Stanislaw Grsuzka fixes an rfkill regression for rt2500pci, and avoids a rt2x00 scheduling while atomic BUG. Please let me know if there are problems! ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2014-06-19 21:32:27 -07:00
Daniel Borkmann	24599e61b7	net: sctp: check proc_dointvec result in proc_sctp_do_auth When writing to the sysctl field net.sctp.auth_enable, it can well be that the user buffer we handed over to proc_dointvec() via proc_sctp_do_auth() handler contains something other than integers. In that case, we would set an uninitialized 4-byte value from the stack to net->sctp.auth_enable that can be leaked back when reading the sysctl variable, and it can unintentionally turn auth_enable on/off based on the stack content since auth_enable is interpreted as a boolean. Fix it up by making sure proc_dointvec() returned sucessfully. Fixes: `b14878ccb7` ("net: sctp: cache auth_enable per endpoint") Reported-by: Florian Westphal <fwestpha@redhat.com> Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Acked-by: Neil Horman <nhorman@tuxdriver.com> Acked-by: Vlad Yasevich <vyasevich@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-06-19 21:30:19 -07:00
Neal Cardwell	2cd0d743b0	tcp: fix tcp_match_skb_to_sack() for unaligned SACK at end of an skb If there is an MSS change (or misbehaving receiver) that causes a SACK to arrive that covers the end of an skb but is less than one MSS, then tcp_match_skb_to_sack() was rounding up pkt_len to the full length of the skb ("Round if necessary..."), then chopping all bytes off the skb and creating a zero-byte skb in the write queue. This was visible now because the recently simplified TLP logic in `bef1909ee3` ("tcp: fixing TLP's FIN recovery") could find that 0-byte skb at the end of the write queue, and now that we do not check that skb's length we could send it as a TLP probe. Consider the following example scenario: mss: 1000 skb: seq: 0 end_seq: 4000 len: 4000 SACK: start_seq: 3999 end_seq: 4000 The tcp_match_skb_to_sack() code will compute: in_sack = false pkt_len = start_seq - TCP_SKB_CB(skb)->seq = 3999 - 0 = 3999 new_len = (pkt_len / mss) * mss = (3999/1000)*1000 = 3000 new_len += mss = 4000 Previously we would find the new_len > skb->len check failing, so we would fall through and set pkt_len = new_len = 4000 and chop off pkt_len of 4000 from the 4000-byte skb, leaving a 0-byte segment afterward in the write queue. With this new commit, we notice that the new new_len >= skb->len check succeeds, so that we return without trying to fragment. Fixes: `adb92db857` ("tcp: Make SACK code to split only at mss boundaries") Reported-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Neal Cardwell <ncardwell@google.com> Cc: Eric Dumazet <edumazet@google.com> Cc: Yuchung Cheng <ycheng@google.com> Cc: Ilpo Jarvinen <ilpo.jarvinen@helsinki.fi> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-06-19 20:50:49 -07:00
David S. Miller	8e4946ccdc	Revert "net: return actual error on register_queue_kobjects" This reverts commit `d36a4f4b47`. Signed-off-by: David S. Miller <davem@davemloft.net>	2014-06-19 18:12:15 -07:00
Kees Cook	6f9a093b66	net: filter: fix upper BPF instruction limit The original checks (via sk_chk_filter) for instruction count uses ">", not ">=", so changing this in sk_convert_filter has the potential to break existing seccomp filters that used exactly BPF_MAXINSNS many instructions. Fixes: `bd4cf0ed33` ("net: filter: rework/optimize internal BPF interpreter's instruction set") Signed-off-by: Kees Cook <keescook@chromium.org> Cc: stable@vger.kernel.org # v3.15+ Acked-by: Daniel Borkmann <dborkman@redhat.com> Acked-by: Alexei Starovoitov <ast@plumgrid.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-06-18 17:04:15 -07:00
Daniel Borkmann	ff5e92c1af	net: sctp: propagate sysctl errors from proc_do* properly sysctl handler proc_sctp_do_hmac_alg(), proc_sctp_do_rto_min() and proc_sctp_do_rto_max() do not properly reflect some error cases when writing values via sysctl from internal proc functions such as proc_dointvec() and proc_dostring(). In all these cases we pass the test for write != 0 and partially do additional work just to notice that additional sanity checks fail and we return with hard-coded -EINVAL while proc_do* functions might also return different errors. So fix this up by simply testing a successful return of proc_do* right after calling it. This also allows to propagate its return value onwards to the user. While touching this, also fix up some minor style issues. Fixes: `4f3fdf3bc5` ("sctp: add check rto_min and rto_max in sysctl") Fixes: `3c68198e75` ("sctp: Make hmac algorithm selection for cookie generation dynamic") Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-06-18 17:03:07 -07:00
Jie Liu	d36a4f4b47	net: return actual error on register_queue_kobjects Return the actual error code if call kset_create_and_add() failed Cc: David S. Miller <davem@davemloft.net> Signed-off-by: Jie Liu <jeff.liu@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-06-18 16:58:40 -07:00
David S. Miller	3a3ec1b2ba	Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf Pablo Neira Ayuso says: ==================== netfilter fixes for net The following patchset contains netfilter updates for your net tree, they are: 1) Fix refcount leak when dumping the dying/unconfirmed conntrack lists, from Florian Westphal. 2) Fix crash in NAT when removing a netnamespace, also from Florian. 3) Fix a crash in IPVS when trying to remove an estimator out of the sysctl scope, from Julian Anastasov. 4) Add zone attribute to the routing to calculate the message size in ctnetlink events, from Ken-ichirou MATSUZAWA. 5) Another fix for the dying/unconfirmed list which was preventing to dump more than one memory page of entries (~17 entries in x86_64). 6) Fix missing RCU-safe list insertion in the rule replacement code in nf_tables. 7) Since the new transaction infrastructure is in place, we have to upgrade the chain use counter from u16 to u32 to avoid overflow after more than 2^16 rules are added. 8) Fix refcount leak when replacing rule in nf_tables. This problem was also introduced in new transaction. 9) Call the ->destroy() callback when releasing nft-xt rules to fix module refcount leaks. 10) Set the family in the netlink messages that contain set elements in nf_tables to make it consistent with other object types. 11) Don't dump NAT port information if it is unset in nft_nat. 12) Update the MAINTAINERS file, I have merged the ebtables entry into netfilter. While at it, also removed the netfilter users mailing list, the development list should be enough. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2014-06-18 16:08:40 -07:00
John W. Linville	2ee3f63d39	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless into for-davem	2014-06-18 14:39:25 -04:00
Octavian Purdila	e0f802fbca	tcp: move ir_mark initialization to tcp_openreq_init ir_mark initialization is done for both TCP v4 and v6, move it in the common tcp_openreq_init function. Signed-off-by: Octavian Purdila <octavian.purdila@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-06-17 15:30:54 -07:00
Peter Pan(潘卫平)	d215d10f2d	net: delete duplicate dev_set_rx_mode() call In __dev_open(), it already calls dev_set_rx_mode(). and dev_set_rx_mode() has no effect for a net device which does not have IFF_UP flag set. So the call of dev_set_rx_mode() is duplicate in __dev_change_flags(). Signed-off-by: Weiping Pan <panweiping3@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-06-17 15:30:54 -07:00
John W. Linville	7f4dbaa3ae	Merge branch 'for-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth	2014-06-17 14:08:47 -04:00
Dave Jones	17846376f2	tcp: remove unnecessary tcp_sk assignment. This variable is overwritten by the child socket assignment before it ever gets used. Signed-off-by: Dave Jones <davej@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-06-16 21:35:00 -07:00
Florian Westphal	945b2b2d25	netfilter: nf_nat: fix oops on netns removal Quoting Samu Kallio: Basically what's happening is, during netns cleanup, nf_nat_net_exit gets called before ipv4_net_exit. As I understand it, nf_nat_net_exit is supposed to kill any conntrack entries which have NAT context (through nf_ct_iterate_cleanup), but for some reason this doesn't happen (perhaps something else is still holding refs to those entries?). When ipv4_net_exit is called, conntrack entries (including those with NAT context) are cleaned up, but the nat_bysource hashtable is long gone - freed in nf_nat_net_exit. The bug happens when attempting to free a conntrack entry whose NAT hash 'prev' field points to a slot in the freed hash table (head for that bin). We ignore conntracks with null nat bindings. But this is wrong, as these are in bysource hash table as well. Restore nat-cleaning for the netns-is-being-removed case. bug: https://bugzilla.kernel.org/show_bug.cgi?id=65191 Fixes: `c2d421e171` ('netfilter: nf_nat: fix race when unloading protocol modules') Reported-by: Samu Kallio <samu.kallio@aberdeencloud.com> Debugged-by: Samu Kallio <samu.kallio@aberdeencloud.com> Signed-off-by: Florian Westphal <fw@strlen.de> Tested-by: Samu Kallio <samu.kallio@aberdeencloud.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2014-06-16 13:58:54 +02:00
Ken-ichirou MATSUZAWA	4a001068d7	netfilter: ctnetlink: add zone size to length Signed-off-by: Ken-ichirou MATSUZAWA <chamas@h4.dion.ne.jp> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2014-06-16 13:53:03 +02:00
Pablo Neira Ayuso	98ca74f4d5	Merge branch 'ipvs' Simon Horman says: ==================== Fix for panic due use of tot_stats estimator outside of CONFIG_SYSCTL It has been present since v3.6.39. ==================== Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2014-06-16 13:22:33 +02:00
Pablo Neira Ayuso	915136065b	netfilter: nft_nat: don't dump port information if unset Don't include port information attributes if they are unset. Reported-by: Ana Rey <anarey@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2014-06-16 13:08:14 +02:00
Pablo Neira Ayuso	6403d96254	netfilter: nf_tables: indicate family when dumping set elements Set the nfnetlink header that indicates the family of this element. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2014-06-16 13:08:09 +02:00
Pablo Neira Ayuso	3d9b142131	netfilter: nft_compat: call {target, match}->destroy() to cleanup entry Otherwise, the reference to external objects (eg. modules) are not released when the rules are removed. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2014-06-16 13:08:04 +02:00
Pablo Neira Ayuso	ac904ac835	netfilter: nf_tables: fix wrong type in transaction when replacing rules In `b380e5c` ("netfilter: nf_tables: add message type to transactions"), I used the wrong message type in the rule replacement case. The rule that is replaced needs to be handled as a deleted rule. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2014-06-16 13:07:58 +02:00
Pablo Neira Ayuso	ac34b86197	netfilter: nf_tables: decrement chain use counter when replacing rules Thus, the chain use counter remains with the same value after the rule replacement. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2014-06-16 13:07:50 +02:00
Pablo Neira Ayuso	a0a7379e16	netfilter: nf_tables: use u32 for chain use counter Since `4fefee5` ("netfilter: nf_tables: allow to delete several objects from a batch"), every new rule bumps the chain use counter. However, this is limited to 16 bits, which means that it will overrun after 2^16 rules. Use a u32 chain counter and check for overflows (just like we do for table objects). Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2014-06-16 13:07:44 +02:00
Pablo Neira Ayuso	5bc5c30765	netfilter: nf_tables: use RCU-safe list insertion when replacing rules The patch `5e94846` ("netfilter: nf_tables: add insert operation") did not include RCU-safe list insertion when replacing rules. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2014-06-16 13:07:29 +02:00
Florian Westphal	cd5f336f17	netfilter: ctnetlink: fix refcnt leak in dying/unconfirmed list dumper 'last' keeps track of the ct that had its refcnt bumped during previous dump cycle. Thus it must not be overwritten until end-of-function. Another (unrelated, theoretical) issue: Don't attempt to bump refcnt of a conntrack whose reference count is already 0. Such conntrack is being destroyed right now, its memory is freed once we release the percpu dying spinlock. Fixes: `b7779d06` ('netfilter: conntrack: spinlock per cpu to protect special lists.') Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2014-06-16 12:51:36 +02:00
Pablo Neira Ayuso	266155b2de	netfilter: ctnetlink: fix dumping of dying/unconfirmed conntracks The dumping prematurely stops, it seems the callback argument that indicates that all entries have been dumped is set after iterating on the first cpu list. The dumping also may stop before the entire per-cpu list content is also dumped. With this patch, conntrack -L dying now shows the dying list content again. Fixes: `b7779d06` ("netfilter: conntrack: spinlock per cpu to protect special lists.") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2014-06-16 12:51:35 +02:00
Linus Torvalds	a9be22425e	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Pull networking fixes from David Miller: 1) Fix checksumming regressions, from Tom Herbert. 2) Undo unintentional permissions changes for SCTP rto_alpha and rto_beta sysfs knobs, from Denial Borkmann. 3) VXLAN, like other IP tunnels, should advertize it's encapsulation size using dev->needed_headroom instead of dev->hard_header_len. From Cong Wang. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: net: sctp: fix permissions for rto_alpha and rto_beta knobs vxlan: Checksum fixes net: add skb_pop_rcv_encapsulation udp: call __skb_checksum_complete when doing full checksum net: Fix save software checksum complete net: Fix GSO constants to match NETIF flags udp: ipv4: do not waste time in __udp4_lib_mcast_demux_lookup vxlan: use dev->needed_headroom instead of dev->hard_header_len MAINTAINERS: update cxgb4 maintainer	2014-06-15 16:37:03 -10:00
Daniel Borkmann	b58537a1f5	net: sctp: fix permissions for rto_alpha and rto_beta knobs Commit `3fd091e73b` ("[SCTP]: Remove multiple levels of msecs to jiffies conversions.") has silently changed permissions for rto_alpha and rto_beta knobs from 0644 to 0444. The purpose of this was to discourage users from tweaking rto_alpha and rto_beta knobs in production environments since they are key to correctly compute rtt/srtt. RFC4960 under section 6.3.1. RTO Calculation says regarding rto_alpha and rto_beta under rule C3 and C4: [...] C3) When a new RTT measurement R' is made, set RTTVAR <- (1 - RTO.Beta) * RTTVAR + RTO.Beta * \|SRTT - R'\| and SRTT <- (1 - RTO.Alpha) * SRTT + RTO.Alpha * R' Note: The value of SRTT used in the update to RTTVAR is its value before updating SRTT itself using the second assignment. After the computation, update RTO <- SRTT + 4 * RTTVAR. C4) When data is in flight and when allowed by rule C5 below, a new RTT measurement MUST be made each round trip. Furthermore, new RTT measurements SHOULD be made no more than once per round trip for a given destination transport address. There are two reasons for this recommendation: First, it appears that measuring more frequently often does not in practice yield any significant benefit [ALLMAN99]; second, if measurements are made more often, then the values of RTO.Alpha and RTO.Beta in rule C3 above should be adjusted so that SRTT and RTTVAR still adjust to changes at roughly the same rate (in terms of how many round trips it takes them to reflect new values) as they would if making only one measurement per round-trip and using RTO.Alpha and RTO.Beta as given in rule C3. However, the exact nature of these adjustments remains a research issue. [...] While it is discouraged to adjust rto_alpha and rto_beta and not further specified how to adjust them, the RFC also doesn't explicitly forbid it, but rather gives a RECOMMENDED default value (rto_alpha=3, rto_beta=2). We have a couple of users relying on the old permissions before they got changed. That said, if someone really has the urge to adjust them, we could allow it with a warning in the log. Fixes: `3fd091e73b` ("[SCTP]: Remove multiple levels of msecs to jiffies conversions.") Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Cc: Vlad Yasevich <vyasevich@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-06-15 01:17:32 -07:00
Tom Herbert	46fb51eb96	net: Fix save software checksum complete Geert reported issues regarding checksum complete and UDP. The logic introduced in commit `7e3cead517` ("net: Save software checksum complete") is not correct. This patch: 1) Restores code in __skb_checksum_complete_header except for setting CHECKSUM_UNNECESSARY. This function may be calculating checksum on something less than skb->len. 2) Adds saving checksum to __skb_checksum_complete. The full packet checksum 0..skb->len is calculated without adding in pseudo header. This value is saved in skb->csum and then the pseudo header is added to that to derive the checksum for validation. 3) In both __skb_checksum_complete_header and __skb_checksum_complete, set skb->csum_valid to whether checksum of zero was computed. This allows skb_csum_unnecessary to return true without changing to CHECKSUM_UNNECESSARY which was done previously. 4) Copy new csum related bits in __copy_skb_header. Reported-by: Geert Uytterhoeven <geert@linux-m68k.org> Signed-off-by: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-06-15 01:00:49 -07:00
Eric Dumazet	63c6f81cdd	udp: ipv4: do not waste time in __udp4_lib_mcast_demux_lookup Its too easy to add thousand of UDP sockets on a particular bucket, and slow down an innocent multicast receiver. Early demux is supposed to be an optimization, we should avoid spending too much time in it. It is interesting to note __udp4_lib_demux_lookup() only tries to match first socket in the chain. 10 is the threshold we already have in __udp4_lib_lookup() to switch to secondary hash. Fixes: `421b3885bf` ("udp: ipv4: Add udp early demux") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: David Held <drheld@google.com> Cc: Shawn Bohrer <sbohrer@rgmadvisors.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-06-13 15:39:24 -07:00
Marcin Kraglak	92d1372e1a	Bluetooth: Allow change security level on ATT_CID in slave role Kernel supports SMP Security Request so don't block increasing security when we are slave. Signed-off-by: Marcin Kraglak <marcin.kraglak@tieto.com> Acked-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Cc: stable@vger.kernel.org	2014-06-13 14:36:39 +02:00
Johan Hedberg	c73f94b8c0	Bluetooth: Fix locking of hdev when calling into SMP code The SMP code expects hdev to be unlocked since e.g. crypto functions will try to (re)lock it. Therefore, we need to release the lock before calling into smp.c from mgmt.c. Without this we risk a deadlock whenever the smp_user_confirm_reply() function is called. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Tested-by: Lukasz Rymanowski <lukasz.rymanowski@tieto.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Cc: stable@vger.kernel.org	2014-06-13 13:32:29 +02:00
Jukka Taimisto	7ab56c3a6e	Bluetooth: Fix deadlock in l2cap_conn_del() A deadlock occurs when PDU containing invalid SMP opcode is received on Security Manager Channel over LE link and conn->pending_rx_work worker has not run yet. When LE link is created l2cap_conn_ready() is called and before returning it schedules conn->pending_rx_work worker to hdev->workqueue. Incoming data to SMP fixed channel is handled by l2cap_recv_frame() which calls smp_sig_channel() to handle the SMP PDU. If smp_sig_channel() indicates failure l2cap_conn_del() is called to delete the connection. When deleting the connection, l2cap_conn_del() purges the pending_rx queue and calls flush_work() to wait for the pending_rx_work worker to complete. Since incoming data is handled by a worker running from the same workqueue as the pending_rx_work is being scheduled on, we will deadlock on waiting for pending_rx_work to complete. This patch fixes the deadlock by calling cancel_work_sync() instead of flush_work(). Signed-off-by: Jukka Taimisto <jtt@codenomicon.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Cc: stable@vger.kernel.org	2014-06-13 13:32:26 +02:00
Johan Hedberg	f8680f128b	Bluetooth: Reuse hci_stop_discovery function when cleaning up HCI state When cleaning up the HCI state as part of the power-off procedure we can reuse the hci_stop_discovery() function instead of explicitly sending HCI command related to discovery. The added benefit of this is that it takes care of canceling name resolving and inquiry which were not previously covered by the code. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Cc: stable@vger.kernel.org	2014-06-13 13:32:23 +02:00
Johan Hedberg	21a60d307d	Bluetooth: Refactor discovery stopping into its own function We'll need to reuse the same logic for stopping discovery also when cleaning up HCI state when powering off. This patch refactors the code out to its own function that can later (in a subsequent patch) be used also for the power off case. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Cc: stable@vger.kernel.org	2014-06-13 13:32:20 +02:00
Johan Hedberg	50143a433b	Bluetooth: Fix indicating discovery state when canceling inquiry When inquiry is canceled through the HCI_Cancel_Inquiry command there is no Inquiry Complete event generated. Instead, all we get is the command complete for the HCI_Inquiry_Cancel command. This means that we must call the hci_discovery_set_state() function from the respective command complete handler in order to ensure that user space knows the correct discovery state. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Cc: stable@vger.kernel.org	2014-06-13 13:32:16 +02:00
Johan Hedberg	fff3490f47	Bluetooth: Fix setting correct authentication information for SMP STK When we store the STK in slave role we should set the correct authentication information for it. If the pairing is producing a HIGH security level the STK is considered authenticated, and otherwise it's considered unauthenticated. This patch fixes the value passed to the hci_add_ltk() function when adding the STK on the slave side. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Tested-by: Marcin Kraglak <marcin.kraglak@tieto.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Cc: stable@vger.kernel.org	2014-06-13 13:30:48 +02:00
Johan Hedberg	4ad51a75c7	Bluetooth: Add clarifying comment for conn->auth_type When responding to an IO capability request when we're the initiators of the pairing we will not yet have the remote IO capability information. Since the conn->auth_type variable is treated as an "absolute" requirement instead of a hint of what's needed later in the user confirmation request handler it's important that it doesn't have the MITM bit set if there's any chance that the remote device doesn't have the necessary IO capabilities. This patch adds a clarifying comment so that conn->auth_type is left untouched in this scenario. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-06-13 13:30:45 +02:00
Johan Hedberg	ba15a58b17	Bluetooth: Fix SSP acceptor just-works confirmation without MITM From the Bluetooth Core Specification 4.1 page 1958: "if both devices have set the Authentication_Requirements parameter to one of the MITM Protection Not Required options, authentication stage 1 shall function as if both devices set their IO capabilities to DisplayOnly (e.g., Numeric comparison with automatic confirmation on both devices)" So far our implementation has done user confirmation for all just-works cases regardless of the MITM requirements, however following the specification to the word means that we should not be doing confirmation when neither side has the MITM flag set. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Tested-by: Szymon Janc <szymon.janc@tieto.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Cc: stable@vger.kernel.org	2014-06-13 13:30:42 +02:00
Johan Hedberg	e694788d73	Bluetooth: Fix check for connection encryption The conn->link_key variable tracks the type of link key in use. It is set whenever we respond to a link key request as well as when we get a link key notification event. These two events do not however always guarantee that encryption is enabled: getting a link key request and responding to it may only mean that the remote side has requested authentication but not encryption. On the other hand, the encrypt change event is a certain guarantee that encryption is enabled. The real encryption state is already tracked in the conn->link_mode variable through the HCI_LM_ENCRYPT bit. This patch fixes a check for encryption in the hci_conn_auth function to use the proper conn->link_mode value and thereby eliminates the chance of a false positive result. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Cc: stable@vger.kernel.org	2014-06-13 13:30:39 +02:00
Johan Hedberg	b62b65055b	Bluetooth: Fix incorrectly overriding conn->src_type The src_type member of struct hci_conn should always reflect the address type of the src_member. It should never be overridden. There is already code in place in the command status handler of HCI_LE_Create_Connection to copy the right initiator address into conn->init_addr_type. Without this patch, if privacy is enabled, we will send the wrong address type in the SMP identity address information PDU (it'll e.g. contain our public address but a random address type). Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Cc: stable@vger.kernel.org	2014-06-13 13:30:37 +02:00
Julian Anastasov	9802d21e7a	ipvs: stop tot_stats estimator only under CONFIG_SYSCTL The tot_stats estimator is started only when CONFIG_SYSCTL is defined. But it is stopped without checking CONFIG_SYSCTL. Fix the crash by moving ip_vs_stop_estimator into ip_vs_control_net_cleanup_sysctl. The change is needed after commit `14e405461e` ("IPVS: Add __ip_vs_control_{init,cleanup}_sysctl()") from 2.6.39. Reported-by: Jet Chen <jet.chen@intel.com> Tested-by: Jet Chen <jet.chen@intel.com> Signed-off-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Simon Horman <horms@verge.net.au>	2014-06-13 16:22:25 +09:00
Linus Torvalds	6d87c225f5	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client Pull Ceph updates from Sage Weil: "This has a mix of bug fixes and cleanups. Alex's patch fixes a rare race in RBD. Ilya's patches fix an ENOENT check when a second rbd image is mapped and a couple memory leaks. Zheng fixes several issues with fragmented directories and multiple MDSs. Josh fixes a spin/sleep issue, and Josh and Guangliang's patches fix setting and unsetting RBD images read-only. Naturally there are several other cleanups mixed in for good measure" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client: (23 commits) rbd: only set disk to read-only once rbd: move calls that may sleep out of spin lock range rbd: add ioctl for rbd ceph: use truncate_pagecache() instead of truncate_inode_pages() ceph: include time stamp in every MDS request rbd: fix ida/idr memory leak rbd: use reference counts for image requests rbd: fix osd_request memory leak in __rbd_dev_header_watch_sync() rbd: make sure we have latest osdmap on 'rbd map' libceph: add ceph_monc_wait_osdmap() libceph: mon_get_version request infrastructure libceph: recognize poolop requests in debugfs ceph: refactor readpage_nounlock() to make the logic clearer mds: check cap ID when handling cap export message ceph: remember subtree root dirfrag's auth MDS ceph: introduce ceph_fill_fragtree() ceph: handle cap import atomically ceph: pre-allocate ceph_cap struct for ceph_add_cap() ceph: update inode fields according to issued caps rbd: replace IS_ERR and PTR_ERR with PTR_ERR_OR_ZERO ...	2014-06-12 23:06:23 -07:00
Linus Torvalds	f9da455b93	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next Pull networking updates from David Miller: 1) Seccomp BPF filters can now be JIT'd, from Alexei Starovoitov. 2) Multiqueue support in xen-netback and xen-netfront, from Andrew J Benniston. 3) Allow tweaking of aggregation settings in cdc_ncm driver, from Bjørn Mork. 4) BPF now has a "random" opcode, from Chema Gonzalez. 5) Add more BPF documentation and improve test framework, from Daniel Borkmann. 6) Support TCP fastopen over ipv6, from Daniel Lee. 7) Add software TSO helper functions and use them to support software TSO in mvneta and mv643xx_eth drivers. From Ezequiel Garcia. 8) Support software TSO in fec driver too, from Nimrod Andy. 9) Add Broadcom SYSTEMPORT driver, from Florian Fainelli. 10) Handle broadcasts more gracefully over macvlan when there are large numbers of interfaces configured, from Herbert Xu. 11) Allow more control over fwmark used for non-socket based responses, from Lorenzo Colitti. 12) Do TCP congestion window limiting based upon measurements, from Neal Cardwell. 13) Support busy polling in SCTP, from Neal Horman. 14) Allow RSS key to be configured via ethtool, from Venkata Duvvuru. 15) Bridge promisc mode handling improvements from Vlad Yasevich. 16) Don't use inetpeer entries to implement ID generation any more, it performs poorly, from Eric Dumazet. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1522 commits) rtnetlink: fix userspace API breakage for iproute2 < v3.9.0 tcp: fixing TLP's FIN recovery net: fec: Add software TSO support net: fec: Add Scatter/gather support net: fec: Increase buffer descriptor entry number net: fec: Factorize feature setting net: fec: Enable IP header hardware checksum net: fec: Factorize the .xmit transmit function bridge: fix compile error when compiling without IPv6 support bridge: fix smatch warning / potential null pointer dereference via-rhine: fix full-duplex with autoneg disable bnx2x: Enlarge the dorq threshold for VFs bnx2x: Check for UNDI in uncommon branch bnx2x: Fix 1G-baseT link bnx2x: Fix link for KR with swapped polarity lane sctp: Fix sk_ack_backlog wrap-around problem net/core: Add VF link state control policy net/fsl: xgmac_mdio is dependent on OF_MDIO net/fsl: Make xgmac_mdio read error message useful net_sched: drr: warn when qdisc is not work conserving ...	2014-06-12 14:27:40 -07:00
Michal Schmidt	e5eca6d41f	rtnetlink: fix userspace API breakage for iproute2 < v3.9.0 When running RHEL6 userspace on a current upstream kernel, "ip link" fails to show VF information. The reason is a kernel<->userspace API change introduced by commit `88c5b5ce5c` ("rtnetlink: Call nlmsg_parse() with correct header length"), after which the kernel does not see iproute2's IFLA_EXT_MASK attribute in the netlink request. iproute2 adjusted for the API change in its commit 63338dca4513 ("libnetlink: Use ifinfomsg instead of rtgenmsg in rtnl_wilddump_req_filter"). The problem has been noticed before: http://marc.info/?l=linux-netdev&m=136692296022182&w=2 (Subject: Re: getting VF link info seems to be broken in 3.9-rc8) We can do better than tell those with old userspace to upgrade. We can recognize the old iproute2 in the kernel by checking the netlink message length. Even when including the IFLA_EXT_MASK attribute, its netlink message is shorter than struct ifinfomsg. With this patch "ip link" shows VF information in both old and new iproute2 versions. Signed-off-by: Michal Schmidt <mschmidt@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-06-12 11:07:42 -07:00
Per Hurtig	bef1909ee3	tcp: fixing TLP's FIN recovery Fix to a problem observed when losing a FIN segment that does not contain data. In such situations, TLP is unable to recover from any tail loss and instead adds at least PTO ms to the retransmission process, i.e., RTO = RTO + PTO. Signed-off-by: Per Hurtig <per.hurtig@kau.se> Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Nandita Dukkipati <nanditad@google.com> Acked-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-06-12 11:05:51 -07:00
Linus Lüssing	3993c4e159	bridge: fix compile error when compiling without IPv6 support Some fields in "struct net_bridge" aren't available when compiling the kernel without IPv6 support. Therefore adding a check/macro to skip the complaining code sections in that case. Introduced by `2cd4143192` ("bridge: memorize and export selected IGMP/MLD querier port") Reported-by: kbuild test robot <fengguang.wu@intel.com> Signed-off-by: Linus Lüssing <linus.luessing@web.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-06-12 11:00:24 -07:00
Linus Lüssing	6c03ee8bda	bridge: fix smatch warning / potential null pointer dereference "New smatch warnings: net/bridge/br_multicast.c:1368 br_ip6_multicast_query() error: we previously assumed 'group' could be null (see line 1349)" In the rare (sort of broken) case of a query having a Maximum Response Delay of zero, we could create a potential null pointer dereference. Fixing this by skipping the multicast specific MLD Query parsing again if no multicast group address is available. Introduced by `dc4eb53a99` ("bridge: adhere to querier election mechanism specified by RFCs") Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Linus Lüssing <linus.luessing@web.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-06-12 11:00:24 -07:00
Linus Torvalds	16b9057804	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull vfs updates from Al Viro: "This the bunch that sat in -next + lock_parent() fix. This is the minimal set; there's more pending stuff. In particular, I really hope to get acct.c fixes merged this cycle - we need that to deal sanely with delayed-mntput stuff. In the next pile, hopefully - that series is fairly short and localized (kernel/acct.c, fs/super.c and fs/namespace.c). In this pile: more iov_iter work. Most of prereqs for ->splice_write with sane locking order are there and Kent's dio rewrite would also fit nicely on top of this pile" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (70 commits) lock_parent: don't step on stale ->d_parent of all-but-freed one kill generic_file_splice_write() ceph: switch to iter_file_splice_write() shmem: switch to iter_file_splice_write() nfs: switch to iter_splice_write_file() fs/splice.c: remove unneeded exports ocfs2: switch to iter_file_splice_write() ->splice_write() via ->write_iter() bio_vec-backed iov_iter optimize copy_page_{to,from}_iter() bury generic_file_aio_{read,write} lustre: get rid of messing with iovecs ceph: switch to ->write_iter() ceph_sync_direct_write: stop poking into iov_iter guts ceph_sync_read: stop poking into iov_iter guts new helper: copy_page_from_iter() fuse: switch to ->write_iter() btrfs: switch to ->write_iter() ocfs2: switch to ->write_iter() xfs: switch to ->write_iter() ...	2014-06-12 10:30:18 -07:00
Xufeng Zhang	d3217b15a1	sctp: Fix sk_ack_backlog wrap-around problem Consider the scenario: For a TCP-style socket, while processing the COOKIE_ECHO chunk in sctp_sf_do_5_1D_ce(), after it has passed a series of sanity check, a new association would be created in sctp_unpack_cookie(), but afterwards, some processing maybe failed, and sctp_association_free() will be called to free the previously allocated association, in sctp_association_free(), sk_ack_backlog value is decremented for this socket, since the initial value for sk_ack_backlog is 0, after the decrement, it will be 65535, a wrap-around problem happens, and if we want to establish new associations afterward in the same socket, ABORT would be triggered since sctp deem the accept queue as full. Fix this issue by only decrementing sk_ack_backlog for associations in the endpoint's list. Fix-suggested-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: Xufeng Zhang <xufeng.zhang@windriver.com> Acked-by: Daniel Borkmann <dborkman@redhat.com> Acked-by: Vlad Yasevich <vyasevich@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-06-12 10:27:14 -07:00
Al Viro	9c1d5284c7	Merge commit '9f12600fe425bc28f0ccba034a77783c09c15af4' into for-linus Backmerge of dcache.c changes from mainline. It's that, or complete rebase... Conflicts: fs/splice.c Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2014-06-12 00:28:09 -04:00
David S. Miller	902455e007	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Conflicts: net/core/rtnetlink.c net/core/skbuff.c Both conflicts were very simple overlapping changes. Signed-off-by: David S. Miller <davem@davemloft.net>	2014-06-11 16:02:55 -07:00
Doug Ledford	c5b4616087	net/core: Add VF link state control policy Commit `1d8faf48c7` (net/core: Add VF link state control) added VF link state control to the netlink VF nested structure, but failed to add a proper entry for the new structure into the VF policy table. Add the missing entry so the table and the actual data copied into the netlink nested struct are in sync. Signed-off-by: Doug Ledford <dledford@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-06-11 15:51:37 -07:00
Florian Westphal	6e765a009a	net_sched: drr: warn when qdisc is not work conserving The DRR scheduler requires that items on the active list are work conserving, i.e. do not hold on to skbs for throttling purposes, etc. Attaching e.g. tbf renders DRR useless because all other classes on the active list are delayed as well. So, warn users that this configuration won't work as expected; we already do this in couple of other qdiscs, see e.g. commit `b00355db3f` ('pkt_sched: sch_hfsc: sch_htb: Add non-work-conserving warning handler') The 'const' change is needed to avoid compiler warning ("discards 'const' qualifier from pointer target type"). tested with: drr_hier() { parent=$1 classes=$2 for i in $(seq 1 $classes); do classid=$parent$(printf %x $i) tc class add dev eth0 parent $parent classid $classid drr tc qdisc add dev eth0 parent $classid tbf rate 64kbit burst 256kbit limit 64kbit done } tc qdisc add dev eth0 root handle 1: drr drr_hier 1: 32 tc filter add dev eth0 protocol all pref 1 parent 1: handle 1 flow hash keys dst perturb 1 divisor 32 Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-06-11 15:50:59 -07:00
Tom Herbert	6bae1d4cc3	net: Add skb_gro_postpull_rcsum to udp and vxlan Need to gro_postpull_rcsum for GRO to work with checksum complete. Signed-off-by: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-06-11 15:46:13 -07:00
Tom Herbert	7e3cead517	net: Save software checksum complete In skb_checksum complete, if we need to compute the checksum for the packet (via skb_checksum) save the result as CHECKSUM_COMPLETE. Subsequent checksum verification can use this. Also, added csum_complete_sw flag to distinguish between software and hardware generated checksum complete, we should always be able to trust the software computation. Signed-off-by: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-06-11 15:46:13 -07:00
stephen hemminger	f647944995	ceph: remove bogus extern Sparse complained about this bogus extern on definition of a function. Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-06-11 15:39:19 -07:00
Eric Dumazet	9709674e68	ipv4: fix a race in ip4_datagram_release_cb() Alexey gave a AddressSanitizer[1] report that finally gave a good hint at where was the origin of various problems already reported by Dormando in the past [2] Problem comes from the fact that UDP can have a lockless TX path, and concurrent threads can manipulate sk_dst_cache, while another thread, is holding socket lock and calls __sk_dst_set() in ip4_datagram_release_cb() (this was added in linux-3.8) It seems that all we need to do is to use sk_dst_check() and sk_dst_set() so that all the writers hold same spinlock (sk->sk_dst_lock) to prevent corruptions. TCP stack do not need this protection, as all sk_dst_cache writers hold the socket lock. [1] https://code.google.com/p/address-sanitizer/wiki/AddressSanitizerForKernel AddressSanitizer: heap-use-after-free in ipv4_dst_check Read of size 2 by thread T15453: [<ffffffff817daa3a>] ipv4_dst_check+0x1a/0x90 ./net/ipv4/route.c:1116 [<ffffffff8175b789>] __sk_dst_check+0x89/0xe0 ./net/core/sock.c:531 [<ffffffff81830a36>] ip4_datagram_release_cb+0x46/0x390 ??:0 [<ffffffff8175eaea>] release_sock+0x17a/0x230 ./net/core/sock.c:2413 [<ffffffff81830882>] ip4_datagram_connect+0x462/0x5d0 ??:0 [<ffffffff81846d06>] inet_dgram_connect+0x76/0xd0 ./net/ipv4/af_inet.c:534 [<ffffffff817580ac>] SYSC_connect+0x15c/0x1c0 ./net/socket.c:1701 [<ffffffff817596ce>] SyS_connect+0xe/0x10 ./net/socket.c:1682 [<ffffffff818b0a29>] system_call_fastpath+0x16/0x1b ./arch/x86/kernel/entry_64.S:629 Freed by thread T15455: [<ffffffff8178d9b8>] dst_destroy+0xa8/0x160 ./net/core/dst.c:251 [<ffffffff8178de25>] dst_release+0x45/0x80 ./net/core/dst.c:280 [<ffffffff818304c1>] ip4_datagram_connect+0xa1/0x5d0 ??:0 [<ffffffff81846d06>] inet_dgram_connect+0x76/0xd0 ./net/ipv4/af_inet.c:534 [<ffffffff817580ac>] SYSC_connect+0x15c/0x1c0 ./net/socket.c:1701 [<ffffffff817596ce>] SyS_connect+0xe/0x10 ./net/socket.c:1682 [<ffffffff818b0a29>] system_call_fastpath+0x16/0x1b ./arch/x86/kernel/entry_64.S:629 Allocated by thread T15453: [<ffffffff8178d291>] dst_alloc+0x81/0x2b0 ./net/core/dst.c:171 [<ffffffff817db3b7>] rt_dst_alloc+0x47/0x50 ./net/ipv4/route.c:1406 [< inlined >] __ip_route_output_key+0x3e8/0xf70 __mkroute_output ./net/ipv4/route.c:1939 [<ffffffff817dde08>] __ip_route_output_key+0x3e8/0xf70 ./net/ipv4/route.c:2161 [<ffffffff817deb34>] ip_route_output_flow+0x14/0x30 ./net/ipv4/route.c:2249 [<ffffffff81830737>] ip4_datagram_connect+0x317/0x5d0 ??:0 [<ffffffff81846d06>] inet_dgram_connect+0x76/0xd0 ./net/ipv4/af_inet.c:534 [<ffffffff817580ac>] SYSC_connect+0x15c/0x1c0 ./net/socket.c:1701 [<ffffffff817596ce>] SyS_connect+0xe/0x10 ./net/socket.c:1682 [<ffffffff818b0a29>] system_call_fastpath+0x16/0x1b ./arch/x86/kernel/entry_64.S:629 [2] <4>[196727.311203] general protection fault: 0000 [#1] SMP <4>[196727.311224] Modules linked in: xt_TEE xt_dscp xt_DSCP macvlan bridge coretemp crc32_pclmul ghash_clmulni_intel gpio_ich microcode ipmi_watchdog ipmi_devintf sb_edac edac_core lpc_ich mfd_core tpm_tis tpm tpm_bios ipmi_si ipmi_msghandler isci igb libsas i2c_algo_bit ixgbe ptp pps_core mdio <4>[196727.311333] CPU: 17 PID: 0 Comm: swapper/17 Not tainted 3.10.26 #1 <4>[196727.311344] Hardware name: Supermicro X9DRi-LN4+/X9DR3-LN4+/X9DRi-LN4+/X9DR3-LN4+, BIOS 3.0 07/05/2013 <4>[196727.311364] task: ffff885e6f069700 ti: ffff885e6f072000 task.ti: ffff885e6f072000 <4>[196727.311377] RIP: 0010:[<ffffffff815f8c7f>] [<ffffffff815f8c7f>] ipv4_dst_destroy+0x4f/0x80 <4>[196727.311399] RSP: 0018:ffff885effd23a70 EFLAGS: 00010282 <4>[196727.311409] RAX: dead000000200200 RBX: ffff8854c398ecc0 RCX: 0000000000000040 <4>[196727.311423] RDX: dead000000100100 RSI: dead000000100100 RDI: dead000000200200 <4>[196727.311437] RBP: ffff885effd23a80 R08: ffffffff815fd9e0 R09: ffff885d5a590800 <4>[196727.311451] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000 <4>[196727.311464] R13: ffffffff81c8c280 R14: 0000000000000000 R15: ffff880e85ee16ce <4>[196727.311510] FS: 0000000000000000(0000) GS:ffff885effd20000(0000) knlGS:0000000000000000 <4>[196727.311554] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 <4>[196727.311581] CR2: 00007a46751eb000 CR3: 0000005e65688000 CR4: 00000000000407e0 <4>[196727.311625] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 <4>[196727.311669] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 <4>[196727.311713] Stack: <4>[196727.311733] ffff8854c398ecc0 ffff8854c398ecc0 ffff885effd23ab0 ffffffff815b7f42 <4>[196727.311784] ffff88be6595bc00 ffff8854c398ecc0 0000000000000000 ffff8854c398ecc0 <4>[196727.311834] ffff885effd23ad0 ffffffff815b86c6 ffff885d5a590800 ffff8816827821c0 <4>[196727.311885] Call Trace: <4>[196727.311907] <IRQ> <4>[196727.311912] [<ffffffff815b7f42>] dst_destroy+0x32/0xe0 <4>[196727.311959] [<ffffffff815b86c6>] dst_release+0x56/0x80 <4>[196727.311986] [<ffffffff81620bd5>] tcp_v4_do_rcv+0x2a5/0x4a0 <4>[196727.312013] [<ffffffff81622b5a>] tcp_v4_rcv+0x7da/0x820 <4>[196727.312041] [<ffffffff815fd9e0>] ? ip_rcv_finish+0x360/0x360 <4>[196727.312070] [<ffffffff815de02d>] ? nf_hook_slow+0x7d/0x150 <4>[196727.312097] [<ffffffff815fd9e0>] ? ip_rcv_finish+0x360/0x360 <4>[196727.312125] [<ffffffff815fda92>] ip_local_deliver_finish+0xb2/0x230 <4>[196727.312154] [<ffffffff815fdd9a>] ip_local_deliver+0x4a/0x90 <4>[196727.312183] [<ffffffff815fd799>] ip_rcv_finish+0x119/0x360 <4>[196727.312212] [<ffffffff815fe00b>] ip_rcv+0x22b/0x340 <4>[196727.312242] [<ffffffffa0339680>] ? macvlan_broadcast+0x160/0x160 [macvlan] <4>[196727.312275] [<ffffffff815b0c62>] __netif_receive_skb_core+0x512/0x640 <4>[196727.312308] [<ffffffff811427fb>] ? kmem_cache_alloc+0x13b/0x150 <4>[196727.312338] [<ffffffff815b0db1>] __netif_receive_skb+0x21/0x70 <4>[196727.312368] [<ffffffff815b0fa1>] netif_receive_skb+0x31/0xa0 <4>[196727.312397] [<ffffffff815b1ae8>] napi_gro_receive+0xe8/0x140 <4>[196727.312433] [<ffffffffa00274f1>] ixgbe_poll+0x551/0x11f0 [ixgbe] <4>[196727.312463] [<ffffffff815fe00b>] ? ip_rcv+0x22b/0x340 <4>[196727.312491] [<ffffffff815b1691>] net_rx_action+0x111/0x210 <4>[196727.312521] [<ffffffff815b0db1>] ? __netif_receive_skb+0x21/0x70 <4>[196727.312552] [<ffffffff810519d0>] __do_softirq+0xd0/0x270 <4>[196727.312583] [<ffffffff816cef3c>] call_softirq+0x1c/0x30 <4>[196727.312613] [<ffffffff81004205>] do_softirq+0x55/0x90 <4>[196727.312640] [<ffffffff81051c85>] irq_exit+0x55/0x60 <4>[196727.312668] [<ffffffff816cf5c3>] do_IRQ+0x63/0xe0 <4>[196727.312696] [<ffffffff816c5aaa>] common_interrupt+0x6a/0x6a <4>[196727.312722] <EOI> <1>[196727.313071] RIP [<ffffffff815f8c7f>] ipv4_dst_destroy+0x4f/0x80 <4>[196727.313100] RSP <ffff885effd23a70> <4>[196727.313377] ---[ end trace 64b3f14fae0f2e29 ]--- <0>[196727.380908] Kernel panic - not syncing: Fatal exception in interrupt Reported-by: Alexey Preobrazhensky <preobr@google.com> Reported-by: dormando <dormando@rydia.ne> Signed-off-by: Eric Dumazet <edumazet@google.com> Fixes: `8141ed9fce` ("ipv4: Add a socket release callback for datagram sockets") Cc: Steffen Klassert <steffen.klassert@secunet.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-06-11 15:39:18 -07:00
Octavian Purdila	bad93e9d4e	net: add __pskb_copy_fclone and pskb_copy_for_clone There are several instances where a pskb_copy or __pskb_copy is immediately followed by an skb_clone. Add a couple of new functions to allow the copy skb to be allocated from the fclone cache and thus speed up subsequent skb_clone calls. Cc: Alexander Smirnov <alex.bluesman.smirnov@gmail.com> Cc: Dmitry Eremin-Solenikov <dbaryshkov@gmail.com> Cc: Marek Lindner <mareklindner@neomailbox.ch> Cc: Simon Wunderlich <sw@simonwunderlich.de> Cc: Antonio Quartulli <antonio@meshcoding.com> Cc: Marcel Holtmann <marcel@holtmann.org> Cc: Gustavo Padovan <gustavo@padovan.org> Cc: Johan Hedberg <johan.hedberg@gmail.com> Cc: Arvid Brodin <arvid.brodin@alten.se> Cc: Patrick McHardy <kaber@trash.net> Cc: Pablo Neira Ayuso <pablo@netfilter.org> Cc: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu> Cc: Lauro Ramos Venancio <lauro.venancio@openbossa.org> Cc: Aloisio Almeida Jr <aloisio.almeida@openbossa.org> Cc: Samuel Ortiz <sameo@linux.intel.com> Cc: Jon Maloy <jon.maloy@ericsson.com> Cc: Allan Stephens <allan.stephens@windriver.com> Cc: Andrew Hendry <andrew.hendry@gmail.com> Cc: Eric Dumazet <edumazet@google.com> Reviewed-by: Christoph Paasch <christoph.paasch@uclouvain.be> Signed-off-by: Octavian Purdila <octavian.purdila@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-06-11 15:38:02 -07:00
Toshiaki Makita	204177f3f3	bridge: Support 802.1ad vlan filtering This enables us to change the vlan protocol for vlan filtering. We come to be able to filter frames on the basis of 802.1ad vlan tags through a bridge. This also changes br->group_addr if it has not been set by user. This is needed for an 802.1ad bridge. (See IEEE 802.1Q-2011 8.13.5.) Furthermore, this sets br->group_fwd_mask_required so that an 802.1ad bridge can forward the Nearest Customer Bridge group addresses except for br->group_addr, which should be passed to higher layer. To change the vlan protocol, write a protocol in sysfs: # echo 0x88a8 > /sys/class/net/br0/bridge/vlan_protocol Signed-off-by: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-06-11 15:22:53 -07:00
Toshiaki Makita	f2808d226f	bridge: Prepare for forwarding another bridge group addresses If a bridge is an 802.1ad bridge, it must forward another bridge group addresses (the Nearest Customer Bridge group addresses). (For details, see IEEE 802.1Q-2011 8.6.3.) As user might not want group_fwd_mask to be modified by enabling 802.1ad, introduce a new mask, group_fwd_mask_required, which indicates addresses the bridge wants to forward. This will be set by enabling 802.1ad. Signed-off-by: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-06-11 15:22:53 -07:00
Toshiaki Makita	8580e2117c	bridge: Prepare for 802.1ad vlan filtering support This enables a bridge to have vlan protocol informantion and allows vlan tag manipulation (retrieve, insert and remove tags) according to the vlan protocol. Signed-off-by: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-06-11 15:22:53 -07:00
Toshiaki Makita	1c5abb6c77	bridge: Add 802.1ad tx vlan acceleration Bridge device doesn't need to embed S-tag into skb->data. Signed-off-by: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-06-11 15:22:53 -07:00
Alexei Starovoitov	61f83d0d57	net: filter: fix warning on 32-bit arch fix compiler warning on 32-bit architectures: net/core/filter.c: In function '__sk_run_filter': net/core/filter.c:540:22: warning: cast to pointer from integer of different size [-Wint-to-pointer-cast] net/core/filter.c:550:22: warning: cast to pointer from integer of different size [-Wint-to-pointer-cast] net/core/filter.c:560:22: warning: cast to pointer from integer of different size [-Wint-to-pointer-cast] Reported-by: Fengguang Wu <fengguang.wu@intel.com> Signed-off-by: Alexei Starovoitov <ast@plumgrid.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-06-11 15:12:27 -07:00
Jon Paul Maloy	02c00c2ab0	tipc: fix potential bug in function tipc_backlog_rcv In commit `4f4482dcd9` ("tipc: compensate for double accounting in socket rcv buffer") we access 'truesize' of a received buffer after it might have been released by the function filter_rcv(). In this commit we correct this by reading the value of 'truesize' to the stack before delivering the buffer to filter_rcv(). Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-06-11 15:01:30 -07:00
Daniel Borkmann	9b87d46510	net: sctp: fix incorrect type in gfp initializer This fixes the following sparse warning: net/sctp/associola.c:1556:29: warning: incorrect type in initializer (different base types) net/sctp/associola.c:1556:29: expected bool [unsigned] [usertype] preload net/sctp/associola.c:1556:29: got restricted gfp_t Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-06-11 12:23:17 -07:00
Daniel Borkmann	a7288c4dd5	net: sctp: improve sctp_select_active_and_retran_path selection In function sctp_select_active_and_retran_path(), we walk the transport list in order to look for the two most recently used ACTIVE transports (trans_pri, trans_sec). In case we didn't find anything ACTIVE, we currently just camp on a possibly PF or INACTIVE transport that is primary path; this behavior actually dates back to linux-history tree of the very early days of lksctp, and can yield a behavior that chooses suboptimal transport paths. Instead, be a bit more clever by reusing and extending the recently introduced sctp_trans_elect_best() handler. In case both transports are evaluated to have the same score resulting from their states, break the tie by looking at: 1) transport patch error count 2) last_time_heard value from each transport. This is analogous to Nishida's Quick Failover draft [1], section 5.1, 3: The sender SHOULD avoid data transmission to PF destinations. When all destinations are in either PF or Inactive state, the sender MAY either move the destination from PF to active state (and transmit data to the active destination) or the sender MAY transmit data to a PF destination. In the former scenario, (i) the sender MUST NOT notify the ULP about the state transition, and (ii) MUST NOT clear the destination's error counter. It is recommended that the sender picks the PF destination with least error count (fewest consecutive timeouts) for data transmission. In case of a tie (multiple PF destinations with same error count), the sender MAY choose the last active destination. Thus for sctp_select_active_and_retran_path(), we keep track of the best, if any, transport that is in PF state and in case no ACTIVE transport has been found (hence trans_{pri,sec} is NULL), we select the best out of the three: current primary_path and retran_path as well as a possible PF transport. The secondary may still camp on the original primary_path as before. The change in sctp_trans_elect_best() with a more fine grained tie selection also improves at the same time path selection for sctp_assoc_update_retran_path() in case of non-ACTIVE states. [1] http://tools.ietf.org/html/draft-nishida-tsvwg-sctp-failover-05 Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-06-11 12:23:17 -07:00
Daniel Borkmann	e575235fc6	net: sctp: migrate most recently used transport to ktime Be more precise in transport path selection and use ktime helpers instead of jiffies to compare and pick the better primary and secondary recently used transports. This also avoids any side-effects during a possible roll-over, and could lead to better path decision-making. Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-06-11 12:23:17 -07:00
Daniel Borkmann	b82e8f31ac	net: sctp: refactor active path selection This patch just refactors and moves the code for the active path selection into its own helper function outside of sctp_assoc_control_transport() which is already big enough. No functional changes here. Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-06-11 12:23:17 -07:00
Daniel Borkmann	67cb9366ff	ktime: add ktime_after and ktime_before helper Add two minimal helper functions analogous to time_before() and time_after() that will later on both be needed by SCTP code. Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-06-11 12:23:17 -07:00
Phoebe Buckheister	2d3b5b0a90	mac802154: don't deliver packets to devices that are down Only one WPAN devices can be active at any given time, so only deliver packets to that one interface that is actually up. Multiple monitors may be up at any given time, but we don't have to deliver to monitors that are down either. Signed-off-by: Phoebe Buckheister <phoebe.buckheister@itwm.fraunhofer.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-06-11 12:10:19 -07:00
Phoebe Buckheister	a374eeb5e5	mac802154: properly free incoming skbs on decryption failure mac802154 RX did not free skbs on decryption failure, assuming that the caller would when the local rx handler returned _DROP. This was false. Signed-off-by: Phoebe Buckheister <phoebe.buckheister@itwm.fraunhofer.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-06-11 12:10:18 -07:00
Wei-Chun Chao	5882a07c72	net: fix UDP tunnel GSO of frag_list GRO packets This patch fixes a kernel BUG_ON in skb_segment. It is hit when testing two VMs on openvswitch with one VM acting as VXLAN gateway. During VXLAN packet GSO, skb_segment is called with skb->data pointing to inner TCP payload. skb_segment calls skb_network_protocol to retrieve the inner protocol. skb_network_protocol actually expects skb->data to point to MAC and it calls pskb_may_pull with ETH_HLEN. This ends up pulling in ETH_HLEN data from header tail. As a result, pskb_trim logic is skipped and BUG_ON is hit later. Move skb_push in front of skb_network_protocol so that skb->data lines up properly. kernel BUG at net/core/skbuff.c:2999! Call Trace: [<ffffffff816ac412>] tcp_gso_segment+0x122/0x410 [<ffffffff816bc74c>] inet_gso_segment+0x13c/0x390 [<ffffffff8164b39b>] skb_mac_gso_segment+0x9b/0x170 [<ffffffff816b3658>] skb_udp_tunnel_segment+0xd8/0x390 [<ffffffff816b3c00>] udp4_ufo_fragment+0x120/0x140 [<ffffffff816bc74c>] inet_gso_segment+0x13c/0x390 [<ffffffff8109d742>] ? default_wake_function+0x12/0x20 [<ffffffff8164b39b>] skb_mac_gso_segment+0x9b/0x170 [<ffffffff8164b4d0>] __skb_gso_segment+0x60/0xc0 [<ffffffff8164b6b3>] dev_hard_start_xmit+0x183/0x550 [<ffffffff8166c91e>] sch_direct_xmit+0xfe/0x1d0 [<ffffffff8164bc94>] __dev_queue_xmit+0x214/0x4f0 [<ffffffff8164bf90>] dev_queue_xmit+0x10/0x20 [<ffffffff81687edb>] ip_finish_output+0x66b/0x890 [<ffffffff81688a58>] ip_output+0x58/0x90 [<ffffffff816c628f>] ? fib_table_lookup+0x29f/0x350 [<ffffffff816881c9>] ip_local_out_sk+0x39/0x50 [<ffffffff816cbfad>] iptunnel_xmit+0x10d/0x130 [<ffffffffa0212200>] vxlan_xmit_skb+0x1d0/0x330 [vxlan] [<ffffffffa02a3919>] vxlan_tnl_send+0x129/0x1a0 [openvswitch] [<ffffffffa02a2cd6>] ovs_vport_send+0x26/0xa0 [openvswitch] [<ffffffffa029931e>] do_output+0x2e/0x50 [openvswitch] Signed-off-by: Wei-Chun Chao <weichunc@plumgrid.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-06-11 00:48:47 -07:00
huizhang	f6c20c596f	net: ipv6: Fixed up ipsec packet be re-routing issue Bug report on https://bugzilla.kernel.org/show_bug.cgi?id=75781 When a local output ipsec packet match the mangle table rule, and be set mark value, the packet will be route again in route_me_harder -> _session_decoder6 In this case, the nhoff in CB of skb was still the default value 0. So the protocal match can't success and the packet can't match correct SA rule,and then the packet be send out in plaintext. To fixed up the issue. The CB->nhoff must be set. Signed-off-by: Hui Zhang <huizhang@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-06-11 00:47:31 -07:00
Dmitry Popov	5ce54af1fc	ip_tunnel: fix i_key matching in ip_tunnel_find Some tunnels (though only vti as for now) can use i_key just for internal use: for example vti uses it for fwmark'ing incoming packets. So raw i_key value shouldn't be treated as a distinguisher for them. ip_tunnel_key_match exists for cases when we want to compare two ip_tunnel_parms' i_keys. Example bug: ip link add type vti ikey 1 local 1.0.0.1 remote 2.0.0.2 ip link add type vti ikey 2 local 1.0.0.1 remote 2.0.0.2 spawned two tunnels, although it doesn't make sense. Signed-off-by: Dmitry Popov <ixaphire@qrator.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-06-11 00:43:37 -07:00
Dmitry Popov	7c8e6b9c28	ip_vti: Fix 'ip tunnel add' with 'key' parameters ip tunnel add remote 10.2.2.1 local 10.2.2.2 mode vti ikey 1 okey 2 translates to p->iflags = VTI_ISVTI\|GRE_KEY and p->i_key = 1, but GRE_KEY != TUNNEL_KEY, so ip_tunnel_ioctl would set i_key to 0 (same story with o_key) making us unable to create vti tunnels with [io]key via ip tunnel. We cannot simply translate GRE_KEY to TUNNEL_KEY (as GRE module does) because vti_tunnels with same local/remote addresses but different ikeys will be treated as different then. So, imo the best option here is to move p->i_flags & *_KEY check for vti tunnels from ip_tunnel.c to ip_vti.c and to think about [io]_mark field for ip_tunnel_parm in the future. Signed-off-by: Dmitry Popov <ixaphire@qrator.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-06-11 00:30:52 -07:00
Alexei Starovoitov	e430f34ee5	net: filter: cleanup A/X name usage The macro 'A' used in internal BPF interpreter: #define A regs[insn->a_reg] was easily confused with the name of classic BPF register 'A', since 'A' would mean two different things depending on context. This patch is trying to clean up the naming and clarify its usage in the following way: - A and X are names of two classic BPF registers - BPF_REG_A denotes internal BPF register R0 used to map classic register A in internal BPF programs generated from classic - BPF_REG_X denotes internal BPF register R7 used to map classic register X in internal BPF programs generated from classic - internal BPF instruction format: struct sock_filter_int { __u8 code; /* opcode / __u8 dst_reg:4; / dest register / __u8 src_reg:4; / source register / __s16 off; / signed offset / __s32 imm; / signed immediate constant */ }; - BPF_X/BPF_K is 1 bit used to encode source operand of instruction In classic: BPF_X - means use register X as source operand BPF_K - means use 32-bit immediate as source operand In internal: BPF_X - means use 'src_reg' register as source operand BPF_K - means use 32-bit immediate as source operand Suggested-by: Chema Gonzalez <chema@google.com> Signed-off-by: Alexei Starovoitov <ast@plumgrid.com> Acked-by: Daniel Borkmann <dborkman@redhat.com> Acked-by: Chema Gonzalez <chema@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-06-11 00:13:16 -07:00
Manuel Schölling	84a7c0b1db	dns_resolver: assure that dns_query() result is null-terminated dns_query() credulously assumes that keys are null-terminated and returns a copy of a memory block that is off by one. Signed-off-by: Manuel Schölling <manuel.schoelling@gmx.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-06-11 00:12:04 -07:00
Linus Lüssing	2cd4143192	bridge: memorize and export selected IGMP/MLD querier port Adding bridge support to the batman-adv multicast optimization requires batman-adv knowing about the existence of bridged-in IGMP/MLD queriers to be able to reliably serve any multicast listener behind this same bridge. Signed-off-by: Linus Lüssing <linus.luessing@web.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-06-10 23:50:47 -07:00
Linus Lüssing	07f8ac4a1e	bridge: add export of multicast database adjacent to net_dev With this new, exported function br_multicast_list_adjacent(net_dev) a list of IPv4/6 addresses is returned. This list contains all multicast addresses sensed by the bridge multicast snooping feature on all bridge ports of the bridge interface of net_dev, excluding addresses from the specified net_device itself. Adding bridge support to the batman-adv multicast optimization requires batman-adv knowing about the existence of bridged-in multicast listeners to be able to reliably serve them with multicast packets. Signed-off-by: Linus Lüssing <linus.luessing@web.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-06-10 23:50:47 -07:00
Linus Lüssing	dc4eb53a99	bridge: adhere to querier election mechanism specified by RFCs MLDv1 (RFC2710 section 6), MLDv2 (RFC3810 section 7.6.2), IGMPv2 (RFC2236 section 3) and IGMPv3 (RFC3376 section 6.6.2) specify that the querier with lowest source address shall become the selected querier. So far the bridge stopped its querier as soon as it heard another querier regardless of its source address. This results in the "wrong" querier potentially becoming the active querier or a potential, unnecessary querying delay. With this patch the bridge memorizes the source address of the currently selected querier and ignores queries from queriers with a higher source address than the currently selected one. This slight optimization is supposed to make it more RFC compliant (but is rather uncritical and therefore probably not necessary to be queued for stable kernels). Signed-off-by: Linus Lüssing <linus.luessing@web.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-06-10 23:50:47 -07:00
Linus Lüssing	90010b36eb	bridge: rename struct bridge_mcast_query/querier The current naming of these two structs is very random, in that reversing their naming would not make any semantical difference. This patch tries to make the naming less confusing by giving them a more specific, distinguishable naming. This is also useful for the upcoming patches reintroducing the "struct bridge_mcast_querier" but for storing information about the selected querier (no matter if our own or a foreign querier). Signed-off-by: Linus Lüssing <linus.luessing@web.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-06-10 23:50:46 -07:00
Dmitry Popov	2346829e64	ipip, sit: fix ipv4_{update_pmtu,redirect} calls ipv4_{update_pmtu,redirect} were called with tunnel's ifindex (t->dev is a tunnel netdevice). It caused wrong route lookup and failure of pmtu update or redirect. We should use the same ifindex that we use in ip_route_output_* in *tunnel_xmit code. It is t->parms.link . Signed-off-by: Dmitry Popov <ixaphire@qrator.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-06-10 23:35:52 -07:00
stephen hemminger	f8c1b7ce00	gre: allow changing mac address when device is up There is no need to require forcing device down on a Ethernet GRE (gretap) tunnel to change the MAC address. Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-06-10 22:46:42 -07:00
Octavian Purdila	6cc55e096f	tcp: add gfp parameter to tcp_fragment tcp_fragment can be called from process context (from tso_fragment). Add a new gfp parameter to allow it to preserve atomic memory if possible. Signed-off-by: Octavian Purdila <octavian.purdila@intel.com> Reviewed-by: Christoph Paasch <christoph.paasch@uclouvain.be> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-06-10 22:30:58 -07:00
Linus Torvalds	d1e1cda862	NFS client updates for Linux 3.16 Highlights include: - Massive cleanup of the NFS read/write code by Anna and Dros - Support multiple NFS read/write requests per page in order to deal with non-page aligned pNFS striping. Also cleans up the r/wsize < page size code nicely. - stable fix for ensuring inode is declared uptodate only after all the attributes have been checked. - stable fix for a kernel Oops when remounting - NFS over RDMA client fixes - move the pNFS files layout driver into its own subdirectory -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIcBAABAgAGBQJTl3pmAAoJEGcL54qWCgDyraIP/08ZbbDowVTP9572bxl+VR2i zNbrflBtl1R05D4Imi/IEySK0w6xj1CLsncNpXAT2bxTlyKPW70tpiiPlRKMPuO8 JW+iPiepR2t0mol6MEd46yuV8btXVk8I+7IYjPXANiMJG8O5dJzNQ8NiCQOERBNt FQ7rzTCFO0ESGXnT6vYrT4I0bwqYVklBiJRTT4PQVzhhhDq9qUdq21BlQjQJFXP4 9aBLurxKptlHBvE6A2Quja6ObEC0s31CxcijqHIJ+Ue4GbKcFbMG1tgjY7ESE/AD rqzDeF0jvWHT+frmvFEUUXWqzF1ReZ4x9pfDoOgeG6T9/K6DT91O0yMOgG8jvlbF 8DSATNYGDX5sSjpvaG5JokGG+cGCk9srVDx+itn7HlwzalRwn0PjKtIYwOJ7TJIr o/j20nOsPrRGF0OqLf9phyocgRrlbMKOzj1IXldHHfAbNkRcISTK08lxvsz96Ddn zRyDmbsbY6QFXdB3AVSeQmg5R0OOLtzNIcsFPmNdvy5eiy67qU0lsGg8UGNnoz8k PHN1pcGejkctLhQ32ee3w/W6zkrgpJZcNC9JSoG8Dc3SeXus0c3IgumRknFCmiep ssN+1jEITAGeS5a2aBxwLQLVI2JAr2lxs5e+R4D5EsQlFkCl6Mrgtzh/aToWTuFl Qt7l2zI3r3VieKT9u7Bh =OyXR -----END PGP SIGNATURE----- Merge tag 'nfs-for-3.16-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs Pull NFS client updates from Trond Myklebust: "Highlights include: - massive cleanup of the NFS read/write code by Anna and Dros - support multiple NFS read/write requests per page in order to deal with non-page aligned pNFS striping. Also cleans up the r/wsize < page size code nicely. - stable fix for ensuring inode is declared uptodate only after all the attributes have been checked. - stable fix for a kernel Oops when remounting - NFS over RDMA client fixes - move the pNFS files layout driver into its own subdirectory" * tag 'nfs-for-3.16-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: (79 commits) NFS: populate ->net in mount data when remounting pnfs: fix lockup caused by pnfs_generic_pg_test NFSv4.1: Fix typo in dprintk NFSv4.1: Comment is now wrong and redundant to code NFS: Use raw_write_seqcount_begin/end int nfs4_reclaim_open_state xprtrdma: Disconnect on registration failure xprtrdma: Remove BUG_ON() call sites xprtrdma: Avoid deadlock when credit window is reset SUNRPC: Move congestion window constants to header file xprtrdma: Reset connection timeout after successful reconnect xprtrdma: Use macros for reconnection timeout constants xprtrdma: Allocate missing pagelist xprtrdma: Remove Tavor MTU setting xprtrdma: Ensure ia->ri_id->qp is not NULL when reconnecting xprtrdma: Reduce the number of hardway buffer allocations xprtrdma: Limit work done by completion handler xprtrmda: Reduce calls to ib_poll_cq() in completion handlers xprtrmda: Reduce lock contention in completion handlers xprtrdma: Split the completion queue xprtrdma: Make rpcrdma_ep_destroy() return void ...	2014-06-10 15:02:42 -07:00
Linus Torvalds	5b174fd647	Merge branch 'for-3.16' of git://linux-nfs.org/~bfields/linux Pull nfsd updates from Bruce Fields: "The largest piece is a long-overdue rewrite of the xdr code to remove some annoying limitations: for example, there was no way to return ACLs larger than 4K, and readdir results were returned only in 4k chunks, limiting performance on large directories. Also: - part of Neil Brown's work to make NFS work reliably over the loopback interface (so client and server can run on the same machine without deadlocks). The rest of it is coming through other trees. - cleanup and bugfixes for some of the server RDMA code, from Steve Wise. - Various cleanup of NFSv4 state code in preparation for an overhaul of the locking, from Jeff, Trond, and Benny. - smaller bugfixes and cleanup from Christoph Hellwig and Kinglong Mee. Thanks to everyone! This summer looks likely to be busier than usual for knfsd. Hopefully we won't break it too badly; testing definitely welcomed" * 'for-3.16' of git://linux-nfs.org/~bfields/linux: (100 commits) nfsd4: fix FREE_STATEID lockowner leak svcrdma: Fence LOCAL_INV work requests svcrdma: refactor marshalling logic nfsd: don't halt scanning the DRC LRU list when there's an RC_INPROG entry nfs4: remove unused CHANGE_SECURITY_LABEL nfsd4: kill READ64 nfsd4: kill READ32 nfsd4: simplify server xdr->next_page use nfsd4: hash deleg stateid only on successful nfs4_set_delegation nfsd4: rename recall_lock to state_lock nfsd: remove unneeded zeroing of fields in nfsd4_proc_compound nfsd: fix setting of NFS4_OO_CONFIRMED in nfsd4_open nfsd4: use recall_lock for delegation hashing nfsd: fix laundromat next-run-time calculation nfsd: make nfsd4_encode_fattr static SUNRPC/NFSD: Remove using of dprintk with KERN_WARNING nfsd: remove unused function nfsd_read_file nfsd: getattr for FATTR4_WORD0_FILES_AVAIL needs the statfs buffer NFSD: Error out when getting more than one fsloc/secinfo/uuid NFSD: Using type of uint32_t for ex_nflavors instead of int ...	2014-06-10 11:50:57 -07:00
Linus Torvalds	14208b0ec5	Merge branch 'for-3.16' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup Pull cgroup updates from Tejun Heo: "A lot of activities on cgroup side. Heavy restructuring including locking simplification took place to improve the code base and enable implementation of the unified hierarchy, which currently exists behind a __DEVEL__ mount option. The core support is mostly complete but individual controllers need further work. To explain the design and rationales of the the unified hierarchy Documentation/cgroups/unified-hierarchy.txt is added. Another notable change is css (cgroup_subsys_state - what each controller uses to identify and interact with a cgroup) iteration update. This is part of continuing updates on css object lifetime and visibility. cgroup started with reference count draining on removal way back and is now reaching a point where csses behave and are iterated like normal refcnted objects albeit with some complexities to allow distinguishing the state where they're being deleted. The css iteration update isn't taken advantage of yet but is planned to be used to simplify memcg significantly" * 'for-3.16' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup: (77 commits) cgroup: disallow disabled controllers on the default hierarchy cgroup: don't destroy the default root cgroup: disallow debug controller on the default hierarchy cgroup: clean up MAINTAINERS entries cgroup: implement css_tryget() device_cgroup: use css_has_online_children() instead of has_children() cgroup: convert cgroup_has_live_children() into css_has_online_children() cgroup: use CSS_ONLINE instead of CGRP_DEAD cgroup: iterate cgroup_subsys_states directly cgroup: introduce CSS_RELEASED and reduce css iteration fallback window cgroup: move cgroup->serial_nr into cgroup_subsys_state cgroup: link all cgroup_subsys_states in their sibling lists cgroup: move cgroup->sibling and ->children into cgroup_subsys_state cgroup: remove cgroup->parent device_cgroup: remove direct access to cgroup->children memcg: update memcg_has_children() to use css_next_child() memcg: remove tasks/children test from mem_cgroup_force_empty() cgroup: remove css_parent() cgroup: skip refcnting on normal root csses and cgrp_dfl_root self css cgroup: use cgroup->self.refcnt for cgroup refcnting ...	2014-06-09 15:03:33 -07:00
David S. Miller	b78370c021	Merge branch 'for-davem' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next John W. Linville says: ==================== pull request: wireless-next 2014-06-06 Please accept this batch of fixes intended for the 3.16 stream. For the bluetooth bits, Gustavo says: "Here some more patches for 3.16. We know that Linus already opened the merge window, but this is fix only pull request, and most of the patches here are also tagged for stable." Along with that, Andrea Merello provides a fix for the broken scanning in the venerable at76c50x driver... ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2014-06-08 14:17:39 -07:00
Eric Dumazet	87757a917b	net: force a list_del() in unregister_netdevice_many() unregister_netdevice_many() API is error prone and we had too many bugs because of dangling LIST_HEAD on stacks. See commit `f87e6f4793` ("net: dont leave active on stack LIST_HEAD") In fact, instead of making sure no caller leaves an active list_head, just force a list_del() in the callee. No one seems to need to access the list after unregister_netdevice_many() Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-06-08 14:15:14 -07:00
Linus Torvalds	b20dcab9d4	LLVMLinux patches for v3.16 -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iEYEABECAAYFAlOTY+wACgkQuseO5dulBZXrIgCdFZyXRojufLLKikWEvHjZ3/k5 KsQAnimtcge+62/IX7YwDjWS+xg9Wt3m =yPrI -----END PGP SIGNATURE----- Merge tag 'llvmlinux-for-v3.16' of git://git.linuxfoundation.org/llvmlinux/kernel Pull LLVM patches from Behan Webster: "Next set of patches to support compiling the kernel with clang. They've been soaking in linux-next since the last merge window. More still in the works for the next merge window..." * tag 'llvmlinux-for-v3.16' of git://git.linuxfoundation.org/llvmlinux/kernel: arm, unwind, LLVMLinux: Enable clang to be used for unwinding the stack ARM: LLVMLinux: Change "extern inline" to "static inline" in glue-cache.h all: LLVMLinux: Change DWARF flag to support gcc and clang net: netfilter: LLVMLinux: vlais-netfilter crypto: LLVMLinux: aligned-attribute.patch	2014-06-08 12:27:44 -07:00
Linus Torvalds	3f17ea6dea	Merge branch 'next' (accumulated 3.16 merge window patches) into master Now that 3.15 is released, this merges the 'next' branch into 'master', bringing us to the normal situation where my 'master' branch is the merge window. * accumulated work in next: (6809 commits) ufs: sb mutex merge + mutex_destroy powerpc: update comments for generic idle conversion cris: update comments for generic idle conversion idle: remove cpu_idle() forward declarations nbd: zero from and len fields in NBD_CMD_DISCONNECT. mm: convert some level-less printks to pr_* MAINTAINERS: adi-buildroot-devel is moderated MAINTAINERS: add linux-api for review of API/ABI changes mm/kmemleak-test.c: use pr_fmt for logging fs/dlm/debug_fs.c: replace seq_printf by seq_puts fs/dlm/lockspace.c: convert simple_str to kstr fs/dlm/config.c: convert simple_str to kstr mm: mark remap_file_pages() syscall as deprecated mm: memcontrol: remove unnecessary memcg argument from soft limit functions mm: memcontrol: clean up memcg zoneinfo lookup mm/memblock.c: call kmemleak directly from memblock_(alloc\|free) mm/mempool.c: update the kmemleak stack trace for mempool allocations lib/radix-tree.c: update the kmemleak stack trace for radix tree allocations mm: introduce kmemleak_update_trace() mm/kmemleak.c: use %u to print ->checksum ...	2014-06-08 11:31:16 -07:00
Mark Charlebois	066c6807f7	net: netfilter: LLVMLinux: vlais-netfilter Replaced non-standard C use of Variable Length Arrays In Structs (VLAIS) in xt_repldata.h with a C99 compliant flexible array member and then calculated offsets to the other struct members. These other members aren't referenced by name in this code, however this patch maintains the same memory layout and padding as was previously accomplished using VLAIS. Had the original structure been ordered differently, with the entries VLA at the end, then it could have been a flexible member, and this patch would have been a lot simpler. However since the data stored in this structure is ultimately exported to userspace, the order of this structure can't be changed. This patch makes no attempt to change the existing behavior, merely the way in which the current layout is accomplished using standard C99 constructs. As such the code can now be compiled with either gcc or clang. This version of the patch removes the trailing alignment that the VLAIS structure would allocate in order to simplify the patch. Author: Mark Charlebois <charlebm@gmail.com> Signed-off-by: Mark Charlebois <charlebm@gmail.com> Signed-off-by: Behan Webster <behanw@converseincode.com> Signed-off-by: Vinícius Tinti <viniciustinti@gmail.com>	2014-06-07 11:44:39 -07:00
Phoebe Buckheister	fff1f59b17	mac802154: llsec: add forgotten list_del_rcu in key removal During key removal, the key object is freed, but not taken out of the llsec key list properly. Fix that. Signed-off-by: Phoebe Buckheister <phoebe.buckheister@itwm.fraunhofer.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-06-06 16:25:37 -07:00
Steve Wise	83710fc753	svcrdma: Fence LOCAL_INV work requests Fencing forces the invalidate to only happen after all prior send work requests have been completed. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Reported by : Devesh Sharma <Devesh.Sharma@Emulex.Com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-06-06 19:22:51 -04:00
Steve Wise	0bf4828983	svcrdma: refactor marshalling logic This patch refactors the NFSRDMA server marshalling logic to remove the intermediary map structures. It also fixes an existing bug where the NFSRDMA server was not minding the device fast register page list length limitations. Signed-off-by: Tom Tucker <tom@opengridcomputing.com> Signed-off-by: Steve Wise <swise@opengridcomputing.com>	2014-06-06 19:22:50 -04:00
J. Bruce Fields	05638dc73a	nfsd4: simplify server xdr->next_page use The rpc code makes available to the NFS server an array of pages to encod into. The server represents its reply as an xdr buf, with the head pointing into the first page in that array, the pages ** array starting just after that, and the tail (if any) sharing any leftover space in the page used by the head. While encoding, we use xdr_stream->page_ptr to keep track of which page we're currently using. Currently we set xdr_stream->page_ptr to buf->pages, which makes the head a weird exception to the rule that page_ptr always points to the page we're currently encoding into. So, instead set it to buf->pages - 1 (the page actually containing the head), and remove the need for a little unintuitive logic in xdr_get_next_encode_buffer() and xdr_truncate_encode. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-06-06 19:22:46 -04:00
John W. Linville	c6ac68a612	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next into for-davem	2014-06-06 11:59:11 -04:00
Dmitry Popov	586d5fc867	ip_tunnel: fix possible rtable leak ip_rt_put(rt) is always called in "error" branches above, but was missed in skb_cow_head branch. As rt is not yet bound to skb here we have to release it by hand. Signed-off-by: Dmitry Popov <ixaphire@qrator.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-06-05 18:44:44 -07:00
Ilya Dryomov	6044cde6f2	libceph: add ceph_monc_wait_osdmap() Add ceph_monc_wait_osdmap(), which will block until the osdmap with the specified epoch is received or timeout occurs. Export both of these as they are going to be needed by rbd. Signed-off-by: Ilya Dryomov <ilya.dryomov@inktank.com> Reviewed-by: Sage Weil <sage@inktank.com>	2014-06-06 09:29:57 +08:00
Ilya Dryomov	513a8243d6	libceph: mon_get_version request infrastructure Add support for mon_get_version requests to libceph. This reuses much of the ceph_mon_generic_request infrastructure, with one exception. Older OSDs don't set mon_get_version reply hdr->tid even if the original request had a non-zero tid, which makes it impossible to lookup ceph_mon_generic_request contexts by tid in get_generic_reply() for such replies. As a workaround, we allocate a reply message on the reply path. This can probably interfere with revoke, but I don't see a better way. Signed-off-by: Ilya Dryomov <ilya.dryomov@inktank.com> Reviewed-by: Sage Weil <sage@inktank.com>	2014-06-06 09:29:57 +08:00
Ilya Dryomov	002b36ba5e	libceph: recognize poolop requests in debugfs Recognize poolop requests in debugfs monc dump, fix prink format specifiers - tid is unsigned. Signed-off-by: Ilya Dryomov <ilya.dryomov@inktank.com> Reviewed-by: Sage Weil <sage@inktank.com>	2014-06-06 09:29:56 +08:00
Sven Wegener	9e89fd8b7d	ipv6: Shrink udp_v6_mcast_next() to one socket variable To avoid the confusion of having two variables, shrink the function to only use the parameter variable for looping. Cc: Eric Dumazet <edumazet@google.com> Signed-off-by: Sven Wegener <sven.wegener@stealer.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-06-05 16:23:08 -07:00
David S. Miller	f666f87b94	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Conflicts: drivers/net/xen-netback/netback.c net/core/filter.c A filter bug fix overlapped some cleanups and a conversion over to some new insn generation macros. A xen-netback bug fix overlapped the addition of multi-queue support. Signed-off-by: David S. Miller <davem@davemloft.net>	2014-06-05 16:22:02 -07:00
Alexei Starovoitov	0dcceabb0c	net: filter: fix SKF_AD_PKTTYPE extension on big-endian BPF classic->internal converter broke SKF_AD_PKTTYPE extension, since pkt_type_offset() was failing to find skb->pkt_type field which is defined as: __u8 pkt_type:3, fclone:2, ipvs_property:1, peeked:1, nf_trace:1; Fix it by searching for 3 most significant bits and shift them by 5 at run-time Fixes: `bd4cf0ed33` ("net: filter: rework/optimize internal BPF interpreter's instruction set") Signed-off-by: Alexei Starovoitov <ast@plumgrid.com> Acked-by: Daniel Borkmann <dborkman@redhat.com> Tested-by: Daniel Borkmann <dborkman@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-06-05 15:40:38 -07:00
David S. Miller	6934e79ed1	Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-next Pablo Neira Ayuso says: ==================== Netfilter/nf_tables fixes for net-next This patchset contains fixes for recent updates available in your net-next, they are: 1) Fix double memory allocation for accounting objects that results in a leak, this slipped through with the new quota extension, patch from Mathieu Poirier. 2) Fix broken ordering when adding set element transactions. 3) Make sure that objects are released in reverse order in the abort path, to avoid possible use-after-free when accessing dependencies. 4) Allow to delete several objects (as long as dependencies are fulfilled) by using one batch. This includes changes in the use counter semantics of the nf_tables objects. 5) Fix illegal sleeping allocation from rcu callback. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2014-06-05 15:35:04 -07:00
Toshiaki Makita	e0a47d1f78	bridge: Fix incorrect judgment of promisc br_manage_promisc() incorrectly expects br_auto_port() to return only 0 or 1, while it actually returns flags, i.e., a subset of BR_AUTO_MASK. Signed-off-by: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-06-05 15:20:31 -07:00
Simon Horman	3b392ddba2	MPLS: Use mpls_features to activate software MPLS GSO segmentation If an MPLS packet requires segmentation then use mpls_features to determine if the software implementation should be used. As no driver advertises MPLS GSO segmentation this will always be the case. I had not noticed that this was necessary before as software MPLS GSO segmentation was already being used in my test environment. I believe that the reason for that is the skbs in question always had fragments and the driver I used does not advertise NETIF_F_FRAGLIST (which seems to be the case for most drivers). Thus software segmentation was activated by skb_gso_ok(). This introduces the overhead of an extra call to skb_network_protocol() in the case where where CONFIG_NET_MPLS_GSO is set and skb->ip_summed == CHECKSUM_NONE. Thanks to Jesse Gross for prompting me to investigate this. Signed-off-by: Simon Horman <horms@verge.net.au> Acked-by: YAMAMOTO Takashi <yamamoto@valinux.co.jp> Acked-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-06-05 15:05:09 -07:00
John W. Linville	67be1e4f4b	Merge branch 'for-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth-next	2014-06-05 14:10:07 -04:00
WANG Cong	ebbe495f19	ipv4: use skb frags api in udp4_hwcsum() Cc: "David S. Miller" <davem@davemloft.net> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-06-05 00:51:47 -07:00
WANG Cong	4cb28970a2	net: use the new API kvfree() It is available since v3.15-rc5. Cc: Pablo Neira Ayuso <pablo@netfilter.org> Cc: "David S. Miller" <davem@davemloft.net> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-06-05 00:49:51 -07:00
Manuel Schölling	9638f6713f	dns_resolver: Do not accept domain names longer than 255 chars According to RFC1035 "[...] the total length of a domain name (i.e., label octets and label length octets) is restricted to 255 octets or less." Signed-off-by: Manuel Schölling <manuel.schoelling@gmx.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-06-05 00:05:53 -07:00
Tom Herbert	359a0ea987	vxlan: Add support for UDP checksums (v4 sending, v6 zero csums) Added VXLAN link configuration for sending UDP checksums, and allowing TX and RX of UDP6 checksums. Also, call common iptunnel_handle_offloads and added GSO support for checksums. Signed-off-by: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-06-04 22:46:39 -07:00
Tom Herbert	4749c09c37	gre: Call gso_make_checksum Call gso_make_checksum. This should have the benefit of using a checksum that may have been previously computed for the packet. This also adds NETIF_F_GSO_GRE_CSUM to differentiate devices that offload GRE GSO with and without the GRE checksum offloaed. Signed-off-by: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-06-04 22:46:38 -07:00
Tom Herbert	0f4f4ffa7b	net: Add GSO support for UDP tunnels with checksum Added a new netif feature for GSO_UDP_TUNNEL_CSUM. This indicates that a device is capable of computing the UDP checksum in the encapsulating header of a UDP tunnel. Signed-off-by: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-06-04 22:46:38 -07:00
Tom Herbert	e9c3a24b3a	tcp: Call gso_make_checksum Call common gso_make_checksum when calculating checksum for a TCP GSO segment. Signed-off-by: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-06-04 22:46:38 -07:00
Tom Herbert	7e2b10c1e5	net: Support for multiple checksums with gso When creating a GSO packet segment we may need to set more than one checksum in the packet (for instance a TCP checksum and UDP checksum for VXLAN encapsulation). To be efficient, we want to do checksum calculation for any part of the packet at most once. This patch adds csum_start offset to skb_gso_cb. This tracks the starting offset for skb->csum which is initially set in skb_segment. When a protocol needs to compute a transport checksum it calls gso_make_checksum which computes the checksum value from the start of transport header to csum_start and then adds in skb->csum to get the full checksum. skb->csum and csum_start are then updated to reflect the checksum of the resultant packet starting from the transport header. This patch also adds a flag to skbuff, encap_hdr_csum, which is set in *gso_segment fucntions to indicate that a tunnel protocol needs checksum calculation Signed-off-by: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-06-04 22:46:38 -07:00
Tom Herbert	77157e1973	l2tp: call udp{6}_set_csum Call common functions to set checksum for UDP tunnel. Signed-off-by: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-06-04 22:46:38 -07:00
Tom Herbert	af5fcba7f3	udp: Generic functions to set checksum Added udp_set_csum and udp6_set_csum functions to set UDP checksums in packets. These are for simple UDP packets such as those that might be created in UDP tunnels. Signed-off-by: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-06-04 22:46:38 -07:00
Sven Wegener	3bfdc59a6c	ipv6: Fix regression caused by `efe4208` in udp_v6_mcast_next() Commit `efe4208` ("ipv6: make lookups simpler and faster") introduced a regression in udp_v6_mcast_next(), resulting in multicast packets not reaching the destination sockets under certain conditions. The packet's IPv6 addresses are wrongly compared to the IPv6 addresses from the function's socket argument, which indicates the starting point for looping, instead of the loop variable. If the addresses from the first socket do not match the packet's addresses, no socket in the list will match. Signed-off-by: Sven Wegener <sven.wegener@stealer.net> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-06-04 15:42:01 -07:00
Sasha Levin	f830b0223c	net: Revert "fib_trie: use seq_file_net rather than seq->private" This reverts commit `30f38d2fdd`. fib_triestat is surrounded by a big lie: while it claims that it's a seq_file (fib_triestat_seq_open, fib_triestat_seq_show), it isn't: static const struct file_operations fib_triestat_fops = { .owner = THIS_MODULE, .open = fib_triestat_seq_open, .read = seq_read, .llseek = seq_lseek, .release = single_release_net, }; Yes, fib_triestat is just a regular file. A small detail (assuming CONFIG_NET_NS=y) is that while for seq_files you could do seq_file_net() to get the net ptr, doing so for a regular file would be wrong and would dereference an invalid pointer. The fib_triestat lie claimed a victim, and trying to show the file would be bad for the kernel. This patch just reverts the issue and fixes fib_triestat, which still needs a rewrite to either be a seq_file or stop claiming it is. Signed-off-by: Sasha Levin <sasha.levin@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-06-04 15:11:41 -07:00
Chuck Lever	c93c62231c	xprtrdma: Disconnect on registration failure If rpcrdma_register_external() fails during request marshaling, the current RPC request is killed. Instead, this RPC should be retried after reconnecting the transport instance. The most likely reason for registration failure with FRMR is a failed post_send, which would be due to a remote transport disconnect or memory exhaustion. These issues can be recovered by a retry. Problems encountered in the marshaling logic itself will not be corrected by trying again, so these should still kill a request. Now that we've added a clean exit for marshaling errors, take the opportunity to defang some BUG_ON's. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>	2014-06-04 08:56:53 -04:00
Chuck Lever	c977dea227	xprtrdma: Remove BUG_ON() call sites If an error occurs in the marshaling logic, fail the RPC request being processed, but leave the client running. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>	2014-06-04 08:56:53 -04:00
Chuck Lever	e7ce710a88	xprtrdma: Avoid deadlock when credit window is reset Update the cwnd while processing the server's reply. Otherwise the next task on the xprt_sending queue is still subject to the old credit window. Currently, no task is awoken if the old congestion window is still exceeded, even if the new window is larger, and a deadlock results. This is an issue during a transport reconnect. Servers don't normally shrink the credit window, but the client does reset it to 1 when reconnecting so the server can safely grow it again. As a minor optimization, remove the hack of grabbing the initial cwnd size (which happens to be RPC_CWNDSCALE) and using that value as the congestion scaling factor. The scaling value is invariant, and we are better off without the multiplication operation. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>	2014-06-04 08:56:52 -04:00
Chuck Lever	4f4cf5ad6f	SUNRPC: Move congestion window constants to header file I would like to use one of the RPC client's congestion algorithm constants in transport-specific code. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>	2014-06-04 08:56:51 -04:00
Chuck Lever	18906972aa	xprtrdma: Reset connection timeout after successful reconnect If the new connection is able to make forward progress, reset the re-establish timeout. Otherwise it keeps growing even if disconnect events are rare. The same behavior as TCP is adopted: reconnect immediately if the transport instance has been able to make some forward progress. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>	2014-06-04 08:56:51 -04:00
Chuck Lever	bfaee096de	xprtrdma: Use macros for reconnection timeout constants Clean up: Ensure the same max and min constant values are used everywhere when setting reconnect timeouts. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>	2014-06-04 08:56:50 -04:00
Shirley Ma	196c69989d	xprtrdma: Allocate missing pagelist GETACL relies on transport layer to alloc memory for reply buffer. However xprtrdma assumes that the reply buffer (pagelist) has been pre-allocated in upper layer. This problem was reported by IOL OFA lab test on PPC. Signed-off-by: Shirley Ma <shirley.ma@oracle.com> Reviewed-by: Chuck Lever <chuck.lever@oracle.com> Tested-by: Edward Mossman <emossman@iol.unh.edu> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>	2014-06-04 08:56:49 -04:00
Chuck Lever	5bc4bc7292	xprtrdma: Remove Tavor MTU setting Clean up. Remove HCA-specific clutter in xprtrdma, which is supposed to be device-independent. Hal Rosenstock <hal@dev.mellanox.co.il> observes: > Note that there is OpenSM option (enable_quirks) to return 1K MTU > in SA PathRecord responses for Tavor so that can be used for this. > The default setting for enable_quirks is FALSE so that would need > changing. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>	2014-06-04 08:56:48 -04:00
Chuck Lever	ec62f40d35	xprtrdma: Ensure ia->ri_id->qp is not NULL when reconnecting Devesh Sharma <Devesh.Sharma@Emulex.Com> reports that after a disconnect, his HCA is failing to create a fresh QP, leaving ia_ri->ri_id->qp set to NULL. But xprtrdma still allows RPCs to wake up and post LOCAL_INV as they exit, causing an oops. rpcrdma_ep_connect() is allowing the wake-up by leaking the QP creation error code (-EPERM in this case) to the RPC client's generic layer. xprt_connect_status() does not recognize -EPERM, so it kills pending RPC tasks immediately rather than retrying the connect. Re-arrange the QP creation logic so that when it fails on reconnect, it leaves ->qp with the old QP rather than NULL. If pending RPC tasks wake and exit, LOCAL_INV work requests will flush rather than oops. On initial connect, leaving ->qp == NULL is OK, since there are no pending RPCs that might use ->qp. But be sure not to try to destroy a NULL QP when rpcrdma_ep_connect() is retried. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>	2014-06-04 08:56:47 -04:00
Chuck Lever	65866f8259	xprtrdma: Reduce the number of hardway buffer allocations While marshaling an RPC/RDMA request, the inline_{rsize,wsize} settings determine whether an inline request is used, or whether read or write chunks lists are built. The current default value of these settings is 1024. Any RPC request smaller than 1024 bytes is sent to the NFS server completely inline. rpcrdma_buffer_create() allocates and pre-registers a set of RPC buffers for each transport instance, also based on the inline rsize and wsize settings. RPC/RDMA requests and replies are built in these buffers. However, if an RPC/RDMA request is expected to be larger than 1024, a buffer has to be allocated and registered for that RPC, and deregistered and released when the RPC is complete. This is known has a "hardway allocation." Since the introduction of NFSv4, the size of RPC requests has become larger, and hardway allocations are thus more frequent. Hardway allocations are significant overhead, and they waste the existing RPC buffers pre-allocated by rpcrdma_buffer_create(). We'd like fewer hardway allocations. Increasing the size of the pre-registered buffers is the most direct way to do this. However, a blanket increase of the inline thresholds has interoperability consequences. On my 64-bit system, rpcrdma_buffer_create() requests roughly 7000 bytes for each RPC request buffer, using kmalloc(). Due to internal fragmentation, this wastes nearly 1200 bytes because kmalloc() already returns an 8192-byte piece of memory for a 7000-byte allocation request, though the extra space remains unused. So let's round up the size of the pre-allocated buffers, and make use of the unused space in the kmalloc'd memory. This change reduces the amount of hardway allocated memory for an NFSv4 general connectathon run from 1322092 to 9472 bytes (99%). Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Tested-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>	2014-06-04 08:56:46 -04:00
Chuck Lever	8301a2c047	xprtrdma: Limit work done by completion handler Sagi Grimberg <sagig@dev.mellanox.co.il> points out that a steady stream of CQ events could starve other work because of the boundless loop pooling in rpcrdma_{send,recv}_poll(). Instead of a (potentially infinite) while loop, return after collecting a budgeted number of completions. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Acked-by: Sagi Grimberg <sagig@dev.mellanox.co.il> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>	2014-06-04 08:56:45 -04:00
Chuck Lever	1c00dd0776	xprtrmda: Reduce calls to ib_poll_cq() in completion handlers Change the completion handlers to grab up to 16 items per ib_poll_cq() call. No extra ib_poll_cq() is needed if fewer than 16 items are returned. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>	2014-06-04 08:56:44 -04:00
Chuck Lever	7f23f6f6e3	xprtrmda: Reduce lock contention in completion handlers Skip the ib_poll_cq() after re-arming, if the provider knows there are no additional items waiting. (Have a look at commit `ed23a727` for more details). Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>	2014-06-04 08:56:43 -04:00
Chuck Lever	fc66448549	xprtrdma: Split the completion queue The current CQ handler uses the ib_wc.opcode field to distinguish between event types. However, the contents of that field are not reliable if the completion status is not IB_WC_SUCCESS. When an error completion occurs on a send event, the CQ handler schedules a tasklet with something that is not a struct rpcrdma_rep. This is never correct behavior, and sometimes it results in a panic. To resolve this issue, split the completion queue into a send CQ and a receive CQ. The send CQ handler now handles only struct rpcrdma_mw wr_id's, and the receive CQ handler now handles only struct rpcrdma_rep wr_id's. Fix suggested by Shirley Ma <shirley.ma@oracle.com> Reported-by: Rafael Reiter <rafael.reiter@ims.co.at> Fixes: `5c635e09ce` BugLink: https://bugzilla.kernel.org/show_bug.cgi?id=73211 Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Tested-by: Klemens Senn <klemens.senn@ims.co.at> Tested-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>	2014-06-04 08:56:42 -04:00
Chuck Lever	7f1d54191e	xprtrdma: Make rpcrdma_ep_destroy() return void Clean up: rpcrdma_ep_destroy() returns a value that is used only to print a debugging message. rpcrdma_ep_destroy() already prints debugging messages in all error cases. Make rpcrdma_ep_destroy() return void instead. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Tested-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>	2014-06-04 08:56:41 -04:00
Chuck Lever	13c9ff8f67	xprtrdma: Simplify rpcrdma_deregister_external() synopsis Clean up: All remaining callers of rpcrdma_deregister_external() pass NULL as the last argument, so remove that argument. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Tested-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>	2014-06-04 08:56:40 -04:00
Chuck Lever	cdd9ade711	xprtrdma: mount reports "Invalid mount option" if memreg mode not supported If the selected memory registration mode is not supported by the underlying provider/HCA, the NFS mount command reports that there was an invalid mount option, and fails. This is misleading. Reporting a problem allocating memory is a lot closer to the truth. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>	2014-06-04 08:56:39 -04:00
Chuck Lever	f10eafd3a6	xprtrdma: Fall back to MTHCAFMR when FRMR is not supported An audit of in-kernel RDMA providers that do not support the FRMR memory registration shows that several of them support MTHCAFMR. Prefer MTHCAFMR when FRMR is not supported. If MTHCAFMR is not supported, only then choose ALLPHYSICAL. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>	2014-06-04 08:56:39 -04:00
Chuck Lever	0ac531c183	xprtrdma: Remove REGISTER memory registration mode All kernel RDMA providers except amso1100 support either MTHCAFMR or FRMR, both of which are faster than REGISTER. amso1100 can continue to use ALLPHYSICAL. The only other ULP consumer in the kernel that uses the reg_phys_mr verb is Lustre. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>	2014-06-04 08:56:38 -04:00
Chuck Lever	b45ccfd25d	xprtrdma: Remove MEMWINDOWS registration modes The MEMWINDOWS and MEMWINDOWS_ASYNC memory registration modes were intended as stop-gap modes before the introduction of FRMR. They are now considered obsolete. MEMWINDOWS_ASYNC is also considered unsafe because it can leave client memory registered and exposed for an indeterminant time after each I/O. At this point, the MEMWINDOWS modes add needless complexity, so remove them. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Tested-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>	2014-06-04 08:56:37 -04:00
Chuck Lever	03ff8821eb	xprtrdma: Remove BOUNCEBUFFERS memory registration mode Clean up: This memory registration mode is slow and was never meant for use in production environments. Remove it to reduce implementation complexity. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Tested-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>	2014-06-04 08:56:37 -04:00
Chuck Lever	254f91e2fa	xprtrdma: RPC/RDMA must invoke xprt_wake_pending_tasks() in process context An IB provider can invoke rpcrdma_conn_func() in an IRQ context, thus rpcrdma_conn_func() cannot be allowed to directly invoke generic RPC functions like xprt_wake_pending_tasks(). Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Tested-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>	2014-06-04 08:56:36 -04:00
Allen Andrews	4034ba0423	nfs-rdma: Fix for FMR leaks Two memory region leaks were found during testing: 1. rpcrdma_buffer_create: While allocating RPCRDMA_FRMR's ib_alloc_fast_reg_mr is called and then ib_alloc_fast_reg_page_list is called. If ib_alloc_fast_reg_page_list returns an error it bails out of the routine dropping the last ib_alloc_fast_reg_mr frmr region creating a memory leak. Added code to dereg the last frmr if ib_alloc_fast_reg_page_list fails. 2. rpcrdma_buffer_destroy: While cleaning up, the routine will only free the MR's on the rb_mws list if there are rb_send_bufs present. However, in rpcrdma_buffer_create while the rb_mws list is being built if one of the MR allocation requests fail after some MR's have been allocated on the rb_mws list the routine never gets to create any rb_send_bufs but instead jumps to the rpcrdma_buffer_destroy routine which will never free the MR's on rb_mws list because the rb_send_bufs were never created. This leaks all the MR's on the rb_mws list that were created prior to one of the MR allocations failing. Issue(2) was seen during testing. Our adapter had a finite number of MR's available and we created enough connections to where we saw an MR allocation failure on our Nth NFS connection request. After the kernel cleaned up the resources it had allocated for the Nth connection we noticed that FMR's had been leaked due to the coding error described above. Issue(1) was seen during a code review while debugging issue(2). Signed-off-by: Allen Andrews <allen.andrews@emulex.com> Reviewed-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>	2014-06-04 08:56:35 -04:00
Steve Wise	0fc6c4e7bb	xprtrdma: mind the device's max fast register page list depth Some rdma devices don't support a fast register page list depth of at least RPCRDMA_MAX_DATA_SEGS. So xprtrdma needs to chunk its fast register regions according to the minimum of the device max supported depth or RPCRDMA_MAX_DATA_SEGS. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Reviewed-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>	2014-06-04 08:56:33 -04:00
David S. Miller	c99f7abf0e	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Conflicts: include/net/inetpeer.h net/ipv6/output_core.c Changes in net were fixing bugs in code removed in net-next. Signed-off-by: David S. Miller <davem@davemloft.net>	2014-06-03 23:32:12 -07:00
WANG Cong	92ff71b8fe	net: remove some unless free on failure in alloc_netdev_mqs() When we jump to free_pcpu on failure in alloc_netdev_mqs() rx and tx queues are not yet allocated, so no need to free them. Cc: David S. Miller <davem@davemloft.net> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-06-03 19:18:58 -07:00
Cong Wang	e51fb15231	rtnetlink: fix a memory leak when ->newlink fails It is possible that ->newlink() fails before registering the device, in this case we should just free it, it's safe to call free_netdev(). Fixes: commit `0e0eee2465` (net: correct error path in rtnl_newlink()) Cc: David S. Miller <davem@davemloft.net> Cc: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: Cong Wang <cwang@twopensource.com> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-06-03 19:16:10 -07:00
Michal Kubecek	21ee543edc	xfrm: fix race between netns cleanup and state expire notification The xfrm_user module registers its pernet init/exit after xfrm itself so that its net exit function xfrm_user_net_exit() is executed before xfrm_net_exit() which calls xfrm_state_fini() to cleanup the SA's (xfrm states). This opens a window between zeroing net->xfrm.nlsk pointer and deleting all xfrm_state instances which may access it (via the timer). If an xfrm state expires in this window, xfrm_exp_state_notify() will pass null pointer as socket to nlmsg_multicast(). As the notifications are called inside rcu_read_lock() block, it is sufficient to retrieve the nlsk socket with rcu_dereference() and check the it for null. Signed-off-by: Michal Kubecek <mkubecek@suse.cz> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-06-03 16:07:44 -07:00
Linus Torvalds	776edb5931	Merge branch 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip into next Pull core locking updates from Ingo Molnar: "The main changes in this cycle were: - reduced/streamlined smp_mb__() interface that allows more usecases and makes the existing ones less buggy, especially in rarer architectures - add rwsem implementation comments - bump up lockdep limits" 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (33 commits) rwsem: Add comments to explain the meaning of the rwsem's count field lockdep: Increase static allocations arch: Mass conversion of smp_mb__() arch,doc: Convert smp_mb__() arch,xtensa: Convert smp_mb__() arch,x86: Convert smp_mb__() arch,tile: Convert smp_mb__() arch,sparc: Convert smp_mb__() arch,sh: Convert smp_mb__() arch,score: Convert smp_mb__() arch,s390: Convert smp_mb__() arch,powerpc: Convert smp_mb__() arch,parisc: Convert smp_mb__() arch,openrisc: Convert smp_mb__() arch,mn10300: Convert smp_mb__() arch,mips: Convert smp_mb__() arch,metag: Convert smp_mb__() arch,m68k: Convert smp_mb__() arch,m32r: Convert smp_mb__() arch,ia64: Convert smp_mb__() ...	2014-06-03 12:57:53 -07:00
David S. Miller	014b20133b	Merge branch 'ethtool-rssh-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/bwh/net-next Ben Hutchings says: ==================== Pull request: Fixes for new ethtool RSS commands This addresses several problems I previously identified with the new ETHTOOL_{G,S}RSSH commands: 1. Missing validation of reserved parameters 2. Vague documentation 3. Use of unnamed magic number 4. No consolidation with existing driver operations I don't currently have access to suitable network hardware, but have tested these changes with a dummy driver that can support various combinations of operations and sizes, together with (a) Debian's ethtool 3.13 (b) ethtool 3.14 with the submitted patch to use ETHTOOL_{G,S}RSSH and minor adjustment for fixes 1 and 3. v2: Update RSS operations in vmxnet3 too ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2014-06-02 23:07:02 -07:00
Ben Hutchings	f062a38448	ethtool: Check that reserved fields of struct ethtool_rxfh are 0 We should fail rather than silently ignoring use of these extensions. Signed-off-by: Ben Hutchings <ben@decadent.org.uk>	2014-06-03 02:43:16 +01:00
Ben Hutchings	fe62d00137	ethtool: Replace ethtool_ops::{get,set}_rxfh_indir() with {get,set}_rxfh() ETHTOOL_{G,S}RXFHINDIR and ETHTOOL_{G,S}RSSH should work for drivers regardless of whether they expose the hash key, unless you try to set a hash key for a driver that doesn't expose it. Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Acked-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2014-06-03 02:42:44 +01:00
Roopa Prabhu	41c389d72c	bridge: Add bridge ifindex to bridge fdb notify msgs (This patch was previously posted as RFC at http://patchwork.ozlabs.org/patch/352677/) This patch adds NDA_MASTER attribute to neighbour attributes enum for bridge/master ifindex. And adds NDA_MASTER to bridge fdb notify msgs. Today bridge fdb notifications dont contain bridge information. Userspace can derive it from the port information in the fdb notification. However this is tricky in some scenarious. Example, bridge port delete notification comes before bridge fdb delete notifications. And we have seen problems in userspace when using libnl where, the bridge fdb delete notification handling code does not understand which bridge this fdb entry is part of because the bridge and port association has already been deleted. And these notifications (port membership and fdb) are generated on separate rtnl groups. Fixing the order of notifications could possibly solve the problem for some cases (I can submit a separate patch for that). This patch chooses to add NDA_MASTER to bridge fdb notify msgs because it not only solves the problem described above, but also helps userspace avoid another lookup into link msgs to derive the master index. Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-06-02 17:58:55 -07:00
Leon Yu	418c96ac15	net: filter: fix possible memory leak in __sk_prepare_filter() __sk_prepare_filter() was reworked in commit `bd4cf0ed3` (net: filter: rework/optimize internal BPF interpreter's instruction set) so that it should have uncharged memory once things went wrong. However that work isn't complete. Error is handled only in __sk_migrate_filter() while memory can still leak in the error path right after sk_chk_filter(). Fixes: `bd4cf0ed33` ("net: filter: rework/optimize internal BPF interpreter's instruction set") Signed-off-by: Leon Yu <chianglungyu@gmail.com> Acked-by: Alexei Starovoitov <ast@plumgrid.com> Tested-by: Alexei Starovoitov <ast@plumgrid.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-06-02 17:49:45 -07:00
Yuchung Cheng	0cfa5c07d6	tcp: fix cwnd undo on DSACK in F-RTO This bug is discovered by an recent F-RTO issue on tcpm list https://www.ietf.org/mail-archive/web/tcpm/current/msg08794.html The bug is that currently F-RTO does not use DSACK to undo cwnd in certain cases: upon receiving an ACK after the RTO retransmission in F-RTO, and the ACK has DSACK indicating the retransmission is spurious, the sender only calls tcp_try_undo_loss() if some never retransmisted data is sacked (FLAG_ORIG_DATA_SACKED). The correct behavior is to unconditionally call tcp_try_undo_loss so the DSACK information is used properly to undo the cwnd reduction. Signed-off-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-06-02 16:50:49 -07:00
David Ahern	30f38d2fdd	fib_trie: use seq_file_net rather than seq->private Make fib_triestat_seq_show consistent with other /proc/net files and use seq_file_net. Signed-off-by: David Ahern <dsahern@gmail.com> Cc: David S. Miller <davem@davemloft.net> Cc: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru> Cc: James Morris <jmorris@namei.org> Cc: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org> Cc: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-06-02 16:41:38 -07:00
Eric W. Biederman	2d7a85f4b0	netlink: Only check file credentials for implicit destinations It was possible to get a setuid root or setcap executable to write to it's stdout or stderr (which has been set made a netlink socket) and inadvertently reconfigure the networking stack. To prevent this we check that both the creator of the socket and the currentl applications has permission to reconfigure the network stack. Unfortunately this breaks Zebra which always uses sendto/sendmsg and creates it's socket without any privileges. To keep Zebra working don't bother checking if the creator of the socket has privilege when a destination address is specified. Instead rely exclusively on the privileges of the sender of the socket. Note from Andy: This is exactly Eric's code except for some comment clarifications and formatting fixes. Neither I nor, I think, anyone else is thrilled with this approach, but I'm hesitant to wait on a better fix since 3.15 is almost here. Note to stable maintainers: This is a mess. An earlier series of patches in 3.15 fix a rather serious security issue (CVE-2014-0181), but they did so in a way that breaks Zebra. The offending series includes: commit `aa4cf9452f` Author: Eric W. Biederman <ebiederm@xmission.com> Date: Wed Apr 23 14:28:03 2014 -0700 net: Add variants of capable for use on netlink messages If a given kernel version is missing that series of fixes, it's probably worth backporting it and this patch. if that series is present, then this fix is critical if you care about Zebra. Cc: stable@vger.kernel.org Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Andy Lutomirski <luto@amacapital.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-06-02 16:34:09 -07:00
Eric Dumazet	39c36094d7	net: fix inet_getid() and ipv6_select_ident() bugs I noticed we were sending wrong IPv4 ID in TCP flows when MTU discovery is disabled. Note how GSO/TSO packets do not have monotonically incrementing ID. 06:37:41.575531 IP (id 14227, proto: TCP (6), length: 4396) 06:37:41.575534 IP (id 14272, proto: TCP (6), length: 65212) 06:37:41.575544 IP (id 14312, proto: TCP (6), length: 57972) 06:37:41.575678 IP (id 14317, proto: TCP (6), length: 7292) 06:37:41.575683 IP (id 14361, proto: TCP (6), length: 63764) It appears I introduced this bug in linux-3.1. inet_getid() must return the old value of peer->ip_id_count, not the new one. Lets revert this part, and remove the prevention of a null identification field in IPv6 Fragment Extension Header, which is dubious and not even done properly. Fixes: `87c48fa3b4` ("ipv6: make fragment identifications less predictable") Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-06-02 14:09:28 -07:00
Toshiaki Makita	e0d7968ab6	bridge: Prevent insertion of FDB entry with disallowed vlan br_handle_local_finish() is allowing us to insert an FDB entry with disallowed vlan. For example, when port 1 and 2 are communicating in vlan 10, and even if vlan 10 is disallowed on port 3, port 3 can interfere with their communication by spoofed src mac address with vlan id 10. Note: Even if it is judged that a frame should not be learned, it should not be dropped because it is destined for not forwarding layer but higher layer. See IEEE 802.1Q-2011 8.13.10. Signed-off-by: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp> Acked-by: Vlad Yasevich <vyasevic@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-06-02 13:38:23 -07:00
David S. Miller	31595de219	Merge branch 'for-davem' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next John W. Linville says: ==================== pull request: wireless-next 2014-06-02 Please pull this remaining batch of updates intended for the 3.16 stream... For the mac80211 bits, Johannes says: "The remainder for -next right now is mostly fixes, and a handful of small new things like some CSA infrastructure, the regdb script mW/dBm conversion change and sending wiphy notifications." For the bluetooth bits, Gustavo says: "Some more patches for 3.16. There is nothing really special here, just a bunch of clean ups, fixes plus some small improvements. Please pull." For the nfc bits, Samuel says: "We have: - Felica (Type3) tags support for trf7970a - Type 4b tags support for port100 - st21nfca DTS typo fix - A few sparse warning fixes" For the atheros bits, Kalle says: "Ben added support for setting antenna configurations. Michal improved warm reset so that we would not need to fall back to cold reset that often, an issue where ath10k stripped protected flag while in monitor mode and made module initialisation asynchronous to fix the problems with firmware loading when the driver is linked to the kernel. Luca removed unused channel_switch_beacon callbacks both from ath9k and ath10k. Marek fixed Protected Management Frames (PMF) when using Action Frames. Also we had other small fixes everywhere in the driver." Along with that, there are a handful of updates to a variety of drivers. This includes updates to at76c50x-usb, ath9k, b43, brcmfmac, mwifiex, rsi, rtlwifi, and wil6210. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2014-06-02 11:17:35 -07:00
Eric Dumazet	73f156a6e8	inetpeer: get rid of ip_id_count Ideally, we would need to generate IP ID using a per destination IP generator. linux kernels used inet_peer cache for this purpose, but this had a huge cost on servers disabling MTU discovery. 1) each inet_peer struct consumes 192 bytes 2) inetpeer cache uses a binary tree of inet_peer structs, with a nominal size of ~66000 elements under load. 3) lookups in this tree are hitting a lot of cache lines, as tree depth is about 20. 4) If server deals with many tcp flows, we have a high probability of not finding the inet_peer, allocating a fresh one, inserting it in the tree with same initial ip_id_count, (cf secure_ip_id()) 5) We garbage collect inet_peer aggressively. IP ID generation do not have to be 'perfect' Goal is trying to avoid duplicates in a short period of time, so that reassembly units have a chance to complete reassembly of fragments belonging to one message before receiving other fragments with a recycled ID. We simply use an array of generators, and a Jenkin hash using the dst IP as a key. ipv6_select_ident() is put back into net/ipv6/ip6_output.c where it belongs (it is only used from this file) secure_ip_id() and secure_ipv6_id() no longer are needed. Rename ip_select_ident_more() to ip_select_ident_segs() to avoid unnecessary decrement/increment of the number of segments. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-06-02 11:00:41 -07:00
Alexander Duyck	670e5b8eaf	net: Add support for device specific address syncing This change provides a function to be used in order to break the ndo_set_rx_mode call into a set of address add and remove calls. The code is based on the implementation of dev_uc_sync/dev_mc_sync. Since they essentially do the same thing but with only one dev I simply named my functions __dev_uc_sync/__dev_mc_sync. I also implemented an unsync version of the functions as well to allow for cleanup on close. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-06-02 10:40:54 -07:00
Alexander Aring	eb06481d69	6lowpan_rtnl: fix off by one while fragmentation This patch fix a off by one error while fragmentation. If the frag_cap value is equal to skb_unprocessed value we need to stop the fragmentation loop because the last fragment which has a size of skb_unprocessed fits into the frag capability size. This issue was introduced by commit `d4b2816d67` ("6lowpan: fix fragmentation"). Signed-off-by: Alexander Aring <alex.aring@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-06-02 10:39:42 -07:00
Alexander Aring	51263fffad	6lowpan_rtnl: fix fragmentation with two fragments This patch fix the 6LoWPAN fragmentation for the case if we have exactly two fragments. The problem is that the (skb_unprocessed >= frag_cap) condition is always false on the second fragment after sending the first fragment. A fragmentation with only one fragment doesn't make any sense. The solution is that we use a do while loop here, that ensures we sending always a minimum of two fragments if we need a fragmentation. This issue was introduced by commit `d4b2816d67` ("6lowpan: fix fragmentation"). Signed-off-by: Alexander Aring <alex.aring@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-06-02 10:39:42 -07:00
Denis ChengRq	2f91abd451	genetlink: remove superfluous assignment the local variable ops and n_ops were just read out from family, and not changed, hence no need to assign back. Validation functions should operate on const parameters and not change anything. Signed-off-by: Cheng Renquan <crquan@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-06-02 10:36:18 -07:00
John W. Linville	fcb2c0d6cf	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next into for-davem	2014-06-02 11:20:17 -04:00
Jukka Taimisto	8a96f3cd22	Bluetooth: Fix L2CAP deadlock -[0x01 Introduction We have found a programming error causing a deadlock in Bluetooth subsystem of Linux kernel. The problem is caused by missing release_sock() call when L2CAP connection creation fails due full accept queue. The issue can be reproduced with 3.15-rc5 kernel and is also present in earlier kernels. -[0x02 Details The problem occurs when multiple L2CAP connections are created to a PSM which contains listening socket (like SDP) and left pending, for example, configuration (the underlying ACL link is not disconnected between connections). When L2CAP connection request is received and listening socket is found the l2cap_sock_new_connection_cb() function (net/bluetooth/l2cap_sock.c) is called. This function locks the 'parent' socket and then checks if the accept queue is full. 1178 lock_sock(parent); 1179 1180 /* Check for backlog size */ 1181 if (sk_acceptq_is_full(parent)) { 1182 BT_DBG("backlog full %d", parent->sk_ack_backlog); 1183 return NULL; 1184 } If case the accept queue is full NULL is returned, but the 'parent' socket is not released. Thus when next L2CAP connection request is received the code blocks on lock_sock() since the parent is still locked. Also note that for connections already established and waiting for configuration to complete a timeout will occur and l2cap_chan_timeout() (net/bluetooth/l2cap_core.c) will be called. All threads calling this function will also be blocked waiting for the channel mutex since the thread which is waiting on lock_sock() alread holds the channel mutex. We were able to reproduce this by sending continuously L2CAP connection request followed by disconnection request containing invalid CID. This left the created connections pending configuration. After the deadlock occurs it is impossible to kill bluetoothd, btmon will not get any more data etc. requiring reboot to recover. -[0x03 Fix Releasing the 'parent' socket when l2cap_sock_new_connection_cb() returns NULL seems to fix the issue. Signed-off-by: Jukka Taimisto <jtt@codenomicon.com> Reported-by: Tommi Mäkilä <tmakila@codenomicon.com> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Cc: stable@vger.kernel.org	2014-06-02 13:38:19 +03:00
Pablo Neira Ayuso	31f8441c32	netfilter: nf_tables: atomic allocation in set notifications from rcu callback Use GFP_ATOMIC allocations when sending removal notifications of anonymous sets from rcu callback context. Sleeping in that context is illegal. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2014-06-02 10:54:38 +02:00
Pablo Neira Ayuso	4fefee570d	netfilter: nf_tables: allow to delete several objects from a batch Three changes to allow the deletion of several objects with dependencies in one transaction, they are: 1) Introduce speculative counter increment/decrement that is undone in the abort path if required, thus we avoid hitting -EBUSY when deleting the chain. The counter updates are reverted in the abort path. 2) Increment/decrement table/chain use counter for each set/rule. We need this to fully rely on the use counters instead of the list content, eg. !list_empty(&chain->rules) which evaluate true in the middle of the transaction. 3) Decrement table use counter when an anonymous set is bound to the rule in the commit path. This avoids hitting -EBUSY when deleting the table that contains anonymous sets. The anonymous sets are released in the nf_tables_rule_destroy path. This should not be a problem since the rule already bumped the use counter of the chain, so the bound anonymous set reflects dependencies through the rule object, which already increases the chain use counter. So the general assumption after this patch is that the use counters are bumped by direct object dependencies. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2014-06-02 10:54:35 +02:00
Pablo Neira Ayuso	7632667d26	netfilter: nft_rbtree: introduce locking There's no rbtree rcu version yet, so let's fall back on the spinlock to protect the concurrent access of this structure both from user (to update the set content) and kernel-space (in the packet path). Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2014-06-02 10:54:31 +02:00
Pablo Neira Ayuso	a1cee076f4	netfilter: nf_tables: release objects in reverse order in the abort path The patch `c7c32e7` ("netfilter: nf_tables: defer all object release via rcu") indicates that we always release deleted objects in the reverse order, but that is only needed in the abort path. These are the two possible scenarios when releasing objects: 1) Deletion scenario in the commit path: no need to release objects in the reverse order since userspace already ensures that dependencies are fulfilled), ie. userspace tells us to delete rule -> ... -> rule -> chain -> table. In this case, we have to release the objects in the same order as userspace provided. 2) Deletion scenario in the abort path: we have to iterate in the reverse order to undo what it cannot be added, ie. userspace sent us a batch that includes: table -> chain -> rule -> ... -> rule, and that needs to be partially undone. In this case, we have to release objects in the reverse order to ensure that the set and chain objects point to valid rule and table objects. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2014-06-02 10:54:28 +02:00
Pablo Neira Ayuso	46bbafceb2	netfilter: nf_tables: fix wrong transaction ordering in set elements The transaction needs to be placed at the end of the commit list, otherwise event notifications are reordered and we may crash when releasing object via call_rcu. This problem was introduced in `60319eb` ("netfilter: nf_tables: use new transaction infrastructure to handle elements"). Reported-by: Arturo Borrero Gonzalez <arturo.borrero.glez@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2014-06-02 10:54:25 +02:00
Mathieu Poirier	4c552a64df	netfilter: nfnetlink_acct: Fix memory leak Allocation of memory need only to happen once, that is after the proper checks on the NFACCT_FLAGS have been done. Otherwise the code can return without freeing already allocated memory. Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2014-06-02 10:46:52 +02:00
Johan Hedberg	f3fb0b58c8	Bluetooth: Fix missing check for FIPS security level When checking whether a legacy link key provides at least HIGH security level we also need to check for FIPS level which is one step above HIGH. This patch fixes a missing check in the hci_link_key_request_evt() function. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Cc: stable@vger.kernel.org	2014-06-02 00:34:36 -07:00
Daniel Borkmann	f8f6d679aa	net: filter: improve filter block macros Commit `9739eef13c` ("net: filter: make BPF conversion more readable") started to introduce helper macros similar to BPF_STMT()/BPF_JUMP() macros from classic BPF. However, quite some statements in the filter conversion functions remained in the old style which gives a mixture of block macros and non block macros in the code. This patch makes the block macros itself more readable by using explicit member initialization, and converts the remaining ones where possible to remain in a more consistent state. Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Acked-by: Alexei Starovoitov <ast@plumgrid.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-06-01 22:16:58 -07:00
Daniel Borkmann	3480593131	net: filter: get rid of BPF_S_* enum This patch finally allows us to get rid of the BPF_S_* enum. Currently, the code performs unnecessary encode and decode workarounds in seccomp and filter migration itself when a filter is being attached in order to overcome BPF_S_* encoding which is not used anymore by the new interpreter resp. JIT compilers. Keeping it around would mean that also in future we would need to extend and maintain this enum and related encoders/decoders. We can get rid of all that and save us these operations during filter attaching. Naturally, also JIT compilers need to be updated by this. Before JIT conversion is being done, each compiler checks if A is being loaded at startup to obtain information if it needs to emit instructions to clear A first. Since BPF extensions are a subset of BPF_LD \| BPF_{W,H,B} \| BPF_ABS variants, case statements for extensions can be removed at that point. To ease and minimalize code changes in the classic JITs, we have introduced bpf_anc_helper(). Tested with test_bpf on x86_64 (JIT, int), s390x (JIT, int), arm (JIT, int), i368 (int), ppc64 (JIT, int); for sparc we unfortunately didn't have access, but changes are analogous to the rest. Joint work with Alexei Starovoitov. Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Signed-off-by: Alexei Starovoitov <ast@plumgrid.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Mircea Gherzan <mgherzan@gmail.com> Cc: Kees Cook <keescook@chromium.org> Acked-by: Chema Gonzalez <chemag@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-06-01 22:16:58 -07:00
Jon Maxwell	c65c7a3066	bridge: notify user space after fdb update There has been a number incidents recently where customers running KVM have reported that VM hosts on different Hypervisors are unreachable. Based on pcap traces we found that the bridge was broadcasting the ARP request out onto the network. However some NICs have an inbuilt switch which on occasions were broadcasting the VMs ARP request back through the physical NIC on the Hypervisor. This resulted in the bridge changing ports and incorrectly learning that the VMs mac address was external. As a result the ARP reply was directed back onto the external network and VM never updated it's ARP cache. This patch will notify the bridge command, after a fdb has been updated to identify such port toggling. Signed-off-by: Jon Maxwell <jmaxwell37@gmail.com> Reviewed-by: Jiri Pirko <jiri@resnulli.us> Acked-by: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp> Acked-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-06-01 22:14:50 -07:00
wangweidong	019ee792d7	bridge: fix the unbalanced promiscuous count when add_if failed As commit `2796d0c648` ("bridge: Automatically manage port promiscuous mode."), make the add_if use dev_set_allmulti instead of dev_set_promiscuous, so when add_if failed, we should do dev_set_allmulti(dev, -1). Signed-off-by: Wang Weidong <wangweidong1@huawei.com> Reviewed-by: Amos Kong <akong@redhat.com> Acked-by: Vlad Yasevich <vyasevic@redhat.com> Acked-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-06-01 22:05:16 -07:00
Nikolay Aleksandrov	4b9b1cdf83	net: fix wrong mac_len calculation for vlans After `1e785f48d2` ("net: Start with correct mac_len in skb_network_protocol") skb->mac_len is used as a start of the calculation in skb_network_protocol() but that is not always correct. If skb->protocol == 8021Q/AD, usually the vlan header is already inserted in the skb (i.e. vlan reorder hdr == 0). Usually when the packet enters dev_hard_xmit it has mac_len == 0 so we take 2 bytes from the destination mac address (skb->data + VLAN_HLEN) as a type in skb_network_protocol() and return vlan_depth == 4. In the case where TSO is off, then the mac_len is set but it's == 18 (ETH_HLEN + VLAN_HLEN), so skb_network_protocol() returns a type from inside the packet and offset == 22. Also make vlan_depth unsigned as suggested before. As suggested by Eric Dumazet, move the while() loop in the if() so we can avoid additional testing in fast path. Here are few netperf tests + debug printk's to illustrate: cat netperf.tso-on.reorder-on.bugged - Vlan -> device (reorder on, default, this case is okay) MIGRATED TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 192.168.3.1 () port 0 AF_INET Recv Send Send Socket Socket Message Elapsed Size Size Size Time Throughput bytes bytes bytes secs. 10^6bits/sec 87380 16384 16384 10.00 7111.54 [ 81.605435] skb->len 65226 skb->gso_size 1448 skb->proto 0x800 skb->mac_len 0 vlan_depth 0 type 0x800 - Vlan -> device (reorder off, bad) cat netperf.tso-on.reorder-off.bugged MIGRATED TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 192.168.3.1 () port 0 AF_INET Recv Send Send Socket Socket Message Elapsed Size Size Size Time Throughput bytes bytes bytes secs. 10^6bits/sec 87380 16384 16384 10.00 241.35 [ 204.578332] skb->len 1518 skb->gso_size 0 skb->proto 0x8100 skb->mac_len 0 vlan_depth 4 type 0x5301 0x5301 are the last two bytes of the destination mac. And if we stop TSO, we may get even the following: [ 83.343156] skb->len 2966 skb->gso_size 1448 skb->proto 0x8100 skb->mac_len 18 vlan_depth 22 type 0xb84 Because mac_len already accounts for VLAN_HLEN. After the fix: cat netperf.tso-on.reorder-off.fixed MIGRATED TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 192.168.3.1 () port 0 AF_INET Recv Send Send Socket Socket Message Elapsed Size Size Size Time Throughput bytes bytes bytes secs. 10^6bits/sec 87380 16384 16384 10.01 5001.46 [ 81.888489] skb->len 65230 skb->gso_size 1448 skb->proto 0x8100 skb->mac_len 0 vlan_depth 18 type 0x800 CC: Vlad Yasevich <vyasevic@redhat.com> CC: Eric Dumazet <eric.dumazet@gmail.com> CC: Daniel Borkman <dborkman@redhat.com> CC: David S. Miller <davem@davemloft.net> Fixes:1e785f48d29a ("net: Start with correct mac_len in skb_network_protocol") Signed-off-by: Nikolay Aleksandrov <nikolay@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-06-01 19:39:13 -07:00
Johan Hedberg	79897d2097	Bluetooth: Fix requiring SMP MITM for outgoing connections Due to recent changes to the way that the MITM requirement is set for outgoing pairing attempts we can no longer rely on the hcon->auth_type variable (which is actually good since it was formed from BR/EDR concepts that don't really exist for SMP). To match the logic that BR/EDR now uses simply rely on the local IO capability and/or needed security level to set the MITM requirement for outgoing pairing requests. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Cc: stable@vger.kernel.org	2014-05-31 23:51:12 -07:00
David S. Miller	6ce995c6f4	Included changes: - prevent NULL dereference in multicast code -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.22 (GNU/Linux) iQIcBAABCAAGBQJTiZ58AAoJEJgn97Bh2u9e2x4QAIjWLDIo6feo4jH8l6q6R5iO cH/EXCtqk9GHNwvfZDNt+pF19ejzVk/TPnmmXTZ4QcElS9GuXe5WdWxiGcS5KEwa 0UNDRp8fgcBSV1Kqc/vbyKiQ4j69QtC1PPfLWUtxj/GYE0qHX/A1OzB9zROvoHJ7 sa3l8O5XRWiaxBYDkT0RfhHH0jeDdvm3I9yt8B+4B6c71094VIsfGXBVPp4tPrdg nkuzBdwF0HFPiFrlsfboJDLcXPLpRR93H1GsmfELYd5jQ4rtUhlcuEESq6573tvB TV93tkm/zmbwtInMoPI29qKL8t2478cJH7SvKvM4NiqMsB1zOhknhUXzElh9TPGA xyNivxJraYJzL53XguBFO8A8fP1k/E8Z6UQXJbgry4lu+6qZ60e0/J8zGxGpSamP i1JX0MAVPX6T4MAlZ70LMxfmzJ5sSNkkYyXobG+aBa/AgzRsXVvG4So1qi364COx btCxgBXK1Z20ZuNclY8/J06D8EbTXI5y5MCSDvMCOHQlb5mjBl34RtFVw+5/QXkg v2suc7T/YLOPNtZktZC2506caPHoOlwEVvkyA55p+qdkcD/Dd5Iv4Hndi+g+C5gv O2ja7gUQco1R8ElormKW9rE7OvjiUlowNJmguXWAdzc9FC0yISpP66BAGjBqwhF9 6YibEebXMQICjxAVTEAM =fvfu -----END PGP SIGNATURE----- Merge tag 'batman-adv-fix-for-davem' of git://git.open-mesh.org/linux-merge Included changes: - prevent NULL dereference in multicast code Antonion Quartulli says: ==================== pull request net: batman-adv 20140527 here you have another very small fix intended for net/linux-3.15. It prevents some multicast functions from dereferencing a NULL pointer. (Actually it was nothing more than a typo) I hope it is not too late for such a small patch. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2014-05-31 20:01:47 -07:00
Marek Lindner	af0a171c07	batman-adv: fix NULL pointer dereferences Was introduced with `4c8755d69c` ("batman-adv: Send multicast packets to nodes with a WANT_ALL flag") Reported-by: Sven Eckelmann <sven@narfation.org> Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch> Acked-by: Antonio Quartulli <antonio@meshcoding.com> Signed-off-by: Linus Lüssing <linus.luessing@web.de> Signed-off-by: Antonio Quartulli <antonio@meshcoding.com>	2014-05-31 10:07:14 +02:00
Jukka Rissanen	6a5e81650a	Bluetooth: l2cap: Set more channel defaults Default values for various channel settings were missing. This way channel users do not need to set default values themselves. Signed-off-by: Jukka Rissanen <jukka.rissanen@linux.intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-05-30 21:38:37 -07:00
Jukka Rissanen	62bbd5b359	Bluetooth: 6LoWPAN: Fix MAC address universal/local bit handling The universal/local bit handling was incorrectly done in the code. So when setting EUI address from BD address we do this: - If BD address type is PUBLIC, then we clear the universal bit in EUI address. If the address type is RANDOM, then the universal bit is set (BT 6lowpan draft chapter 3.2.2) - After this we invert the universal/local bit according to RFC 2464 When figuring out BD address we do the reverse: - Take EUI address from stateless IPv6 address, invert the universal/local bit according to RFC 2464 - If universal bit is 1 in this modified EUI address, then address type is set to RANDOM, otherwise it is PUBLIC Note that 6lowpan_iphc.[ch] does the final toggling of U/L bit before sending or receiving the network packet. Signed-off-by: Jukka Rissanen <jukka.rissanen@linux.intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Cc: stable@vger.kernel.org	2014-05-30 21:28:21 -07:00
Johan Hedberg	7e3691e13a	Bluetooth: Fix authentication check for FIPS security level When checking whether we need to request authentication or not we should include HCI_SECURITY_FIPS to the levels that always need authentication. This patch fixes check for it in the hci_outgoing_auth_needed() function. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Cc: stable@vger.kernel.org	2014-05-30 21:25:01 -07:00
Johan Hedberg	61b433579b	Bluetooth: Fix properly ignoring LTKs of unknown types In case there are new LTK types in the future we shouldn't just blindly assume that != MGMT_LTK_UNAUTHENTICATED means that the key is authenticated. This patch adds explicit checks for each allowed key type in the form of a switch statement and skips any key which has an unknown value. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Cc: stable@vger.kernel.org	2014-05-30 21:23:29 -07:00
David S. Miller	dbfc4b698a	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf Pablo Neira Ayuso says: ==================== The following patchset contains a late fix for IPVS: * Fix crash when trying to remove the transport header with non-linear skbuffs, this was introduced in 3.6-rc. Patch from Peter Christensen via the IPVS folks. I'll pass this to -stable once this hits mainstream. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2014-05-30 17:56:09 -07:00
David S. Miller	90d0e08e57	Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-next Pablo Neira Ayuso says: ==================== Netfilter/IPVS updates for net-next This small patchset contains three accumulated Netfilter/IPVS updates, they are: 1) Refactorize common NAT code by encapsulating it into a helper function, similarly to what we do in other conntrack extensions, from Florian Westphal. 2) A minor format string mismatch fix for IPVS, from Masanari Iida. 3) Add quota support to the netfilter accounting infrastructure, now you can add quotas to accounting objects via the nfnetlink interface and use them from iptables. You can also listen to quota notifications from userspace. This enhancement from Mathieu Poirier. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2014-05-30 17:54:47 -07:00
Himangi Saraogi	47162c0b7e	af_key: Replace comma with semicolon This patch replaces a comma between expression statements by a semicolon. A simplified version of the semantic patch that performs this transformation is as follows: // <smpl> @r@ expression e1,e2,e; type T; identifier i; @@ e1 -, +; e2; // </smpl> Signed-off-by: Himangi Saraogi <himangi774@gmail.com> Acked-by: Julia Lawall <julia.lawall@lip6.fr> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-05-30 17:48:58 -07:00
Himangi Saraogi	01728371dc	rds/tcp_listen: Replace comma with semicolon This patch replaces a comma between expression statements by a semicolon. A simplified version of the semantic patch that performs this transformation is as follows: // <smpl> @r@ expression e1,e2,e; type T; identifier i; @@ e1 -, +; e2; // </smpl> Signed-off-by: Himangi Saraogi <himangi774@gmail.com> Acked-by: Julia Lawall <julia.lawall@lip6.fr> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-05-30 17:48:58 -07:00
Himangi Saraogi	cc2afe9fe2	RDS/RDMA: Replace comma with semicolon This patch replaces a comma between expression statements by a semicolon. A simplified version of the semantic patch that performs this transformation is as follows: // <smpl> @r@ expression e1,e2,e; type T; identifier i; @@ e1 -, +; e2; // </smpl> Signed-off-by: Himangi Saraogi <himangi774@gmail.com> Acked-by: Julia Lawall <julia.lawall@lip6.fr> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-05-30 17:48:58 -07:00
Himangi Saraogi	70cb4a4526	ipmr: Replace comma with semicolon This patch replaces a comma between expression statements by a semicolon. A simplified version of the semantic patch that performs this transformation is as follows: // <smpl> @r@ expression e1,e2,e; type T; identifier i; @@ e1 -, +; e2; // </smpl> Signed-off-by: Himangi Saraogi <himangi774@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-05-30 17:48:57 -07:00
Ursula Braun	4d520f62e0	af_iucv: correct cleanup if listen backlog is full In case of transport HIPER a sock struct is allocated for an incoming connect request. If the backlog queue is full this socket is not needed, but is left in the list of af_iucv sockets. Final socket release posts console message "Attempt to release alive iucv socket". This patch makes sure the new created socket is cleaned up correctly if the backlog queue is full. Signed-off-by: Ursula Braun <ursula.braun@de.ibm.com> Signed-off-by: Frank Blaschka <blaschka@linux.vnet.ibm.com> Reported-by: Philipp Hachtmann <phacht@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-05-30 17:35:23 -07:00
Philipp Hachtmann	53a4b4995e	af_iucv: Add automatic (source) iucv_name to bind If a socket is bound to an address using before calling connect it is usual to leave it to the network system to choose an appropriate outgoing application name respective port address. af_iucv on VM uses a counter and uses simple numbers as unique identifiers. This behaviour was missing when af_iucv is used with HiperSockets. This patch contains a simple approach to harmonize af_iucv's behaviour. Signed-off-by: Philipp Hachtmann <phacht@linux.vnet.ibm.com> Signed-off-by: Frank Blaschka <blaschka@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-05-30 17:35:23 -07:00
Kinglong Mee	a48fd0f9f7	SUNRPC/NFSD: Remove using of dprintk with KERN_WARNING When debugging, rpc prints messages from dprintk(KERN_WARNING ...) with "^A4" prefixed, [ 2780.339988] ^A4nfsd: connect from unprivileged port: 127.0.0.1, port=35316 Trond tells, > dprintk != printk. We have NEVER supported dprintk(KERN_WARNING...) This patch removes using of dprintk with KERN_WARNING. Signed-off-by: Kinglong Mee <kinglongmee@gmail.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-05-30 20:25:28 -04:00
David S. Miller	4d1cdf1db6	Merge branch 'for-davem' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next John W. Linville says: ==================== Please pull this batch of updates intended for 3.16... For the mac80211 bits, Johannes says: "Here I just have Heikki's rfkill GPIO cleanups. The ARM/tegra patch is OK with the maintainer (Stephen). Let me know of any problems." and; "We have a whole bunch of work on CSA by Andrei, Luca and Michal, but unfortunately it doesn't seem quite complete yet so it's still disabled. There's some TDLS work from Arik, and the rest is mostly minor fixes and cleanups." For the NFC bits, Samuel says: "This is the NFC pull request for 3.16. We have: - STMicroeectronics st21nfca support. The st21nfca is an HCI chipset and thus relies on the HCI stack. This submission provides support for tag redaer/writer mode (including Type 5) and device tree bindings. - PM runtime support and a bunch of bug fixes for TI's trf7970a. - Device tree support for NXP's pn544. Legacy platform data support is obviously kept intact. - NFC Tag type 4B support to the NFC Digital stack. - SOCK_RAW type support to the raw NFC socket, and allow NCI sniffing from that. This can be extended to report HCI frames and also proprietarry ones like e.g. the pn533 ones." For the iwlwifi bits, Emmanuel says: "Eran continues to work on new devices, Eyal is still digging in the rate control stuff, and Johannes added new functionality to the debug system we have in place now along with a few cleanups he made on the way. That's pretty much it." and; "Avri continues to work on the power code and Eran is improving the NVM handling as a preparations for new devices on which he works with Liad. Luca cleans up a bit the code while working on CSA. I have the regular BT Coex stuff and a small lockdep fix. Johannes has his regular amount of clean ups and improvements, the main one is the ability to leave 2 chains open to improve diversity and hence the throughput in high attenuation scenarios." and; "The regular amount of housekeeping here. I merged iwlwifi-fixes.git to be able to add the patch you didn't want in wireless.git at that stage of the -rc cycle. Luca has a few preparations for CSA implementation and also what seems to be a bugfix for P2P but hasn't caused issues we could notice." For the Atheros bits, Kalle says: "For ath10k Michal did various small fixes on how we handle hardware/firmware problems and he also fixed two memory leaks." Also included are a couple of pulls from the wireless tree to avoid/resolve merge issues... ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2014-05-30 17:18:46 -07:00
Paul Bolle	391296c90c	atm: remove commented out check This preprocessor check is commented out ever since this file was added during the v2.3 development cycle. It is unclear what it purpose might have been. Whatever it was, it can safely be removed now. Signed-off-by: Paul Bolle <pebolle@tiscali.nl> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-05-30 17:17:04 -07:00
Sachin Kamat	484611e530	net: tso: Export symbols for modular build Export the symbols to fix the below errors when built as modules: ERROR: "tso_build_data" [drivers/net/ethernet/marvell/mvneta.ko] undefined! ERROR: "tso_build_hdr" [drivers/net/ethernet/marvell/mvneta.ko] undefined! ERROR: "tso_start" [drivers/net/ethernet/marvell/mvneta.ko] undefined! ERROR: "tso_count_descs" [drivers/net/ethernet/marvell/mvneta.ko] undefined! ERROR: "tso_build_data" [drivers/net/ethernet/marvell/mv643xx_eth.ko] undefined! ERROR: "tso_build_hdr" [drivers/net/ethernet/marvell/mv643xx_eth.ko] undefined! ERROR: "tso_start" [drivers/net/ethernet/marvell/mv643xx_eth.ko] undefined! ERROR: "tso_count_descs" [drivers/net/ethernet/marvell/mv643xx_eth.ko] undefined! Signed-off-by: Sachin Kamat <sachin.kamat@linaro.org> Acked-by: Ezequiel Garcia <ezequiel.garcia@free-electrons.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-05-30 15:52:03 -07:00
J. Bruce Fields	a5cddc885b	nfsd4: better reservation of head space for krb5 RPC_MAX_AUTH_SIZE is scattered around several places. Better to set it once in the auth code, where this kind of estimate should be made. And while we're at it we can leave it zero when we're not using krb5i or krb5p. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-05-30 17:32:17 -04:00
J. Bruce Fields	db3f58a95b	rpc: define xdr_restrict_buflen With this xdr_reserve_space can help us enforce various limits. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-05-30 17:32:01 -04:00
J. Bruce Fields	2825a7f907	nfsd4: allow encoding across page boundaries After this we can handle for example getattr of very large ACLs. Read, readdir, readlink are still special cases with their own limits. Also we can't handle a new operation starting close to the end of a page. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-05-30 17:31:54 -04:00
J. Bruce Fields	3e19ce762b	rpc: xdr_truncate_encode This will be used in the server side in a few cases: - when certain operations (read, readdir, readlink) fail after encoding a partial response. - when we run out of space after encoding a partial response. - in readlink, where we initially reserve PAGE_SIZE bytes for data, then truncate to the actual size. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-05-30 17:31:47 -04:00
John W. Linville	57afc62e94	NFC: 3.16: Second pull request This is the 2nd NFC pull request for 3.16. We have: - Felica (Type3) tags support for trf7970a - Type 4b tags support for port100 - st21nfca DTS typo fix - A few sparse warning check fixes -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIcBAABAgAGBQJTh7q/AAoJEIqAPN1PVmxKKAMP/1mJP88CXQv6SZUpmWMXP/K2 7LqdG6nSnuPwm8k43qbNTdCbZRxfRVTdmyBjdAsxHYVOj2S3hGMkYXcCW6phD+AJ I4OPi3quC+y+4Tjl34fWIpEPTgvmAqMxuyLXiKwMTwuzdwNkDF3JzYiRyxm2QvqM qFevVEUdWqj0YywJGfokQLFfWNJbu7ghpBei4eIK53QX63dIQVPi63Lih5jBI4ig gJg7CHfPzaduYuCysU7rRss93p4CJ45Mc8b9CZn59KWW2nRw98wp867083Rbr9F7 zwaH0hc/L1kwFLLEXMYPx2a/1CoEya54amu8oKaBEg90OUvYPxjPQlPKvmy1hKXB cNwW7snuAH+10IBmD3dcoEqZ50pTXkMZw5czdNmgnUUxrOyS4wzR/n1X10+FqH3O 1E6G8MWVZuIU9l/FBSRvhX0jFK2upHgGrD93nu1qAg7giAZvqDHUSKdGVmMfI32D Tm+j6cS0/AouePssWChQtPwbAJus2kgeBO/w8gu2HaFN8C13E/nPSg77tONlRWQ6 rEkXum1P2jE9QTGQfzGwbCITxhEiMpHxtXV80lD5THkfHVVtQV6zkL2Lj9QDzoxQ d80Xk2DOScKnDcVCOiHX1NrnST3sFH1TsRS9XCKvmDX02VMl+KbYZzzJJaQ8gDLj NCVNv3BvuclwsG3VVqFn =t52d -----END PGP SIGNATURE----- Merge tag 'nfc-next-3.16-2' of git://git.kernel.org/pub/scm/linux/kernel/git/sameo/nfc-next Samuel Ortiz <sameo@linux.intel.com> says: "NFC: 3.16: Second pull request This is the 2nd NFC pull request for 3.16. We have: - Felica (Type3) tags support for trf7970a - Type 4b tags support for port100 - st21nfca DTS typo fix - A few sparse warning check fixes" Signed-off-by: John W. Linville <linville@tuxdriver.com>	2014-05-30 13:41:40 -04:00
John W. Linville	a5eb1aeb25	Merge branch 'for-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth-next Conflicts: drivers/bluetooth/btusb.c	2014-05-29 13:03:47 -04:00
John W. Linville	737be10d8c	Merge git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211-next	2014-05-29 12:55:38 -04:00
David Rientjes	c6c8fe79a8	net, sunrpc: suppress allocation warning in rpc_malloc() rpc_malloc() allocates with GFP_NOWAIT without making any attempt at reclaim so it easily fails when low on memory. This ends up spamming the kernel log: SLAB: Unable to allocate memory on node 0 (gfp=0x4000) cache: kmalloc-8192, object size: 8192, order: 1 node 0: slabs: 207/207, objs: 207/207, free: 0 rekonq: page allocation failure: order:1, mode:0x204000 CPU: 2 PID: 14321 Comm: rekonq Tainted: G O 3.15.0-rc3-12.gfc9498b-desktop+ #6 Hardware name: System manufacturer System Product Name/M4A785TD-V EVO, BIOS 2105 07/23/2010 0000000000000000 ffff880010ff17d0 ffffffff815e693c 0000000000204000 ffff880010ff1858 ffffffff81137bd2 0000000000000000 0000001000000000 ffff88011ffebc38 0000000000000001 0000000000204000 ffff88011ffea000 Call Trace: [<ffffffff815e693c>] dump_stack+0x4d/0x6f [<ffffffff81137bd2>] warn_alloc_failed+0xd2/0x140 [<ffffffff8113be19>] __alloc_pages_nodemask+0x7e9/0xa30 [<ffffffff811824a8>] kmem_getpages+0x58/0x140 [<ffffffff81183de6>] fallback_alloc+0x1d6/0x210 [<ffffffff81183be3>] ____cache_alloc_node+0x123/0x150 [<ffffffff81185953>] __kmalloc+0x203/0x490 [<ffffffffa06b0ee2>] rpc_malloc+0x32/0xa0 [sunrpc] [<ffffffffa06a6999>] call_allocate+0xb9/0x170 [sunrpc] [<ffffffffa06b19d8>] __rpc_execute+0x88/0x460 [sunrpc] [<ffffffffa06b2da9>] rpc_execute+0x59/0xc0 [sunrpc] [<ffffffffa06a932b>] rpc_run_task+0x6b/0x90 [sunrpc] [<ffffffffa077b5c1>] nfs4_call_sync_sequence+0x51/0x80 [nfsv4] [<ffffffffa077d45d>] _nfs4_do_setattr+0x1ed/0x280 [nfsv4] [<ffffffffa0782a72>] nfs4_do_setattr+0x72/0x180 [nfsv4] [<ffffffffa078334c>] nfs4_proc_setattr+0xbc/0x140 [nfsv4] [<ffffffffa074a7e8>] nfs_setattr+0xd8/0x240 [nfs] [<ffffffff811baa71>] notify_change+0x231/0x380 [<ffffffff8119cf5c>] chmod_common+0xfc/0x120 [<ffffffff8119df80>] SyS_chmod+0x40/0x90 [<ffffffff815f4cfd>] system_call_fastpath+0x1a/0x1f ... If the allocation fails, simply return NULL and avoid spamming the kernel log. Reported-by: Marc Dietrich <marvin24@gmx.de> Signed-off-by: David Rientjes <rientjes@google.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>	2014-05-29 11:11:51 -04:00
Avraham Stern	d3a58df87a	mac80211: set new interfaces as idle upon init Mark new interfaces as idle to allow operations that require that interfaces are idle to take place. Interface types that are always not idle (like AP interfaces) will be set as not idle when they are assigned a channel context. Signed-off-by: Avraham Stern <avraham.stern@intel.com> Signed-off-by: Emmanuel Grumbach<emmanuel.grumbach@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-05-28 16:22:49 +02:00
Felix Fietkau	abd43a6a68	mac80211: reduce packet loss notifications under load During strong signal fluctuations under high throughput, few consecutive failed A-MPDU transmissions can easily trigger packet loss notification, and thus (in AP mode) client disconnection. Reduce the number of false positives by checking the A-MPDU status flag and treating a failed A-MPDU as a single packet. Signed-off-by: Felix Fietkau <nbd@openwrt.org> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-05-28 16:22:48 +02:00
Arik Nemtsov	923eaf3672	mac80211: don't check netdev state for debugfs read/write Doing so will lead to an oops for a p2p-dev interface, since it has no netdev. Cc: stable@vger.kernel.org Signed-off-by: Arik Nemtsov <arikx.nemtsov@intel.com> Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-05-28 16:22:48 +02:00
Felix Fietkau	53d045258e	mac80211: fix a memory leak on sta rate selection table If the rate control algorithm uses a selection table, it is leaked when the station is destroyed - fix that. Signed-off-by: Felix Fietkau <nbd@openwrt.org> Reported-by: Christophe Prévotaux <cprevotaux@nltinc.com> Fixes: `0d528d85c5` ("mac80211: improve the rate control API") Cc: stable@vger.kernel.org # v3.10+ [add commit log entry, remove pointless NULL check] Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-05-28 16:22:41 +02:00
John W. Linville	9db7cb6901	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next into for-davem	2014-05-27 13:51:31 -04:00
John W. Linville	03c4444650	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless	2014-05-27 13:47:27 -04:00
chaitanya.mgit@gmail.com	a9fb54169b	regdb: Generalize the mW to dBm power conversion Generalize the power conversion from mW to dBm using log. This should fix the below compilation error for country NO which adds a new power value 2000mW which is not handled earlier. CC [M] net/wireless/wext-sme.o CC [M] net/wireless/regdb.o net/wireless/regdb.c:1130:1: error: Unknown undeclared here (not in a function) net/wireless/regdb.c:1130:9: error: expected } before power make[2]: * [net/wireless/regdb.o] Error 1 make[1]: * [net/wireless] Error 2 make: *** [net] Error 2 Reported-By: John Walker <john@x109.net> Signed-off-by: Chaitanya T K <chaitanya.mgit@gmail.com> Acked-by: John W. Linville <linville@tuxdriver.com> [remove unneeded parentheses, fix rounding by using %.0f] Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-05-27 17:58:58 +02:00
Krzysztof Hałasa	c7d37a66e3	mac80211: fix IBSS join by initializing last_scan_completed Without this fix, freshly rebooted Linux creates a new IBSS instead of joining an existing one. Only when jiffies counter overflows after 5 minutes the IBSS can be successfully joined. Signed-off-by: Krzysztof Hałasa <khalasa@piap.pl> [edit commit message slightly] Cc: stable@vger.kernel.org Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-05-27 08:54:01 +02:00
Johannes Berg	3bb2055672	cfg80211: send events when devices are added/removed We're currently sending NEW_WIPHY events for renames (which is a bit odd, but now can't be changed), but also send them for really new devices that register. Also send DEL_WIPHY events when a device is removed, the event ID for this was already reserved. Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-05-26 13:52:25 +02:00
Emmanuel Grumbach	34171dc0d6	mac80211: fix virtual monitor interface addition Since the commit below, cfg80211_chandef_dfs_required() will warn if it gets a an NL80211_IFTYPE_UNSPECIFIED iftype as explicitely written in the commit log. When an virtual monitor interface is added, its type is set in ieee80211_sub_if_data.vif.type, but not in ieee80211_sub_if_data.wdev.iftype which is passed to cfg80211_chandef_dfs_required() hence resulting in the following warning: WARNING: CPU: 1 PID: 21265 at net/wireless/chan.c:376 cfg80211_chandef_dfs_required+0xbc/0x130 [cfg80211]() Modules linked in: [...] CPU: 1 PID: 21265 Comm: ifconfig Tainted: G W O 3.13.11+ #12 Hardware name: Dell Inc. Latitude E6410/0667CC, BIOS A01 03/05/2010 0000000000000009 ffff88008f5fdb08 ffffffff817d4219 ffff88008f5fdb50 ffff88008f5fdb40 ffffffff8106f57d 0000000000000000 0000000000000000 ffff880081062fb8 ffff8800810604e0 0000000000000001 ffff88008f5fdba0 Call Trace: [<ffffffff817d4219>] dump_stack+0x4d/0x66 [<ffffffff8106f57d>] warn_slowpath_common+0x7d/0xa0 [<ffffffff8106f5ec>] warn_slowpath_fmt+0x4c/0x50 [<ffffffffa04ea4ec>] cfg80211_chandef_dfs_required+0xbc/0x130 [cfg80211] [<ffffffffa06b1024>] ieee80211_vif_use_channel+0x94/0x500 [mac80211] [<ffffffffa0684e6b>] ieee80211_add_virtual_monitor+0x1ab/0x5c0 [mac80211] [<ffffffffa0686ae5>] ieee80211_do_open+0xe75/0x1580 [mac80211] [<ffffffffa0687259>] ieee80211_open+0x69/0x70 [mac80211] [snip] Fixes: `00ec75fc5a` ("cfg80211: pass the actual iftype when calling cfg80211_chandef_dfs_required()") Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com> Acked-by: Luciano Coelho <luciano.coelho@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-05-26 11:04:42 +02:00
Luciano Coelho	1a5f0c13d1	mac80211: add a single-transaction driver op to switch contexts In some cases, when the driver is already using all the channel contexts it can handle at once, we have to do an in-place switch (ie. we cannot afford using an extra context temporarily for the transaction). But some drivers may not support switching the channel context assigned to a vif on the fly (ie. without unassigning and assigning it) while others may only work if the context is changed on the fly, without unassigning it first. To allow these different scenarios, add a new driver operation that let's the driver decide how to handle an in-place switch. Signed-off-by: Luciano Coelho <luciano.coelho@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-05-26 11:04:41 +02:00
Pablo Neira	1708803ef2	netfilter: bridge: fix Kconfig unmet dependencies Before `f5efc69` ("netfilter: nf_tables: Add meta expression key for bridge interface name"), the entire net/bridge/netfilter/ directory depended on BRIDGE_NF_EBTABLES, ie. on ebtables. However, that directory already contained the nf_tables bridge extension that we should allow to compile separately. In `f5efc69`, we tried to generalize this by using CONFIG_BRIDGE_NETFILTER which was not a good idea since this option already existed and it is dedicated to enable the Netfilter bridge IP/ARP filtering. Let's try to fix this mess by: 1) making net/bridge/netfilter/ dependent on the toplevel CONFIG_NETFILTER option, just like we do with the net/netfilter and net/ipv{4,6}/netfilter/ directories. 2) Changing 'selects' to 'depends on' NETFILTER_XTABLES for BRIDGE_NF_EBTABLES. I believe this problem was already before `f5efc69`: warning: (BRIDGE_NF_EBTABLES) selects NETFILTER_XTABLES which has unmet direct dependencies (NET && INET && NETFILTER) 3) Fix ebtables/nf_tables bridge dependencies by making NF_TABLES_BRIDGE and BRIDGE_NF_EBTABLES dependent on BRIDGE and NETFILTER: warning: (NF_TABLES_BRIDGE && BRIDGE_NF_EBTABLES) selects BRIDGE_NETFILTER which has unmet direct dependencies (NET && BRIDGE && NETFILTER && INET && NETFILTER_ADVANCED) net/built-in.o: In function `br_parse_ip_options': br_netfilter.c:(.text+0x4a5ba): undefined reference to `ip_options_compile' br_netfilter.c:(.text+0x4a5ed): undefined reference to `ip_options_rcv_srr' net/built-in.o: In function `br_nf_pre_routing_finish': br_netfilter.c:(.text+0x4a8a4): undefined reference to `ip_route_input_noref' br_netfilter.c:(.text+0x4a987): undefined reference to `ip_route_output_flow' make: *** [vmlinux] Error 1 Reported-by: Jim Davis <jim.epost@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-05-26 00:42:30 -04:00
Peter Christensen	f44a5f45f5	ipvs: Fix panic due to non-linear skb Receiving a ICMP response to an IPIP packet in a non-linear skb could cause a kernel panic in __skb_pull. The problem was introduced in commit `f2edb9f770` ("ipvs: implement passive PMTUD for IPIP packets"). Signed-off-by: Peter Christensen <pch@ordbogen.com> Acked-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Simon Horman <horms@verge.net.au>	2014-05-26 10:22:46 +09:00
Fengguang Wu	db3287da34	NFC: nfc_sock_link() can be static CC: Hiren Tandel <hirent@marvell.com> CC: Samuel Ortiz <sameo@linux.intel.com> Signed-off-by: Fengguang Wu <fengguang.wu@intel.com> Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>	2014-05-26 00:53:10 +02:00
Fengguang Wu	cb30caf027	NFC: digital: digital_in_send_attrib_req() can be static CC: "Mark A. Greer" <mgreer@animalcreek.com> CC: Samuel Ortiz <sameo@linux.intel.com> Signed-off-by: Fengguang Wu <fengguang.wu@intel.com> Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>	2014-05-26 00:52:15 +02:00
Thierry Escande	9dc33705b2	NFC: digital: Randomize poll cycles This change adds some entropy to polling cycles, choosing the next polling rf technology randomly. This reflects the change done in the pn533 driver, avoiding possible infinite loop for devices that export 2 targets on 2 different modulations. If the first target is not readable, we will stay in an error loop for ever. Signed-off-by: Thierry Escande <thierry.escande@linux.intel.com> Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>	2014-05-26 00:42:02 +02:00
Thierry Escande	00e625df3e	NFC: digital: Return proper error code when sending ATR_REQ The error code returned by digital_in_send_cmd() was not returned by digital_in_send_atr_req(). Signed-off-by: Thierry Escande <thierry.escande@linux.intel.com> Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>	2014-05-26 00:42:02 +02:00
Arnaldo Carvalho de Melo	85d3fc9418	tipc: Don't reset the timeout when restarting As it may then take longer than what the user specified using setsockopt(SO_RCVTIMEO). Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-05-24 14:11:41 -04:00
David S. Miller	8646224cdb	Merge branch 'for-davem' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless John W. Linville says: ==================== pull request: wireless 2014-05-23 I have two more fixes intended for the 3.15 stream... For the iwlwifi one, Emmanuel says: "A race has been discovered in the beacon filtering code. Since the fix is too big for 3.15, I disable here the feature." For the bluetooth one, Gustavo says: "This pull request contains a very important fix for 3.15. Here we fix the permissions of a debugfs file that would otherwise allow unauthorized users to write content to it." Please let me know if there are problems! ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2014-05-24 14:06:19 -04:00
David S. Miller	54e5c4def0	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Conflicts: drivers/net/bonding/bond_alb.c drivers/net/ethernet/altera/altera_msgdma.c drivers/net/ethernet/altera/altera_sgdma.c net/ipv6/xfrm6_output.c Several cases of overlapping changes. The xfrm6_output.c has a bug fix which overlaps the renaming of skb->local_df to skb->ignore_df. In the Altera TSE driver cases, the register access cleanups in net-next overlapped with bug fixes done in net. Similarly a bug fix to send ALB packets in the bonding driver using the right source address overlaps with cleanups in net-next. Signed-off-by: David S. Miller <davem@davemloft.net>	2014-05-24 00:32:30 -04:00
Linus Torvalds	5fa6a683c0	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Pull networking fixes from David Miller: "It looks like a sizeble collection but this is nearly 3 weeks of bug fixing while you were away. 1) Fix crashes over IPSEC tunnels with NAT, the latter can reroute the packet through a non-IPSEC protected path and the code has to be able to handle SKBs attached to routes lacking an attached xfrm state. From Steffen Klassert. 2) Fix OOPSs in ipv4 and ipv6 ipsec layers for unsupported sub-protocols, also from Steffen Klassert. 3) Set local_df on fragmented netfilter skbs otherwise we won't be able to forward successfully, from Florian Westphal. 4) cdc_mbim ipv6 neighbour code does __vlan_find_dev_deep without holding RCU lock, from Bjorn Mork. 5) local_df test in ip_may_fragment is inverted, from Florian Westphal. 6) jme driver doesn't check for DMA mapping failures, from Neil Horman. 7) qlogic driver doesn't calculate number of TX queues properly, from Shahed Shaikh. 8) fib_info_cnt can drift irreversibly positive if we fail to allocate the fi->fib_metrics array, from Sergey Popovich. 9) Fix use after free in ip6_route_me_harder(), also from Sergey Popovich. 10) When SYSCTL is disabled, we don't handle local_port_range and ping_group_range defaults properly at all, from Cong Wang. 11) Unaccelerated VLAN tagged frames improperly handled by cdc_mbim driver, fix from Bjorn Mork. 12) cassini driver needs nested lock annotations for TX locking, from Emil Goode. 13) On init error ipv6 VTI driver can unregister pernet ops twice, oops. Fix from Mahtias Krause. 14) If macvlan device is down, don't propagate IFF_ALLMULTI changes, from Peter Christensen. 15) Missing NULL pointer check while parsing netlink config options in ip6_tnl_validate(). From Susant Sahani. 16) Fix handling of neighbour entries during ipv6 router reachability probing, from Duan Jiong. 17) x86 and s390 JIT address randomization has some address calculation bugs leading to crashes, from Alexei Starovoitov and Heiko Carstens. 18) Clear up those uglies with nop patching and net_get_random_once(), from Hannes Frederic Sowa. 19) Option length miscalculated in ip6_append_data(), fix also from Hannes Frederic Sowa. 20) A while ago we fixed a race during device unregistry when a namespace went down, turns out there is a second place that needs similar protection. From Cong Wang. 21) In the new Altera TSE driver multicast filtering isn't working, disable it and just use promisc mode until the cause is found. From Vince Bridgers. 22) When we disable router enabling in ipv6 we have to flush the cached routes explicitly, from Duan Jiong. 23) NBMA tunnels should not cache routes on the tunnel object because the key is variable, from Timo Teräs. 24) With stacked devices GRO information in skb->cb[] can be not setup properly, make sure it is in all code paths. From Eric Dumazet. 25) Really fix stacked vlan locking, multiple levels of nesting with intervening non-vlan devices are possible. From Vlad Yasevich. 26) Fallback ipip tunnel device's mtu is not setup properly, from Steffen Klassert. 27) The packet scheduler's tcindex filter can crash because we structure copy objects with list_head's inside, oops. From Cong Wang. 28) Fix CHECKSUM_COMPLETE handling for ipv6 GRE tunnels, from Eric Dumazet. 29) In some configurations 'itag' in __mkroute_input() can end up being used uninitialized because of how fib_validate_source() works. Fix it by explitly initializing itag to zero like all the other fib_validate_source() callers do, from Li RongQing" * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (116 commits) batman: fix a bogus warning from batadv_is_on_batman_iface() ipv4: initialise the itag variable in __mkroute_input bonding: Send ALB learning packets using the right source bonding: Don't assume 802.1Q when sending alb learning packets. net: doc: Update references to skb->rxhash stmmac: Remove unbalanced clk_disable call ipv6: gro: fix CHECKSUM_COMPLETE support net_sched: fix an oops in tcindex filter can: peak_pci: prevent use after free at netdev removal ip_tunnel: Initialize the fallback device properly vlan: Fix build error wth vlan_get_encap_level() can: c_can: remove obsolete STRICT_FRAME_ORDERING Kconfig option MAINTAINERS: Pravin Shelar is Open vSwitch maintainer. bnx2x: Convert return 0 to return rc bonding: Fix alb mode to only use first level vlans. bonding: Fix stacked device detection in arp monitoring macvlan: Fix lockdep warnings with stacked macvlan devices vlan: Fix lockdep warning with stacked vlan devices. net: Allow for more then a single subclass for netif_addr_lock net: Find the nesting level of a given device by type. ...	2014-05-23 15:29:43 -07:00
Daniel Borkmann	b1fcd35cf5	net: filter: let unattached filters use sock_fprog_kern The sk_unattached_filter_create() API is used by BPF filters that are not directly attached or related to sockets, and are used in team, ptp, xt_bpf, cls_bpf, etc. As such all users do their own internal managment of obtaining filter blocks and thus already have them in kernel memory and set up before calling into sk_unattached_filter_create(). As a result, due to __user annotation in sock_fprog, sparse triggers false positives (incorrect type in assignment [different address space]) when filters are set up before passing them to sk_unattached_filter_create(). Therefore, let sk_unattached_filter_create() API use sock_fprog_kern to overcome this issue. Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Acked-by: Alexei Starovoitov <ast@plumgrid.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-05-23 16:48:05 -04:00
Daniel Borkmann	8556ce79d5	net: filter: remove DL macro Lets get rid of this macro. After commit `5bcfedf06f` ("net: filter: simplify label names from jump-table"), labels have become more readable due to omission of BPF_ prefix but at the same time more generic, so that things like `git grep -n` would not find them. As a middle path, lets get rid of the DL macro as it's not strictly needed and would otherwise just hide the full name. Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Acked-by: Alexei Starovoitov <ast@plumgrid.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-05-23 16:48:05 -04:00
Tom Herbert	6b649feafe	l2tp: Add support for zero IPv6 checksums Added new L2TP configuration options to allow TX and RX of zero checksums in IPv6. Default is not to use them. Signed-off-by: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-05-23 16:28:53 -04:00
Tom Herbert	1c19448c9b	net: Make enabling of zero UDP6 csums more restrictive RFC 6935 permits zero checksums to be used in IPv6 however this is recommended only for certain tunnel protocols, it does not make checksums completely optional like they are in IPv4. This patch restricts the use of IPv6 zero checksums that was previously intoduced. no_check6_tx and no_check6_rx have been added to control the use of checksums in UDP6 RX and TX path. The normal sk_no_check_{rx,tx} settings are not used (this avoids ambiguity when dealing with a dual stack socket). A helper function has been added (udp_set_no_check6) which can be called by tunnel impelmentations to all zero checksums (send on the socket, and accept them as valid). Signed-off-by: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-05-23 16:28:53 -04:00
Tom Herbert	28448b8045	net: Split sk_no_check into sk_no_check_{rx,tx} Define separate fields in the sock structure for configuring disabling checksums in both TX and RX-- sk_no_check_tx and sk_no_check_rx. The SO_NO_CHECK socket option only affects sk_no_check_tx. Also, removed UDP_CSUM_* defines since they are no longer necessary. Signed-off-by: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-05-23 16:28:53 -04:00
Tom Herbert	b26ba202e0	net: Eliminate no_check from protosw It doesn't seem like an protocols are setting anything other than the default, and allowing to arbitrarily disable checksums for a whole protocol seems dangerous. This can be done on a per socket basis. Signed-off-by: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-05-23 16:28:53 -04:00
Tom Herbert	0f8066bd48	sunrpc: Remove sk_no_check setting Setting sk_no_check to UDP_CSUM_NORCV seems to have no effect. Signed-off-by: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-05-23 16:28:53 -04:00
Sucheta Chakraborty	ed616689a3	net-next:v4: Add support to configure SR-IOV VF minimum and maximum Tx rate through ip tool. o min_tx_rate puts lower limit on the VF bandwidth. VF is guaranteed to have a bandwidth of at least this value. max_tx_rate puts cap on the VF bandwidth. VF can have a bandwidth of up to this value. o A new handler set_vf_rate for attr IFLA_VF_RATE has been introduced which takes 4 arguments: netdev, VF number, min_tx_rate, max_tx_rate o ndo_set_vf_rate replaces ndo_set_vf_tx_rate handler. o Drivers that currently implement ndo_set_vf_tx_rate should now call ndo_set_vf_rate instead and reject attempt to set a minimum bandwidth greater than 0 for IFLA_VF_TX_RATE when IFLA_VF_RATE is not yet implemented by driver. o If user enters only one of either min_tx_rate or max_tx_rate, then, userland should read back the other value from driver and set both for IFLA_VF_RATE. Drivers that have not yet implemented IFLA_VF_RATE should always return min_tx_rate as 0 when read from ip tool. o If both IFLA_VF_TX_RATE and IFLA_VF_RATE options are specified, then IFLA_VF_RATE should override. o Idea is to have consistent display of rate values to user. o Usage example: - ./ip link set p4p1 vf 0 rate 900 ./ip link show p4p1 32: p4p1: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN mode DEFAULT qlen 1000 link/ether 00:0e:1e:08:b0:f0 brd ff:ff:ff:ff:ff:ff vf 0 MAC 3e:a0:ca:bd:ae:5a, tx rate 900 (Mbps), max_tx_rate 900Mbps vf 1 MAC f6:c6:7c:3f:3d:6c vf 2 MAC 56:32:43:98:d7:71 vf 3 MAC d6:be:c3:b5:85:ff vf 4 MAC ee:a9:9a:1e:19:14 vf 5 MAC 4a:d0:4c:07:52:18 vf 6 MAC 3a:76:44:93:62:f9 vf 7 MAC 82:e9:e7:e3:15:1a ./ip link set p4p1 vf 0 max_tx_rate 300 min_tx_rate 200 ./ip link show p4p1 32: p4p1: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN mode DEFAULT qlen 1000 link/ether 00:0e:1e:08:b0:f0 brd ff:ff:ff:ff:ff:ff vf 0 MAC 3e:a0:ca:bd:ae:5a, tx rate 300 (Mbps), max_tx_rate 300Mbps, min_tx_rate 200Mbps vf 1 MAC f6:c6:7c:3f:3d:6c vf 2 MAC 56:32:43:98:d7:71 vf 3 MAC d6:be:c3:b5:85:ff vf 4 MAC ee:a9:9a:1e:19:14 vf 5 MAC 4a:d0:4c:07:52:18 vf 6 MAC 3a:76:44:93:62:f9 vf 7 MAC 82:e9:e7:e3:15:1a ./ip link set p4p1 vf 0 max_tx_rate 600 rate 300 ./ip link show p4p1 32: p4p1: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN mode DEFAULT qlen 1000 link/ether 00:0e:1e:08:b0:f brd ff:ff:ff:ff:ff:ff vf 0 MAC 3e:a0:ca:bd:ae:5, tx rate 600 (Mbps), max_tx_rate 600Mbps, min_tx_rate 200Mbps vf 1 MAC f6:c6:7c:3f:3d:6c vf 2 MAC 56:32:43:98:d7:71 vf 3 MAC d6:be:c3:b5:85:ff vf 4 MAC ee:a9:9a:1e:19:14 vf 5 MAC 4a:d0:4c:07:52:18 vf 6 MAC 3a:76:44:93:62:f9 vf 7 MAC 82:e9:e7:e3:15:1a Signed-off-by: Sucheta Chakraborty <sucheta.chakraborty@qlogic.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-05-23 15:04:02 -04:00
Johan Hedberg	d7b2545023	Bluetooth: Clearly distinguish mgmt LTK type from authenticated property On the mgmt level we have a key type parameter which currently accepts two possible values: 0x00 for unauthenticated and 0x01 for authenticated. However, in the internal struct smp_ltk representation we have an explicit "authenticated" boolean value. To make this distinction clear, add defines for the possible mgmt values and do conversion to and from the internal authenticated value. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-05-23 11:24:04 -07:00
John W. Linville	5ca2504ea3	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless into for-davem	2014-05-23 10:55:58 -04:00
Pravin B Shelar	0c200ef94c	openvswitch: Simplify genetlink code. Following patch get rid of struct genl_family_and_ops which is redundant due to changes to struct genl_family. Signed-off-by: Kyle Mestery <mestery@noironetworks.com> Acked-by: Kyle Mestery <mestery@noironetworks.com> Signed-off-by: Pravin B Shelar <pshelar@nicira.com>	2014-05-22 16:27:37 -07:00
Jarno Rajahalme	893f139b9a	openvswitch: Minimize ovs_flow_cmd_new\|set critical sections. Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com> Signed-off-by: Pravin B Shelar <pshelar@nicira.com>	2014-05-22 16:27:36 -07:00
Jarno Rajahalme	37bdc87ba0	openvswitch: Split ovs_flow_cmd_new_or_set(). Following patch will be easier to reason about with separate ovs_flow_cmd_new() and ovs_flow_cmd_set() functions. Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com> Signed-off-by: Pravin B Shelar <pshelar@nicira.com>	2014-05-22 16:27:36 -07:00
Jarno Rajahalme	aed067783e	openvswitch: Minimize ovs_flow_cmd_del critical section. ovs_flow_cmd_del() now allocates reply (if needed) after the flow has already been removed from the flow table. If the reply allocation fails, a netlink error is signaled with netlink_set_err(), as is already done in ovs_flow_cmd_new_or_set() in the similar situation. Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com> Signed-off-by: Pravin B Shelar <pshelar@nicira.com>	2014-05-22 16:27:36 -07:00
Jarno Rajahalme	0e9796b4af	openvswitch: Reduce locking requirements. Reduce and clarify locking requirements for ovs_flow_cmd_alloc_info(), ovs_flow_cmd_fill_info() and ovs_flow_cmd_build_info(). A datapath pointer is available only when holding a lock. Change ovs_flow_cmd_fill_info() and ovs_flow_cmd_build_info() to take a dp_ifindex directly, rather than a datapath pointer that is then (only) used to get the dp_ifindex. This is useful, since the dp_ifindex is available even when the datapath pointer is not, both before and after taking a lock, which makes further critical section reduction possible. Make ovs_flow_cmd_alloc_info() take an 'acts' argument instead a 'flow' pointer. This allows some future patches to do the allocation before acquiring the flow pointer. The locking requirements after this patch are: ovs_flow_cmd_alloc_info(): May be called without locking, must not be called while holding the RCU read lock (due to memory allocation). If 'acts' belong to a flow in the flow table, however, then the caller must hold ovs_mutex. ovs_flow_cmd_fill_info(): Either ovs_mutex or RCU read lock must be held. ovs_flow_cmd_build_info(): This calls both of the above, so the caller must hold ovs_mutex. Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com> Signed-off-by: Pravin B Shelar <pshelar@nicira.com>	2014-05-22 16:27:36 -07:00
Jarno Rajahalme	86ec8dbae2	openvswitch: Fix ovs_flow_stats_get/clear RCU dereference. For ovs_flow_stats_get() using ovsl_dereference() was wrong, since flow dumps call this with RCU read lock. ovs_flow_stats_clear() is always called with ovs_mutex, so can use ovsl_dereference(). Also, make the ovs_flow_stats_get() 'flow' argument const to make later patches cleaner. Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com> Signed-off-by: Pravin B Shelar <pshelar@nicira.com>	2014-05-22 16:27:35 -07:00
Jarno Rajahalme	eb07265904	openvswitch: Fix typo. Incorrect struct name was confusing, even though otherwise inconsequental. Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com> Signed-off-by: Pravin B Shelar <pshelar@nicira.com>	2014-05-22 16:27:35 -07:00
Jarno Rajahalme	6093ae9aba	openvswitch: Minimize dp and vport critical sections. Move most memory allocations away from the ovs_mutex critical sections. vport allocations still happen while the lock is taken, as changing that would require major refactoring. Also, vports are created very rarely so it should not matter. Change ovs_dp_cmd_get() now only takes the rcu_read_lock(), rather than ovs_lock(), as nothing need to be changed. This was done by ovs_vport_cmd_get() already. Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com> Signed-off-by: Pravin B Shelar <pshelar@nicira.com>	2014-05-22 16:27:35 -07:00
Jarno Rajahalme	56c19868e1	openvswitch: Make flow mask removal symmetric. Masks are inserted when flows are inserted to the table, so it is logical to correspondingly remove masks when flows are removed from the table, in ovs_flow_table_remove(). This allows ovs_flow_free() to be called without locking, which will be used by later patches. Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com> Signed-off-by: Pravin B Shelar <pshelar@nicira.com>	2014-05-22 16:27:35 -07:00
Jarno Rajahalme	fb5d1e9e12	openvswitch: Build flow cmd netlink reply only if needed. Use netlink_has_listeners() and NLM_F_ECHO flag to determine if a reply is needed or not for OVS_FLOW_CMD_NEW, OVS_FLOW_CMD_SET, or OVS_FLOW_CMD_DEL. Currently, OVS userspace does not request a reply for OVS_FLOW_CMD_NEW, but usually does for OVS_FLOW_CMD_DEL, as stats may have changed. Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com> Signed-off-by: Pravin B Shelar <pshelar@nicira.com>	2014-05-22 16:27:34 -07:00
Jarno Rajahalme	bb6f9a708d	openvswitch: Clarify locking. Remove unnecessary locking from functions that are always called with appropriate locking. Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com> Signed-off-by: Thomas Graf <tgraf@redhat.com> Signed-off-by: Pravin B Shelar <pshelar@nicira.com>	2014-05-22 16:27:34 -07:00
Jarno Rajahalme	be52c9e96a	openvswitch: Avoid assigning a NULL pointer to flow actions. Flow SET can accept an empty set of actions, with the intended semantics of leaving existing actions unmodified. This seems to have been brokin after OVS 1.7, as we have assigned the flow's actions pointer to NULL in this case, but we never check for the NULL pointer later on. This patch restores the intended behavior and documents it in the include/linux/openvswitch.h. Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com> Signed-off-by: Pravin B Shelar <pshelar@nicira.com>	2014-05-22 16:27:34 -07:00
Jarno Rajahalme	1139e241ec	openvswitch: Compact sw_flow_key. Minimize padding in sw_flow_key and move 'tp' top the main struct. These changes simplify code when accessing the transport port numbers and the tcp flags, and makes the sw_flow_key 8 bytes smaller on 64-bit systems (128->120 bytes). These changes also make the keys for IPv4 packets to fit in one cache line. There is a valid concern for safety of packing the struct ovs_key_ipv4_tunnel, as it would be possible to take the address of the tun_id member as a __be64 * which could result in unaligned access in some systems. However: - sw_flow_key itself is 64-bit aligned, so the tun_id within is always 64-bit aligned. - We never make arrays of ovs_key_ipv4_tunnel (which would force every second tun_key to be misaligned). - We never take the address of the tun_id in to a __be64 *. - Whereever we use struct ovs_key_ipv4_tunnel outside the sw_flow_key, it is in stack (on tunnel input functions), where compiler has full control of the alignment. Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com> Signed-off-by: Pravin B Shelar <pshelar@nicira.com>	2014-05-22 16:27:34 -07:00
Cong Wang	b6ed549860	batman: fix a bogus warning from batadv_is_on_batman_iface() batman tries to search dev->iflink to check if it's a batman interface, but ->iflink could be 0, which is not a valid ifindex. It should just avoid iflink == 0 case. Reported-by: Jet Chen <jet.chen@intel.com> Tested-by: Jet Chen <jet.chen@intel.com> Cc: David S. Miller <davem@davemloft.net> Cc: Steffen Klassert <steffen.klassert@secunet.com> Cc: Antonio Quartulli <antonio@open-mesh.com> Cc: Marek Lindner <mareklindner@neomailbox.ch> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: Cong Wang <cwang@twopensource.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-05-22 17:23:00 -04:00
David S. Miller	65db611a5c	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec-next Steffen Klassert says: ==================== pull request (net-next): ipsec-next 2014-05-22 This is the last ipsec pull request before I leave for a three weeks vacation tomorrow. David, can you please take urgent ipsec patches directly into net/net-next during this time? I'll continue to run the ipsec/ipsec-next trees as soon as I'm back. 1) Simplify the xfrm audit handling, from Tetsuo Handa. 2) Codingstyle cleanup for xfrm_output, from abian Frederick. Please pull or let me know if there are problems. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2014-05-22 16:00:00 -04:00
NeilBrown	ef11ce2487	SUNRPC: track whether a request is coming from a loop-back interface. If an incoming NFS request is coming from the local host, then nfsd will need to perform some special handling. So detect that possibility and make the source visible in rq_local. Signed-off-by: NeilBrown <neilb@suse.de> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-05-22 15:59:18 -04:00
Li RongQing	fbdc0ad095	ipv4: initialise the itag variable in __mkroute_input the value of itag is a random value from stack, and may not be initiated by fib_validate_source, which called fib_combine_itag if CONFIG_IP_ROUTE_CLASSID is not set This will make the cached dst uncertainty Signed-off-by: Li RongQing <roy.qing.li@gmail.com> Acked-by: Alexei Starovoitov <ast@plumgrid.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-05-22 15:57:36 -04:00
Trond Myklebust	c789102c20	SUNRPC: Fix a module reference leak in svc_handle_xprt If the accept() call fails, we need to put the module reference. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Cc: stable@vger.kernel.org Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-05-22 15:57:22 -04:00
Chuck Lever	16e4d93f6d	NFSD: Ignore client's source port on RDMA transports An NFS/RDMA client's source port is meaningless for RDMA transports. The transport layer typically sets the source port value on the connection to a random ephemeral port. Currently, NFS server administrators must specify the "insecure" export option to enable clients to access exports via RDMA. But this means NFS clients can access such an export via IP using an ephemeral port, which may not be desirable. This patch eliminates the need to specify the "insecure" export option to allow NFS/RDMA clients access to an export. BugLink: https://bugzilla.linux-nfs.org/show_bug.cgi?id=250 Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-05-22 15:55:48 -04:00
Dan Carpenter	b3f7a7b48f	ieee802154: missing put_dev() on error We should call put_dev() on the error path here. Fixes: `3e9c156e2c` ('ieee802154: add netlink interfaces for llsec') Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-05-22 15:54:45 -04:00
Cong Wang	b1282726d5	bridge: make br_device_notifier static Merge net/bridge/br_notify.c into net/bridge/br.c, since it has only br_device_event() and br.c is small. Cc: Stephen Hemminger <stephen@networkplumber.org> Cc: David S. Miller <davem@davemloft.net> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-05-22 15:33:47 -04:00
Chen Gang	5c4a43b024	net/dccp/timer.c: use 'u64' instead of 's64' to avoid compiler's warning 'dccp_timestamp_seed' is initialized once by ktime_get_real() in dccp_timestamping_init(). It is always less than ktime_get_real() in dccp_timestamp(). Then, ktime_us_delta() in dccp_timestamp() will always return positive number. So can use manual type cast to let compiler and do_div() know about it to avoid warning. The related warning (with allmodconfig under unicore32): CC [M] net/dccp/timer.o net/dccp/timer.c: In function ‘dccp_timestamp’: net/dccp/timer.c:285: warning: comparison of distinct pointer types lacks a cast Signed-off-by: Chen Gang <gang.chen.5i5j@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-05-22 15:31:45 -04:00
Phoebe Buckheister	53819a6ced	mac802154: llsec: correctly lookup implicit-indexed keys Key id comparison for type 1 keys (implicit source, with index) should return true if mode and id are equal, not false. Signed-off-by: Phoebe Buckheister <phoebe.buckheister@itwm.fraunhofer.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-05-22 15:27:32 -04:00
Phoebe Buckheister	62e9c117ee	mac802154: llsec: fold useless return value check llsec_do_encrypt will never return a positive value, so the restriction to 0-or-negative on return is useless. Signed-off-by: Phoebe Buckheister <phoebe.buckheister@itwm.fraunhofer.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-05-22 15:24:13 -04:00
Phoebe Buckheister	6f3eabcd04	mac802154: llsec: fix incorrect lock pairing In encrypt, sec->lock is taken with read_lock_bh, so in the error path, we must read_unlock_bh. Signed-off-by: Phoebe Buckheister <phoebe.buckheister@itwm.fraunhofer.de> Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-05-22 15:24:13 -04:00
Michal Kubeček	da08143b85	vlan: more careful checksum features handling When combining real_dev's features and vlan_features, simple bitwise AND is used. This doesn't work well for checksum offloading features as if one set has NETIF_F_HW_CSUM and the other NETIF_F_IP_CSUM and/or NETIF_F_IPV6_CSUM, we end up with no checksum offloading. However, from the logical point of view (how can_checksum_protocol() works), NETIF_F_HW_CSUM contains the functionality of NETIF_F_IP_CSUM and NETIF_F_IPV6_CSUM so that the result should be IP/IPV6. Add helper function netdev_intersect_features() implementing this logic and use it in vlan_dev_fix_features(). Signed-off-by: Michal Kubecek <mkubecek@suse.cz> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-05-22 15:07:23 -04:00
Ezequiel Garcia	e876f208af	net: Add a software TSO helper API Although the implementation probably needs a lot of work, this initial API allows to implement software TSO in mvneta and mv643xx_eth drivers in a not so intrusive way. Signed-off-by: Ezequiel Garcia <ezequiel.garcia@free-electrons.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-05-22 14:57:15 -04:00
John W. Linville	40a10fd740	Merge git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211-next	2014-05-22 13:58:36 -04:00
John W. Linville	99abe65ff1	NFC: 3.16: First pull request This is the NFC pull request for 3.16. We have: - STMicroeectronics st21nfca support. The st21nfca is an HCI chipset and thus relies on the HCI stack. This submission provides support for tag redaer/writer mode (including Type 5) and device tree bindings. - PM runtime support and a bunch of bug fixes for TI's trf7970a. - Device tree support for NXP's pn544. Legacy platform data support is obviously kept intact. - NFC Tag type 4B support to the NFC Digital stack. - SOCK_RAW type support to the raw NFC socket, and allow NCI sniffing from that. This can be extended to report HCI frames and also proprietarry ones like e.g. the pn533 ones. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIcBAABAgAGBQJTepRlAAoJEIqAPN1PVmxKnF0P/RvfrZs6CbGNJC+dkEbk90p1 nsngy4+4MmPwJYVzObnLz4Br0k1kmFKiOKske6drjMpgzDWeuQelw3B7bd3FYfxD YkQsc5RC984xrDoDH5pn8mA6VJqmn7whrmcibTYAixrDqTvo8gw6uja4ryAnSdZm n7cRbh/A5F/sa7O4mPA0bCTdp4jAS/vOP9rGFDOth/b5yJVs99XmC+AZp/Ad9BUx +/osWGmBV5jshtX7aPTSxIQB4BUaP/lP1DW8yF5whKDjsHC9QyJcAtw9HfZ4tv2h YNteZZ8yjM+rSjnDw/LvDc2Gp8Z8P1GYf8D3QN3cWhw1ZvXi7CnqKjEnm41sbfaH L5esIfsRBUdmk6Ika7zALqmOQFI3PzH+ag96punl29qb2gyBDRSnXKVLirv3xxFG h7vYtQL43Rosn/4pSilRbYReRwyKbSCxW3un/tUJy0Faafs6q+9oMC2aWbIfTT6l 40n4H9EmzYy2OaaXSFckiIIYYgVDAji8GLXTf+dPHb+NrH3QQOR3m27WzHc4rmYk kUrv0lKoFswA+VLlIcJTrSKNF21FDjwuImzIWiPz6Fx/+rWJ0b4GlQyIynD72LpR 2LkUhTrxuRuRtxVCtvTdkPlL6Bdp3HO7t4qZ0EirgnpmGK6NScBgABoqFJSbz9uS UUvZbHVIjLrDU9zzoyz8 =cSl+ -----END PGP SIGNATURE----- Merge tag 'nfc-next-3.16-1' of git://git.kernel.org/pub/scm/linux/kernel/git/sameo/nfc-next Samuel Ortiz <sameo@linux.intel.com> says: "NFC: 3.16: First pull request This is the NFC pull request for 3.16. We have: - STMicroeectronics st21nfca support. The st21nfca is an HCI chipset and thus relies on the HCI stack. This submission provides support for tag redaer/writer mode (including Type 5) and device tree bindings. - PM runtime support and a bunch of bug fixes for TI's trf7970a. - Device tree support for NXP's pn544. Legacy platform data support is obviously kept intact. - NFC Tag type 4B support to the NFC Digital stack. - SOCK_RAW type support to the raw NFC socket, and allow NCI sniffing from that. This can be extended to report HCI frames and also proprietarry ones like e.g. the pn533 ones." Signed-off-by: John W. Linville <linville@tuxdriver.com>	2014-05-22 13:56:46 -04:00
David S. Miller	8af750d739	Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nftables Pablo Neira Ayuso says: ==================== Netfilter/nftables updates for net-next The following patchset contains Netfilter/nftables updates for net-next, most relevantly they are: 1) Add set element update notification via netlink, from Arturo Borrero. 2) Put all object updates in one single message batch that is sent to kernel-space. Before this patch only rules where included in the batch. This series also introduces the generic transaction infrastructure so updates to all objects (tables, chains, rules and sets) are applied in an all-or-nothing fashion, these series from me. 3) Defer release of objects via call_rcu to reduce the time required to commit changes. The assumption is that all objects are destroyed in reverse order to ensure that dependencies betweem them are fulfilled (ie. rules and sets are destroyed first, then chains, and finally tables). 4) Allow to match by bridge port name, from Tomasz Bursztyka. This series include two patches to prepare this new feature. 5) Implement the proper set selection based on the characteristics of the data. The new infrastructure also allows you to specify your preferences in terms of memory and computational complexity so the underlying set type is also selected according to your needs, from Patrick McHardy. 6) Several cleanup patches for nft expressions, including one minor possible compilation breakage due to missing mark support, also from Patrick. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2014-05-22 12:06:23 -04:00
Neal Cardwell	ca8a226343	tcp: make cwnd-limited checks measurement-based, and gentler Experience with the recent `e114a710aa` ("tcp: fix cwnd limited checking to improve congestion control") has shown that there are common cases where that commit can cause cwnd to be much larger than necessary. This leads to TSO autosizing cooking skbs that are too large, among other things. The main problems seemed to be: (1) That commit attempted to predict the future behavior of the connection by looking at the write queue (if TSO or TSQ limit sending). That prediction sometimes overestimated future outstanding packets. (2) That commit always allowed cwnd to grow to twice the number of outstanding packets (even in congestion avoidance, where this is not needed). This commit improves both of these, by: (1) Switching to a measurement-based approach where we explicitly track the largest number of packets in flight during the past window ("max_packets_out"), and remember whether we were cwnd-limited at the moment we finished sending that flight. (2) Only allowing cwnd to grow to twice the number of outstanding packets ("max_packets_out") in slow start. In congestion avoidance mode we now only allow cwnd to grow if it was fully utilized. Signed-off-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-05-22 12:04:49 -04:00
Emmanuel Grumbach	67af981153	cfg80211: allow RSSI compensation Channels in 2.4GHz band overlap, this means that if we send a probe request on channel 1 and then move to channel 2, we will hear the probe response on channel 2. In this case, the RSSI will be lower than if we had heard it on the channel on which it was sent (1 in this case). The firmware / low level driver can parse the channel in the DS IE or HT IE and compensate the RSSI so that it will still have a valid value even if we heard the frame on an adjacent channel. This can be done up to a certain offset. Add this offset as a configuration for the low level driver. A low level driver that can compensate the low RSSI in this case should assign the maximal offset for which the RSSI value is still valid. Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-05-22 09:58:49 +02:00
Eric Dumazet	4de462ab63	ipv6: gro: fix CHECKSUM_COMPLETE support When GRE support was added in linux-3.14, CHECKSUM_COMPLETE handling broke on GRE+IPv6 because we did not update/use the appropriate csum : GRO layer is supposed to use/update NAPI_GRO_CB(skb)->csum instead of skb->csum Tested using a GRE tunnel and IPv6 traffic. GRO aggregation now happens at the first level (ethernet device) instead of being done in gre tunnel. Native IPv6+TCP is still properly aggregated. Fixes: `bf5a755f5e` ("net-gre-gro: Add GRE support to the GRO stack") Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Jerry Chu <hkchu@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-05-21 17:18:47 -04:00
Alexei Starovoitov	5fe821a9de	net: filter: cleanup invocation of internal BPF Kernel API for classic BPF socket filters is: sk_unattached_filter_create() - validate classic BPF, convert, JIT SK_RUN_FILTER() - run it sk_unattached_filter_destroy() - destroy socket filter Cleanup internal BPF kernel API as following: sk_filter_select_runtime() - final step of internal BPF creation. Try to JIT internal BPF program, if JIT is not available select interpreter SK_RUN_FILTER() - run it sk_filter_free() - free internal BPF program Disallow direct calls to BPF interpreter. Execution of the BPF program should be done with SK_RUN_FILTER() macro. Example of internal BPF create, run, destroy: struct sk_filter fp; fp = kzalloc(sk_filter_size(prog_len), GFP_KERNEL); memcpy(fp->insni, prog, prog_len sizeof(fp->insni[0])); fp->len = prog_len; sk_filter_select_runtime(fp); SK_RUN_FILTER(fp, ctx); sk_filter_free(fp); Sockets, seccomp, testsuite, tracing are using different ways to populate sk_filter, so first steps of program creation are not common. Signed-off-by: Alexei Starovoitov <ast@plumgrid.com> Acked-by: Daniel Borkmann <dborkman@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-05-21 17:07:17 -04:00
Cong Wang	bf63ac73b3	net_sched: fix an oops in tcindex filter Kelly reported the following crash: IP: [<ffffffff817a993d>] tcf_action_exec+0x46/0x90 PGD 3009067 PUD `300c067` PMD 11ff30067 PTE 800000011634b060 Oops: 0000 [#1] SMP DEBUG_PAGEALLOC CPU: 1 PID: 639 Comm: dhclient Not tainted 3.15.0-rc4+ #342 Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011 task: ffff8801169ecd00 ti: ffff8800d21b8000 task.ti: ffff8800d21b8000 RIP: 0010:[<ffffffff817a993d>] [<ffffffff817a993d>] tcf_action_exec+0x46/0x90 RSP: 0018:ffff8800d21b9b90 EFLAGS: 00010283 RAX: 00000000ffffffff RBX: ffff88011634b8e8 RCX: ffff8800cf7133d8 RDX: ffff88011634b900 RSI: ffff8800cf7133e0 RDI: ffff8800d210f840 RBP: ffff8800d21b9bb0 R08: ffffffff8287bf60 R09: 0000000000000001 R10: ffff8800d2b22b24 R11: 0000000000000001 R12: ffff8800d210f840 R13: ffff8800d21b9c50 R14: ffff8800cf7133e0 R15: ffff8800cad433d8 FS: 00007f49723e1840(0000) GS:ffff88011a800000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: ffff88011634b8f0 CR3: 00000000ce469000 CR4: 00000000000006e0 Stack: ffff8800d2170188 ffff8800d210f840 ffff8800d2171b90 0000000000000000 ffff8800d21b9be8 ffffffff817c55bb ffff8800d21b9c50 ffff8800d2171b90 ffff8800d210f840 ffff8800d21b0300 ffff8800d21b9c50 ffff8800d21b9c18 Call Trace: [<ffffffff817c55bb>] tcindex_classify+0x88/0x9b [<ffffffff817a7f7d>] tc_classify_compat+0x3e/0x7b [<ffffffff817a7fdf>] tc_classify+0x25/0x9f [<ffffffff817b0e68>] htb_enqueue+0x55/0x27a [<ffffffff817b6c2e>] dsmark_enqueue+0x165/0x1a4 [<ffffffff81775642>] __dev_queue_xmit+0x35e/0x536 [<ffffffff8177582a>] dev_queue_xmit+0x10/0x12 [<ffffffff818f8ecd>] packet_sendmsg+0xb26/0xb9a [<ffffffff810b1507>] ? __lock_acquire+0x3ae/0xdf3 [<ffffffff8175cf08>] __sock_sendmsg_nosec+0x25/0x27 [<ffffffff8175d916>] sock_aio_write+0xd0/0xe7 [<ffffffff8117d6b8>] do_sync_write+0x59/0x78 [<ffffffff8117d84d>] vfs_write+0xb5/0x10a [<ffffffff8117d96a>] SyS_write+0x49/0x7f [<ffffffff8198e212>] system_call_fastpath+0x16/0x1b This is because we memcpy struct tcindex_filter_result which contains struct tcf_exts, obviously struct list_head can not be simply copied. This is a regression introduced by commit `33be627159` (net_sched: act: use standard struct list_head). It's not very easy to fix it as the code is a mess: if (old_r) memcpy(&cr, r, sizeof(cr)); else { memset(&cr, 0, sizeof(cr)); tcf_exts_init(&cr.exts, TCA_TCINDEX_ACT, TCA_TCINDEX_POLICE); } ... tcf_exts_change(tp, &cr.exts, &e); ... memcpy(r, &cr, sizeof(cr)); the above code should equal to: tcindex_filter_result_init(&cr); if (old_r) cr.res = r->res; ... if (old_r) tcf_exts_change(tp, &r->exts, &e); else tcf_exts_change(tp, &cr.exts, &e); ... r->res = cr.res; after this change, since there is no need to copy struct tcf_exts. And it also fixes other places zero'ing struct's contains struct tcf_exts. Fixes: commit `33be627159` (net_sched: act: use standard struct list_head) Reported-by: Kelly Anderson <kelly@xilka.com> Tested-by: Kelly Anderson <kelly@xilka.com> Cc: David S. Miller <davem@davemloft.net> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-05-21 16:47:13 -04:00
Li RongQing	1495664355	ipv6: slight optimization in ip6_dst_gc entries is always greater than rt_max_size here, since if entries is less than rt_max_size, the fib6_run_gc function will be skipped Signed-off-by: Li RongQing <roy.qing.li@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-05-21 15:52:23 -04:00
Tom Gundersen	f98f89a010	net: tunnels - enable module autoloading Enable the module alias hookup to allow tunnel modules to be autoloaded on demand. This is in line with how most other netdev kinds work, and will allow userspace to create tunnels without having CAP_SYS_MODULE. Signed-off-by: Tom Gundersen <teg@jklm.no> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-05-21 15:46:52 -04:00
Arik Nemtsov	4d3df547e8	cfg80211: don't set reg timeout for user-handled hint Otherwise every "indoor" setting by usermode will cause a regdomain reset. Acked-by: Luis R. Rodriguez <mcgrof@suse.com> Signed-off-by: Arik Nemtsov <arikx.nemtsov@intel.com> Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-05-21 09:15:18 +02:00
Antonio Quartulli	7406353d43	cfg80211: implement cfg80211_get_station cfg80211 API Implement and export the new cfg80211_get_station() API. This utility can be used by other kernel modules to obtain detailed information about a given wireless station. It will be in particular useful to batman-adv which will implement a wireless rate based metric. Signed-off-by: Antonio Quartulli <antonio@open-mesh.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-05-21 09:15:17 +02:00
Antonio Quartulli	cca674d47e	mac80211: export the expected throughput Add get_expected_throughput() API to mac80211 so that each driver can implement its own version based on the RC algorithm they are using (might be using an HW RC algo). The API returns a value expressed in Kbps. Also, add the new get_expected_throughput() member to the rate_control_ops structure in order to be able to query the RC algorithm (this patch provides an implementation of this API for both minstrel and minstrel_ht). The related member in the station_info object is now filled accordingly when dumping a station. Cc: Felix Fietkau <nbd@openwrt.org> Signed-off-by: Antonio Quartulli <antonio@open-mesh.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-05-21 09:15:16 +02:00
Steffen Klassert	78ff4be45a	ip_tunnel: Initialize the fallback device properly We need to initialize the fallback device to have a correct mtu set on this device. Otherwise the mtu is set to null and the device is unusable. Fixes: `fd58156e45` ("IPIP: Use ip-tunneling code.") Cc: Pravin B Shelar <pshelar@nicira.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-05-21 02:08:32 -04:00
David S. Miller	d050de607f	Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf Pablo Neira Ayuso says: ==================== Netfilter/nftables fixes for net The following patchset contains nftables fixes for your net tree, they are: 1) Fix crash when using the goto action in a rule by making sure that we always fall back on the base chain. Otherwise, this may try to access the counter memory area of non-base chains, which does not exists. 2) Fix several aspects of the rule tracing that are currently broken: * Reset rule number counter after goto/jump action, otherwise the tracing reports a bogus rule number. * Fix tracing of the goto action. * Fix bogus rule number counter after goto. * Fix missing return trace after finishing the walk through the non-base chain. * Fix missing trace when matching non-terminal rule. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2014-05-21 01:24:19 -04:00
Johan Hedberg	1cc6114402	Bluetooth: Update smp_confirm to return a response code Now that smp_confirm() is called "inline" we can have it return a response code and have the sending of it be done in the shared place for command handlers. One exception is when we're entering smp.c from mgmt.c when user space responds to authentication, in which case we still need our own code to call smp_failure(). Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-05-20 08:44:14 -07:00
Johan Hedberg	861580a970	Bluetooth: Update smp_random to return a response code Since we're now calling smp_random() "inline" we can have it directly return a response code and have the shared command handler send the response. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-05-20 08:44:14 -07:00
Johan Hedberg	4a74d65868	Bluetooth: Rename smp->smp_flags to smp->flags There's no reason to have "smp" in this variable name since it is already part of the SMP struct which provides sufficient context. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-05-20 08:44:14 -07:00
Johan Hedberg	9dd4dd275f	Bluetooth: Remove unnecessary work structs from SMP code When the SMP code was initially created (mid-2011) parts of the Bluetooth subsystem were still not converted to use workqueues. This meant that the crypto calls, which could sleep, couldn't be called directly. Because of this the "confirm" and "random" work structs were introduced. These days the entire Bluetooth subsystem runs through workqueues which makes these structs unnecessary. This patch removes them and converts the calls to queue them to use direct function calls instead. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-05-20 08:44:13 -07:00
Johan Hedberg	1ef35827a9	Bluetooth: Fix setting initial local auth_req value There is no reason to have the initial local value conditional to whether the remote value has bonding set or not. We can either way start off with the value we received. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-05-20 08:44:12 -07:00
Johan Hedberg	4bc58f51e1	Bluetooth: Make SMP context private to smp.c There are no users of the smp_chan struct outside of smp.c so move it away from smp.h. The addition of the l2cap.h include to hci_core.c, hci_conn.c and mgmt.c is something that should have been there already previously to avoid warnings of undeclared struct l2cap_conn, but the compiler warning was apparently shadowed away by the mention of l2cap_conn in the struct smp_chan definition. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-05-20 08:44:11 -07:00
Antonio Quartulli	867d849fc8	cfg80211: export expected throughput through get_station() Users may need information about the expected throughput towards a given peer. This value is supposed to consider the size overhead generated by the 802.11 header. This value is exported in kbps through the get_station() API by including it into the station_info object. Moreover, it is sent to user space when replying to the nl80211 GET_STATION command. This information will be useful to the batman-adv module which will use it for its new metric computation. Signed-off-by: Antonio Quartulli <antonio@open-mesh.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-05-20 15:13:32 +02:00
Hiren Tandel	0515829642	NFC: NCI: Send all NCI frames to raw sockets So that anyone listening on SOCKPROTO_RAW for raw frames will get all NCI frames, in both directions. This actually implements userspace NFC NCI sniffing. It's now up to userspace to decode those frames. Signed-off-by: Hiren Tandel <hirent@marvell.com> Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>	2014-05-20 00:23:59 +02:00
Hiren Tandel	57be1f3f3e	NFC: Add RAW socket type support for SOCKPROTO_RAW This allows for a more generic NFC sniffing by using SOCKPROTO_RAW SOCK_RAW to read RAW NFC frames. This is for sniffing anything but LLCP (HCI, NCI, etc...). Signed-off-by: Hiren Tandel <hirent@marvell.com> Signed-off-by: Rahul Tank <rahult@marvell.com> Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>	2014-05-20 00:06:04 +02:00
Hiren Tandel	c79d9f9ef8	NFC: NCI: No need to reverse ATR_RES Response ATR_RES response received within Activation Parameters is already in correct order. Reversing it fails LLCP magic number check and so P2P functionality fails. Signed-off-by: Hiren Tandel <hirent@marvell.com> Signed-off-by: Rahul Tank <rahult@marvell.com> Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>	2014-05-19 23:58:08 +02:00
Mark A. Greer	4b8b6267be	NFC: digital: Handle multiple SENSF_REQ frames According to section 5.15.1.3 of the NFC Activity Specification, multiple SENSF_REQ commands can be received by a target before it receives an ATR_REQ command. To handle this, add a routine that checks whether a SENSF_REQ or ATR_REQ has been recieved. If its a SENSF_REQ, respond appropriately and continue waiting for a ATR_REQ. If its an ATR_REQ, handle it as before. CC: Thierry Escande <thierry.escande@linux.intel.com> Signed-off-by: Mark A. Greer <mgreer@animalcreek.com> Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>	2014-05-19 23:52:40 +02:00
Mark A. Greer	96e829b433	NFC: digital: SENSF_RES excludes RD when SENSF_REQ RC is zero The check in digital_tg_send_sensf_res() that excludes the 'RD' field from the SENSF_RES is inverted. The 'RD' field should be excluded when the SENSF_REQ 'RC' field is equal to DIGITAL_SENSF_REQ_RC_NONE instead of when its not equal. This is described in section 6.6.2.11 of the NFC Digital Specification. CC: Thierry Escande <thierry.escande@linux.intel.com> Signed-off-by: Mark A. Greer <mgreer@animalcreek.com> Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>	2014-05-19 23:52:37 +02:00
John W. Linville	20b4f9c73f	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth	2014-05-19 16:34:27 -04:00
Johannes Berg	922bd80fc3	cfg80211: constify wowlan/coalesce mask/pattern pointers This requires changing the nl80211 parsing code a bit to use intermediate pointers for the allocation, but clarifies the API towards the drivers. Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-05-19 18:06:50 +02:00
Johannes Berg	c1e5f4714d	cfg80211: constify more pointers in the cfg80211 API This also propagates through the drivers. The orinoco driver uses the cfg80211 API structs for internal bookkeeping, and so needs a (void *) cast that removes the const - but that's OK because it allocates those pointers. Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-05-19 17:53:16 +02:00
Johannes Berg	3b3a0162fa	cfg80211: constify MAC addresses in cfg80211 ops This propagates through all the drivers and mac80211. Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-05-19 17:34:42 +02:00
Johannes Berg	00591cea31	mac80211: minstrel-ht: small clarifications Antonio and I were looking over this code and some things didn't immediately make sense, so we came up with two small clarifications. Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-05-19 14:30:37 +02:00
Pablo Neira Ayuso	c7c32e72cb	netfilter: nf_tables: defer all object release via rcu Now that all objects are released in the reverse order via the transaction infrastructure, we can enqueue the release via call_rcu to save one synchronize_rcu. For small rule-sets loaded via nft -f, it now takes around 50ms less here. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2014-05-19 12:06:13 +02:00
Pablo Neira Ayuso	128ad3322b	netfilter: nf_tables: remove skb and nlh from context structure Instead of caching the original skbuff that contains the netlink messages, this stores the netlink message sequence number, the netlink portID and the report flag. This helps to prepare the introduction of the object release via call_rcu. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2014-05-19 12:06:13 +02:00
Pablo Neira Ayuso	35151d840c	netfilter: nf_tables: simplify nf_tables__notify Now that all these function are called from the commit path, we can pass the context structure to reduce the amount of parameters in all of the nf_tables__notify functions. This patch also removes unneeded branches to check for skb, nlh and net that should be always set in the context structure. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2014-05-19 12:06:12 +02:00
Pablo Neira Ayuso	60319eb1ca	netfilter: nf_tables: use new transaction infrastructure to handle elements Leave the set content in consistent state if we fail to load the batch. Use the new generic transaction infrastructure to achieve this. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2014-05-19 12:06:12 +02:00
Pablo Neira Ayuso	55dd6f9307	netfilter: nf_tables: use new transaction infrastructure to handle table This patch speeds up rule-set updates and it also provides a way to revert updates and leave things in consistent state in case that the batch needs to be aborted. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2014-05-19 12:06:12 +02:00
Pablo Neira Ayuso	e1aaca93ee	netfilter: nf_tables: pass context to nf_tables_updtable() So nf_tables_uptable() only takes one single parameter. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2014-05-19 12:06:11 +02:00
Pablo Neira Ayuso	f75edf5e9c	netfilter: nf_tables: disabling table hooks always succeeds nf_tables_table_disable() always succeeds, make this function void. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2014-05-19 12:06:11 +02:00
Pablo Neira Ayuso	91c7b38dc9	netfilter: nf_tables: use new transaction infrastructure to handle chain This patch speeds up rule-set updates and it also introduces a way to revert chain updates if the batch is aborted. The idea is to store the changes in the transaction to apply that in the commit step. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2014-05-19 12:06:11 +02:00
Pablo Neira Ayuso	ff3cd7b3c9	netfilter: nf_tables: refactor chain statistic routines Add new routines to encapsulate chain statistics allocation and replacement. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2014-05-19 12:06:11 +02:00
Pablo Neira Ayuso	958bee14d0	netfilter: nf_tables: use new transaction infrastructure to handle sets This patch reworks the nf_tables API so set updates are included in the same batch that contains rule updates. This speeds up rule-set updates since we skip a dialog of four messages between kernel and user-space (two on each direction), from: 1) create the set and send netlink message to the kernel 2) process the response from the kernel that contains the allocated name. 3) add the set elements and send netlink message to the kernel. 4) process the response from the kernel (to check for errors). To: 1) add the set to the batch. 2) add the set elements to the batch. 3) add the rule that points to the set. 4) send batch to the kernel. This also introduces an internal set ID (NFTA_SET_ID) that is unique in the batch so set elements and rules can refer to new sets. Backward compatibility has been only retained in userspace, this means that new nft versions can talk to the kernel both in the new and the old fashion. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2014-05-19 12:06:10 +02:00
Pablo Neira Ayuso	b380e5c733	netfilter: nf_tables: add message type to transactions The patch adds message type to the transaction to simplify the commit the and abort routines. Yet another step forward in the generalisation of the transaction infrastructure. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2014-05-19 12:06:10 +02:00
Pablo Neira Ayuso	37082f930b	netfilter: nf_tables: relocate commit and abort routines in the source file Move the commit and abort routines to the bottom of the source code file. This change is required by the follow up patches that add the set, chain and table transaction support. This patch is just a cleanup to access several functions without having to declare their prototypes. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2014-05-19 12:06:10 +02:00
Pablo Neira Ayuso	1081d11b08	netfilter: nf_tables: generalise transaction infrastructure This patch generalises the existing rule transaction infrastructure so it can be used to handle set, table and chain object transactions as well. The transaction provides a data area that stores private information depending on the transaction type. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2014-05-19 12:06:10 +02:00
Pablo Neira Ayuso	7c95f6d866	netfilter: nf_tables: deconstify table and chain in context structure The new transaction infrastructure updates the family, table and chain objects in the context structure, so let's deconstify them. While at it, move the context structure initialization routine to the top of the source file as it will be also used from the table and chain routines. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2014-05-19 12:06:09 +02:00
Oliver Hartkopp	45c700291a	can: add hash based access to single EFF frame filters In contrast to the direct access to the single SFF frame filters (which are indexed by the SFF CAN ID itself) the single EFF frame filters are arranged in a single linked hlist. To reduce the hlist traversal in the case of many filter subscriptions a hash based access is introduced for single EFF filters. Signed-off-by: Oliver Hartkopp <socketcan@hartkopp.net> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>	2014-05-19 09:38:24 +02:00
Oliver Hartkopp	e3d3917f3d	can: proc: make array printing function indenpendent from sff frames The can_rcvlist_sff_proc_show_one() function which prints the array of filters for the single SFF CAN identifiers is prepared to be used by a second caller. Therefore it is also renamed to properly describe its future functionality. Signed-off-by: Oliver Hartkopp <socketcan@hartkopp.net> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>	2014-05-19 09:38:24 +02:00
David S. Miller	b6052af61a	Included changes: - fix codestyle to respect new checkpatch warnings - increase internal version number -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.22 (GNU/Linux) iQIcBAABCAAGBQJTeLMvAAoJEEKTMo6mOh1VKtgP/RuR34USuUbY/xMZ9/Rn2/E7 z1qn6hh8hlw+Hd+Vn+9BvDJzwn+Baneu1c3SMP08kE+pAst0n788y/f/pVzfToJk Gll0sOVHiSm05M0QQ0Vq57H+rxoFv2KACM1t2+NMW+pB+PsSYG5y87b6I+0hR4Pv lbBCNmgIxY2alxM8qab2Zlt+cCUdkKUnI67P0LtVnMh91JuKwsheOdR+Smxz2+2g J+2Bzcz+NIHhJP9c+QmJipV+gtIRjFr7+bebaXDm/eEBq/3f6cEhFtwa76CmCpI/ cAIMDFORCHB27qNMgKSuzFDdhF1qQJnZh8FX0dfRBXvH8NwxBOkjFh1CBJ3iwjm1 T7GBTLTKiv/JqdNjqrWJ9OxChl8I2jppevZdimq1VUjhv9117Jc73TnzazjULTST xr5PpZ1gRfruUVXl362otrtzm0N/hdqez+mYlkZEx/ERTDedLCZZAnjTsx5PPMG+ GXlbc1BWuQZuHpvs8uWMcnXDaWtNyNKKpvfRPuvLIST80F1Bw/KRd2FDH/AiO2tL 2eACn9ughC5XO9E+/iyfWm1MQMEwo/w9+EfWpnRWV9HtDuHepVGy59x3mCYH/bN0 7FP23lbaFw05i/UpsRRneqkzMJLk/16qLCiNoC8u2hEiqKzu0/celPwl7B16Fs4Z CU65LSN/QNU9q+AXVQOd =tdAQ -----END PGP SIGNATURE----- Merge tag 'batman-adv-for-davem' of git://git.open-mesh.org/linux-merge Included changes: - fix codestyle to respect new checkpatch warnings - increase internal version number	2014-05-18 21:27:09 -04:00
Manuel Schölling	71fd762f2e	net: rds: Use time_after() for time comparison To be future-proof and for better readability the time comparisons are modified to use time_after() instead of raw math. Signed-off-by: Manuel Schölling <manuel.schoelling@gmx.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-05-18 21:24:52 -04:00
stephen hemminger	614d056c8e	ipv4: minor spelling fix Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-05-18 21:10:29 -04:00
stephen hemminger	025559eec8	bridge: fix spelling of promiscuous Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-05-18 21:10:08 -04:00
Ben Hutchings	61d88c6811	ethtool: Disallow ETHTOOL_SRSSH with both indir table and hash key unchanged This would be a no-op, so there is no reason to request it. This also allows conversion of the current implementations of ethtool_ops::{get,set}_rxfh_indir to ethtool_ops::{get,set}_rxfh with no change other than their parameters. Signed-off-by: Ben Hutchings <ben@decadent.org.uk>	2014-05-19 01:29:42 +01:00
Ben Hutchings	7455fa2422	ethtool: Name the 'no change' value for setting RSS hash key but not indir table We usually allocate special values of u32 fields starting from the top down, so also change the value to 0xffffffff. As these operations haven't been included in a stable release yet, it's not too late to change. Signed-off-by: Ben Hutchings <ben@decadent.org.uk>	2014-05-19 01:18:19 +01:00
Ben Hutchings	fb95cd8d14	ethtool: Return immediately on error in ethtool_copy_validate_indir() We must return -EFAULT immediately rather than continuing into the loop. Similarly, we may as well return -EINVAL directly. Signed-off-by: Ben Hutchings <ben@decadent.org.uk>	2014-05-19 01:17:32 +01:00
Alexei Starovoitov	d4f0e0958d	net: bridge: fix build fix build when BRIDGE_VLAN_FILTERING is not set Fixes: `2796d0c648` ("bridge: Automatically manage port promiscuous mode") Signed-off-by: Alexei Starovoitov <ast@plumgrid.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-05-18 20:09:50 -04:00
Trond Myklebust	7a9a7b774f	SUNRPC: Fix a module reference issue in rpcsec_gss We're not taking a reference in the case where _gss_mech_get_by_pseudoflavor loops without finding the correct rpcsec_gss flavour, so why are we releasing it? Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>	2014-05-18 13:47:14 -04:00
Simon Wunderlich	871d3d9fdf	batman-adv: Start new development cycle Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de> Signed-off-by: Antonio Quartulli <antonio@meshcoding.com>	2014-05-18 15:04:00 +02:00
Antonio Quartulli	2b64df2058	batman-adv: remove semi-colon after macro definition Reported by checkpatch with the following warning: "WARNING: macros should not use a trailing semicolon" Signed-off-by: Antonio Quartulli <antonio@meshcoding.com> Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch>	2014-05-18 15:04:00 +02:00
Antonio Quartulli	f138694b15	batman-adv: add blank line between declarations and the rest of the code Reported by checkpatch with the following message: "WARNING: Missing a blank line after declarations" Signed-off-by: Antonio Quartulli <antonio@meshcoding.com> Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch>	2014-05-18 15:03:52 +02:00
Vlad Yasevich	44a4085538	bonding: Fix stacked device detection in arp monitoring Prior to commit `fbd929f2dc` bonding: support QinQ for bond arp interval the arp monitoring code allowed for proper detection of devices stacked on top of vlans. Since the above commit, the code can still detect a device stacked on top of single vlan, but not a device stacked on top of Q-in-Q configuration. The search will only set the inner vlan tag if the route device is the vlan device. However, this is not always the case, as it is possible to extend the stacked configuration. With this patch it is possible to provision devices on top Q-in-Q vlan configuration that should be used as a source of ARP monitoring information. For example: ip link add link bond0 vlan10 type vlan proto 802.1q id 10 ip link add link vlan10 vlan100 type vlan proto 802.1q id 100 ip link add link vlan100 type macvlan Note: This patch limites the number of stacked VLANs to 2, just like before. The original, however had another issue in that if we had more then 2 levels of VLANs, we would end up generating incorrectly tagged traffic. This is no longer possible. Fixes: `fbd929f2dc` (bonding: support QinQ for bond arp interval) CC: Jay Vosburgh <j.vosburgh@gmail.com> CC: Veaceslav Falico <vfalico@redhat.com> CC: Andy Gospodarek <andy@greyhouse.net> CC: Ding Tianhong <dingtianhong@huawei.com> CC: Patric McHardy <kaber@trash.net> Signed-off-by: Vlad Yasevich <vyasevic@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-05-16 22:29:05 -04:00
Vlad Yasevich	d38569ab2b	vlan: Fix lockdep warning with stacked vlan devices. This reverts commit `dc8eaaa006`. vlan: Fix lockdep warning when vlan dev handle notification Instead we use the new new API to find the lock subclass of our vlan device. This way we can support configurations where vlans are interspersed with other devices: bond -> vlan -> macvlan -> vlan Signed-off-by: Vlad Yasevich <vyasevic@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-05-16 22:14:49 -04:00
Vlad Yasevich	4085ebe8c3	net: Find the nesting level of a given device by type. Multiple devices in the kernel can be stacked/nested and they need to know their nesting level for the purposes of lockdep. This patch provides a generic function that determines a nesting level of a particular device by its type (ex: vlan, macvlan, etc). We only care about nesting of the same type of devices. For example: eth0 <- vlan0.10 <- macvlan0 <- vlan1.20 The nesting level of vlan1.20 would be 1, since there is another vlan in the stack under it. Signed-off-by: Vlad Yasevich <vyasevic@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-05-16 22:14:49 -04:00
Thomas Graf	97dc48e220	pktgen: Use seq_puts() where seq_printf() is not needed Signed-off-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-05-16 17:30:30 -04:00
Eric Dumazet	29e9824278	net: gro: make sure skb->cb[] initial content has not to be zero Starting from linux-3.13, GRO attempts to build full size skbs. Problem is the commit assumed one particular field in skb->cb[] was clean, but it is not the case on some stacked devices. Timo reported a crash in case traffic is decrypted before reaching a GRE device. Fix this by initializing NAPI_GRO_CB(skb)->last at the right place, this also removes one conditional. Thanks a lot to Timo for providing full reports and bisecting this. Fixes: `8a29111c7c` ("net: gro: allow to build full sized skb") Bisected-by: Timo Teras <timo.teras@iki.fi> Signed-off-by: Eric Dumazet <edumazet@google.com> Tested-by: Timo Teräs <timo.teras@iki.fi> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-05-16 17:24:54 -04:00
Phoebe Buckheister	f0f77dc6be	ieee802154, mac802154: implement devkey record option The 802.15.4-2011 standard states that for each key, a list of devices that use this key shall be kept. Previous patches have only considered two options: * a device "uses" (or may use) all keys, rendering the list useless * a device is restricted to a certain set of keys Another option would be that a device may use all keys, but need not do so, and we are interested in the actual set of keys the device uses. Recording keys used by any given device may have a noticable performance impact and might not be needed as often. The common case, in which a device will not switch keys too often, should still perform well. Signed-off-by: Phoebe Buckheister <phoebe.buckheister@itwm.fraunhofer.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-05-16 17:23:42 -04:00

... 6 7 8 9 10 ...

33558 Commits