Add a flag attribute to use in associations, for tagging the target
connection as supporting RRM. It is the responsibility of upper
layers to set this flag only if both the underlying device, and the
target network indeed support RRM.
To be used in ASSOCIATE and CONNECT commands.
Signed-off-by: Assaf Krauss <assaf.krauss@intel.com>
Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Our legal structure changed at some point (see wikipedia), but
we forgot to immediately switch over to the new copyright
notice.
For files that we have modified in the time since the change,
add the proper copyright notice now.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Our legal structure changed at some point (see wikipedia), but
we forgot to immediately switch over to the new copyright
notice.
For files that we have modified in the time since the change,
add the proper copyright notice now.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
This patch adds a new sysctl_mld_qrv knob to configure the mldv1/v2 query
robustness variable. It specifies how many retransmit of unsolicited mld
retransmit should happen. Admins might want to tune this on lossy links.
Also reset mld state on interface down/up, so we pick up new sysctl
settings during interface up event.
IPv6 certification requests this knob to be available.
I didn't make this knob netns specific, as it is mostly a setting in a
physical environment and should be per host.
Cc: Flavio Leitner <fbl@redhat.com>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Acked-by: Flavio Leitner <fbl@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
performance improvements, and various patches all over,
rather than listing them one might as well look into the
git log instead.
-----BEGIN PGP SIGNATURE-----
iQIcBAABCAAGBQJUAIx4AAoJEDBSmw7B7bqrUYcP/3t4qdFxm0bd4j2AEkl3mPwB
Qu7obTicOTfBRoJNEgS+8AU2u3PfztU6+ErZs4ETLUuqaZwXisqmwBiMo86+Wtdf
gx9KonwEW051g7YmB0+6EMwuy04MGzTEk8VavQwqM4g9LIPJ4Buo/kj7MNJ51m11
XyRmJqZJnKKeiiQ4eC0gPf8e44qiQqaDuYZ0r1UDnNRg2KrbAHlGTBKYI3VRl2u4
xRpPGVnHwT0qkWb1Zw9fk0VfPr9m1ETthzcZvnhk6uMnJ28D+1B1FjZR1GJU6BW7
Zx2FbevbZTjDoNT1GQpLGMXBuW0lsZFetXVFiJCr/StaPBtHmtdu28fuNVm8yJYz
euDlEgrE8F4npdec2F5R2zh7Ue2U7eMEL2uxxjciNSJOipHgx5EXH12Y/5QtrChy
4OHPbNHgpmqFB7TmkvHDgP/0A7XdyqKVc+NtIV+eECIwE4tHcJ6A+bQ+ZCoRV2Vw
zmsNuNeNeDW7NEAw9veRXissLZMy/EjUnsOrnW29BpO/yG+2YjqpyQ6JQpcXeCPD
WQgl2FHpk6ap3jpVjxminxw2HkDnQ0oTKusGLcezalhUlWMo7VYNN59aLzcphxX5
Fotp/8v1sbDTF46uc/QJ38N5TqflwWeFpxvGkdNGuAT4llP03NaXV0ORBecFmMW2
esb+PLwlByCDeVFu53q+
=Qth6
-----END PGP SIGNATURE-----
Merge tag 'mac80211-next-for-john-2014-08-29' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211-next
Johannes Berg <johannes@sipsolutions.net> says:
"Not that much content this time. Some RCU cleanups, crypto
performance improvements, and various patches all over,
rather than listing them one might as well look into the
git log instead."
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Conflicts:
drivers/net/wireless/ath/wil6210/wmi.c
and adds a terminating NUL-byte to the alpha2 sent to userspace, which
shouldn't be necessary but since many places treat it as a string we
couldn't move to just sending two bytes.
In addition to that, we have two VLAN fixes from Felix, a mesh fix, a
fix for the recently introduced RX aggregation offload, a revert for
a broken patch (that luckily didn't really cause any harm) and a small
fix for alignment in debugfs.
-----BEGIN PGP SIGNATURE-----
iQIcBAABCAAGBQJUAFrmAAoJEDBSmw7B7bqr79AP/1yGi9lkv/wWUs5y0AhUSen9
850MU26BBlyAAFSz11xqgaEeRmeBeqhR3K7w/M02TX0CHxBzMqMZfyE//tq0UJaI
ZwZmtyQmdMiOSNKignTIIx7OHTioq0wrGKb6O2UvKoJfTlB9t01jCC4jmCTF5Vos
6ReF7NaZEbxW6XDOsClNTAtIa1c6n1RQ5VbDIEL5Vfvqv8LbcobduF8WcYl80eIQ
+EvIHtUm/Luxg6DblibgEVtwYOtNpvRz4pofdw3xoSHAnF+zhXbUr0dUjpkBNA7o
vWboCBl14Qn1M7pOJZ0+TBzFmquAr6CDbDvArVCH01Swh27EUDQUcHQAggGpT71w
DFgWHOYP0UCB6Y4U0GjBehy8PeuytqJLBSceKVud7DDqd8fY+Lq3MMyicIk0aw3o
IIDLWrujkCBXsdfuxQETmYxHU05WHSuYOCTgGSqbq3QPTWm8pBGWTdbk+1t/0FyH
cGLJOWs/jCrtHdzDj6TH+kL8NmvwB7sC9MT45qG0ilevmPW25yrnTJPEMEvFBqvZ
lnaqiX6D1kGNZd09CxgSIhxrQi+N0Yg+UlLa4IUtOIqnQussOC3xH2U5qTufdpa1
Gi9aCkBGVKQiObPWucf2QB4t1sZ18rxBhrAelZhQPLTKrnsuLhpcVBlU+L6ScCAk
FVni4HZH2IGtDQ577k10
=G/pB
-----END PGP SIGNATURE-----
Merge tag 'mac80211-for-john-2014-08-29' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211
Johannes Berg <johannes@sipsolutions.net> says:
"Here are a few fixes for mac80211. One has been discussed for a while
and adds a terminating NUL-byte to the alpha2 sent to userspace, which
shouldn't be necessary but since many places treat it as a string we
couldn't move to just sending two bytes.
In addition to that, we have two VLAN fixes from Felix, a mesh fix, a
fix for the recently introduced RX aggregation offload, a revert for
a broken patch (that luckily didn't really cause any harm) and a small
fix for alignment in debugfs."
Signed-off-by: John W. Linville <linville@redhat.com>
Move the specific NAT IPv4 core functions that are called from the
hooks from iptable_nat.c to nf_nat_l3proto_ipv4.c. This prepares the
ground to allow iptables and nft to use the same NAT engine code that
comes in a follow up patch.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Enable to specify local and remote prefix length thresholds for the
policy hash table via a netlink XFRM_MSG_NEWSPDINFO message.
prefix length thresholds are specified by XFRMA_SPD_IPV4_HTHRESH and
XFRMA_SPD_IPV6_HTHRESH optional attributes (struct xfrmu_spdhthresh).
example:
struct xfrmu_spdhthresh thresh4 = {
.lbits = 0;
.rbits = 24;
};
struct xfrmu_spdhthresh thresh6 = {
.lbits = 0;
.rbits = 56;
};
struct nlmsghdr *hdr;
struct nl_msg *msg;
msg = nlmsg_alloc();
hdr = nlmsg_put(msg, NL_AUTO_PORT, NL_AUTO_SEQ, XFRMA_SPD_IPV4_HTHRESH, sizeof(__u32), NLM_F_REQUEST);
nla_put(msg, XFRMA_SPD_IPV4_HTHRESH, sizeof(thresh4), &thresh4);
nla_put(msg, XFRMA_SPD_IPV6_HTHRESH, sizeof(thresh6), &thresh6);
nla_send_auto(sk, msg);
The numbers are the policy selector minimum prefix lengths to put a
policy in the hash table.
- lbits is the local threshold (source address for out policies,
destination address for in and fwd policies).
- rbits is the remote threshold (destination address for out
policies, source address for in and fwd policies).
The default values are:
XFRMA_SPD_IPV4_HTHRESH: 32 32
XFRMA_SPD_IPV6_HTHRESH: 128 128
Dynamic re-building of the SPD is performed when the thresholds values
are changed.
The current thresholds can be read via a XFRM_MSG_GETSPDINFO request:
the kernel replies to XFRM_MSG_GETSPDINFO requests by an
XFRM_MSG_NEWSPDINFO message, with both attributes
XFRMA_SPD_IPV4_HTHRESH and XFRMA_SPD_IPV6_HTHRESH.
Signed-off-by: Christophe Gouault <christophe.gouault@6wind.com>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
The idea is an extension of the current policy hashing.
Today only non-prefixed policies are stored in a hash table. This
patch relaxes the constraints, and hashes policies whose prefix
lengths are greater or equal to a configurable threshold.
Each hash table (one per direction) maintains its own set of IPv4 and
IPv6 thresholds (dbits4, sbits4, dbits6, sbits6), by default (32, 32,
128, 128).
Example, if the output hash table is configured with values (16, 24,
56, 64):
ip xfrm policy add dir out src 10.22.0.0/20 dst 10.24.1.0/24 ... => hashed
ip xfrm policy add dir out src 10.22.0.0/16 dst 10.24.1.1/32 ... => hashed
ip xfrm policy add dir out src 10.22.0.0/16 dst 10.24.0.0/16 ... => unhashed
ip xfrm policy add dir out \
src 3ffe:304:124:2200::/60 dst 3ffe:304:124:2401::/64 ... => hashed
ip xfrm policy add dir out \
src 3ffe:304:124:2200::/56 dst 3ffe:304:124:2401::2/128 ... => hashed
ip xfrm policy add dir out \
src 3ffe:304:124:2200::/56 dst 3ffe:304:124:2400::/56 ... => unhashed
The high order bits of the addresses (up to the threshold) are used to
compute the hash key.
Signed-off-by: Christophe Gouault <christophe.gouault@6wind.com>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
sk->sk_error_queue is dequeued in four locations. All share the
exact same logic. Deduplicate.
Also collapse the two critical sections for dequeue (at the top of
the recv handler) and signal (at the bottom).
This moves signal generation for the next packet forward, which should
be harmless.
It also changes the behavior if the recv handler exits early with an
error. Previously, a signal for follow-up packets on the errqueue
would then not be scheduled. The new behavior, to always signal, is
arguably a bug fix.
For rxrpc, the change causes the same function to be called repeatedly
for each queued packet (because the recv handler == sk_error_report).
It is likely that all packets will fail for the same reason (e.g.,
memory exhaustion).
This code runs without sk_lock held, so it is not safe to trust that
sk->sk_err is immutable inbetween releasing q->lock and the subsequent
test. Introduce int err just to avoid this potential race.
Signed-off-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
John W. Linville says:
====================
pull request: wireless 2014-08-28
Please pull this batch of fixes intended for the 3.17 stream.
For the Bluetooth/6LowPAN/802.15.4 bits, Johan says:
'It contains a connection reference counting fix for LE where a
connection might stay up even though it should get disconnected.
The other 802.15.4 6LoWPAN related patches were sent to the bluetooth
tree by Alexander Aring and described as follows by him:
"
these patches contains patches for the bluetooth branch.
This series includes memory leak fixes and an errno value fix.
Also there are two patches for sending and receiving 1280 6LoWPAN
packets, which makes the IEEE 802.15.4 6LoWPAN stack more RFC
compliant.
"'
Along with that...
Alexey Khoroshilov fixes a use-after-free bug on at76c50x-usb.
Hauke Mehrtens adds a PCI ID to bcma.
Himangi Saraogi fixes a silly "A || A" test in rtlwifi.
Larry Finger adds a device ID to rtl8192cu.
Maks Naumov fixes a strncmp argument in ath9k.
Álvaro Fernández Rojas adds a PCI ID to ssb.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Since SCTP day 1, that is, 19b55a2af145 ("Initial commit") from lksctp
tree, the official <netinet/sctp.h> header carries a copy of enum
sctp_sstat_state that looks like (compared to the current in-kernel
enumeration):
User definition: Kernel definition:
enum sctp_sstat_state { typedef enum {
SCTP_EMPTY = 0, <removed>
SCTP_CLOSED = 1, SCTP_STATE_CLOSED = 0,
SCTP_COOKIE_WAIT = 2, SCTP_STATE_COOKIE_WAIT = 1,
SCTP_COOKIE_ECHOED = 3, SCTP_STATE_COOKIE_ECHOED = 2,
SCTP_ESTABLISHED = 4, SCTP_STATE_ESTABLISHED = 3,
SCTP_SHUTDOWN_PENDING = 5, SCTP_STATE_SHUTDOWN_PENDING = 4,
SCTP_SHUTDOWN_SENT = 6, SCTP_STATE_SHUTDOWN_SENT = 5,
SCTP_SHUTDOWN_RECEIVED = 7, SCTP_STATE_SHUTDOWN_RECEIVED = 6,
SCTP_SHUTDOWN_ACK_SENT = 8, SCTP_STATE_SHUTDOWN_ACK_SENT = 7,
}; } sctp_state_t;
This header was later on also placed into the uapi, so that user space
programs can compile without having <netinet/sctp.h>, but the shipped
with <linux/sctp.h> instead.
While RFC6458 under 8.2.1.Association Status (SCTP_STATUS) says that
sstat_state can range from SCTP_CLOSED to SCTP_SHUTDOWN_ACK_SENT, we
nevertheless have a what it appears to be dummy SCTP_EMPTY state from
the very early days.
While it seems to do just nothing, commit 0b8f9e25b0 ("sctp: remove
completely unsed EMPTY state") did the right thing and removed this dead
code. That however, causes an off-by-one when the user asks the SCTP
stack via SCTP_STATUS API and checks for the current socket state thus
yielding possibly undefined behaviour in applications as they expect
the kernel to tell the right thing.
The enumeration had to be changed however as based on the current socket
state, we access a function pointer lookup-table through this. Therefore,
I think the best way to deal with this is just to add a helper function
sctp_assoc_to_state() to encapsulate the off-by-one quirk.
Reported-by: Tristan Su <sooqing@gmail.com>
Fixes: 0b8f9e25b0 ("sctp: remove completely unsed EMPTY state")
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Acked-by: Vlad Yasevich <vyasevich@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add support for the 4-bytes Broadcom tag that built-in switches such as
the Starfighter 2 might insert when receiving packets, or that we need
to insert while targetting specific switch ports. We use a fake local
EtherType value for this 4-bytes switch tag: ETH_P_BRCMTAG to make sure
we can assign DSA-specific network operations within the DSA drivers.
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Allow switch drivers to hook a PHY link update callback to perform
port-specific link work.
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Whenever libphy determines that the link status of a given PHY/port has
changed, allow to call into the switch driver link adjustment callback
so proper actions can be taken care of by the switch driver upon link
notification.
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In case switch port tagging is disabled (voluntarily, or the switch just
does not support it), allow us to continue using the defined set of
dsa_device_ops in net/dsa/slave.c.
We introduce dsa_protocol_is_tagged() to check whether we need to
override skb->protocol and go through the DSA-specifif packet_type
function, or if we just go on and receive the SKB through the normal
path.
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Modify the DSA slave interface to be bound to an arbitray PHY, not just
the ones that are available as child PHY devices of the switch MDIO bus.
This allows us for instance to have external PHYs connected to a
separate MDIO bus, but yet also connected to a given switch port.
Under certain configurations, the physical port mask might not be a 1:1
mapping to the MII PHYs mask. This is the case, if e.g: Port 1 of the
switch is used and connects to a PHY at a MDIO address different than 1.
Introduce a phys_mii_mask variable which allows driver to implement and
divert their own MDIO read/writes operations for a subset of the MDIO
PHY addresses.
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
We will later use the per-port device_node pointer to fetch a bunch of
port-specific properties.
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
We might need to fetch additional resources from the device tree node
pointer, such as register ranges or other properties. Keep a device_node
pointer around for this.
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
DSA is currently registering one packet_type function per EtherType it
needs to intercept in the receive path of a DSA-enabled Ethernet device.
Right now we have three of them: trailer, DSA and eDSA, and there might
be more in the future, this will not scale to the addition of new
protocols.
This patch proceeds with adding a new layer of abstraction and two new
functions:
dsa_switch_rcv() which will dispatch into the tag-protocol specific
receive function implemented by net/dsa/tag_*.c
dsa_slave_xmit() which will dispatch into the tag-protocol specific
transmit function implemented by net/dsa/tag_*.c
When we do create the per-port slave network devices, we iterate over
the switch protocol to assign the DSA-specific receive and transmit
operations.
A new fake ethertype value is used: ETH_P_XDSA to illustrate the fact
that this is no longer going to look like ETH_P_DSA or ETH_P_TRAILER
like it used to be.
This allows us to greatly simplify the check in eth_type_trans() and
always override the skb->protocol with ETH_P_XDSA for Ethernet switches
tagged protocol, while also reducing the number repetitive slave
netdevice_ops assignments.
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Replace uses of get_cpu_var for address calculation through this_cpu_ptr.
Cc: netdev@vger.kernel.org
Cc: Eric Dumazet <edumazet@google.com>
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Christoph Lameter <cl@linux.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
When using the cfg80211_inform_bss[_width]() functions drivers
cannot currently indicate whether the data was received in a
beacon or probe response. Fix that by passing a new enum that
indicates such (or unknown).
For good measure, use it in ath6kl.
Acked-by: Kalle Valo <kvalo@qca.qualcomm.com> [ath6kl]
Acked-by: Arend van Spriel <arend@broadcom.com> [brcmfmac]
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
There are a few possible cases of where BSS data came from:
1) only a beacon has been received
2) only a probe response has been received
3) the driver didn't report what it received (this happens when
using cfg80211_inform_bss[_width]())
4) both probe response and beacon data has been received
Unfortunately, in the userspace API, a few things weren't there:
a) there was no way to differentiate cases 1) and 4) above
without comparing the data of the IEs
b) the TSF was always from the last frame, instead of being
exposed for beacon/probe response separately like IEs
Fix this by
i) exporting a new flag attribute that indicates whether or
not probe response data has been received - this addresses (a)
ii) exporting a BEACON_TSF attribute that holds the beacon's TSF
if a beacon has been received
iii) not exporting the beacon attributes in case (3) above as that
would just lead userspace into thinking the data actually came
from a beacon when that isn't clear
To implement this, track inside the IEs struct whether or not it
(definitely) came from a beacon.
Reported-by: William Seto
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Header-less cloned skbs with sufficient headroom need not be cloned
unless the tailroom is going to be modified.
Fix ieee80211_skb_resize so it would only resize cloned skbs if either
the header isn't released or the tailroom is going to be modified.
Some drivers might have assumed that skbs are never cloned, so add a HW
flag that explicitly permits cloned TX skbs. Drivers which do not modify
TX skbs should set this flag to avoid copying skbs.
Signed-off-by: Ido Yariv <idox.yariv@intel.com>
Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
When hw acceleration is enabled, the GENERATE_IV or PUT_IV_SPACE flags
will only require headroom space. Consequently, the tailroom-needed
counter can safely be decremented.
Signed-off-by: Ido Yariv <idox.yariv@intel.com>
Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
In the cfg80211_rx_mgmt(), parameter @gfp was used for the memory allocation.
But, memory get allocated under spin_lock_bh(), this implies atomic context.
So, one can't use GFP_KERNEL, only variants with no __GFP_WAIT. Actually, in all
occurrences GFP_ATOMIC is used (wil6210 use GFP_KERNEL by mistake),
and it should be this way or warning triggered in the memory allocation code.
Remove @gfp parameter as no actual choice exist, and use hard coded
GFP_ATOMIC for memory allocation.
Signed-off-by: Vladimir Kondratiev <qca_vkondrat@qca.qualcomm.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Fixes: commit 690e36e726 (net: Allow raw buffers to be passed into the flow dissector)
Cc: David S. Miller <davem@davemloft.net>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Implement GRO for UDPv6. Add UDP checksum verification in gro_receive
for both UDP4 and UDP6 calling skb_gro_checksum_validate_zero_check.
Signed-off-by: Tom Herbert <therbert@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add inet_gro_compute_pseudo and ip6_gro_compute_pseudo. These are
the logical equivalents of inet_compute_pseudo and ip6_compute_pseudo
for GRO path. The IP header is taken from skb_gro_network_header.
Signed-off-by: Tom Herbert <therbert@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Drivers, and perhaps other entities we have not yet considered,
sometimes want to know how deep the protocol headers go before
deciding how large of an SKB to allocate and how much of the packet to
place into the linear SKB area.
For example, consider a driver which has a device which DMAs into
pools of pages and then tells the driver where the data went in the
DMA descriptor(s). The driver can then build an SKB and reference
most of the data via SKB fragments (which are page/offset/length
triplets).
However at least some of the front of the packet should be placed into
the linear SKB area, which comes before the fragments, so that packet
processing can get at the headers efficiently. The first thing each
protocol layer is going to do is a "pskb_may_pull()" so we might as
well aggregate as much of this as possible while we're building the
SKB in the driver.
Part of supporting this is that we don't have an SKB yet, so we want
to be able to let the flow dissector operate on a raw buffer in order
to compute the offset of the end of the headers.
So now we have a __skb_flow_dissect() which takes an explicit data
pointer and length.
Signed-off-by: David S. Miller <davem@davemloft.net>
ktime_get_ns() replaces ktime_to_ns(ktime_get())
ktime_get_real_ns() replaces ktime_to_ns(ktime_get_real())
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Recently the LE passive scanning and auto-connections feature was
introduced. It uses the hci_connect_le() API which returns a hci_conn
along with a reference count to that object. All previous users would
tie this returned reference to some existing object, such as an L2CAP
channel, and there'd be no leaked references this way. For
auto-connections however the reference was returned but not stored
anywhere, leaving established connections with one higher reference
count than they should have.
Instead of playing special tricks with hci_conn_hold/drop this patch
associates the returned reference from hci_connect_le() with the object
that in practice does own this reference, i.e. the hci_conn_params
struct that caused us to initiate a connection in the first place. Once
the connection is established or fails to establish this reference is
removed appropriately.
One extra thing needed is to call hci_pend_le_actions_clear() before
calling hci_conn_hash_flush() so that the reference is cleared before
the hci_conn objects are fully removed.
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
This patch drops the userspace accessable sysfs entry for the maximum
datagram size of a 6LoWPAN fragment packet.
A fragment should not have a datagram size value greater than 1280 byte.
Instead of make this value configurable, we accept 1280 datagram size
fragment packets only.
Signed-off-by: Martin Townsend <martin.townsend@xsilon.com>
Signed-off-by: Alexander Aring <alex.aring@gmail.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
We currently track the QoS capability twice: for all peer stations
in the WLAN_STA_WME flag, and for any clients associated to an AP
interface separately for drivers in the sta->sta.wme field.
Remove the WLAN_STA_WME flag and track the capability only in the
driver-visible field, getting rid of the limitation that the field
is only valid in AP mode.
Reviewed-by: Arik Nemtsov <arik@wizery.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
alpha2 is defined as 2-chars array, but is used in multiple
places as string (e.g. with nla_put_string calls), which
might leak kernel data.
Solve it by simply adding an extra char for the NULL
terminator, making such operations safe.
Cc: stable@vger.kernel.org
Signed-off-by: Eliad Peller <eliadx.peller@intel.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
tcp_tw_recycle heavily relies on tcp timestamps to build a per-host
ordering of incoming connections and teardowns without the need to
hold state on a specific quadruple for TCP_TIMEWAIT_LEN, but only for
the last measured RTO. To do so, we keep the last seen timestamp in a
per-host indexed data structure and verify if the incoming timestamp
in a connection request is strictly greater than the saved one during
last connection teardown. Thus we can verify later on that no old data
packets will be accepted by the new connection.
During moving a socket to time-wait state we already verify if timestamps
where seen on a connection. Only if that was the case we let the
time-wait socket expire after the RTO, otherwise normal TCP_TIMEWAIT_LEN
will be used. But we don't verify this on incoming SYN packets. If a
connection teardown was less than TCP_PAWS_MSL seconds in the past we
cannot guarantee to not accept data packets from an old connection if
no timestamps are present. We should drop this SYN packet. This patch
closes this loophole.
Please note, this patch does not make tcp_tw_recycle in any way more
usable but only adds another safety check:
Sporadic drops of SYN packets because of reordering in the network or
in the socket backlog queues can happen. Users behing NAT trying to
connect to a tcp_tw_recycle enabled server can get caught in blackholes
and their connection requests may regullary get dropped because hosts
behind an address translator don't have synchronized tcp timestamp clocks.
tcp_tw_recycle cannot work if peers don't have tcp timestamps enabled.
In general, use of tcp_tw_recycle is disadvised.
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Florian Westphal <fw@strlen.de>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Make sure we use the correct address-family-specific function for
handling MTU reductions from within tcp_release_cb().
Previously AF_INET6 sockets were incorrectly always using the IPv6
code path when sometimes they were handling IPv4 traffic and thus had
an IPv4 dst.
Signed-off-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Diagnosed-by: Willem de Bruijn <willemb@google.com>
Fixes: 563d34d057 ("tcp: dont drop MTU reduction indications")
Reviewed-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Now that there are no-longer any users for l2cap_conn->security_timer we
can go ahead and simply remove it. The patch makes initialization of the
conn->info_timer unconditional since it's better not to leave any
l2cap_conn data structures uninitialized no matter what the underlying
transport.
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Since we no-longer do special handling of SMP within l2cap_core.c we
don't have any code for calling l2cap_conn_del() when smp.c doesn't like
the data it gets. At the same time we cannot simply export
l2cap_conn_del() since it will try to lock the channels it calls into
whereas we already hold the lock in the smp.c l2cap_chan callbacks (i.e.
it'd lead to a deadlock).
This patch adds a new l2cap_conn_shutdown() API which is very similar to
l2cap_conn_del() except that it defers the call to l2cap_conn_del()
through a workqueue, thereby making it safe to use it from an L2CAP
channel callback.
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Now that we have all the necessary pieces in place we can fully convert
SMP to use the L2CAP channel infrastructure. This patch adds the
necessary callbacks and removes the now unneeded conn->smp_chan pointer.
One notable behavioral change in this patch comes from the following
code snippet:
- case L2CAP_CID_SMP:
- if (smp_sig_channel(conn, skb))
- l2cap_conn_del(conn->hcon, EACCES);
This piece of code was essentially forcing a disconnection if garbage
SMP data was received. The l2cap_conn_del() function is private to
l2cap_conn.c so we don't have access to it anymore when using the L2CAP
channel callbacks. Therefore, the behavior of the new code is simply to
return errors in the recv() callback (which is simply the old
smp_sig_channel()), but no disconnection will occur.
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Now that we have per-adapter SMP data thanks to the root SMP L2CAP
channel we can take advantage of it and attach the AES crypto context
(only used for SMP) to it. This means that the smp_irk_matches() and
smp_generate_rpa() function can be converted to internally handle the
AES context.
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
This patch creates the initial SMP L2CAP channels and a skeleton for
their callbacks. There is one per-adapter channel created upon adapter
registration, and then one channel per-connection created through the
new_connection callback. The channels are registered with the reserved
CID 0x1f for now in order to not conflict with existing SMP
functionality. Once everything is in place the value can be changed to
what it should be, i.e. L2CAP_CID_SMP.
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
In preparation for converting SMP to use l2cap_chan it's useful to add a
few more callback helpers so that smp.c won't need to define all of its
own.
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
The LE ATT socket uses a special trick where it temporarily sets
BT_CONFIG state for the duration of a security level elevation. In order
to not require special hacks for going back to BT_CONNECTED state in the
l2cap_core.c code the most reasonable place to resume the state is the
resume callback. This patch adds a new flag to track the pending
security level change and ensures that the state is set back to
BT_CONNECTED in the resume callback in case the flag is set.
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Similar to our hci_update_background_scan() function we can simplify a
lot of code by creating a unified helper function for doing page scan
updates. This patch adds such a function to hci_core.c and updates all
the relevant places to use it.
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
There are several situations where we're interested in knowing whether
we're currently in the process of powering off an adapter. This patch
adds a convenience function for the purpose and makes it public since
we'll soon need to access it from hci_event.c as well.
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Pull SElinux fixes from Paul Moore:
"Two small patches to fix a couple of build warnings in SELinux and
NetLabel. The patches are obvious enough that I don't think any
additional explanation is necessary, but it basically boils down to
the usual: I was stupid, and these patches fix some of the stupid.
Both patches were posted earlier this week to the SELinux list, and
that is where they sat as I didn't think there were noteworthy enough
to go upstream at this point in time, but DaveM would rather see them
upstream now so who am I to argue. As the patches are both very
small"
* 'stable-3.17' of git://git.infradead.org/users/pcmoore/selinux:
selinux: remove unused variabled in the netport, netnode, and netif caches
netlabel: fix the netlbl_catmap_setlong() dummy function
When I added the netlbl_catmap_setlong() function I mistakenly forgot
to mark the associated dummy function as an inline.
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Paul Moore <pmoore@redhat.com>
sock_tx_timestamp() should not ignore initial *tx_flags value, as TCP
stack can store SKBTX_SHARED_FRAG in it.
Also first argument (struct sock *) can be const.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Fixes: 4ed2d765df ("net-timestamp: TCP timestamping")
Cc: Willem de Bruijn <willemb@google.com>
Acked-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull networking updates from David Miller:
"Highlights:
1) Steady transitioning of the BPF instructure to a generic spot so
all kernel subsystems can make use of it, from Alexei Starovoitov.
2) SFC driver supports busy polling, from Alexandre Rames.
3) Take advantage of hash table in UDP multicast delivery, from David
Held.
4) Lighten locking, in particular by getting rid of the LRU lists, in
inet frag handling. From Florian Westphal.
5) Add support for various RFC6458 control messages in SCTP, from
Geir Ola Vaagland.
6) Allow to filter bridge forwarding database dumps by device, from
Jamal Hadi Salim.
7) virtio-net also now supports busy polling, from Jason Wang.
8) Some low level optimization tweaks in pktgen from Jesper Dangaard
Brouer.
9) Add support for ipv6 address generation modes, so that userland
can have some input into the process. From Jiri Pirko.
10) Consolidate common TCP connection request code in ipv4 and ipv6,
from Octavian Purdila.
11) New ARP packet logger in netfilter, from Pablo Neira Ayuso.
12) Generic resizable RCU hash table, with intial users in netlink and
nftables. From Thomas Graf.
13) Maintain a name assignment type so that userspace can see where a
network device name came from (enumerated by kernel, assigned
explicitly by userspace, etc.) From Tom Gundersen.
14) Automatic flow label generation on transmit in ipv6, from Tom
Herbert.
15) New packet timestamping facilities from Willem de Bruijn, meant to
assist in measuring latencies going into/out-of the packet
scheduler, latency from TCP data transmission to ACK, etc"
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1536 commits)
cxgb4 : Disable recursive mailbox commands when enabling vi
net: reduce USB network driver config options.
tg3: Modify tg3_tso_bug() to handle multiple TX rings
amd-xgbe: Perform phy connect/disconnect at dev open/stop
amd-xgbe: Use dma_set_mask_and_coherent to set DMA mask
net: sun4i-emac: fix memory leak on bad packet
sctp: fix possible seqlock seadlock in sctp_packet_transmit()
Revert "net: phy: Set the driver when registering an MDIO bus device"
cxgb4vf: Turn off SGE RX/TX Callback Timers and interrupts in PCI shutdown routine
team: Simplify return path of team_newlink
bridge: Update outdated comment on promiscuous mode
net-timestamp: ACK timestamp for bytestreams
net-timestamp: TCP timestamping
net-timestamp: SCHED timestamp on entering packet scheduler
net-timestamp: add key to disambiguate concurrent datagrams
net-timestamp: move timestamp flags out of sk_flags
net-timestamp: extend SCM_TIMESTAMPING ancillary data struct
cxgb4i : Move stray CPL definitions to cxgb4 driver
tcp: reduce spurious retransmits due to transient SACK reneging
qlcnic: Initialize dcbnl_ops before register_netdev
...
Pull security subsystem updates from James Morris:
"In this release:
- PKCS#7 parser for the key management subsystem from David Howells
- appoint Kees Cook as seccomp maintainer
- bugfixes and general maintenance across the subsystem"
* 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security: (94 commits)
X.509: Need to export x509_request_asymmetric_key()
netlabel: shorter names for the NetLabel catmap funcs/structs
netlabel: fix the catmap walking functions
netlabel: fix the horribly broken catmap functions
netlabel: fix a problem when setting bits below the previously lowest bit
PKCS#7: X.509 certificate issuer and subject are mandatory fields in the ASN.1
tpm: simplify code by using %*phN specifier
tpm: Provide a generic means to override the chip returned timeouts
tpm: missing tpm_chip_put in tpm_get_random()
tpm: Properly clean sysfs entries in error path
tpm: Add missing tpm_do_selftest to ST33 I2C driver
PKCS#7: Use x509_request_asymmetric_key()
Revert "selinux: fix the default socket labeling in sock_graft()"
X.509: x509_request_asymmetric_keys() doesn't need string length arguments
PKCS#7: fix sparse non static symbol warning
KEYS: revert encrypted key change
ima: add support for measuring and appraising firmware
firmware_class: perform new LSM checks
security: introduce kernel_fw_from_file hook
PKCS#7: Missing inclusion of linux/err.h
...
Conflicts:
drivers/net/Makefile
net/ipv6/sysctl_net_ipv6.c
Two ipv6_table_template[] additions overlap, so the index
of the ipv6_table[x] assignments needed to be adjusted.
In the drivers/net/Makefile case, we've gotten rid of the
garbage whereby we had to list every single USB networking
driver in the top-level Makefile, there is just one
"USB_NETWORKING" that guards everything.
Signed-off-by: David S. Miller <davem@davemloft.net>
Datagrams timestamped on transmission can coexist in the kernel stack
and be reordered in packet scheduling. When reading looped datagrams
from the socket error queue it is not always possible to unique
correlate looped data with original send() call (for application
level retransmits). Even if possible, it may be expensive and complex,
requiring packet inspection.
Introduce a data-independent ID mechanism to associate timestamps with
send calls. Pass an ID alongside the timestamp in field ee_data of
sock_extended_err.
The ID is a simple 32 bit unsigned int that is associated with the
socket and incremented on each send() call for which software tx
timestamp generation is enabled.
The feature is enabled only if SOF_TIMESTAMPING_OPT_ID is set, to
avoid changing ee_data for existing applications that expect it 0.
The counter is reset each time the flag is reenabled. Reenabling
does not change the ID of already submitted data. It is possible
to receive out of order IDs if the timestamp stream is not quiesced
first.
Signed-off-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
sk_flags is reaching its limit. New timestamping options will not fit.
Move all of them into a new field sk->sk_tsflags.
Added benefit is that this removes boilerplate code to convert between
SOF_TIMESTAMPING_.. and SOCK_TIMESTAMPING_.. in getsockopt/setsockopt.
SOCK_TIMESTAMPING_RX_SOFTWARE is also used to toggle the receive
timestamp logic (netstamp_needed). That can be simplified and this
last key removed, but will leave that for a separate patch.
Signed-off-by: Willem de Bruijn <willemb@google.com>
----
The u16 in sock can be moved into a 16-bit hole below sk_gso_max_segs,
though that scatters tstamp fields throughout the struct.
Signed-off-by: David S. Miller <davem@davemloft.net>
Applications that request kernel tx timestamps with SO_TIMESTAMPING
read timestamps as recvmsg() ancillary data. The response is defined
implicitly as timespec[3].
1) define struct scm_timestamping explicitly and
2) add support for new tstamp types. On tx, scm_timestamping always
accompanies a sock_extended_err. Define previously unused field
ee_info to signal the type of ts[0]. Introduce SCM_TSTAMP_SND to
define the existing behavior.
The reception path is not modified. On rx, no struct similar to
sock_extended_err is passed along with SCM_TIMESTAMPING.
Signed-off-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This commit reduces spurious retransmits due to apparent SACK reneging
by only reacting to SACK reneging that persists for a short delay.
When a sequence space hole at snd_una is filled, some TCP receivers
send a series of ACKs as they apparently scan their out-of-order queue
and cumulatively ACK all the packets that have now been consecutiveyly
received. This is essentially misbehavior B in "Misbehaviors in TCP
SACK generation" ACM SIGCOMM Computer Communication Review, April
2011, so we suspect that this is from several common OSes (Windows
2000, Windows Server 2003, Windows XP). However, this issue has also
been seen in other cases, e.g. the netdev thread "TCP being hoodwinked
into spurious retransmissions by lack of timestamps?" from March 2014,
where the receiver was thought to be a BSD box.
Since snd_una would temporarily be adjacent to a previously SACKed
range in these scenarios, this receiver behavior triggered the Linux
SACK reneging code path in the sender. This led the sender to clear
the SACK scoreboard, enter CA_Loss, and spuriously retransmit
(potentially) every packet from the entire write queue at line rate
just a few milliseconds before the ACK for each packet arrives at the
sender.
To avoid such situations, now when a sender sees apparent reneging it
does not yet retransmit, but rather adjusts the RTO timer to give the
receiver a little time (max(RTT/2, 10ms)) to send us some more ACKs
that will restore sanity to the SACK scoreboard. If the reneging
persists until this RTO then, as before, we clear the SACK scoreboard
and enter CA_Loss.
A 10ms delay tolerates a receiver sending such a stream of ACKs at
56Kbit/sec. And to allow for receivers with slower or more congested
paths, we wait for at least RTT/2.
We validated the resulting max(RTT/2, 10ms) delay formula with a mix
of North American and South American Google web server traffic, and
found that for ACKs displaying transient reneging:
(1) 90% of inter-ACK delays were less than 10ms
(2) 99% of inter-ACK delays were less than RTT/2
In tests on Google web servers this commit reduced reneging events by
75%-90% (as measured by the TcpExtTCPSACKReneging counter), without
any measurable impact on latency for user HTTP and SPDY requests.
Signed-off-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: Yuchung Cheng <ycheng@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Conflicts:
net/6lowpan/iphc.c
Minor conflicts in iphc.c were changes overlapping with some
style cleanups.
John W. Linville says:
====================
Please pull this last(?) batch of wireless change intended for the
3.17 stream...
For the NFC bits, Samuel says:
"This is a rather quiet one, we have:
- A new driver from ST Microelectronics for their NCI ST21NFCB,
including device tree support.
- p2p support for the ST21NFCA driver
- A few fixes an enhancements for the NFC digital laye"
For the Atheros bits, Kalle says:
"Michal and Janusz did some important RX aggregation fixes, basically we
were missing RX reordering altogether. The 10.1 firmware doesn't support
Ad-Hoc mode and Michal fixed ath10k so that it doesn't advertise Ad-Hoc
support with that firmware. Also he implemented a workaround for a KVM
issue."
For the Bluetooth bits, Gustavo and Johan say:
"To quote Gustavo from his previous request:
'Some last minute fixes for -next. We have a fix for a use after free in
RFCOMM, another fix to an issue with ADV_DIRECT_IND and one for ADV_IND with
auto-connection handling. Last, we added support for reading the codec and
MWS setting for controllers that support these features.'
Additionally there are fixes to LE scanning, an update to conform to the 4.1
core specification as well as fixes for tracking the page scan state. All
of these fixes are important for 3.17."
And,
"We've got:
- 6lowpan fixes/cleanups
- A couple crash fixes, one for the Marvell HCI driver and another in LE SMP.
- Fix for an incorrect connected state check
- Fix for the bondable requirement during pairing (an issue which had
crept in because of using "pairable" when in fact the actual meaning
was "bondable" (these have different meanings in Bluetooth)"
Along with those are some late-breaking hardware support patches in
brcmfmac and b43 as well as a stray ath9k patch.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Use kmem_cache to allocate/free inet_frag_queue objects since they're
all the same size per inet_frags user and are alloced/freed in high volumes
thus making it a perfect case for kmem_cache.
Signed-off-by: Nikolay Aleksandrov <nikolay@redhat.com>
Acked-by: Florian Westphal <fw@strlen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Move the flags to an enum definion, swap FIRST_IN/LAST_IN to be in increasing
order and add comments explaining each flag and the inet_frag_queue struct
members.
Signed-off-by: Nikolay Aleksandrov <nikolay@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The last_in field has been used to store various flags different from
first/last frag in so give it a more descriptive name: flags.
Signed-off-by: Nikolay Aleksandrov <nikolay@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Historically the NetLabel LSM secattr catmap functions and data
structures have had very long names which makes a mess of the NetLabel
code and anyone who uses NetLabel. This patch renames the catmap
functions and structures from "*_secattr_catmap_*" to just "*_catmap_*"
which improves things greatly.
There are no substantial code or logic changes in this patch.
Signed-off-by: Paul Moore <pmoore@redhat.com>
Tested-by: Casey Schaufler <casey@schaufler-ca.com>
The NetLabel secattr catmap functions, and the SELinux import/export
glue routines, were broken in many horrible ways and the SELinux glue
code fiddled with the NetLabel catmap structures in ways that we
probably shouldn't allow. At some point this "worked", but that was
likely due to a bit of dumb luck and sub-par testing (both inflicted
by yours truly). This patch corrects these problems by basically
gutting the code in favor of something less obtuse and restoring the
NetLabel abstractions in the SELinux catmap glue code.
Everything is working now, and if it decides to break itself in the
future this code will be much easier to debug than the code it
replaces.
One noteworthy side effect of the changes is that it is no longer
necessary to allocate a NetLabel catmap before calling one of the
NetLabel APIs to set a bit in the catmap. NetLabel will automatically
allocate the catmap nodes when needed, resulting in less allocations
when the lowest bit is greater than 255 and less code in the LSMs.
Cc: stable@vger.kernel.org
Reported-by: Christian Evans <frodox@zoho.com>
Signed-off-by: Paul Moore <pmoore@redhat.com>
Tested-by: Casey Schaufler <casey@schaufler-ca.com>
The NetLabel category (catmap) functions have a problem in that they
assume categories will be set in an increasing manner, e.g. the next
category set will always be larger than the last. Unfortunately, this
is not a valid assumption and could result in problems when attempting
to set categories less than the startbit in the lowest catmap node.
In some cases kernel panics and other nasties can result.
This patch corrects the problem by checking for this and allocating a
new catmap node instance and placing it at the front of the list.
Cc: stable@vger.kernel.org
Reported-by: Christian Evans <frodox@zoho.com>
Signed-off-by: Paul Moore <pmoore@redhat.com>
Tested-by: Casey Schaufler <casey@schaufler-ca.com>
The SCTP socket extensions API document describes the v4mapping option as
follows:
8.1.15. Set/Clear IPv4 Mapped Addresses (SCTP_I_WANT_MAPPED_V4_ADDR)
This socket option is a Boolean flag which turns on or off the
mapping of IPv4 addresses. If this option is turned on, then IPv4
addresses will be mapped to V6 representation. If this option is
turned off, then no mapping will be done of V4 addresses and a user
will receive both PF_INET6 and PF_INET type addresses on the socket.
See [RFC3542] for more details on mapped V6 addresses.
This description isn't really in line with what the code does though.
Introduce addr_to_user (renamed addr_v4map), which should be called
before any sockaddr is passed back to user space. The new function
places the sockaddr into the correct format depending on the
SCTP_I_WANT_MAPPED_V4_ADDR option.
Audit all places that touched v4mapped and either sanely construct
a v4 or v6 address then call addr_to_user, or drop the
unnecessary v4mapped check entirely.
Audit all places that call addr_to_user and verify they are on a sycall
return path.
Add a custom getname that formats the address properly.
Several bugs are addressed:
- SCTP_I_WANT_MAPPED_V4_ADDR=0 often returned garbage for
addresses to user space
- The addr_len returned from recvmsg was not correct when
returning AF_INET on a v6 socket
- flowlabel and scope_id were not zerod when promoting
a v4 to v6
- Some syscalls like bind and connect behaved differently
depending on v4mapped
Tested bind, getpeername, getsockname, connect, and recvmsg for proper
behaviour in v4mapped = 1 and 0 cases.
Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
Tested-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com>
Signed-off-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pablo Neira Ayuso says:
====================
Netfilter updates for net-next
The following patchset contains netfilter updates for net-next, they are:
1) Add the reject expression for the nf_tables bridge family, this
allows us to send explicit reject (TCP RST / ICMP dest unrech) to
the packets matching a rule.
2) Simplify and consolidate the nf_tables set dumping logic. This uses
netlink control->data to filter out depending on the request.
3) Perform garbage collection in xt_hashlimit using a workqueue instead
of a timer, which is problematic when many entries are in place in
the tables, from Eric Dumazet.
4) Remove leftover code from the removed ulog target support, from
Paul Bolle.
5) Dump unmodified flags in the netfilter packet accounting when resetting
counters, so userspace knows that a counter was in overquota situation,
from Alexey Perevalov.
6) Fix wrong usage of the bitwise functions in nfnetlink_acct, also from
Alexey.
7) Fix a crash when adding new set element with an empty NFTA_SET_ELEM_LIST
attribute.
This patchset also includes a couple of cleanups for xt_LED from
Duan Jiong and for nf_conntrack_ipv4 (using coccinelle) from
Himangi Saraogi.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Ipv4 tunnels created with "local any remote $ip" didn't work properly since
7d442fab0 (ipv4: Cache dst in tunnels). 99% of packets sent via those tunnels
had src addr = 0.0.0.0. That was because only dst_entry was cached, although
fl4.saddr has to be cached too. Every time ip_tunnel_xmit used cached dst_entry
(tunnel_rtable_get returned non-NULL), fl4.saddr was initialized with
tnl_params->saddr (= 0 in our case), and wasn't changed until iptunnel_xmit().
This patch adds saddr to ip_tunnel->dst_cache, fixing this issue.
Reported-by: Sergey Popov <pinkbyte@gentoo.org>
Signed-off-by: Dmitry Popov <ixaphire@qrator.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
This setting maps to the HCI_BONDABLE flag which tracks whether we're
bondable or not. Therefore, rename the mgmt setting and respective
command accordingly.
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
The HCI_PAIRABLE flag isn't actually controlling whether we're pairable
but whether we're bondable. Therefore, rename it accordingly.
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
This define is unused since commit
96cb3eb7a1 ("6lowpan: fix fragmentation on
sending side"). It is a worst case scenario for payload calculation.
Since commit 96cb3eb7a1 we calculation the
payload to use the optimal size.
This define is also necessary for ieee802154 6lowpan only and the file
include/net/6lowpan.h should contain generic 6lowpan things only.
Signed-off-by: Alexander Aring <alex.aring@gmail.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
This patch removes the own implementation to check of link-layer,
broadcast and any address type and use the IPv6 api for that.
Signed-off-by: Alexander Aring <alex.aring@gmail.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
We create a proc dir for each network device, this will cause
conflicts when the devices have name "all" or "default".
Rather than emitting an ugly kernel warning, we could just
fail earlier by checking the device name.
Reported-by: Stephane Chazelas <stephane.chazelas@gmail.com>
Cc: "David S. Miller" <davem@davemloft.net>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The SO_TIMESTAMPING API defines three types of timestamps: software,
hardware in raw format (hwtstamp) and hardware converted to system
format (syststamp). The last has been deprecated in favor of combining
hwtstamp with a PTP clock driver. There are no active users in the
kernel.
The option was device driver dependent. If set, but without hardware
support, the correct behavior is to return zero in the relevant field
in the SCM_TIMESTAMPING ancillary message. Without device drivers
implementing the option, this field is effectively always zero.
Remove the internal plumbing to dissuage new drivers from implementing
the feature. Keep the SOF_TIMESTAMPING_SYS_HARDWARE flag, however, to
avoid breaking existing applications that request the timestamp.
Signed-off-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This is the NFC pull request for 3.17.
This is a rather quiet one, we have:
- A new driver from ST Microelectronics for their NCI ST21NFCB,
including device tree support.
- p2p support for the ST21NFCA driver
- A few fixes an enhancements for the NFC digital layer
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJT0CJhAAoJEIqAPN1PVmxKtnAP/jOu1cP/EsegdByqmEumjSGQ
mBv7wC+/r91wV6Hny8Ij8uxyN+/QfxF75nwPjTrPSApo7mHOlF8FeY0/EktxJazo
c/NIAntNwREIbpkc68CIQNrYr9YFkKEM+7lCV1ImALUb/CPfiH7Fx7vGdhkCKqkc
B8etkLyeJtl9EqSZM7GI2YrEbPzPEWLk2ydVQ9BccxvN0I8rc29DnD+DNR5sL6Wa
7bEmZF5J/GnVErvnS8uHDGgerpzlFAj7MONrsxADZCHPie0F87T3wXAGwKlaBY6R
nOde0YCS749JxN1c2AdmgIadwo5XqjeSbjQ1g8L8HJTWY8Tl9Vw8GoQFN9qhHkSM
gkOya77n0R8SZWoJzo3BFBMpsncVG2cJIsnwpRBIWMUzg76mGe7Fzl21KCXD8xyy
xvy8Ar3QKPDTu6uvLNEPk9+cfl/JxQAoLNL30eeZGBDBSg/g2ptNiBYNLXvVoEtU
B/xyTdmA1SXQnOKGKNxjFCo+WZDXSoTrWeml/uvBhprVAj6YS3K/imc9EiL4zcD7
72iNGZbIZRfw91x7VbbQ5Nb8PEyYsLef8ztUFM9HgliNgIDMHaQHoXYwmD4264uO
RGTETdHYQb0ltX3HsBiIgTNF+Y1sJUVP3TyEhjSk58TpCy/S6v9YjcT9RYmNpuPk
YftdxfapAKV0EJhcQc1r
=wR0y
-----END PGP SIGNATURE-----
Merge tag 'nfc-next-3.17-1' of git://git.kernel.org/pub/scm/linux/kernel/git/sameo/nfc-next
Samuel Ortiz <sameo@linux.intel.com> says:
"NFC: 3.17 pull request
This is the NFC pull request for 3.17.
This is a rather quiet one, we have:
- A new driver from ST Microelectronics for their NCI ST21NFCB,
including device tree support.
- p2p support for the ST21NFCA driver
- A few fixes an enhancements for the NFC digital layer"
Signed-off-by: John W. Linville <linville@tuxdriver.com>
In "Counting Packets Sent Between Arbitrary Internet Hosts", Jeffrey and
Jedidiah describe ways exploiting linux IP identifier generation to
infer whether two machines are exchanging packets.
With commit 73f156a6e8 ("inetpeer: get rid of ip_id_count"), we
changed IP id generation, but this does not really prevent this
side-channel technique.
This patch adds a random amount of perturbation so that IP identifiers
for a given destination [1] are no longer monotonically increasing after
an idle period.
Note that prandom_u32_max(1) returns 0, so if generator is used at most
once per jiffy, this patch inserts no hole in the ID suite and do not
increase collision probability.
This is jiffies based, so in the worst case (HZ=1000), the id can
rollover after ~65 seconds of idle time, which should be fine.
We also change the hash used in __ip_select_ident() to not only hash
on daddr, but also saddr and protocol, so that ICMP probes can not be
used to infer information for other protocols.
For IPv6, adds saddr into the hash as well, but not nexthdr.
If I ping the patched target, we can see ID are now hard to predict.
21:57:11.008086 IP (...)
A > target: ICMP echo request, seq 1, length 64
21:57:11.010752 IP (... id 2081 ...)
target > A: ICMP echo reply, seq 1, length 64
21:57:12.013133 IP (...)
A > target: ICMP echo request, seq 2, length 64
21:57:12.015737 IP (... id 3039 ...)
target > A: ICMP echo reply, seq 2, length 64
21:57:13.016580 IP (...)
A > target: ICMP echo request, seq 3, length 64
21:57:13.019251 IP (... id 3437 ...)
target > A: ICMP echo reply, seq 3, length 64
[1] TCP sessions uses a per flow ID generator not changed by this patch.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Jeffrey Knockel <jeffk@cs.unm.edu>
Reported-by: Jedidiah R. Crandall <crandall@cs.unm.edu>
Cc: Willy Tarreau <w@1wt.eu>
Cc: Hannes Frederic Sowa <hannes@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
John W. Linville says:
====================
pull request: wireless-next 2014-07-25
Please pull this batch of updates intended for the 3.17 stream!
For the mac80211 bits, Johannes says:
"We have a lot of TDLS patches, among them a fix that should make hwsim
tests happy again. The rest, this time, is mostly small fixes."
For the Bluetooth bits, Gustavo says:
"Some more patches for 3.17. The most important change here is the move of
the 6lowpan code to net/6lowpan. It has been agreed with Davem that this
change will go through the bluetooth tree. The rest are mostly clean up and
fixes."
and,
"Here follows some more patches for 3.17. These are mostly fixes to what
we've sent to you before for next merge window."
For the iwlwifi bits, Emmanuel says:
"I have the usual amount of BT Coex stuff. Arik continues to work
on TDLS and Ariej contributes a few things for HS2.0. I added a few
more things to the firmware debugging infrastructure. Eran fixes a
small bug - pretty normal content."
And for the Atheros bits, Kalle says:
"For ath6kl me and Jessica added support for ar6004 hw3.0, our latest
version of ar6004.
For ath10k Janusz added a printout so that it's easier to check what
ath10k kconfig options are enabled. He also added a debugfs file to
configure maximum amsdu and ampdu values. Also we had few fixes as
usual."
On top of that is the usual large batch of various driver updates --
brcmfmac, mwifiex, the TI drivers, and wil6210 all get some action.
Rafał has also been very busy with b43 and related updates.
Also, I pulled the wireless tree into this in order to resolve a
merge conflict...
P.S. The change to fs/compat_ioctl.c reflects a name change in a
Bluetooth header file...
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Change formal parameter name to not shadow the global jiffies.
Signed-off-by: Mark Rustad <mark.d.rustad@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
rehash is rare operation, don't force readers to take
the read-side rwlock.
Instead, we only have to detect the (rare) case where
the secret was altered while we are trying to insert
a new inetfrag queue into the table.
If it was changed, drop the bucket lock and recompute
the hash to get the 'new' chain bucket that we have to
insert into.
Joint work with Nikolay Aleksandrov.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Nikolay Aleksandrov <nikolay@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
merge functionality into the eviction workqueue.
Instead of rebuilding every n seconds, take advantage of the upper
hash chain length limit.
If we hit it, mark table for rebuild and schedule workqueue.
To prevent frequent rebuilds when we're completely overloaded,
don't rebuild more than once every 5 seconds.
ipfrag_secret_interval sysctl is now obsolete and has been marked as
deprecated, it still can be changed so scripts won't be broken but it
won't have any effect. A comment is left above each unused secret_timer
variable to avoid confusion.
Joint work with Nikolay Aleksandrov.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Nikolay Aleksandrov <nikolay@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The 'nqueues' counter is protected by the lru list lock,
once thats removed this needs to be converted to atomic
counter. Given this isn't used for anything except for
reporting it to userspace via /proc, just remove it.
We still report the memory currently used by fragment
reassembly queues.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
When the high_thresh limit is reached we try to toss the 'oldest'
incomplete fragment queues until memory limits are below the low_thresh
value. This happens in softirq/packet processing context.
This has two drawbacks:
1) processors might evict a queue that was about to be completed
by another cpu, because they will compete wrt. resource usage and
resource reclaim.
2) LRU list maintenance is expensive.
But when constantly overloaded, even the 'least recently used' element is
recent, so removing 'lru' queue first is not 'fairer' than removing any
other fragment queue.
This moves eviction out of the fast path:
When the low threshold is reached, a work queue is scheduled
which then iterates over the table and removes the queues that exceed
the memory limits of the namespace. It sets a new flag called
INET_FRAG_EVICTED on the evicted queues so the proper counters will get
incremented when the queue is forcefully expired.
When the high threshold is reached, no more fragment queues are
created until we're below the limit again.
The LRU list is now unused and will be removed in a followup patch.
Joint work with Nikolay Aleksandrov.
Suggested-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Nikolay Aleksandrov <nikolay@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
First step to move eviction handling into a work queue.
We lose two spots that accounted evicted fragments in MIB counters.
Accounting will be restored since the upcoming work-queue evictor
invokes the frag queue timer callbacks instead.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Store the default values for minimum and maximum advertising interval
with all the other controller defaults. These vaules are sent to the
adapter whenever advertising is (re)enabled.
Signed-off-by: Georg Lukas <georg@op-co.de>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
The ulog targets were recently killed. A few references to the Kconfig
macros CONFIG_IP_NF_TARGET_ULOG and CONFIG_BRIDGE_EBT_ULOG were left
untouched. Kill these too.
Signed-off-by: Paul Bolle <pebolle@tiscali.nl>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
When adding remote devices to the kernel using the Add Device management
command, these devices are explicitly allowed to connect. This kind of
incoming connections are possible even when the controller itself is
not connectable.
For BR/EDR this distinction is pretty simple since there is only one
type of incoming connections. With LE this is not that simple anymore
since there are ADV_IND and ADV_DIRECT_IND advertising events.
The ADV_DIRECT_IND advertising events are send for incoming (slave
initiated) connections only. And this is the only thing the kernel
should allow when adding devices using action 0x01. This meaning
of incoming connections is coming from BR/EDR and needs to be
mapped to LE the same way.
Supporting the auto-connection of devices using ADV_IND advertising
events is an important feature as well. However it does not map to
incoming connections. So introduce a new action 0x02 that allows
the kernel to connect to devices using ADV_DIRECT_IND and in addition
ADV_IND advertising reports.
This difference is represented by the new HCI_AUTO_CONN_DIRECT value
for only connecting to ADV_DIRECT_IND. For connection to ADV_IND and
ADV_DIRECT_IND the old value HCI_AUTO_CONN_ALWAYS is used.
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
It hasn't been used since commit 0fd7bac(net: relax rcvbuf limits).
Signed-off-by: Sorin Dumitru <sorin@returnze.ro>
Signed-off-by: David S. Miller <davem@davemloft.net>
If the Bluetooth controller supports Get MWS Transport Layer
Configuration command, then issue it during initialization.
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
If the Bluetooth controller supports Read Local Supported Codecs
command, then issue it during initialization so that the list of
codecs is known.
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
The digital layer of the NFC subsystem currently
supports a 'tg_listen_mdaa' driver hook that supports
devices that can do mode detection and automatic
anticollision. However, there are some devices that
can do mode detection but not automatic anitcollision
so add the 'tg_listen_md' hook to support those devices.
In order for the digital layer to get the RF technology
detected by the device from the driver, add the
'tg_get_rf_tech' hook. It is only valid to call this
hook immediately after a successful call to 'tg_listen_md'.
CC: Thierry Escande <thierry.escande@linux.intel.com>
Signed-off-by: Mark A. Greer <mgreer@animalcreek.com>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
Remove extra blank line that was inadvertently
added by a recent commit.
CC: Thierry Escande <thierry.escande@linux.intel.com>
Signed-off-by: Mark A. Greer <mgreer@animalcreek.com>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
stop_poll allows to stop CLF reader polling. Some other operations might be
necessary for some CLF to stop polling. For example in card mode.
Signed-off-by: Christophe Ricard <christophe-h.ricard@st.com>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
MSG_MORE and 'corking' a socket would require that the transmit of
a data chunk be delayed.
Rename the return value to be less specific.
Signed-off-by: David Laight <david.laight@aculab.com>
Acked-by: Vlad Yasevich <vyasevich@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Conflicts:
drivers/infiniband/hw/cxgb4/device.c
The cxgb4 conflict was simply overlapping changes.
Signed-off-by: David S. Miller <davem@davemloft.net>
Some drivers may be performing most of Tx/Rx
aggregation on their own (e.g. in firmware)
including AddBa/DelBa negotiations but may
otherwise require Rx reordering assistance.
The patch exports 2 new functions for establishing
Rx aggregation sessions in assumption device
driver has taken care of the necessary
negotiations.
Signed-off-by: Michal Kazior <michal.kazior@tieto.com>
[fix endian bug]
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Pablo Neira Ayuso says:
====================
Netfilter updates for net-next
The following patchset contains updates for your net-next tree,
they are:
1) Use kvfree() helper function from x_tables, from Eric Dumazet.
2) Remove extra timer from the conntrack ecache extension, use a
workqueue instead to redeliver lost events to userspace instead,
from Florian Westphal.
3) Removal of the ulog targets for ebtables and iptables. The nflog
infrastructure superseded this almost 9 years ago, time to get rid
of this code.
4) Replace the list of loggers by an array now that we can only have
two possible non-overlapping logger flavours, ie. kernel ring buffer
and netlink logging.
5) Move Eric Dumazet's log buffer code to nf_log to reuse it from
all of the supported per-family loggers.
6) Consolidate nf_log_packet() as an unified interface for packet logging.
After this patch, if the struct nf_loginfo is available, it explicitly
selects the logger that is used.
7) Move ip and ip6 logging code from xt_LOG to the corresponding
per-family loggers. Thus, x_tables and nf_tables share the same code
for packet logging.
8) Add generic ARP packet logger, which is used by nf_tables. The
format aims to be consistent with the output of xt_LOG.
9) Add generic bridge packet logger. Again, this is used by nf_tables
and it routes the packets to the real family loggers. As a result,
we get consistent logging format for the bridge family. The ebt_log
logging code has been intentionally left in place not to break
backward compatibility since the logging output differs from xt_LOG.
10) Update nft_log to explicitly request the required family logger when
needed.
11) Finish nft_log so it supports arp, ip, ip6, bridge and inet families.
Allowing selection between netlink and kernel buffer ring logging.
12) Several fixes coming after the netfilter core logging changes spotted
by robots.
13) Use IS_ENABLED() macros whenever possible in the netfilter tree,
from Duan Jiong.
14) Removal of a couple of unnecessary branch before kfree, from Fabian
Frederick.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Add new "NFC_DIGITAL_FRAMING_*" calls to the digital
layer so the driver can make the necessary adjustments
when performing anticollision while in target mode.
The driver must ensure that the effect of these calls
happens after the following response has been sent but
before reception of the next request begins.
Acked-by: Thierry Escande <thierry.escande@linux.intel.com>
Signed-off-by: Mark A. Greer <mgreer@animalcreek.com>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
v2: fixed issue with checking return of dcbnl_rtnl_ops->getapp()
Signed-off-by: Anish Bhatt <anish@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Even though our side requests authentication, the original action that
caused it may be remotely triggered, such as an incoming L2CAP or RFCOMM
connect request. To track this information introduce a new hci_conn flag
called HCI_CONN_AUTH_INITIATOR.
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
We're interested in whether an authentication request is because of a
remote or local action. So far hci_conn_security() has been used both
for incoming and outgoing actions (e.g. RFCOMM or L2CAP connect
requests) so without some modifications it cannot know which peer is
responsible for requesting authentication.
This patch adds a new "bool initiator" parameter to hci_conn_security()
to indicate which side is responsible for the request and updates the
current users to pass this information correspondingly.
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
sparse is throwing warnings when building sunrpc modules due to some
endianness shenanigans in ipv6.h. Specifically:
CHECK net/sunrpc/addr.c
include/net/ipv6.h:573:17: warning: restricted __be64 degrades to integer
include/net/ipv6.h:577:34: warning: restricted __be32 degrades to integer
include/net/ipv6.h:573:17: warning: restricted __be64 degrades to integer
include/net/ipv6.h:577:34: warning: restricted __be32 degrades to integer
Sprinkle some endianness fixups to silence them. These should all get
fixed up at compile time, so I don't think this will add any extra work
to be done at runtime.
Signed-off-by: Jeff Layton <jlayton@primarydata.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Many multicast sources can have the same port which can result in a very
large list when hashing by port only. Hash by address and port instead
if this is the case. This makes multicast more similar to unicast.
On a 24-core machine receiving from 500 multicast sockets on the same
port, before this patch 80% of system CPU was used up by spin locking
and only ~25% of packets were successfully delivered.
With this patch, all packets are delivered and kernel overhead is ~8%
system CPU on spinlocks.
Signed-off-by: David Held <drheld@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pablo Neira Ayuso says:
====================
Netfilter/nf_tables fixes
The following patchset contains nf_tables fixes, they are:
1) Fix wrong transaction handling when the table flags are not
modified.
2) Fix missing rcu read_lock section in the netlink dump path, which
is not protected by the nfnl_lock.
3) Set NLM_F_DUMP_INTR in the netlink dump path to indicate
interferences with updates.
4) Fix 64 bits chain counters when they are retrieved from a 32 bits
arch, from Eric Dumazet.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch implements section 5.3.6. of RFC6458, that is, support
for 'SCTP Next Receive Information Structure' (SCTP_NXTINFO) which
is placed into ancillary data cmsghdr structure for each recvmsg()
call, if this information is already available when delivering the
current message.
This option can be enabled/disabled via setsockopt(2) on SOL_SCTP
level by setting an int value with 1/0 for SCTP_RECVNXTINFO in
user space applications as per RFC6458, section 8.1.30.
The sctp_nxtinfo structure is defined as per RFC as below ...
struct sctp_nxtinfo {
uint16_t nxt_sid;
uint16_t nxt_flags;
uint32_t nxt_ppid;
uint32_t nxt_length;
sctp_assoc_t nxt_assoc_id;
};
... and provided under cmsg_level IPPROTO_SCTP, cmsg_type
SCTP_NXTINFO, while cmsg_data[] contains struct sctp_nxtinfo.
Joint work with Daniel Borkmann.
Signed-off-by: Geir Ola Vaagland <geirola@gmail.com>
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch implements section 5.3.5. of RFC6458, that is, support
for 'SCTP Receive Information Structure' (SCTP_RCVINFO) which is
placed into ancillary data cmsghdr structure for each recvmsg()
call.
This option can be enabled/disabled via setsockopt(2) on SOL_SCTP
level by setting an int value with 1/0 for SCTP_RECVRCVINFO in user
space applications as per RFC6458, section 8.1.29.
The sctp_rcvinfo structure is defined as per RFC as below ...
struct sctp_rcvinfo {
uint16_t rcv_sid;
uint16_t rcv_ssn;
uint16_t rcv_flags;
<-- 2 bytes hole -->
uint32_t rcv_ppid;
uint32_t rcv_tsn;
uint32_t rcv_cumtsn;
uint32_t rcv_context;
sctp_assoc_t rcv_assoc_id;
};
... and provided under cmsg_level IPPROTO_SCTP, cmsg_type
SCTP_RCVINFO, while cmsg_data[] contains struct sctp_rcvinfo.
An sctp_rcvinfo item always corresponds to the data in msg_iov.
Joint work with Daniel Borkmann.
Signed-off-by: Geir Ola Vaagland <geirola@gmail.com>
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch implements section 5.3.4. of RFC6458, that is, support
for 'SCTP Send Information Structure' (SCTP_SNDINFO) which can be
placed into ancillary data cmsghdr structure for sendmsg() calls.
The sctp_sndinfo structure is defined as per RFC as below ...
struct sctp_sndinfo {
uint16_t snd_sid;
uint16_t snd_flags;
uint32_t snd_ppid;
uint32_t snd_context;
sctp_assoc_t snd_assoc_id;
};
... and supplied under cmsg_level IPPROTO_SCTP, cmsg_type
SCTP_SNDINFO, while cmsg_data[] contains struct sctp_sndinfo.
An sctp_sndinfo item always corresponds to the data in msg_iov.
Joint work with Daniel Borkmann.
Signed-off-by: Geir Ola Vaagland <geirola@gmail.com>
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Most (probably all) controllers can only deal with a single slave LE
connection at a time. This patch adds a counter for such connections so
that the number can be quickly looked up without iterating the
connections list.
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
We need to be able to track slave vs master LE connections in
hci_conn_hash, and to be able to do that we need to know the role of the
connection by the time hci_conn_add_has() is called. This means in
practice the hci_conn_add() call that creates the hci_conn_object.
This patch adds a new role parameter to hci_conn_add() function to give
the object its initial role value, and updates the callers to pass the
appropriate role to it. Since the function now takes care of
initializing both conn->role and conn->out values we can remove some
other unnecessary assignments.
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
To make the code more understandable it makes sense to use the new HCI
defines for connection role instead of a "bool master" parameter. This
makes it immediately clear when looking at the function calls what the
last parameter is describing.
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Having a dedicated u8 role variable in the hci_conn struct greatly
simplifies tracking of the role, since this is the native way that it's
represented on the HCI level.
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
All HCI commands and events, including LE ones, use 0x00 for master role
and 0x01 for slave role. It makes therefore sense to add generic defines
for these instead of the current LE_CONN_ROLE_MASTER. Having clean
defines will also make it possible to provide simpler internal APIs.
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Since Yuchung's 9b44190dc1 (tcp: refactor F-RTO), tcp_enter_cwr is always
called with set_ssthresh = 1. Thus, we can remove this argument from
tcp_enter_cwr. Further, as we remove this one, tcp_init_cwnd_reduction
is then always called with set_ssthresh = true, and so we can get rid of
this argument as well.
Cc: Yuchung Cheng <ycheng@google.com>
Signed-off-by: Christoph Paasch <christoph.paasch@uclouvain.be>
Acked-by: Yuchung Cheng <ycheng@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This passes down NET_NAME_USER (or NET_NAME_ENUM) to alloc_netdev(),
for any device created over rtnetlink.
v9: restore reverse-christmas-tree order of local variables
Signed-off-by: Tom Gundersen <teg@jklm.no>
Signed-off-by: David S. Miller <davem@davemloft.net>
Added udp_tunnel.c which can contain some common functions for UDP
tunnels. The first function in this is udp_sock_create which is used
to open the listener port for a UDP tunnel.
Signed-off-by: Tom Herbert <therbert@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The code in neigh_sysctl_register() relies on a specific layout of
struct neigh_table, namely that the 'gc_*' variables are directly
following the 'parms' member in a specific order. The code, though,
expresses this in the most ugly way.
Get rid of the ugly casts and use the 'tbl' pointer to get a handle to
the table. This way we can refer to the 'gc_*' variables directly.
Similarly seen in the grsecurity patch, written by Brad Spengler.
Signed-off-by: Mathias Krause <minipli@googlemail.com>
Cc: Brad Spengler <spender@grsecurity.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Use generic u64_stats_sync infrastructure to get proper 64bit stats,
even on 32bit arches, at no extra cost for 64bit arches.
Without this fix, 32bit arches can have some wrong counters at the time
the carry is propagated into upper word.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
An updater may interfer with the dumping of any of the object lists.
Fix this by using a per-net generation counter and use the
nl_dump_check_consistent() interface so the NLM_F_DUMP_INTR flag is set
to notify userspace that it has to restart the dump since an updater
has interfered.
This patch also replaces the existing consistency checking code in the
rule dumping path since it is broken. Basically, the value that the
dump callback returns is not propagated to userspace via
netlink_dump_start().
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
John W. Linville says:
====================
Please pull this batch of updates intended for the 3.17 stream...
This is primarily a Bluetooth pull. Gustavo says:
"A lot of patches to 3.17. The bulk of changes here are for LE support.
The 6loWPAN over Bluetooth now has it own module, we also have support for
background auto-connection and passive scanning, Bluetooth device address
provisioning, support for reading Bluetooth clock values and LE connection
parameters plus many many fixes."
The balance is just a pull of the wireless.git tree, to avoid some
pending merge problems.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
The spinlock protecting the L2CAP ident number can be converted into
a mutex since the whole processing is run in a workqueue. So instead
of using a spinlock, just use a mutex here.
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
The support for LE encryption is optional. When encryption is not
supported then also do not enable the encryption related events.
This moves the event mask setting to the third initialization
stage to ensure that the LE features are available.
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
This patch introduces a possibility for userspace to set various (so far
two) modes of generating addresses. This is useful for example for
NetworkManager because it can set the mode to NONE and take care of link
local addresses itself. That allow it to have the interface up,
monitoring carrier but still don't have any addresses on it.
One more use-case by Dan Williams:
<quote>
WWAN devices often have their LL address provided by the firmware of the
device, which sometimes refuses to respond to incorrect LL addresses
when doing DHCPv6 or IPv6 ND. The kernel cannot generate the correct LL
address for two reasons:
1) WWAN pseudo-ethernet interfaces often construct a fake MAC address,
or read a meaningless MAC address from the firmware. Thus the EUI64 and
the IPv6LL address the kernel assigns will be wrong. The real LL
address is often retrieved from the firmware with AT or proprietary
commands.
2) WWAN PPP interfaces receive their LL address from IPV6CP, not from
kernel assignments. Only after IPV6CP has completed do we know the LL
address of the PPP interface and its peer. But the kernel has already
assigned an incorrect LL address to the interface.
So being able to suppress the kernel LL address generation and assign
the one retrieved from the firmware is less complicated and more robust.
</quote>
Signed-off-by: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: David S. Miller <davem@davemloft.net>
There is no external user of the SCO timeout constants and thus
move them into net/bluetooth/sco.c where they are actuallu used.
In addition just remove SCO_CONN_IDLE_TIMEOUT since it is unused.
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
The SCO_DEFAULT_FLUSH_TO constant has been defined, but it is not
used anywhere and so just remove it.
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
There exists no external user of struct sco_conn and thus move
it into the one place that is actually using it.
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
There exists no external user of struct sco_pinfo and sco_pi and
thus move it into the one place that is actually using it.
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
The list of L2CAP fixed channels increased with newer versions of the
specification. This just updates the constants for it.
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
The internals of the HCI request framework should not be leaking to
its users. Move them all into net/bluetooth/hci_core.c and provide
a simple hci_req_pending helper function for the one user outside
the framework.
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
There exists no external user of struct hci_pinfo and hci_pi and thus
move it into the one place that is actually using it.
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
There is only single location using struct hci_sec_filter and with
that there is no point in putting this declaration into a global
header file. So move it right next to its user and make the code
a lot more simpler.
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
All the HCI sockets and ioctl based definitions have been in a global
header file that also includes all the HCI protocol structures. To
make this a bit cleaner, move them into its own file.
This also adjusts fs/compat_ioctl.c to only include this new file
and not all the protocol structures that are not needed.
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
The Set Connectable/Discoverable mgmt handlers use a hci_request with a
proper callback to handle the HCI command sending. It makes therefore
little sense to have this extra function to be called from hci_event.c
for command failures.
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Since the HCISETSCAN ioctl is the only non-mgmt user we care about for
setting the right discoverable state we can simply do the necessary
updates in the ioctl handler function instead. This then allows the
removal of the mgmt_discoverable function and should simplify that state
handling considerably.
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
The mgmt_connectable function has been used to ensure that the right
actions to HCI_CONNECTABLE are taken when the HCI_Write_Scan_Enable
command is triggered by something else than mgmt. The only other user
that we really care about is the HCISETSCAN ioctl code, so we can
actually more simply perform the needed changes there instead.
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
When the white list is in use the code would not update the
HCI_CONNECTABLE flag if it gets changed through the ioctl code (e.g.
hciconfig hci0 pscan). Since the flag is important for properly
accepting incoming connections add code to fix it up if necessary and
emit a New Settings mgmt event.
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
This patch extends the Add/Remove device commands by letting user space
pass BR/EDR addresses to them. The resulting entries get stored in a new
hdev->whitelist list. The idea is that we can now selectively accept
connections from devices in the list even though HCI_CONNECTABLE is not
set (the actual implementation of this is coming in a subsequent patch).
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
We already have several lists with struct bdaddr_list entries, and there
will be more in the future. Since the operations for adding, removing,
looking up and clearing entries in these lists are exactly the same it
doesn't make sense to define new functions for every single list. This
patch unifies the functions by passing the list_head to them instead of
a hci_dev pointer.
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
The Authenticated Payload Timeout Expired event is valid for
controllers with BR/EDR Secure Connections support, but also for
LE only controllers supporting LE Ping feature. When either of them
is available enable this event. Previous it was not enabled when
the controller was only supporting LE operation.
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Commit cb1ce2ef38 ("ipv6: Implement automatic flow label generation
on transmit") introduced ip6_make_flowlabel, while commit b73c3d0e4f
("net: Save TX flow hash in sock and set in skbuf on xmit") introduced
ip6_set_txhash.
ip6_set_tx_hash() uses sk_v6_daddr which references
__sk_common.skc_v6_daddr from struct sock_common, which is gated with
IS_ENABLED(CONFIG_IPV6).
ip6_make_flowlabel() uses the ipv6 member from struct net which is
also gated with IS_ENABLED(CONFIG_IPV6).
When CONFIG_IPV6 is disabled, we will hit a build failure that looks
like this when the compiler attempts inlining these functions:
CC [M] drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.o
In file included from include/net/inet_sock.h:27:0,
from include/net/ip.h:30,
from drivers/net/ethernet/broadcom/cnic.c:37:
include/net/ipv6.h: In function 'ip6_set_txhash':
include/net/sock.h:327:33: error: 'struct sock_common' has no member named 'skc_v6_daddr'
#define sk_v6_daddr __sk_common.skc_v6_daddr
^
include/net/ipv6.h:696:49: note: in expansion of macro 'sk_v6_daddr'
keys.dst = (__force __be32)ipv6_addr_hash(&sk->sk_v6_daddr);
^
In file included from include/net/inetpeer.h:15:0,
from include/net/route.h:28,
from include/net/ip.h:31,
from drivers/net/ethernet/broadcom/cnic.c:37:
include/net/ipv6.h: In function 'ip6_make_flowlabel':
include/net/ipv6.h:706:37: error: 'struct net' has no member named 'ipv6'
if (!flowlabel && (autolabel || net->ipv6.sysctl.auto_flowlabels)) {
^
Fixes: cb1ce2ef38 ("ipv6: Implement automatic flow label generation on transmit")
Fixes: b73c3d0e4f ("net: Save TX flow hash in sock and set in skbuf on xmit")
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Using pointers into sctp_cmd_seq_t.cmds[] lets the compiler generate much
better code.
Use the last entry first to optimise the overflow check.
Signed-off-by: David Laight <david.laight@aculab.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Even if memset() is inlined (as on x86) using it to zero the union
generates a memory word write of zero, followed by a write of the
smaller field, and then a read of the word.
As well as being a lot of instructions the sequence is unlikely to
be optimised by the store-load forward hardware so will be slow.
Instead allocate a field of the union that is the same size as the
entire union and write a zero value to it. The compiler will then
generate the required value in a register.
Zeroing the union shouldn't be necessary, but this patch series isn't
intended to have a behavioural change.
Signed-off-by: David Laight <david.laight@aculab.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
sctp_init_cmd_seq() and sctp_next_cmd() are only called from one place.
The call sequence for sctp_add_cmd_sf() is likely to be longer than
the inlined code.
With sctp_add_cmd_sf() inlined the compiler can optimise repeated calls.
Signed-off-by: David Laight <david.laight@aculab.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
John W. Linville says:
====================
pull request: wireless-next 2014-07-03
Please pull this first batch of wireless updates intended for the
3.17 stream...
For the mac80211 bits, Johannes says:
"The biggest thing here is probably Arik's TDLS rework, beyond that we
have smaller improvements and features like David's scanning IE thing,
Luca's queue work, some CSA work, etc. Also your PID rate control
removal, of course."
For the iwlwifi bits, Emmanuel says:
"I have here a whole bunch of various things. Andy contributes
better debug prints for dvm specific flows and a module parameter to
completely disable power save for dvm. Andrei is sharing the premises
of his work on CSA - more to come. Eran and Liad keep on working
on the new devices. I have the regular amount of BT Coex stuff and
I continue to work on the firmware error report system adding more
debug capabilities. More to come on that subject too."
On top of that, there are some cleanups to the new rsi driver, some
continuing improvements to the rtl818x drivers, and the usual bundles
of updates to ath9k, b43, mwifiex, wil6210, and a few other bits here
and there.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Since the real advertising state is now tracked with its own flag we can
simply set/unset the HCI_ADVERTISING flag in the
set_advertising_complete function.
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Having a single HCI_ADVERTISING flag is problematic since it tries to
track both the real advertising state and the corresponding mgmt
setting. To make the logic simpler and more reliable add a new flag that
only tracks the actual advertising state that has been written to the
controller.
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
This patch adds new mac802154 hw flags for transmit power, csma and
listen before transmit (lbt). These flags indicates that the transceiver
supports these features. If the flags are set and the driver doesn't
implement the necessary functions, then ieee802154_register_device
returns -ENOSYS "Function not implemented".
This patch merges also all at86rf230 operations into one operations structure
and set the right hw flags for the at86rf230 transceivers.
Signed-off-by: Alexander Aring <alex.aring@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Automatically generate flow labels for IPv6 packets on transmit.
The flow label is computed based on skb_get_hash. The flow label will
only automatically be set when it is zero otherwise (i.e. flow label
manager hasn't set one). This supports the transmit side functionality
of RFC 6438.
Added an IPv6 sysctl auto_flowlabels to enable/disable this behavior
system wide, and added IPV6_AUTOFLOWLABEL socket option to enable this
functionality per socket.
By default, auto flowlabels are disabled to avoid possible conflicts
with flow label manager, however if this feature proves useful we
may want to enable it by default.
It should also be noted that FreeBSD has already implemented automatic
flow labels (including the sysctl and socket option). In FreeBSD,
automatic flow labels default to enabled.
Performance impact:
Running super_netperf with 200 flows for TCP_RR and UDP_RR for
IPv6. Note that in UDP case, __skb_get_hash will be called for
every packet with explains slight regression. In the TCP case
the hash is saved in the socket so there is no regression.
Automatic flow labels disabled:
TCP_RR:
86.53% CPU utilization
127/195/322 90/95/99% latencies
1.40498e+06 tps
UDP_RR:
90.70% CPU utilization
118/168/243 90/95/99% latencies
1.50309e+06 tps
Automatic flow labels enabled:
TCP_RR:
85.90% CPU utilization
128/199/337 90/95/99% latencies
1.40051e+06
UDP_RR
92.61% CPU utilization
115/164/236 90/95/99% latencies
1.4687e+06
Signed-off-by: Tom Herbert <therbert@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In vxlan and OVS vport-vxlan call common function to get source port
for a UDP tunnel. Removed vxlan_src_port since the functionality is
now in udp_flow_src_port.
Signed-off-by: Tom Herbert <therbert@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch adds udp_flow_src_port function which is intended to be
a common function that UDP tunnel implementations call to set the source
port. The source port is chosen so that a hash over the outer headers
(IP addresses and UDP ports) acts as suitable hash for the flow of the
encapsulated packet. In this manner, UDP encapsulation works with RSS
and ECMP based wrt the inner flow.
Signed-off-by: Tom Herbert <therbert@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
For a connected socket we can precompute the flow hash for setting
in skb->hash on output. This is a performance advantage over
calculating the skb->hash for every packet on the connection. The
computation is done using the common hash algorithm to be consistent
with computations done for packets of the connection in other states
where thers is no socket (e.g. time-wait, syn-recv, syn-cookies).
This patch adds sk_txhash to the sock structure. inet_set_txhash and
ip6_set_txhash functions are added which are called from points in
TCP and UDP where socket moves to established state.
skb_set_hash_from_sk is a function which sets skb->hash from the
sock txhash value. This is called in UDP and TCP transmit path when
transmitting within the context of a socket.
Tested: ran super_netperf with 200 TCP_RR streams over a vxlan
interface (in this case skb_get_hash called on every TX packet to
create a UDP source port).
Before fix:
95.02% CPU utilization
154/256/505 90/95/99% latencies
1.13042e+06 tps
Time in functions:
0.28% skb_flow_dissect
0.21% __skb_get_hash
After fix:
94.95% CPU utilization
156/254/485 90/95/99% latencies
1.15447e+06
Neither __skb_get_hash nor skb_flow_dissect appear in perf
Signed-off-by: Tom Herbert <therbert@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Move the hash computation located in __skb_get_hash to be a separate
function which takes flow_keys as input. This will allow flow hash
computation in other contexts where we only have addresses and ports.
Signed-off-by: Tom Herbert <therbert@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Always store in snt_synack the time at which the server received the
first client SYN and attempted to send the first SYNACK.
Recent commit aa27fc501 ("tcp: tcp_v[46]_conn_request: fix snt_synack
initialization") resolved an inconsistency between IPv4 and IPv6 in
the initialization of snt_synack. This commit brings back the idea
from 843f4a55e (tcp: use tcp_v4_send_synack on first SYN-ACK), which
was going for the original behavior of snt_synack from the commit
where it was added in 9ad7c049f0 ("tcp: RFC2988bis + taking RTT
sample from 3WHS for the passive open side") in v3.1.
In addition to being simpler (and probably a tiny bit faster),
unconditionally storing the time of the first SYNACK attempt has been
useful because it allows calculating a performance metric quantifying
how long it took to establish a passive TCP connection.
Signed-off-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: Yuchung Cheng <ycheng@google.com>
Cc: Octavian Purdila <octavian.purdila@intel.com>
Cc: Jerry Chu <hkchu@google.com>
Acked-by: Octavian Purdila <octavian.purdila@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
If we have both LE scanning and advertising simultaneously enabled we
need a way to tell hci_connect_le() in which role to initiate a
connection. This patch adds a new parameter to the function to give it
the necessary information. For auto-connect and mgmt_pair_device we
always use master role, whereas for L2CAP users (in practice sockets) we
use slave role whenever HCI_ADVERTISING is set and master role
otherwise.
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
The auth_type value which gets assigned to hci_conn->auth_type is
something that's only used for BR/EDR connections and is of no value for
LE connections. It makes therefore little sense to pass it to the
hci_connect_le() function. This patch removes the parameter from the
function.
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
When we establish connections as a consequence of receiving an
advertising report it makes no sense to wait the normal 20 second LE
connection timeout. This patch modifies the hci_connect_le function to
take an extra timeout value and uses a lower 2 second timeout for the
auto-connection case. This timeout is intentionally chosen to be just a
bit higher than the 1.28 second timeout that High Duty Cycle Advertising
uses.
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
This adds support for changing the public device address. This feature
is required by controllers that do not provide a public address and
have HCI_QUIRK_INVALID_BDADDR set.
Even if a controller has a public device address, this is useful when
an embedded system wants to use its own value. As long as the driver
provides the set_bdaddr callback, this allows changing the device
address before powering on the controller.
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
When the external configuration triggers the switch to a configured
controller, it means the setup needs to be run. Controllers that start
out unconfigured have only run limited set of HCI commands. This is
not enough for complete operation and thus run the setup procedure
before announcing the new controller index.
This introduces HCI_CONFIG flag as companion to HCI_SETUP flag. The
HCI_SETUP flag is only used once for the initial setup procedure. And
during that procedure hdev->setup driver callback is called. With the
new HCI_CONFIG the switch from unconfigured to configured state is
triggering the same setup procedure just without hdev->setup. This
is required since bringing a controller back to unconfigured state
from configured state is possible.
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
When calling Device Remove with BDADDR_ANY we should in a similar way
emit Device Removed events as we do when removing a single device. Since
we have to iterate the list and call device_removed() the dedicated
hci_conn_params_clear_enabled() is not really useful anymore. This patch
removes the helper function and does the event emission and list item
removal in a single loop.
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
During the setup phase of a controller, the Bluetooth address will be
read and to have that original address available for later use, store
it as setup address.
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
When one or more of the missing configuration options change, then send
this even to all the other management interface clients.
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
The Set External Configuration management command allows for switching
between configured and unconfigured start if HCI_QURIK_EXTERNAL_CONFIG
is set by the transport driver.
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
When a controller requires external configuration, then setting this
quirk will allow indicating this.
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
When a Bluetooth controller does not have a valid public Bluetooth
address, then allow the driver to indicate this. If the quirk is
set, the Bluetooth core will switch to unconfigured state first
and will allow userspace to configure the address before starting
the full initialization of the controller.
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
The current existing device quirks are not documented. So instead of
spreading bits and pieces somewhere in the code, add proper comments
on where these quirks can be used and what behavior they change.
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
The public address configuration option is value 0x02 since the generic
external configuration is value 0x01. So adjust this accordingly and
also add the value 0x01 to the list.
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
In some circumstances we need to look up entries in pend_le_conns and in
other in pend_le_reports. This patch converts the existing lookup
function for pend_le_conns to something that can be used for both lists.
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Since there are no more users of this function we can simply go ahead
and remove it.
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
When powering off (hci_dev_do_close) we should clear both the
pend_le_reports and pend_le_conns types of entries. When powering on
respectively we should populate both lists. This patch converts the
hci_pend_le_conns_clear() function into hci_pend_le_actions_clear()
(which can now be static) and converts the restart_le_auto_conns()
function into restart_le_actions().
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Now that there are no-longer any users of the hci_pend_le_conn_del()
function we can simply go ahead and remove it.
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
To simplify manipulation and lookup of hci_conn_params entries of the
type HCI_AUTO_CONN_REPORT it makes sense to store them in their own
list. The new action list_head in hci_conn_params is used for this
purpose.
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
In preparation to store also HCI_AUTO_CONN_REPORT entries in a list it
makes sense to convert the existing pend_le_conn list head of
hci_conn_params into a more generically named "action". This makes sense
because a parameter entry will never participate in more than one action
list.
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
The Read Controller Configuration Information command allows retrieving
details about possible configurations option. The supported options are
returned and also the missing options (if any).
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Since the connection parameters are always a basis for adding entries to
hdev->pend_le_conns (so far of type bdaddr_list) it's simpler and more
efficient to have the parameters themselves be the entries in the
pend_le_conns list. We do this by adding another list_head to the
hci_conn_params struct.
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
To be able to make the right choice of whether to start passive scanning
or to send out a mgmt_device_found event we need to know if there are
any devices in the le_conn_params list with the auto_connect value set
to HCI_AUTO_CONN_REPORT. This patch adds a counter for this kind of
devices.
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
This command allows to get the list of currently known controller that
are in unconfigured state.
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
When a controller in an unconfigured state gets removed, then send
Unconfigured Index Removed events.
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
When a controller is in unconfigured state it is currently hidden
from the management interface. This change now announces the new
controller with an Unconfigured Index Added event and allows clients
to easily detect the controller.
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
With the new unconfigured controller state it is possible to provide a
fully functional HCI transport, but disable the higher level operations
that would normally happen. This way userspace can try to configure the
controller before releases the unconfigured state.
The internal state is represented by HCI_UNCONFIGURED. This replaces the
HCI_QUIRK_RAW_DEVICE quirk as internal state representation. This is now
a real state and drivers can use the quirk to actually trigger this
state. In the future this will allow a more fine grained switching from
unconfigured state to configured state for controller inititialization.
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
There are more places that can take advantage of is_identity_address()
besides hci_core.c. This patch moves the function to hci_core.h and
gives it the appropriate hci_ prefix.
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
The calling functions of mgmt_new_conn_param have more information about
the parameters, such as whether the kernel is tracking them or not. It
makes therefore sense to have them pass an initial store_hint value to
the mgmt_new_conn_param function.
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
The caller of hci_le_conn_update is directly interested in knowing what
the best value is for the store_hint parameter of the corresponding
mgmt event. Since hci_le_conn_update knows whether there were stored
parameters that were updated or not we can have it return an initial
store_hint value to the caller.
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
This patch implements the new Load Connection Parameters mgmt command
that's intended to load the desired connection parameters for LE
devices.
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
The 0x00 action value of mgmt means "scan and report" but do not
connect. This is different from HCI_AUTO_CONN_DISABLED so we need a new
value for it.
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
In some circumstances we'll need to either clear only the enabled
parameters or only the disabled ones. This patch adds convenience
functions for this purpose.
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
We'll soon have specific clear functions for clearing enabled or
disabled entries, so rename the function that removes everything to
clear_all().
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Some embedded controllers allow the programming of a public address
and this adds vendor support for supporting OEM confguration of such
addresses.
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
This patch introduces a new Mgmt event called "New Connection Parameter".
This event indicates to userspace the connection parameters values the
remote device requested.
The user may store these values and load them into kernel. This way, next
time a connection is established to that device, the kernel will use those
parameters values instead of the default ones.
This event is sent when the remote device requests new connection
parameters through connection parameter update procedure. This event is
not sent for slave connections.
Signed-off-by: Andre Guedes <andre.guedes@openbossa.org>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
The Bluetooth 4.1 introduces a new LE meta event called "LE Remote
Connection Parameter Request" event. In order to the controller
sends this event to host, we should enable it during controller
initialization.
Signed-off-by: Andre Guedes <andre.guedes@openbossa.org>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>