Commit Graph

7738 Commits

Author SHA1 Message Date
Johan Hedberg
bc6d2d0418 Bluetooth: Remove unneeded mgmt_discoverable function
Since the HCISETSCAN ioctl is the only non-mgmt user we care about for
setting the right discoverable state we can simply do the necessary
updates in the ioctl handler function instead. This then allows the
removal of the mgmt_discoverable function and should simplify that state
handling considerably.

Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2014-07-10 11:14:20 +02:00
Johan Hedberg
031547d868 Bluetooth: Remove unneeded mgmt_connectable function
The mgmt_connectable function has been used to ensure that the right
actions to HCI_CONNECTABLE are taken when the HCI_Write_Scan_Enable
command is triggered by something else than mgmt. The only other user
that we really care about is the HCISETSCAN ioctl code, so we can
actually more simply perform the needed changes there instead.

Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2014-07-10 11:14:20 +02:00
Johan Hedberg
91a668b056 Bluetooth: Fix setting HCI_CONNECTABLE from ioctl code
When the white list is in use the code would not update the
HCI_CONNECTABLE flag if it gets changed through the ioctl code (e.g.
hciconfig hci0 pscan). Since the flag is important for properly
accepting incoming connections add code to fix it up if necessary and
emit a New Settings mgmt event.

Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2014-07-09 12:30:18 +02:00
Johan Hedberg
6659358efe Bluetooth: Introduce a whitelist for BR/EDR devices
This patch extends the Add/Remove device commands by letting user space
pass BR/EDR addresses to them. The resulting entries get stored in a new
hdev->whitelist list. The idea is that we can now selectively accept
connections from devices in the list even though HCI_CONNECTABLE is not
set (the actual implementation of this is coming in a subsequent patch).

Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2014-07-09 12:25:27 +02:00
Johan Hedberg
dcc36c16c2 Bluetooth: Unify helpers for bdaddr_list manipulations
We already have several lists with struct bdaddr_list entries, and there
will be more in the future. Since the operations for adding, removing,
looking up and clearing entries in these lists are exactly the same it
doesn't make sense to define new functions for every single list. This
patch unifies the functions by passing the list_head to them instead of
a hci_dev pointer.

Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2014-07-09 12:25:26 +02:00
Marcel Holtmann
cd7ca0ec5e Bluetooth: Fix enabling Authenticated Payload Timeout Expired event
The Authenticated Payload Timeout Expired event is valid for
controllers with BR/EDR Secure Connections support, but also for
LE only controllers supporting LE Ping feature. When either of them
is available enable this event. Previous it was not enabled when
the controller was only supporting LE operation.

Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
2014-07-09 11:19:15 +03:00
Florian Fainelli
a37934fc0d net: provide stubs for ip6_set_txhash and ip6_make_flowlabel
Commit cb1ce2ef38 ("ipv6: Implement automatic flow label generation
on transmit") introduced ip6_make_flowlabel, while commit b73c3d0e4f
("net: Save TX flow hash in sock and set in skbuf on xmit") introduced
ip6_set_txhash.

ip6_set_tx_hash() uses sk_v6_daddr which references
__sk_common.skc_v6_daddr from struct sock_common, which is gated with
IS_ENABLED(CONFIG_IPV6).

ip6_make_flowlabel() uses the ipv6 member from struct net which is
also gated with IS_ENABLED(CONFIG_IPV6).

When CONFIG_IPV6 is disabled, we will hit a build failure that looks
like this when the compiler attempts inlining these functions:

  CC [M]  drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.o
In file included from include/net/inet_sock.h:27:0,
                 from include/net/ip.h:30,
                 from drivers/net/ethernet/broadcom/cnic.c:37:
include/net/ipv6.h: In function 'ip6_set_txhash':
include/net/sock.h:327:33: error: 'struct sock_common' has no member named 'skc_v6_daddr'
 #define sk_v6_daddr  __sk_common.skc_v6_daddr
                                 ^
include/net/ipv6.h:696:49: note: in expansion of macro 'sk_v6_daddr'
  keys.dst = (__force __be32)ipv6_addr_hash(&sk->sk_v6_daddr);
                                                 ^
In file included from include/net/inetpeer.h:15:0,
                 from include/net/route.h:28,
                 from include/net/ip.h:31,
                 from drivers/net/ethernet/broadcom/cnic.c:37:
include/net/ipv6.h: In function 'ip6_make_flowlabel':
include/net/ipv6.h:706:37: error: 'struct net' has no member named 'ipv6'
  if (!flowlabel && (autolabel || net->ipv6.sysctl.auto_flowlabels)) {
                                     ^

Fixes: cb1ce2ef38 ("ipv6: Implement automatic flow label generation on transmit")
Fixes: b73c3d0e4f ("net: Save TX flow hash in sock and set in skbuf on xmit")
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-07-08 20:43:35 -07:00
David Laight
d1a3fe26e9 net: sctp: Use pointers (not array indexes) to access sctp_cmd_seq_t.cmds[].
Using pointers into sctp_cmd_seq_t.cmds[] lets the compiler generate much
better code.
Use the last entry first to optimise the overflow check.

Signed-off-by: David Laight <david.laight@aculab.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-07-08 14:39:00 -07:00
David Laight
b9420e1c87 net: sctp: Optimise the way 'sctp_arg_t' values are initialised.
Even if memset() is inlined (as on x86) using it to zero the union
generates a memory word write of zero, followed by a write of the
smaller field, and then a read of the word.
As well as being a lot of instructions the sequence is unlikely to
be optimised by the store-load forward hardware so will be slow.

Instead allocate a field of the union that is the same size as the
entire union and write a zero value to it. The compiler will then
generate the required value in a register.

Zeroing the union shouldn't be necessary, but this patch series isn't
intended to have a behavioural change.

Signed-off-by: David Laight <david.laight@aculab.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-07-08 14:39:00 -07:00
David Laight
be1f4f48ce net: sctp: Inline the functions from command.c
sctp_init_cmd_seq() and sctp_next_cmd() are only called from one place.
The call sequence for sctp_add_cmd_sf() is likely to be longer than
the inlined code.
With sctp_add_cmd_sf() inlined the compiler can optimise repeated calls.

Signed-off-by: David Laight <david.laight@aculab.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-07-08 14:38:48 -07:00
David S. Miller
72948cdcbb Merge git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next
John W. Linville says:

====================
pull request: wireless-next 2014-07-03

Please pull this first batch of wireless updates intended for the
3.17 stream...

For the mac80211 bits, Johannes says:

"The biggest thing here is probably Arik's TDLS rework, beyond that we
have smaller improvements and features like David's scanning IE thing,
Luca's queue work, some CSA work, etc. Also your PID rate control
removal, of course."

For the iwlwifi bits, Emmanuel says:

"I have here a whole bunch of various things. Andy contributes
better debug prints for dvm specific flows and a module parameter to
completely disable power save for dvm. Andrei is sharing the premises
of his work on CSA - more to come. Eran and Liad keep on working
on the new devices. I have the regular amount of BT Coex stuff and
I continue to work on the firmware error report system adding more
debug capabilities. More to come on that subject too."

On top of that, there are some cleanups to the new rsi driver, some
continuing improvements to the rtl818x drivers, and the usual bundles
of updates to ath9k, b43, mwifiex, wil6210, and a few other bits here
and there.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2014-07-08 14:20:31 -07:00
Johan Hedberg
c93bd15033 Bluetooth: Remove unnecessary mgmt_advertising function
Since the real advertising state is now tracked with its own flag we can
simply set/unset the HCI_ADVERTISING flag in the
set_advertising_complete function.

Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2014-07-08 14:22:06 +02:00
Johan Hedberg
66c417c1ee Bluetooth: Add flag to track the real advertising state
Having a single HCI_ADVERTISING flag is problematic since it tries to
track both the real advertising state and the corresponding mgmt
setting. To make the logic simpler and more reliable add a new flag that
only tracks the actual advertising state that has been written to the
controller.

Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2014-07-08 14:22:05 +02:00
Alexander Aring
640985ec2f mac802154: at86rf230: add hw flags and merge ops
This patch adds new mac802154 hw flags for transmit power, csma and
listen before transmit (lbt). These flags indicates that the transceiver
supports these features. If the flags are set and the driver doesn't
implement the necessary functions, then ieee802154_register_device
returns -ENOSYS "Function not implemented".

This patch merges also all at86rf230 operations into one operations structure
and set the right hw flags for the at86rf230 transceivers.

Signed-off-by: Alexander Aring <alex.aring@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-07-07 21:29:24 -07:00
Tom Herbert
cb1ce2ef38 ipv6: Implement automatic flow label generation on transmit
Automatically generate flow labels for IPv6 packets on transmit.
The flow label is computed based on skb_get_hash. The flow label will
only automatically be set when it is zero otherwise (i.e. flow label
manager hasn't set one). This supports the transmit side functionality
of RFC 6438.

Added an IPv6 sysctl auto_flowlabels to enable/disable this behavior
system wide, and added IPV6_AUTOFLOWLABEL socket option to enable this
functionality per socket.

By default, auto flowlabels are disabled to avoid possible conflicts
with flow label manager, however if this feature proves useful we
may want to enable it by default.

It should also be noted that FreeBSD has already implemented automatic
flow labels (including the sysctl and socket option). In FreeBSD,
automatic flow labels default to enabled.

Performance impact:

Running super_netperf with 200 flows for TCP_RR and UDP_RR for
IPv6. Note that in UDP case, __skb_get_hash will be called for
every packet with explains slight regression. In the TCP case
the hash is saved in the socket so there is no regression.

Automatic flow labels disabled:

  TCP_RR:
    86.53% CPU utilization
    127/195/322 90/95/99% latencies
    1.40498e+06 tps

  UDP_RR:
    90.70% CPU utilization
    118/168/243 90/95/99% latencies
    1.50309e+06 tps

Automatic flow labels enabled:

  TCP_RR:
    85.90% CPU utilization
    128/199/337 90/95/99% latencies
    1.40051e+06

  UDP_RR
    92.61% CPU utilization
    115/164/236 90/95/99% latencies
    1.4687e+06

Signed-off-by: Tom Herbert <therbert@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-07-07 21:14:21 -07:00
Tom Herbert
535fb8d006 vxlan: Call udp_flow_src_port
In vxlan and OVS vport-vxlan call common function to get source port
for a UDP tunnel. Removed vxlan_src_port since the functionality is
now in udp_flow_src_port.

Signed-off-by: Tom Herbert <therbert@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-07-07 21:14:21 -07:00
Tom Herbert
b8f1a55639 udp: Add function to make source port for UDP tunnels
This patch adds udp_flow_src_port function which is intended to be
a common function that UDP tunnel implementations call to set the source
port. The source port is chosen so that a hash over the outer headers
(IP addresses and UDP ports) acts as suitable hash for the flow of the
encapsulated packet. In this manner, UDP encapsulation works with RSS
and ECMP based wrt the inner flow.

Signed-off-by: Tom Herbert <therbert@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-07-07 21:14:21 -07:00
Tom Herbert
b73c3d0e4f net: Save TX flow hash in sock and set in skbuf on xmit
For a connected socket we can precompute the flow hash for setting
in skb->hash on output. This is a performance advantage over
calculating the skb->hash for every packet on the connection. The
computation is done using the common hash algorithm to be consistent
with computations done for packets of the connection in other states
where thers is no socket (e.g. time-wait, syn-recv, syn-cookies).

This patch adds sk_txhash to the sock structure. inet_set_txhash and
ip6_set_txhash functions are added which are called from points in
TCP and UDP where socket moves to established state.

skb_set_hash_from_sk is a function which sets skb->hash from the
sock txhash value. This is called in UDP and TCP transmit path when
transmitting within the context of a socket.

Tested: ran super_netperf with 200 TCP_RR streams over a vxlan
interface (in this case skb_get_hash called on every TX packet to
create a UDP source port).

Before fix:

  95.02% CPU utilization
  154/256/505 90/95/99% latencies
  1.13042e+06 tps

  Time in functions:
    0.28% skb_flow_dissect
    0.21% __skb_get_hash

After fix:

  94.95% CPU utilization
  156/254/485 90/95/99% latencies
  1.15447e+06

  Neither __skb_get_hash nor skb_flow_dissect appear in perf

Signed-off-by: Tom Herbert <therbert@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-07-07 21:14:21 -07:00
Tom Herbert
5ed20a68cd flow_dissector: Abstract out hash computation
Move the hash computation located in __skb_get_hash to be a separate
function which takes flow_keys as input. This will allow flow hash
computation in other contexts where we only have addresses and ports.

Signed-off-by: Tom Herbert <therbert@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-07-07 21:14:20 -07:00
Neal Cardwell
86c6a2c75a tcp: switch snt_synack back to measuring transmit time of first SYNACK
Always store in snt_synack the time at which the server received the
first client SYN and attempted to send the first SYNACK.

Recent commit aa27fc501 ("tcp: tcp_v[46]_conn_request: fix snt_synack
initialization") resolved an inconsistency between IPv4 and IPv6 in
the initialization of snt_synack. This commit brings back the idea
from 843f4a55e (tcp: use tcp_v4_send_synack on first SYN-ACK), which
was going for the original behavior of snt_synack from the commit
where it was added in 9ad7c049f0 ("tcp: RFC2988bis + taking RTT
sample from 3WHS for the passive open side") in v3.1.

In addition to being simpler (and probably a tiny bit faster),
unconditionally storing the time of the first SYNACK attempt has been
useful because it allows calculating a performance metric quantifying
how long it took to establish a passive TCP connection.

Signed-off-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: Yuchung Cheng <ycheng@google.com>
Cc: Octavian Purdila <octavian.purdila@intel.com>
Cc: Jerry Chu <hkchu@google.com>
Acked-by: Octavian Purdila <octavian.purdila@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-07-07 19:26:37 -07:00
Johan Hedberg
cdd6275e51 Bluetooth: Pass desired connection role to hci_connect_le()
If we have both LE scanning and advertising simultaneously enabled we
need a way to tell hci_connect_le() in which role to initiate a
connection. This patch adds a new parameter to the function to give it
the necessary information. For auto-connect and mgmt_pair_device we
always use master role, whereas for L2CAP users (in practice sockets) we
use slave role whenever HCI_ADVERTISING is set and master role
otherwise.

Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2014-07-07 15:18:08 +02:00
Johan Hedberg
d93375a82d Bluetooth: Remove auth_type parameter from hci_connect_le()
The auth_type value which gets assigned to hci_conn->auth_type is
something that's only used for BR/EDR connections and is of no value for
LE connections. It makes therefore little sense to pass it to the
hci_connect_le() function. This patch removes the parameter from the
function.

Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2014-07-07 15:18:07 +02:00
Johan Hedberg
09ae260ba4 Bluetooth: Use lower timeout for LE auto-connections
When we establish connections as a consequence of receiving an
advertising report it makes no sense to wait the normal 20 second LE
connection timeout. This patch modifies the hci_connect_le function to
take an extra timeout value and uses a lower 2 second timeout for the
auto-connection case. This timeout is intentionally chosen to be just a
bit higher than the 1.28 second timeout that High Duty Cycle Advertising
uses.

Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2014-07-06 14:46:15 +03:00
Marcel Holtmann
9713c17b08 Bluetooth: Add support for changing the public device address
This adds support for changing the public device address. This feature
is required by controllers that do not provide a public address and
have HCI_QUIRK_INVALID_BDADDR set.

Even if a controller has a public device address, this is useful when
an embedded system wants to use its own value. As long as the driver
provides the set_bdaddr callback, this allows changing the device
address before powering on the controller.

Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
2014-07-06 13:42:20 +03:00
Marcel Holtmann
d603b76b0c Bluetooth: Run controller setup after external configuration
When the external configuration triggers the switch to a configured
controller, it means the setup needs to be run. Controllers that start
out unconfigured have only run limited set of HCI commands. This is
not enough for complete operation and thus run the setup procedure
before announcing the new controller index.

This introduces HCI_CONFIG flag as companion to HCI_SETUP flag. The
HCI_SETUP flag is only used once for the initial setup procedure. And
during that procedure hdev->setup driver callback is called. With the
new HCI_CONFIG the switch from unconfigured to configured state is
triggering the same setup procedure just without hdev->setup. This
is required since bringing a controller back to unconfigured state
from configured state is possible.

Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
2014-07-06 13:41:51 +03:00
Johan Hedberg
19de0825cd Bluetooth: Fix sending Device Removed when clearing all parameters
When calling Device Remove with BDADDR_ANY we should in a similar way
emit Device Removed events as we do when removing a single device. Since
we have to iterate the list and call device_removed() the dedicated
hci_conn_params_clear_enabled() is not really useful anymore. This patch
removes the helper function and does the event emission and list item
removal in a single loop.

Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2014-07-06 12:32:26 +02:00
Marcel Holtmann
e30d3f5fef Bluetooth: Store Bluetooth address from controller setup
During the setup phase of a controller, the Bluetooth address will be
read and to have that original address available for later use, store
it as setup address.

Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
2014-07-05 15:48:28 +03:00
Marcel Holtmann
f4537c04d3 Bluetooth: Add support for New Configuration Options management event
When one or more of the missing configuration options change, then send
this even to all the other management interface clients.

Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
2014-07-04 21:12:00 +03:00
Marcel Holtmann
dbece37a32 Bluetooth: Add support for Set External Configuration management command
The Set External Configuration management command allows for switching
between configured and unconfigured start if HCI_QURIK_EXTERNAL_CONFIG
is set by the transport driver.

Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
2014-07-04 21:10:30 +03:00
Marcel Holtmann
eb1904f49d Bluetooth: Add quirk for external configuration requirement
When a controller requires external configuration, then setting this
quirk will allow indicating this.

Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
2014-07-04 21:08:15 +03:00
Marcel Holtmann
89bc22d23f Bluetooth: Add quirk for invalid controller address setting
When a Bluetooth controller does not have a valid public Bluetooth
address, then allow the driver to indicate this. If the quirk is
set, the Bluetooth core will switch to unconfigured state first
and will allow userspace to configure the address before starting
the full initialization of the controller.

Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
2014-07-04 18:09:32 +03:00
Marcel Holtmann
118305b50a Bluetooth: Document the existing device quirks
The current existing device quirks are not documented. So instead of
spreading bits and pieces somewhere in the code, add proper comments
on where these quirks can be used and what behavior they change.

Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
2014-07-04 17:30:48 +03:00
Marcel Holtmann
46ebeb26cd Bluetooth: Fix constant for public address configuration
The public address configuration option is value 0x02 since the generic
external configuration is value 0x01. So adjust this accordingly and
also add the value 0x01 to the list.

Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
2014-07-04 14:59:09 +03:00
Johan Hedberg
501f882741 Bluetooth: Make hci_pend_le_conn_lookup more general purposed
In some circumstances we need to look up entries in pend_le_conns and in
other in pend_le_reports. This patch converts the existing lookup
function for pend_le_conns to something that can be used for both lists.

Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2014-07-04 11:58:10 +02:00
Johan Hedberg
d9b3ad7df1 Bluetooth: Remove unused hci_pend_le_conn_add function
Since there are no more users of this function we can simply go ahead
and remove it.

Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2014-07-04 11:58:09 +02:00
Johan Hedberg
d7347f3cc2 Bluetooth: Fix clearing and restarting all LE actions on power cycle
When powering off (hci_dev_do_close) we should clear both the
pend_le_reports and pend_le_conns types of entries. When powering on
respectively we should populate both lists. This patch converts the
hci_pend_le_conns_clear() function into hci_pend_le_actions_clear()
(which can now be static) and converts the restart_le_auto_conns()
function into restart_le_actions().

Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2014-07-04 11:58:09 +02:00
Johan Hedberg
ae44e5d19e Bluetooth: Remove unused hci_pend_le_conn_del() function
Now that there are no-longer any users of the hci_pend_le_conn_del()
function we can simply go ahead and remove it.

Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2014-07-04 11:58:09 +02:00
Johan Hedberg
66f8455aea Bluetooth: Convert pend_le_reports into a list
To simplify manipulation and lookup of hci_conn_params entries of the
type HCI_AUTO_CONN_REPORT it makes sense to store them in their own
list. The new action list_head in hci_conn_params is used for this
purpose.

Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2014-07-04 11:58:08 +02:00
Johan Hedberg
93450c7544 Bluetooth: Convert pend_le_conn list to a generic action list
In preparation to store also HCI_AUTO_CONN_REPORT entries in a list it
makes sense to convert the existing pend_le_conn list head of
hci_conn_params into a more generically named "action". This makes sense
because a parameter entry will never participate in more than one action
list.

Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2014-07-04 11:58:08 +02:00
Marcel Holtmann
9fc3bfb681 Bluetooth: Add support for controller configuration info command
The Read Controller Configuration Information command allows retrieving
details about possible configurations option. The supported options are
returned and also the missing options (if any).

Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
2014-07-04 08:50:19 +03:00
Johan Hedberg
912b42ef05 Bluetooth: Use hci_conn_params in pend_le_conns
Since the connection parameters are always a basis for adding entries to
hdev->pend_le_conns (so far of type bdaddr_list) it's simpler and more
efficient to have the parameters themselves be the entries in the
pend_le_conns list. We do this by adding another list_head to the
hci_conn_params struct.

Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2014-07-03 18:45:08 +02:00
Johan Hedberg
851efca838 Bluetooth: Track number of added devices with HCI_AUTO_CONN_REPORT
To be able to make the right choice of whether to start passive scanning
or to send out a mgmt_device_found event we need to know if there are
any devices in the le_conn_params list with the auto_connect value set
to HCI_AUTO_CONN_REPORT. This patch adds a counter for this kind of
devices.

Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2014-07-03 17:42:58 +02:00
Marcel Holtmann
73d1df2a7a Bluetooth: Add support for Read Unconfigured Index List command
This command allows to get the list of currently known controller that
are in unconfigured state.

Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
2014-07-03 17:42:58 +02:00
Marcel Holtmann
edd3896bc4 Bluetooth: Add support for Unconfigured Index Removed events
When a controller in an unconfigured state gets removed, then send
Unconfigured Index Removed events.

Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
2014-07-03 17:42:58 +02:00
Marcel Holtmann
0602a8adc3 Bluetooth: Add support for Unconfigured Index Added events
When a controller is in unconfigured state it is currently hidden
from the management interface. This change now announces the new
controller with an Unconfigured Index Added event and allows clients
to easily detect the controller.

Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
2014-07-03 17:42:58 +02:00
Marcel Holtmann
4a964404c0 Bluetooth: Introduce unconfigured controller state
With the new unconfigured controller state it is possible to provide a
fully functional HCI transport, but disable the higher level operations
that would normally happen. This way userspace can try to configure the
controller before releases the unconfigured state.

The internal state is represented by HCI_UNCONFIGURED. This replaces the
HCI_QUIRK_RAW_DEVICE quirk as internal state representation. This is now
a real state and drivers can use the quirk to actually trigger this
state. In the future this will allow a more fine grained switching from
unconfigured state to configured state for controller inititialization.

Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
2014-07-03 17:42:58 +02:00
Johan Hedberg
c46245b3ef Bluetooth: Make is_identity_address a global function
There are more places that can take advantage of is_identity_address()
besides hci_core.c. This patch moves the function to hci_core.h and
gives it the appropriate hci_ prefix.

Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2014-07-03 17:42:57 +02:00
Johan Hedberg
f4869e2adb Bluetooth: Pass store hint to mgmt_new_conn_param
The calling functions of mgmt_new_conn_param have more information about
the parameters, such as whether the kernel is tracking them or not. It
makes therefore sense to have them pass an initial store_hint value to
the mgmt_new_conn_param function.

Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2014-07-03 17:42:57 +02:00
Johan Hedberg
7d6ca6939c Bluetooth: Make hci_le_conn_update return the store hint
The caller of hci_le_conn_update is directly interested in knowing what
the best value is for the store_hint parameter of the corresponding
mgmt event. Since hci_le_conn_update knows whether there were stored
parameters that were updated or not we can have it return an initial
store_hint value to the caller.

Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2014-07-03 17:42:57 +02:00
Johan Hedberg
a26f3dcff2 Bluetooth: Add Load Connection Parameters command
This patch implements the new Load Connection Parameters mgmt command
that's intended to load the desired connection parameters for LE
devices.

Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2014-07-03 17:42:57 +02:00
Johan Hedberg
a3451d279f Bluetooth: Add new auto_conn value matching mgmt action 0x00
The 0x00 action value of mgmt means "scan and report" but do not
connect. This is different from HCI_AUTO_CONN_DISABLED so we need a new
value for it.

Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2014-07-03 17:42:57 +02:00
Johan Hedberg
55af49a8fe Bluetooth: Add specific connection parameter clear functions
In some circumstances we'll need to either clear only the enabled
parameters or only the disabled ones. This patch adds convenience
functions for this purpose.

Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2014-07-03 17:42:56 +02:00
Johan Hedberg
373110c5d3 Bluetooth: Rename hci_conn_params_clear to hci_conn_params_clear_all
We'll soon have specific clear functions for clearing enabled or
disabled entries, so rename the function that removes everything to
clear_all().

Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2014-07-03 17:42:56 +02:00
Marcel Holtmann
24c457e270 Bluetooth: Add support for hdev->set_bdaddr callback handling
Some embedded controllers allow the programming of a public address
and this adds vendor support for supporting OEM confguration of such
addresses.

Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
2014-07-03 17:42:55 +02:00
Andre Guedes
ffb5a827d5 Bluetooth: Introduce "New Connection Parameter" Event
This patch introduces a new Mgmt event called "New Connection Parameter".
This event indicates to userspace the connection parameters values the
remote device requested.

The user may store these values and load them into kernel. This way, next
time a connection is established to that device, the kernel will use those
parameters values instead of the default ones.

This event is sent when the remote device requests new connection
parameters through connection parameter update procedure. This event is
not sent for slave connections.

Signed-off-by: Andre Guedes <andre.guedes@openbossa.org>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2014-07-03 17:42:55 +02:00
Andre Guedes
662bc2e63d Bluetooth: Enable new LE meta event
The Bluetooth 4.1 introduces a new LE meta event called "LE Remote
Connection Parameter Request" event. In order to the controller
sends this event to host, we should enable it during controller
initialization.

Signed-off-by: Andre Guedes <andre.guedes@openbossa.org>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2014-07-03 17:42:55 +02:00
Andre Guedes
8e75b46a4f Bluetooth: Connection Parameter Update Procedure
This patch adds support for LE Connection Parameters Request Link
Layer control procedure introduced in Core spec 4.1. This procedure
allows a Peripheral or Central to update the Link Layer connection
parameters of an established connection.

Regarding the acceptance of connection parameters, the LL procedure
follows the same approach of L2CAP procedure (see l2cap_conn_param_
update_req function). We accept any connection parameters values as
long as they are within the valid range.

Signed-off-by: Andre Guedes <andre.guedes@openbossa.org>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2014-07-03 17:42:54 +02:00
Johan Hedberg
2a8357f239 Bluetooth: Fix redundant device (un)blocked events
For the Block/Unblock Device mgmt commands we should only emit the
Blocked/Unblocked events on any socket except for the one which received
the command. The code was previously incorrectly trying to look up a
non-existent pending command and thereby ending up not skipping the
command socket for the event.

We can simplify the code a lot by simply sending the event directly from
the command handler functions. We have the reference to the command
socket available there which makes it easy to pass to the mgmt_event
function for skipping.

The only notable side-effect of this is that the old blacklisting
ioctl's no-longer cause mgmt events to be emitted, however as user space
versions using these ioctl's are not mgmt-aware this is acceptable.

Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2014-07-03 17:42:54 +02:00
Johan Hedberg
fe59a05f94 Bluetooth: Add flag to track STK encryption
There are certain subtle differences in behavior when we're encrypted
with the STK, such as allowing re-encryption even though the security
level stays the same. Because of this, add a flag to track whether we're
encrypted with an STK or not.

Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2014-07-03 17:42:54 +02:00
Marcel Holtmann
c70a7e4cc8 Bluetooth: Add support for Not Connectable flag for Device Found events
The Device Found events of the management interface should indicate if
it is possible to connect to a remote device or if it is broadcaster
only advertising. To allow this differentation the Not Connectable flag
is introduced that will be set when it is known that a device can not
be connected.

Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
2014-07-03 17:42:53 +02:00
Marcel Holtmann
af58925ca6 Bluetooth: Provide flags parameter direct to mgmt_device_found
Providing the flags parameter directly to mgmt_device_found function
makes the core simpler and more readable. With this it becomes a lot
easier to add new flags in the future.

This also changes hci_inquiry_cache_update to just return that flags
needed for mgmt_device_found since that is its only use for the two
return parameters anyway.

Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
2014-07-03 17:42:53 +02:00
Marcel Holtmann
d06b50ce14 Bluetooth: Remove connection interval parameters from hci_conn_params_set
The connection interval parameter of hci_conn_params_set are always used
with the controller defaults. So just let hci_conn_params_add set the
controller default and not bother resetting them to controller defaults
every time the hci_conn_params_set is called.

Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
2014-07-03 17:42:53 +02:00
Marcel Holtmann
51d167c097 Bluetooth: Change hci_conn_params_add to return the parameter struct
When adding new connection parameters, it is useful to return either
the existing struct or the newly created one.

Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
2014-07-03 17:42:52 +02:00
Andre Guedes
d4905f2453 Bluetooth: Connection parameters check helper
This patch renames l2cap_check_conn_param() to hci_check_conn_params()
and moves it to hci_core.h so it can reused in others files. This helper
will be reused in the next patch.

Signed-off-by: Andre Guedes <andre.guedes@openbossa.org>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2014-07-03 17:42:52 +02:00
Marcel Holtmann
bf5b3c8be0 Bluetooth: Provide function to create and set connection parameters
In some cases it is useful to not overwrite connection parametes and
instead just create default ones if they don't exist. This function
does exactly that. hci_conn_params_add will allow to create new
default connection parameters. hci_conn_params_set will set the
values and also create new parameters if they don't exist.

Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
2014-07-03 17:42:51 +02:00
Marcel Holtmann
04fb7d9066 Bluetooth: Provide defaults for LE connection latency and timeout
Store the connection latency and supervision timeout default values
with all the other controller defaults. And when needed use them
for new connections.

Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
2014-07-03 17:42:51 +02:00
Marcel Holtmann
8afef092a1 Bluetooth: Add Device Added and Device Removed management events
When devices are added or removed, then make sure that events are send
out to all other clients so that the list of devices can be easily
tracked. This is especially important when external clients are
adding or removing devices within the auto-connection list.

Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
2014-07-03 17:42:50 +02:00
Marcel Holtmann
2faade53e6 Bluetooth: Add support for Add/Remove Device management commands
This allows adding or removing devices from the background scanning
list the kernel maintains. Device flagged for auto-connection will
be automatically connected if they are found.

The passive scanning required for auto-connection will be started
and stopped on demand.

Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
2014-07-03 17:42:50 +02:00
Marcel Holtmann
f044eb0524 Bluetooth: Store latency and supervision timeout in connection params
When the slave updates the connection parameters, store also the
connection latency and supervision timeout information in the
internal list of connection parameters for known devices.

Having these values available allowes the auto-connection
procedure to use the correct values from the beginning without
having to request an update on every connection establishment.

Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
2014-07-03 17:42:50 +02:00
Johan Hedberg
958684263d Bluetooth: Add support for Get Clock Info mgmt command
This patch implements support for the Get Clock Information mgmt
command. This is done by performing one or two HCI_Read_Clock commands
and creating the response from the stored values in the hci_dev and
hci_conn structs.

Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2014-07-03 17:42:49 +02:00
Johan Hedberg
33f3572103 Bluetooth: Add tracking of local and piconet clock values
This patch adds support for storing the local and piconet clock values
from the HCI_Read_Clock command response to the hci_dev and hci_conn
structs. This will be later used in another patch to implement support
for the Get Clock Info mgmt command.

Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2014-07-03 17:42:48 +02:00
Marcel Holtmann
df935429be Bluetooth: Send HCI_Read_Clock_Offset before disconnecting
When the connection is in master role and it is going to be
disconnected based on the disconnection timeout, then send
the HCI_Read_Clock_Offset command in an attempt to update the
clock offset value in the inquiry cache.

Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
2014-07-03 17:42:48 +02:00
Johan Hedberg
b10e8017bd Bluetooth: Remove unnecessary hcon->smp_conn variable
The smp_conn member of struct hci_conn was simply a pointer to the
l2cap_conn object. Since we already have hcon->l2cap_data that points to
the same thing there's no need to have this second variable. This patch
removes it and changes the single place that was using it to use
hcon->l2cap_data instead.

Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2014-07-03 17:42:47 +02:00
Andre Guedes
dbbfa2ab7a Bluetooth: Use macro instead of hard-coded value
This patch replaces the hard-coded value in hci_bdaddr_is_rpa() helper
by the corresponding macro ADDR_LE_DEV_RANDOM.

Signed-off-by: Andre Guedes <andre.guedes@openbossa.org>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2014-07-03 17:42:46 +02:00
Johan Hedberg
4dae27983e Bluetooth: Convert hci_conn->link_mode into flags
Since the link_mode member of the hci_conn struct is a bit field and we
already have a flags member as well it makes sense to merge these two
together. This patch moves all used link_mode bits into corresponding
flags. To keep backwards compatibility with user space we still need to
provide a get_link_mode() helper function for the ioctl's that expect a
link_mode style value.

Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2014-07-03 17:42:46 +02:00
Johan Hedberg
3769972bad Bluetooth: Add a new HCI_USE_DEBUG_KEYS flag
To pave the way for actively using debug keys for pairing this patch
adds a new HCI_USE_DEBUG_KEYS flag for the purpose. When the flag is set
we issue a HCI_Write_SSP_Debug mode whenever HCI_Write_SSP_Mode(0x01)
has been issued as well as before issuing a HCI_Write_SSP_Mode(0x00)
command.

Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2014-07-03 17:42:46 +02:00
Johan Hedberg
af6a9c3213 Bluetooth: Convert hcon->flush_key to a proper flag
There's no point in having boolean variables in the hci_conn struct
since it already has a flags member. This patch converts the flush_key
member into a proper flag.

Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2014-07-03 17:42:46 +02:00
Johan Hedberg
0663b297f1 Bluetooth: Rename HCI_DEBUG_KEYS to HCI_KEEP_DEBUG_KEYS
We're planning to add a flag to actively use debug keys in addition to
simply just accepting them, which makes the current generically named
DEBUG_KEYS flag a bit confusing. Since the flag in practice affects
whether the kernel keeps debug keys around or not rename it to
HCI_KEEP_DEBUG_KEYS.

Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2014-07-03 17:42:45 +02:00
Johan Hedberg
7652ff6aea Bluetooth: Move mgmt event sending out from hci_add_link_key()
There are two callers of hci_add_link_key(). The first one is the HCI
Link Key Notification event and the second one the mgmt code that
receives a list of link keys from user space. Previously we've had the
hci_add_link_key() function being responsible for also emitting a mgmt
signal but for the latter use case this should not happen. Because of
this a rather awkward new_key paramter has been passed to the function.

This patch moves the mgmt event sending out from the hci_add_link_key()
function, thereby making the code a bit more understandable.

Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2014-07-03 17:42:45 +02:00
Johan Hedberg
567fa2aa3d Bluetooth: Update hci_add_link_key() to return pointer to key
By returning the added (or updated) key we pave the way for further
refactoring (in subsequent patches) that allows moving the mgmt event
sending out from this function (and thereby removal of the awkward
new_key parameter).

Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2014-07-03 17:42:45 +02:00
Marcel Holtmann
1855d92dce Bluetooth: Track LE connection parameter update event
When the LE controller changes its connection parameters, it will send
a connection parameter update event. Make sure that the new set of
parameters are stored in hci_conn struct and thus will properly update
the previous values retrieved from the connection complete event.

Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
2014-07-03 17:42:44 +02:00
Marcel Holtmann
e04fde60ef Bluetooth: Store current LE connection parameters in hci_conn struct
The LE connection parameters are needed later on to be able to decide
if it is required to trigger connection update procedures. So when the
connection has been established successfully, store the current used
parameters in hci_conn struct.

Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
2014-07-03 17:42:44 +02:00
Jukka Rissanen
6b8d4a6a03 Bluetooth: 6LoWPAN: Use connected oriented channel instead of fixed one
Create a CoC dynamically instead of one fixed channel for communication
to peer devices.

Signed-off-by: Jukka Rissanen <jukka.rissanen@linux.intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2014-07-03 17:42:44 +02:00
Jukka Rissanen
0498878b18 Bluetooth: Provide L2CAP ops callback for memcpy_fromiovec
The highly optimized TX path for L2CAP channels and its fragmentation
within the HCI ACL packets requires to copy data from user provided
IO vectors and also kernel provided memory buffers.

This patch allows channel clients to provide a memcpy_fromiovec callback
to keep this optimized behavior, but adapt it to kernel vs user memory
for the TX path. For all kernel internal L2CAP channels, a default
implementation is provided that can be referenced.

In case of A2MP, this fixes a long-standing issue with wrongly accessing
kernel memory as user memory.

This patch originally by Marcel Holtmann.

Signed-off-by: Jukka Rissanen <jukka.rissanen@linux.intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2014-07-03 17:42:43 +02:00
Marcel Holtmann
111902f723 Bluetooth: Use separate dbg_flags to special debugfs options
All the special settings configured via debugfs are either developer
only options or temporary solutions. To not clutter the standard flags,
move them to their own dbg_flags entry.

Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
2014-07-03 17:42:43 +02:00
Johan Hedberg
ddf4ba078f Bluetooth: Remove unused LTK authentication defines
These defines were probably put in to track authenticated vs
unauthenticated LTKs, however since the LTK struct has a separate
boolean authenticated member these were never used.

Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2014-07-03 17:42:42 +02:00
Johan Hedberg
2ceba53936 Bluetooth: Remove HCI prefix from SMP LTK defines
The LTK type has really nothing to do with HCI so it makes more sense to
have these in smp.h than hci.h. This patch moves the defines to smp.h
and removes the HCI_ prefix in the same go.

Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2014-07-03 17:42:42 +02:00
Johan Hedberg
7d5843b7b7 Bluetooth: Remove unnecessary SMP STK define
We never store the "master" type of STKs since we request encryption
directly with them so we only need one STK type (the one that's
looked-up on the slave side). Simply remove the unnecessary define and
rename the _SLAVE one to the shorter form.

Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2014-07-03 17:42:42 +02:00
Marcel Holtmann
65cc2b49db Bluetooth: Use struct delayed_work for HCI command timeout
Since the whole HCI command, event and data packet processing has been
migrated to use workqueues instead of tasklets, it makes sense to use
struct delayed_work instead of struct timer_list for the timeout
handling. This patch converts the hdev->cmd_timer to use workqueue
as well.

Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
2014-07-03 17:42:42 +02:00
Marcel Holtmann
d9fbd02be5 Bluetooth: Use explicit header and body length for L2CAP SKB allocation
When allocating the L2CAP SKB for transmission, provide the upper layers
with a clear distinction on what is the header and what is the body
portion of the SKB.

Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
2014-07-03 17:42:42 +02:00
Marcel Holtmann
0775899158 Bluetooth: Shrink size of struct l2cap_ctrl fields
The struct l2cap_ctrl fields are wasting an unsigned int when all the
bits can fit into an __u8 field.

Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
2014-07-03 17:42:41 +02:00
Marcel Holtmann
8d46321c4f Bluetooth: Assign L2CAP socket priority when allocating SKB
The SKB for L2CAP sockets are all allocated in a central callback
in the socket support. Instead of having to pass around the socket
priority all the time, assign it to skb->priority when actually
allocating the SKB.

Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
2014-07-03 17:42:41 +02:00
Marcel Holtmann
67f86a45bb Bluetooth: Use const for struct l2cap_ops field
The struct l2cap_ops field should not allow any modifications and thus
it is better declared as const.

Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
2014-07-03 17:42:41 +02:00
Daniel Borkmann
8f61059a96 net: sctp: improve timer slack calculation for transport HBs
RFC4960, section 8.3 says:

  On an idle destination address that is allowed to heartbeat,
  it is recommended that a HEARTBEAT chunk is sent once per RTO
  of that destination address plus the protocol parameter
  'HB.interval', with jittering of +/- 50% of the RTO value,
  and exponential backoff of the RTO if the previous HEARTBEAT
  is unanswered.

Currently, we calculate jitter via sctp_jitter() function first,
and then add its result to the current RTO for the new timeout:

  TMO = RTO + (RAND() % RTO) - (RTO / 2)
              `------------------------^-=> sctp_jitter()

Instead, we can just simplify all this by directly calculating:

  TMO = (RTO / 2) + (RAND() % RTO)

With the help of prandom_u32_max(), we don't need to open code
our own global PRNG, but can instead just make use of the per
CPU implementation of prandom with better quality numbers. Also,
we can now spare us the conditional for divide by zero check
since no div or mod operation needs to be used. Note that
prandom_u32_max() won't emit the same result as a mod operation,
but we really don't care here as we only want to have a random
number scaled into RTO interval.

Note, exponential RTO backoff is handeled elsewhere, namely in
sctp_do_8_2_transport_strike().

Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-07-02 18:44:07 -07:00
Alexander Aring
48bc03433c ieee802154: reassembly: fix possible buffer overflow
The max_dsize attribute in ctl_table for lowpan_frags_ns_ctl_table is
configured with integer accessing methods. This patch change the
max_dsize attribute to int to avoid a possible buffer overflow.

Signed-off-by: Alexander Aring <alex.aring@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-07-02 18:34:25 -07:00
Eric Dumazet
5925a0555b net: fix sparse warning in sk_dst_set()
sk_dst_cache has __rcu annotation, so we need a cast to avoid
following sparse error :

include/net/sock.h:1774:19: warning: incorrect type in initializer (different address spaces)
include/net/sock.h:1774:19:    expected struct dst_entry [noderef] <asn:4>*__ret
include/net/sock.h:1774:19:    got struct dst_entry *dst

Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: kbuild test robot <fengguang.wu@intel.com>
Fixes: 7f50236153 ("ipv4: irq safe sk_dst_[re]set() and ipv4_sk_update_pmtu() fix")
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-07-02 17:03:06 -07:00
Eric Dumazet
9fe516ba3f inet: move ipv6only in sock_common
When an UDP application switches from AF_INET to AF_INET6 sockets, we
have a small performance degradation for IPv4 communications because of
extra cache line misses to access ipv6only information.

This can also be noticed for TCP listeners, as ipv6_only_sock() is also
used from __inet_lookup_listener()->compute_score()

This is magnified when SO_REUSEPORT is used.

Move ipv6only into struct sock_common so that it is available at
no extra cost in lookups.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-07-01 23:46:21 -07:00
Eric Dumazet
7f50236153 ipv4: irq safe sk_dst_[re]set() and ipv4_sk_update_pmtu() fix
We have two different ways to handle changes to sk->sk_dst

First way (used by TCP) assumes socket lock is owned by caller, and use
no extra lock : __sk_dst_set() & __sk_dst_reset()

Another way (used by UDP) uses sk_dst_lock because socket lock is not
always taken. Note that sk_dst_lock is not softirq safe.

These ways are not inter changeable for a given socket type.

ipv4_sk_update_pmtu(), added in linux-3.8, added a race, as it used
the socket lock as synchronization, but users might be UDP sockets.

Instead of converting sk_dst_lock to a softirq safe version, use xchg()
as we did for sk_rx_dst in commit e47eb5dfb2 ("udp: ipv4: do not use
sk_dst_lock from softirq context")

In a follow up patch, we probably can remove sk_dst_lock, as it is
only used in IPv6.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Steffen Klassert <steffen.klassert@secunet.com>
Fixes: 9cb3a50c5f ("ipv4: Invalidate the socket cached route on pmtu events if possible")
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-06-30 23:40:58 -07:00
Octavian Purdila
1fb6f159fd tcp: add tcp_conn_request
Create tcp_conn_request and remove most of the code from
tcp_v4_conn_request and tcp_v6_conn_request.

Signed-off-by: Octavian Purdila <octavian.purdila@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-06-27 15:53:37 -07:00
Octavian Purdila
695da14eb0 tcp: add queue_add_hash to tcp_request_sock_ops
Add queue_add_hash member to tcp_request_sock_ops so that we can later
unify tcp_v4_conn_request and tcp_v6_conn_request.

Signed-off-by: Octavian Purdila <octavian.purdila@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-06-27 15:53:36 -07:00
Octavian Purdila
2aec4a297b tcp: add mss_clamp to tcp_request_sock_ops
Add mss_clamp member to tcp_request_sock_ops so that we can later
unify tcp_v4_conn_request and tcp_v6_conn_request.

Signed-off-by: Octavian Purdila <octavian.purdila@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-06-27 15:53:36 -07:00
Octavian Purdila
5db92c9949 tcp: unify tcp_v4_rtx_synack and tcp_v6_rtx_synack
Signed-off-by: Octavian Purdila <octavian.purdila@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-06-27 15:53:36 -07:00
Octavian Purdila
d6274bd8d6 tcp: add send_synack method to tcp_request_sock_ops
Create a new tcp_request_sock_ops method to unify the IPv4/IPv6
signature for tcp_v[46]_send_synack. This allows us to later unify
tcp_v4_rtx_synack with tcp_v6_rtx_synack and tcp_v4_conn_request with
tcp_v4_conn_request.

Signed-off-by: Octavian Purdila <octavian.purdila@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-06-27 15:53:36 -07:00
Octavian Purdila
936b8bdb53 tcp: add init_seq method to tcp_request_sock_ops
More work in preparation of unifying tcp_v4_conn_request and
tcp_v6_conn_request: indirect the init sequence calls via the
tcp_request_sock_ops.

Signed-off-by: Octavian Purdila <octavian.purdila@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-06-27 15:53:36 -07:00
Octavian Purdila
d94e0417ad tcp: add route_req method to tcp_request_sock_ops
Create wrappers with same signature for the IPv4/IPv6 request routing
calls and use these wrappers (via route_req method from
tcp_request_sock_ops) in tcp_v4_conn_request and tcp_v6_conn_request
with the purpose of unifying the two functions in a later patch.

We can later drop the wrapper functions and modify inet_csk_route_req
and inet6_cks_route_req to use the same signature.

Signed-off-by: Octavian Purdila <octavian.purdila@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-06-27 15:53:36 -07:00
Octavian Purdila
fb7b37a7f3 tcp: add init_cookie_seq method to tcp_request_sock_ops
Move the specific IPv4/IPv6 cookie sequence initialization to a new
method in tcp_request_sock_ops in preparation for unifying
tcp_v4_conn_request and tcp_v6_conn_request.

Signed-off-by: Octavian Purdila <octavian.purdila@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-06-27 15:53:35 -07:00
Octavian Purdila
16bea70aa7 tcp: add init_req method to tcp_request_sock_ops
Move the specific IPv4/IPv6 intializations to a new method in
tcp_request_sock_ops in preparation for unifying tcp_v4_conn_request
and tcp_v6_conn_request.

Signed-off-by: Octavian Purdila <octavian.purdila@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-06-27 15:53:35 -07:00
Octavian Purdila
476eab8251 net: remove inet6_reqsk_alloc
Since pktops is only used for IPv6 only and opts is used for IPv4
only, we can move these fields into a union and this allows us to drop
the inet6_reqsk_alloc function as after this change it becomes
equivalent with inet_reqsk_alloc.

This patch also fixes a kmemcheck issue in the IPv6 stack: the flags
field was not annotated after a request_sock was allocated.

Signed-off-by: Octavian Purdila <octavian.purdila@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-06-27 15:53:35 -07:00
Octavian Purdila
57b47553f6 tcp: cookie_v4_init_sequence: skb should be const
Signed-off-by: Octavian Purdila <octavian.purdila@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-06-27 15:53:35 -07:00
Pablo Neira Ayuso
960649d192 netfilter: bridge: add generic packet logger
This adds the generic plain text packet loggger for bridged packets.
It routes the logging message to the real protocol packet logger.
I decided not to refactor the ebt_log code for two reasons:

1) The ebt_log output is not consistent with the IPv4 and IPv6
   Netfilter packet loggers. The output is different for no good
   reason and it adds redundant code to handle packet logging.

2) To avoid breaking backward compatibility for applications
   outthere that are parsing the specific ebt_log output, the ebt_log
   output has been left as is. So only nftables will use the new
   consistent logging format for logged bridged packets.

More decisions coming in this patch:

1) This also removes ebt_log as default logger for bridged packets.
   Thus, nf_log_packet() routes packet to this new packet logger
   instead. This doesn't break backward compatibility since
   nf_log_packet() is not used to log packets in plain text format
   from anywhere in the ebtables/netfilter bridge code.

2) The new bridge packet logger also performs a lazy request to
   register the real IPv4, ARP and IPv6 netfilter packet loggers.
   If the real protocol logger is no available (not compiled or the
   module is not available in the system, not packet logging happens.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2014-06-27 13:20:47 +02:00
Pablo Neira Ayuso
fab4085f4e netfilter: log: nf_log_packet() as real unified interface
Before this patch, the nf_loginfo parameter specified the logging
configuration in case the specified default logger was loaded. This
patch updates the semantics of the nf_loginfo parameter in
nf_log_packet() which now indicates the logger that you explicitly
want to use.

Thus, nf_log_packet() is exposed as an unified interface which
internally routes the log message to the corresponding logger type
by family.

The module dependencies are expressed by the new nf_logger_find_get()
and nf_logger_put() functions which bump the logger module refcount.
Thus, you can not remove logger modules that are used by rules anymore.

Another important effect of this change is that the family specific
module is only loaded when required. Therefore, xt_LOG and nft_log
will just trigger the autoload of the nf_log_{ip,ip6} modules
according to the family.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2014-06-27 13:20:13 +02:00
Pablo Neira Ayuso
83e96d443b netfilter: log: split family specific code to nf_log_{ip,ip6,common}.c files
The plain text logging is currently embedded into the xt_LOG target.
In order to be able to use the plain text logging from nft_log, as a
first step, this patch moves the family specific code to the following
files and Kconfig symbols:

1) net/ipv4/netfilter/nf_log_ip.c: CONFIG_NF_LOG_IPV4
2) net/ipv6/netfilter/nf_log_ip6.c: CONFIG_NF_LOG_IPV6
3) net/netfilter/nf_log_common.c: CONFIG_NF_LOG_COMMON

These new modules will be required by xt_LOG and nft_log. This patch
is based on original patch from Arturo Borrero Gonzalez.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2014-06-27 13:19:59 +02:00
David S. Miller
9b8d90b963 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2014-06-25 22:40:43 -07:00
Eric Dumazet
f886497212 ipv4: fix dst race in sk_dst_get()
When IP route cache had been removed in linux-3.6, we broke assumption
that dst entries were all freed after rcu grace period. DST_NOCACHE
dst were supposed to be freed from dst_release(). But it appears
we want to keep such dst around, either in UDP sockets or tunnels.

In sk_dst_get() we need to make sure dst refcount is not 0
before incrementing it, or else we might end up freeing a dst
twice.

DST_NOCACHE set on a dst does not mean this dst can not be attached
to a socket or a tunnel.

Then, before actual freeing, we need to observe a rcu grace period
to make sure all other cpus can catch the fact the dst is no longer
usable.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Dormando <dormando@rydia.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-06-25 17:41:44 -07:00
Pablo Neira Ayuso
27fd8d90c9 netfilter: nf_log: move log buffering to core logging
This patch moves Eric Dumazet's log buffer implementation from the
xt_log.h header file to the core net/netfilter/nf_log.c. This also
includes the renaming of the structure and functions to avoid possible
undesired namespace clashes.

This change allows us to use it from the arp and bridge packet logging
implementation in follow up patches.
2014-06-25 19:28:43 +02:00
Pablo Neira Ayuso
5962815a6a netfilter: nf_log: use an array of loggers instead of list
Now that legacy ulog targets are not available anymore in the tree, we
can have up to two possible loggers:

1) The plain text logging via kernel logging ring.
2) The nfnetlink_log infrastructure which delivers log messages
   to userspace.

This patch replaces the list of loggers by an array of two pointers
per family for each possible logger and it also introduces a new field
to the nf_logger structure which indicates the position in the logger
array (based on the logger type).

This prepares a follow up patch that consolidates the nf_log_packet()
interface by allowing to specify the logger as parameter.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2014-06-25 19:28:43 +02:00
Florian Westphal
9500507c61 netfilter: conntrack: remove timer from ecache extension
This brings the (per-conntrack) ecache extension back to 24 bytes in size
(was 152 byte on x86_64 with lockdep on).

When event delivery fails, re-delivery is attempted via work queue.

Redelivery is attempted at least every 0.1 seconds, but can happen
more frequently if userspace is not congested.

The nf_ct_release_dying_list() function is removed.
With this patch, ownership of the to-be-redelivered conntracks
(on-dying-list-with-DYING-bit not yet set) is with the work queue,
which will release the references once event is out.

Joint work with Pablo Neira Ayuso.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2014-06-25 19:15:38 +02:00
Michal Kazior
97dc94f1d9 cfg80211: remove channel_switch combination check
Driver is now responsible for veryfing if the
switch is possible.

Since this is inherently tricky driver may decide
to disconnect an interface later with
cfg80211_stop_iface().

This doesn't mean driver can accept everything. It
should do it's best to verify requests and reject
them as soon as possible.

Signed-off-by: Michal Kazior <michal.kazior@tieto.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2014-06-25 18:06:20 +02:00
Michal Kazior
5bcae31d9c mac80211: implement multi-vif in-place reservations
Multi-vif in-place reservations happen when
it is impossible to allocate more channel contexts
as indicated by interface combinations.

Such reservations are not finalized until all
assigned interfaces are ready.

This still doesn't handle all possible cases
(i.e. degradation of number of channels) properly.

Signed-off-by: Michal Kazior <michal.kazior@tieto.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2014-06-25 18:06:20 +02:00
David Spinadel
633e271326 mac80211: split sched scan IEs
Split sched scan IEs to band specific and not band specific
blocks. Common IEs blocks may be sent to the FW once per command,
instead of per band.

This allows optimization of size of the command, which may be
required by some drivers (eg. iwlmvm with newer firmware version).

As this changes the mac80211 API, update all drivers to use the
new version correctly, even if they don't (yet) make use of the
split data.

Signed-off-by: David Spinadel <david.spinadel@intel.com>
Reviewed-by: Alexander Bondar <alexander.bondar@intel.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2014-06-25 09:10:43 +02:00
David Spinadel
c56ef67250 mac80211: support more than one band in scan request
Some drivers (such as iwlmvm) can handle multiple bands in a single
HW scan request. Add a HW flag to indicate that the driver support
this. To hold the required data, create a separate structure for
HW scan request that holds cfg scan request and data about
different parts of the scan IEs.

As this changes the mac80211 API, update all drivers using it to
use the correct new function type/argument.

Signed-off-by: David Spinadel <david.spinadel@intel.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2014-06-25 09:10:42 +02:00
Govindarajulu Varadarajan
e0f31d8498 flow_keys: Record IP layer protocol in skb_flow_dissect()
skb_flow_dissect() dissects only transport header type in ip_proto. It dose not
give any information about IPv4 or IPv6.

This patch adds new member, n_proto, to struct flow_keys. Which records the
IP layer type. i.e IPv4 or IPv6.

This can be used in netdev->ndo_rx_flow_steer driver function to dissect flow.

Adding new member to flow_keys increases the struct size by around 4 bytes.
This causes BUILD_BUG_ON(sizeof(qcb->data) < sz); to fail in
qdisc_cb_private_validate()

So increase data size by 4

Signed-off-by: Govindarajulu Varadarajan <_govind@gmx.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-06-23 14:32:19 -07:00
Arik Nemtsov
ee10f2c779 mac80211: protect TDLS discovery session
After sending a TDLS discovery-request, we expect a reply to arrive on
the AP's channel. We must stay on the channel (no PSM, scan, etc.), since
a TDLS setup-response is a direct packet not buffered by the AP.
Add a new mac80211 driver callback to allow discovery session protection.

Signed-off-by: Arik Nemtsov <arikx.nemtsov@intel.com>
Reviewed-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2014-06-23 14:28:19 +02:00
Arik Nemtsov
c887f0d3a0 mac80211: add API to request TDLS operation from userspace
Write a mac80211 to the cfg80211 API for requesting a userspace TDLS
operation. Define TDLS specific reason codes that can be used here.

Signed-off-by: Arik Nemtsov <arikx.nemtsov@intel.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2014-06-23 14:28:17 +02:00
Arik Nemtsov
31fa97c5de cfg80211: pass TDLS initiator in tdls_mgmt operations
The TDLS initiator is set once during link setup. If determines the
address ordering in the link identifier IE.

Fix dependent drivers - mwifiex and mac80211.

Signed-off-by: Arik Nemtsov <arikx.nemtsov@intel.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2014-06-23 14:24:55 +02:00
Johannes Berg
b7ffbd7ef6 cfg80211: make ethtool the driver's responsibility
Currently, cfg80211 tries to implement ethtool, but that doesn't
really scale well, with all the different operations. Make the
lower-level driver responsible for it, which currently only has
an effect on mac80211. It will similarly not scale well at that
level though, since mac80211 also has many drivers.

To cleanly implement this in mac80211, introduce a new file and
move some code to appropriate places.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2014-06-23 11:05:33 +02:00
Octavian Purdila
e0f802fbca tcp: move ir_mark initialization to tcp_openreq_init
ir_mark initialization is done for both TCP v4 and v6, move it in the
common tcp_openreq_init function.

Signed-off-by: Octavian Purdila <octavian.purdila@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-06-17 15:30:54 -07:00
Pablo Neira Ayuso
a0a7379e16 netfilter: nf_tables: use u32 for chain use counter
Since 4fefee5 ("netfilter: nf_tables: allow to delete several objects
from a batch"), every new rule bumps the chain use counter. However,
this is limited to 16 bits, which means that it will overrun after
2^16 rules.

Use a u32 chain counter and check for overflows (just like we do for
table objects).

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2014-06-16 13:07:44 +02:00
Tom Herbert
bbdff225ed udp: call __skb_checksum_complete when doing full checksum
In __udp_lib_checksum_complete check if checksum is being done over all
the data (len is equal to skb->len) and if it is call
__skb_checksum_complete instead of __skb_checksum_complete_head. This
allows checksum to be saved in checksum complete.

Signed-off-by: Tom Herbert <therbert@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-06-15 01:00:49 -07:00
Linus Torvalds
f9da455b93 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next
Pull networking updates from David Miller:

 1) Seccomp BPF filters can now be JIT'd, from Alexei Starovoitov.

 2) Multiqueue support in xen-netback and xen-netfront, from Andrew J
    Benniston.

 3) Allow tweaking of aggregation settings in cdc_ncm driver, from Bjørn
    Mork.

 4) BPF now has a "random" opcode, from Chema Gonzalez.

 5) Add more BPF documentation and improve test framework, from Daniel
    Borkmann.

 6) Support TCP fastopen over ipv6, from Daniel Lee.

 7) Add software TSO helper functions and use them to support software
    TSO in mvneta and mv643xx_eth drivers.  From Ezequiel Garcia.

 8) Support software TSO in fec driver too, from Nimrod Andy.

 9) Add Broadcom SYSTEMPORT driver, from Florian Fainelli.

10) Handle broadcasts more gracefully over macvlan when there are large
    numbers of interfaces configured, from Herbert Xu.

11) Allow more control over fwmark used for non-socket based responses,
    from Lorenzo Colitti.

12) Do TCP congestion window limiting based upon measurements, from Neal
    Cardwell.

13) Support busy polling in SCTP, from Neal Horman.

14) Allow RSS key to be configured via ethtool, from Venkata Duvvuru.

15) Bridge promisc mode handling improvements from Vlad Yasevich.

16) Don't use inetpeer entries to implement ID generation any more, it
    performs poorly, from Eric Dumazet.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1522 commits)
  rtnetlink: fix userspace API breakage for iproute2 < v3.9.0
  tcp: fixing TLP's FIN recovery
  net: fec: Add software TSO support
  net: fec: Add Scatter/gather support
  net: fec: Increase buffer descriptor entry number
  net: fec: Factorize feature setting
  net: fec: Enable IP header hardware checksum
  net: fec: Factorize the .xmit transmit function
  bridge: fix compile error when compiling without IPv6 support
  bridge: fix smatch warning / potential null pointer dereference
  via-rhine: fix full-duplex with autoneg disable
  bnx2x: Enlarge the dorq threshold for VFs
  bnx2x: Check for UNDI in uncommon branch
  bnx2x: Fix 1G-baseT link
  bnx2x: Fix link for KR with swapped polarity lane
  sctp: Fix sk_ack_backlog wrap-around problem
  net/core: Add VF link state control policy
  net/fsl: xgmac_mdio is dependent on OF_MDIO
  net/fsl: Make xgmac_mdio read error message useful
  net_sched: drr: warn when qdisc is not work conserving
  ...
2014-06-12 14:27:40 -07:00
Florian Westphal
6e765a009a net_sched: drr: warn when qdisc is not work conserving
The DRR scheduler requires that items on the active list are work
conserving, i.e. do not hold on to skbs for throttling purposes, etc.
Attaching e.g. tbf renders DRR useless because all other classes on the
active list are delayed as well.

So, warn users that this configuration won't work as expected; we
already do this in couple of other qdiscs, see e.g.

commit b00355db3f
('pkt_sched: sch_hfsc: sch_htb: Add non-work-conserving warning handler')

The 'const' change is needed to avoid compiler warning ("discards 'const'
qualifier from pointer target type").

tested with:
drr_hier() {
        parent=$1
        classes=$2
        for i in  $(seq 1 $classes); do
                classid=$parent$(printf %x $i)
                tc class add dev eth0 parent $parent classid $classid drr
		tc qdisc add dev eth0 parent $classid tbf rate 64kbit burst 256kbit limit 64kbit
        done
}
tc qdisc add dev eth0 root handle 1: drr
drr_hier 1: 32
tc filter add dev eth0 protocol all pref 1 parent 1: handle 1 flow hash keys dst perturb 1 divisor 32

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-06-11 15:50:59 -07:00
Daniel Borkmann
e575235fc6 net: sctp: migrate most recently used transport to ktime
Be more precise in transport path selection and use ktime
helpers instead of jiffies to compare and pick the better
primary and secondary recently used transports. This also
avoids any side-effects during a possible roll-over, and
could lead to better path decision-making.

Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-06-11 12:23:17 -07:00
Octavian Purdila
6cc55e096f tcp: add gfp parameter to tcp_fragment
tcp_fragment can be called from process context (from tso_fragment).
Add a new gfp parameter to allow it to preserve atomic memory if
possible.

Signed-off-by: Octavian Purdila <octavian.purdila@intel.com>
Reviewed-by: Christoph Paasch <christoph.paasch@uclouvain.be>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-06-10 22:30:58 -07:00
Linus Torvalds
3f17ea6dea Merge branch 'next' (accumulated 3.16 merge window patches) into master
Now that 3.15 is released, this merges the 'next' branch into 'master',
bringing us to the normal situation where my 'master' branch is the
merge window.

* accumulated work in next: (6809 commits)
  ufs: sb mutex merge + mutex_destroy
  powerpc: update comments for generic idle conversion
  cris: update comments for generic idle conversion
  idle: remove cpu_idle() forward declarations
  nbd: zero from and len fields in NBD_CMD_DISCONNECT.
  mm: convert some level-less printks to pr_*
  MAINTAINERS: adi-buildroot-devel is moderated
  MAINTAINERS: add linux-api for review of API/ABI changes
  mm/kmemleak-test.c: use pr_fmt for logging
  fs/dlm/debug_fs.c: replace seq_printf by seq_puts
  fs/dlm/lockspace.c: convert simple_str to kstr
  fs/dlm/config.c: convert simple_str to kstr
  mm: mark remap_file_pages() syscall as deprecated
  mm: memcontrol: remove unnecessary memcg argument from soft limit functions
  mm: memcontrol: clean up memcg zoneinfo lookup
  mm/memblock.c: call kmemleak directly from memblock_(alloc|free)
  mm/mempool.c: update the kmemleak stack trace for mempool allocations
  lib/radix-tree.c: update the kmemleak stack trace for radix tree allocations
  mm: introduce kmemleak_update_trace()
  mm/kmemleak.c: use %u to print ->checksum
  ...
2014-06-08 11:31:16 -07:00
Tom Herbert
359a0ea987 vxlan: Add support for UDP checksums (v4 sending, v6 zero csums)
Added VXLAN link configuration for sending UDP checksums, and allowing
TX and RX of UDP6 checksums.

Also, call common iptunnel_handle_offloads and added GSO support for
checksums.

Signed-off-by: Tom Herbert <therbert@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-06-04 22:46:39 -07:00
Tom Herbert
4749c09c37 gre: Call gso_make_checksum
Call gso_make_checksum. This should have the benefit of using a
checksum that may have been previously computed for the packet.

This also adds NETIF_F_GSO_GRE_CSUM to differentiate devices that
offload GRE GSO with and without the GRE checksum offloaed.

Signed-off-by: Tom Herbert <therbert@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-06-04 22:46:38 -07:00
Tom Herbert
af5fcba7f3 udp: Generic functions to set checksum
Added udp_set_csum and udp6_set_csum functions to set UDP checksums
in packets. These are for simple UDP packets such as those that might
be created in UDP tunnels.

Signed-off-by: Tom Herbert <therbert@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-06-04 22:46:38 -07:00
Linus Torvalds
1aacb90eaa Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial into next
Pull trivial tree changes from Jiri Kosina:
 "Usual pile of patches from trivial tree that make the world go round"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (23 commits)
  staging: go7007: remove reference to CONFIG_KMOD
  aic7xxx: Remove obsolete preprocessor define
  of: dma: doc fixes
  doc: fix incorrect formula to calculate CommitLimit value
  doc: Note need of bc in the kernel build from 3.10 onwards
  mm: Fix printk typo in dmapool.c
  modpost: Fix comment typo "Modules.symvers"
  Kconfig.debug: Grammar s/addition/additional/
  wimax: Spelling s/than/that/, wording s/destinatary/recipient/
  aic7xxx: Spelling s/termnation/termination/
  arm64: mm: Remove superfluous "the" in comment
  of: Spelling s/anonymouns/anonymous/
  dma: imx-sdma: Spelling s/determnine/determine/
  ath10k: Improve grammar in comments
  ath6kl: Spelling s/determnine/determine/
  of: Improve grammar for of_alias_get_id() documentation
  drm/exynos: Spelling s/contro/control/
  radio-bcm2048.c: fix wrong overflow check
  doc: printk-formats: do not mention casts for u64/s64
  doc: spelling error changes
  ...
2014-06-04 08:50:34 -07:00
David S. Miller
c99f7abf0e Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Conflicts:
	include/net/inetpeer.h
	net/ipv6/output_core.c

Changes in net were fixing bugs in code removed in net-next.

Signed-off-by: David S. Miller <davem@davemloft.net>
2014-06-03 23:32:12 -07:00
Linus Torvalds
776edb5931 Merge branch 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip into next
Pull core locking updates from Ingo Molnar:
 "The main changes in this cycle were:

   - reduced/streamlined smp_mb__*() interface that allows more usecases
     and makes the existing ones less buggy, especially in rarer
     architectures

   - add rwsem implementation comments

   - bump up lockdep limits"

* 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (33 commits)
  rwsem: Add comments to explain the meaning of the rwsem's count field
  lockdep: Increase static allocations
  arch: Mass conversion of smp_mb__*()
  arch,doc: Convert smp_mb__*()
  arch,xtensa: Convert smp_mb__*()
  arch,x86: Convert smp_mb__*()
  arch,tile: Convert smp_mb__*()
  arch,sparc: Convert smp_mb__*()
  arch,sh: Convert smp_mb__*()
  arch,score: Convert smp_mb__*()
  arch,s390: Convert smp_mb__*()
  arch,powerpc: Convert smp_mb__*()
  arch,parisc: Convert smp_mb__*()
  arch,openrisc: Convert smp_mb__*()
  arch,mn10300: Convert smp_mb__*()
  arch,mips: Convert smp_mb__*()
  arch,metag: Convert smp_mb__*()
  arch,m68k: Convert smp_mb__*()
  arch,m32r: Convert smp_mb__*()
  arch,ia64: Convert smp_mb__*()
  ...
2014-06-03 12:57:53 -07:00
Eric Dumazet
39c36094d7 net: fix inet_getid() and ipv6_select_ident() bugs
I noticed we were sending wrong IPv4 ID in TCP flows when MTU discovery
is disabled.
Note how GSO/TSO packets do not have monotonically incrementing ID.

06:37:41.575531 IP (id 14227, proto: TCP (6), length: 4396)
06:37:41.575534 IP (id 14272, proto: TCP (6), length: 65212)
06:37:41.575544 IP (id 14312, proto: TCP (6), length: 57972)
06:37:41.575678 IP (id 14317, proto: TCP (6), length: 7292)
06:37:41.575683 IP (id 14361, proto: TCP (6), length: 63764)

It appears I introduced this bug in linux-3.1.

inet_getid() must return the old value of peer->ip_id_count,
not the new one.

Lets revert this part, and remove the prevention of
a null identification field in IPv6 Fragment Extension Header,
which is dubious and not even done properly.

Fixes: 87c48fa3b4 ("ipv6: make fragment identifications less predictable")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-06-02 14:09:28 -07:00
David S. Miller
31595de219 Merge branch 'for-davem' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next
John W. Linville says:

====================
pull request: wireless-next 2014-06-02

Please pull this remaining batch of updates intended for the 3.16 stream...

For the mac80211 bits, Johannes says:

"The remainder for -next right now is mostly fixes, and a handful of
small new things like some CSA infrastructure, the regdb script mW/dBm
conversion change and sending wiphy notifications."

For the bluetooth bits, Gustavo says:

"Some more patches for 3.16. There is nothing really special here, just a
bunch of clean ups, fixes plus some small improvements. Please pull."

For the nfc bits, Samuel says:

"We have:

- Felica (Type3) tags support for trf7970a
- Type 4b tags support for port100
- st21nfca DTS typo fix
- A few sparse warning fixes"

For the atheros bits, Kalle says:

"Ben added support for setting antenna configurations. Michal improved
warm reset so that we would not need to fall back to cold reset that
often, an issue where ath10k stripped protected flag while in monitor
mode and made module initialisation asynchronous to fix the problems
with firmware loading when the driver is linked to the kernel.

Luca removed unused channel_switch_beacon callbacks both from ath9k and
ath10k. Marek fixed Protected Management Frames (PMF) when using Action
Frames. Also we had other small fixes everywhere in the driver."

Along with that, there are a handful of updates to a variety
of drivers.  This includes updates to at76c50x-usb, ath9k, b43,
brcmfmac, mwifiex, rsi, rtlwifi, and wil6210.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2014-06-02 11:17:35 -07:00
Eric Dumazet
73f156a6e8 inetpeer: get rid of ip_id_count
Ideally, we would need to generate IP ID using a per destination IP
generator.

linux kernels used inet_peer cache for this purpose, but this had a huge
cost on servers disabling MTU discovery.

1) each inet_peer struct consumes 192 bytes

2) inetpeer cache uses a binary tree of inet_peer structs,
   with a nominal size of ~66000 elements under load.

3) lookups in this tree are hitting a lot of cache lines, as tree depth
   is about 20.

4) If server deals with many tcp flows, we have a high probability of
   not finding the inet_peer, allocating a fresh one, inserting it in
   the tree with same initial ip_id_count, (cf secure_ip_id())

5) We garbage collect inet_peer aggressively.

IP ID generation do not have to be 'perfect'

Goal is trying to avoid duplicates in a short period of time,
so that reassembly units have a chance to complete reassembly of
fragments belonging to one message before receiving other fragments
with a recycled ID.

We simply use an array of generators, and a Jenkin hash using the dst IP
as a key.

ipv6_select_ident() is put back into net/ipv6/ip6_output.c where it
belongs (it is only used from this file)

secure_ip_id() and secure_ipv6_id() no longer are needed.

Rename ip_select_ident_more() to ip_select_ident_segs() to avoid
unnecessary decrement/increment of the number of segments.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-06-02 11:00:41 -07:00
John W. Linville
fcb2c0d6cf Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next into for-davem 2014-06-02 11:20:17 -04:00
David S. Miller
90d0e08e57 Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-next
Pablo Neira Ayuso says:

====================
Netfilter/IPVS updates for net-next

This small patchset contains three accumulated Netfilter/IPVS updates,
they are:

1) Refactorize common NAT code by encapsulating it into a helper
   function, similarly to what we do in other conntrack extensions,
   from Florian Westphal.

2) A minor format string mismatch fix for IPVS, from Masanari Iida.

3) Add quota support to the netfilter accounting infrastructure, now
   you can add quotas to accounting objects via the nfnetlink interface
   and use them from iptables. You can also listen to quota
   notifications from userspace. This enhancement from Mathieu Poirier.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2014-05-30 17:54:47 -07:00
John W. Linville
a5eb1aeb25 Merge branch 'for-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth-next
Conflicts:
	drivers/bluetooth/btusb.c
2014-05-29 13:03:47 -04:00
John W. Linville
737be10d8c Merge git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211-next 2014-05-29 12:55:38 -04:00
John W. Linville
9db7cb6901 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next into for-davem 2014-05-27 13:51:31 -04:00
Luciano Coelho
1a5f0c13d1 mac80211: add a single-transaction driver op to switch contexts
In some cases, when the driver is already using all the channel
contexts it can handle at once, we have to do an in-place switch
(ie. we cannot afford using an extra context temporarily for the
transaction).  But some drivers may not support switching the channel
context assigned to a vif on the fly (ie. without unassigning and
assigning it) while others may only work if the context is changed on
the fly, without unassigning it first.

To allow these different scenarios, add a new driver operation that
let's the driver decide how to handle an in-place switch.

Signed-off-by: Luciano Coelho <luciano.coelho@intel.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2014-05-26 11:04:41 +02:00
David S. Miller
54e5c4def0 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Conflicts:
	drivers/net/bonding/bond_alb.c
	drivers/net/ethernet/altera/altera_msgdma.c
	drivers/net/ethernet/altera/altera_sgdma.c
	net/ipv6/xfrm6_output.c

Several cases of overlapping changes.

The xfrm6_output.c has a bug fix which overlaps the renaming
of skb->local_df to skb->ignore_df.

In the Altera TSE driver cases, the register access cleanups
in net-next overlapped with bug fixes done in net.

Similarly a bug fix to send ALB packets in the bonding driver using
the right source address overlaps with cleanups in net-next.

Signed-off-by: David S. Miller <davem@davemloft.net>
2014-05-24 00:32:30 -04:00
Tom Herbert
28448b8045 net: Split sk_no_check into sk_no_check_{rx,tx}
Define separate fields in the sock structure for configuring disabling
checksums in both TX and RX-- sk_no_check_tx and sk_no_check_rx.
The SO_NO_CHECK socket option only affects sk_no_check_tx. Also,
removed UDP_CSUM_* defines since they are no longer necessary.

Signed-off-by: Tom Herbert <therbert@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-05-23 16:28:53 -04:00
Tom Herbert
b26ba202e0 net: Eliminate no_check from protosw
It doesn't seem like an protocols are setting anything other
than the default, and allowing to arbitrarily disable checksums
for a whole protocol seems dangerous. This can be done on a per
socket basis.

Signed-off-by: Tom Herbert <therbert@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-05-23 16:28:53 -04:00
Johan Hedberg
d7b2545023 Bluetooth: Clearly distinguish mgmt LTK type from authenticated property
On the mgmt level we have a key type parameter which currently accepts
two possible values: 0x00 for unauthenticated and 0x01 for
authenticated. However, in the internal struct smp_ltk representation we
have an explicit "authenticated" boolean value.

To make this distinction clear, add defines for the possible mgmt values
and do conversion to and from the internal authenticated value.

Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2014-05-23 11:24:04 -07:00
David S. Miller
65db611a5c Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec-next
Steffen Klassert says:

====================
pull request (net-next): ipsec-next 2014-05-22

This is the last ipsec pull request before I leave for
a three weeks vacation tomorrow. David, can you please
take urgent ipsec patches directly into net/net-next
during this time?

I'll continue to run the ipsec/ipsec-next trees as soon
as I'm back.

1) Simplify the xfrm audit handling, from Tetsuo Handa.

2) Codingstyle cleanup for xfrm_output, from abian Frederick.

Please pull or let me know if there are problems.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2014-05-22 16:00:00 -04:00
Ezequiel Garcia
e876f208af net: Add a software TSO helper API
Although the implementation probably needs a lot of work, this initial API
allows to implement software TSO in mvneta and mv643xx_eth drivers in a not
so intrusive way.

Signed-off-by: Ezequiel Garcia <ezequiel.garcia@free-electrons.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-05-22 14:57:15 -04:00
John W. Linville
40a10fd740 Merge git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211-next 2014-05-22 13:58:36 -04:00
John W. Linville
99abe65ff1 NFC: 3.16: First pull request
This is the NFC pull request for 3.16. We have:
 
 - STMicroeectronics st21nfca support. The st21nfca is an HCI chipset and
   thus relies on the HCI stack. This submission provides support for tag
   redaer/writer mode (including Type 5) and device tree bindings.
 
 - PM runtime support and a bunch of bug fixes for TI's trf7970a.
 
 - Device tree support for NXP's pn544. Legacy platform data support is
   obviously kept intact.
 
 - NFC Tag type 4B support to the NFC Digital stack.
 
 - SOCK_RAW type support to the raw NFC socket, and allow NCI
   sniffing from that. This can be extended to report HCI frames and also
   proprietarry ones like e.g. the pn533 ones.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJTepRlAAoJEIqAPN1PVmxKnF0P/RvfrZs6CbGNJC+dkEbk90p1
 nsngy4+4MmPwJYVzObnLz4Br0k1kmFKiOKske6drjMpgzDWeuQelw3B7bd3FYfxD
 YkQsc5RC984xrDoDH5pn8mA6VJqmn7whrmcibTYAixrDqTvo8gw6uja4ryAnSdZm
 n7cRbh/A5F/sa7O4mPA0bCTdp4jAS/vOP9rGFDOth/b5yJVs99XmC+AZp/Ad9BUx
 +/osWGmBV5jshtX7aPTSxIQB4BUaP/lP1DW8yF5whKDjsHC9QyJcAtw9HfZ4tv2h
 YNteZZ8yjM+rSjnDw/LvDc2Gp8Z8P1GYf8D3QN3cWhw1ZvXi7CnqKjEnm41sbfaH
 L5esIfsRBUdmk6Ika7zALqmOQFI3PzH+ag96punl29qb2gyBDRSnXKVLirv3xxFG
 h7vYtQL43Rosn/4pSilRbYReRwyKbSCxW3un/tUJy0Faafs6q+9oMC2aWbIfTT6l
 40n4H9EmzYy2OaaXSFckiIIYYgVDAji8GLXTf+dPHb+NrH3QQOR3m27WzHc4rmYk
 kUrv0lKoFswA+VLlIcJTrSKNF21FDjwuImzIWiPz6Fx/+rWJ0b4GlQyIynD72LpR
 2LkUhTrxuRuRtxVCtvTdkPlL6Bdp3HO7t4qZ0EirgnpmGK6NScBgABoqFJSbz9uS
 UUvZbHVIjLrDU9zzoyz8
 =cSl+
 -----END PGP SIGNATURE-----

Merge tag 'nfc-next-3.16-1' of git://git.kernel.org/pub/scm/linux/kernel/git/sameo/nfc-next

Samuel Ortiz <sameo@linux.intel.com> says:

"NFC: 3.16: First pull request

This is the NFC pull request for 3.16. We have:

- STMicroeectronics st21nfca support. The st21nfca is an HCI chipset and
  thus relies on the HCI stack. This submission provides support for tag
  redaer/writer mode (including Type 5) and device tree bindings.

- PM runtime support and a bunch of bug fixes for TI's trf7970a.

- Device tree support for NXP's pn544. Legacy platform data support is
  obviously kept intact.

- NFC Tag type 4B support to the NFC Digital stack.

- SOCK_RAW type support to the raw NFC socket, and allow NCI
  sniffing from that. This can be extended to report HCI frames and also
  proprietarry ones like e.g. the pn533 ones."

Signed-off-by: John W. Linville <linville@tuxdriver.com>
2014-05-22 13:56:46 -04:00
David S. Miller
8af750d739 Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nftables
Pablo Neira Ayuso says:

====================
Netfilter/nftables updates for net-next

The following patchset contains Netfilter/nftables updates for net-next,
most relevantly they are:

1) Add set element update notification via netlink, from Arturo Borrero.

2) Put all object updates in one single message batch that is sent to
   kernel-space. Before this patch only rules where included in the batch.
   This series also introduces the generic transaction infrastructure so
   updates to all objects (tables, chains, rules and sets) are applied in
   an all-or-nothing fashion, these series from me.

3) Defer release of objects via call_rcu to reduce the time required to
   commit changes. The assumption is that all objects are destroyed in
   reverse order to ensure that dependencies betweem them are fulfilled
   (ie. rules and sets are destroyed first, then chains, and finally
   tables).

4) Allow to match by bridge port name, from Tomasz Bursztyka. This series
   include two patches to prepare this new feature.

5) Implement the proper set selection based on the characteristics of the
   data. The new infrastructure also allows you to specify your preferences
   in terms of memory and computational complexity so the underlying set
   type is also selected according to your needs, from Patrick McHardy.

6) Several cleanup patches for nft expressions, including one minor possible
   compilation breakage due to missing mark support, also from Patrick.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2014-05-22 12:06:23 -04:00
Neal Cardwell
ca8a226343 tcp: make cwnd-limited checks measurement-based, and gentler
Experience with the recent e114a710aa ("tcp: fix cwnd limited
checking to improve congestion control") has shown that there are
common cases where that commit can cause cwnd to be much larger than
necessary. This leads to TSO autosizing cooking skbs that are too
large, among other things.

The main problems seemed to be:

(1) That commit attempted to predict the future behavior of the
connection by looking at the write queue (if TSO or TSQ limit
sending). That prediction sometimes overestimated future outstanding
packets.

(2) That commit always allowed cwnd to grow to twice the number of
outstanding packets (even in congestion avoidance, where this is not
needed).

This commit improves both of these, by:

(1) Switching to a measurement-based approach where we explicitly
track the largest number of packets in flight during the past window
("max_packets_out"), and remember whether we were cwnd-limited at the
moment we finished sending that flight.

(2) Only allowing cwnd to grow to twice the number of outstanding
packets ("max_packets_out") in slow start. In congestion avoidance
mode we now only allow cwnd to grow if it was fully utilized.

Signed-off-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-05-22 12:04:49 -04:00
Emmanuel Grumbach
67af981153 cfg80211: allow RSSI compensation
Channels in 2.4GHz band overlap, this means that if we
send a probe request on channel 1 and then move to channel
2, we will hear the probe response on channel 2. In this
case, the RSSI will be lower than if we had heard it on
the channel on which it was sent (1 in this case).

The firmware / low level driver can parse the channel in
the DS IE or HT IE and compensate the RSSI so that it will
still have a valid value even if we heard the frame on an
adjacent channel. This can be done up to a certain offset.

Add this offset as a configuration for the low level driver.
A low level driver that can compensate the low RSSI in this
case should assign the maximal offset for which the RSSI
value is still valid.

Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2014-05-22 09:58:49 +02:00
Antonio Quartulli
7406353d43 cfg80211: implement cfg80211_get_station cfg80211 API
Implement and export the new cfg80211_get_station() API.
This utility can be used by other kernel modules to obtain
detailed information about a given wireless station.

It will be in particular useful to batman-adv which will
implement a wireless rate based metric.

Signed-off-by: Antonio Quartulli <antonio@open-mesh.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2014-05-21 09:15:17 +02:00
Antonio Quartulli
cca674d47e mac80211: export the expected throughput
Add get_expected_throughput() API to mac80211 so that each
driver can implement its own version based on the RC
algorithm they are using (might be using an HW RC algo).
The API returns a value expressed in Kbps.

Also, add the new get_expected_throughput() member
to the rate_control_ops structure in order to be
able to query the RC algorithm (this patch provides an
implementation of this API for both minstrel and
minstrel_ht).

The related member in the station_info object is now
filled accordingly when dumping a station.

Cc: Felix Fietkau <nbd@openwrt.org>
Signed-off-by: Antonio Quartulli <antonio@open-mesh.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2014-05-21 09:15:16 +02:00
Antonio Quartulli
867d849fc8 cfg80211: export expected throughput through get_station()
Users may need information about the expected throughput
towards a given peer.
This value is supposed to consider the size overhead
generated by the 802.11 header.

This value is exported in kbps through the get_station() API
by including it into the station_info object.
Moreover, it is sent to user space when replying to the
nl80211 GET_STATION command.

This information will be useful to the batman-adv module
which will use it for its new metric computation.

Signed-off-by: Antonio Quartulli <antonio@open-mesh.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2014-05-20 15:13:32 +02:00
Hiren Tandel
57be1f3f3e NFC: Add RAW socket type support for SOCKPROTO_RAW
This allows for a more generic NFC sniffing by using SOCKPROTO_RAW
SOCK_RAW to read RAW NFC frames. This is for sniffing anything but LLCP
(HCI, NCI, etc...).

Signed-off-by: Hiren Tandel <hirent@marvell.com>
Signed-off-by: Rahul Tank <rahult@marvell.com>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2014-05-20 00:06:04 +02:00
Johannes Berg
922bd80fc3 cfg80211: constify wowlan/coalesce mask/pattern pointers
This requires changing the nl80211 parsing code a bit to use
intermediate pointers for the allocation, but clarifies the
API towards the drivers.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2014-05-19 18:06:50 +02:00
Johannes Berg
c1e5f4714d cfg80211: constify more pointers in the cfg80211 API
This also propagates through the drivers.

The orinoco driver uses the cfg80211 API structs for internal
bookkeeping, and so needs a (void *) cast that removes the
const - but that's OK because it allocates those pointers.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2014-05-19 17:53:16 +02:00
Johannes Berg
3b3a0162fa cfg80211: constify MAC addresses in cfg80211 ops
This propagates through all the drivers and mac80211.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2014-05-19 17:34:42 +02:00
Luciano Coelho
8d77ec8562 mac80211: fix csa_counter_offs argument name in docbook
The csa_counter_offs was erroneously described as csa_offs in
the docbook section.

This fixes two warnings when making htmldocs (at least):

Warning(include/net/mac80211.h:3428): No description found for parameter 'csa_counter_offs[IEEE80211_MAX_CSA_COUNTERS_NUM]'
Warning(include/net/mac80211.h:3428): Excess struct/union/enum/typedef member 'csa_offs' description in 'ieee80211_mutable_offsets'

Signed-off-by: Luciano Coelho <luciano.coelho@intel.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2014-05-19 14:39:34 +02:00
Luciano Coelho
c2e4323b33 cfg80211: add documentation for max_num_csa_counters
Move the comment in the structure to a description of the
max_num_csa_counters field in the docbook area.

This fixes a warning when building htmldocs (at least):

 Warning(include/net/cfg80211.h:3064): No description found for parameter 'max_num_csa_counters'

Signed-off-by: Luciano Coelho <luciano.coelho@intel.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2014-05-19 14:38:53 +02:00
Pablo Neira Ayuso
c7c32e72cb netfilter: nf_tables: defer all object release via rcu
Now that all objects are released in the reverse order via the
transaction infrastructure, we can enqueue the release via
call_rcu to save one synchronize_rcu. For small rule-sets loaded
via nft -f, it now takes around 50ms less here.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2014-05-19 12:06:13 +02:00
Pablo Neira Ayuso
128ad3322b netfilter: nf_tables: remove skb and nlh from context structure
Instead of caching the original skbuff that contains the netlink
messages, this stores the netlink message sequence number, the
netlink portID and the report flag. This helps to prepare the
introduction of the object release via call_rcu.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2014-05-19 12:06:13 +02:00
Pablo Neira Ayuso
60319eb1ca netfilter: nf_tables: use new transaction infrastructure to handle elements
Leave the set content in consistent state if we fail to load the
batch. Use the new generic transaction infrastructure to achieve
this.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2014-05-19 12:06:12 +02:00
Pablo Neira Ayuso
55dd6f9307 netfilter: nf_tables: use new transaction infrastructure to handle table
This patch speeds up rule-set updates and it also provides a way
to revert updates and leave things in consistent state in case that
the batch needs to be aborted.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2014-05-19 12:06:12 +02:00
Pablo Neira Ayuso
91c7b38dc9 netfilter: nf_tables: use new transaction infrastructure to handle chain
This patch speeds up rule-set updates and it also introduces a way to
revert chain updates if the batch is aborted. The idea is to store the
changes in the transaction to apply that in the commit step.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2014-05-19 12:06:11 +02:00
Pablo Neira Ayuso
958bee14d0 netfilter: nf_tables: use new transaction infrastructure to handle sets
This patch reworks the nf_tables API so set updates are included in
the same batch that contains rule updates. This speeds up rule-set
updates since we skip a dialog of four messages between kernel and
user-space (two on each direction), from:

 1) create the set and send netlink message to the kernel
 2) process the response from the kernel that contains the allocated name.
 3) add the set elements and send netlink message to the kernel.
 4) process the response from the kernel (to check for errors).

To:

 1) add the set to the batch.
 2) add the set elements to the batch.
 3) add the rule that points to the set.
 4) send batch to the kernel.

This also introduces an internal set ID (NFTA_SET_ID) that is unique
in the batch so set elements and rules can refer to new sets.

Backward compatibility has been only retained in userspace, this
means that new nft versions can talk to the kernel both in the new
and the old fashion.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2014-05-19 12:06:10 +02:00
Pablo Neira Ayuso
b380e5c733 netfilter: nf_tables: add message type to transactions
The patch adds message type to the transaction to simplify the
commit the and abort routines. Yet another step forward in the
generalisation of the transaction infrastructure.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2014-05-19 12:06:10 +02:00
Pablo Neira Ayuso
1081d11b08 netfilter: nf_tables: generalise transaction infrastructure
This patch generalises the existing rule transaction infrastructure
so it can be used to handle set, table and chain object transactions
as well. The transaction provides a data area that stores private
information depending on the transaction type.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2014-05-19 12:06:10 +02:00
Pablo Neira Ayuso
7c95f6d866 netfilter: nf_tables: deconstify table and chain in context structure
The new transaction infrastructure updates the family, table and chain
objects in the context structure, so let's deconstify them. While at it,
move the context structure initialization routine to the top of the
source file as it will be also used from the table and chain routines.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2014-05-19 12:06:09 +02:00
Phoebe Buckheister
f0f77dc6be ieee802154, mac802154: implement devkey record option
The 802.15.4-2011 standard states that for each key, a list of devices
that use this key shall be kept. Previous patches have only considered
two options:

 * a device "uses" (or may use) all keys, rendering the list useless
 * a device is restricted to a certain set of keys

Another option would be that a device *may* use all keys, but need not
do so, and we are interested in the actual set of keys the device uses.
Recording keys used by any given device may have a noticable performance
impact and might not be needed as often. The common case, in which a
device will not switch keys too often, should still perform well.

Signed-off-by: Phoebe Buckheister <phoebe.buckheister@itwm.fraunhofer.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-05-16 17:23:42 -04:00
Phoebe Buckheister
29e023746a mac802154: add llsec configuration functions
Signed-off-by: Phoebe Buckheister <phoebe.buckheister@itwm.fraunhofer.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-05-16 17:23:41 -04:00
Phoebe Buckheister
af9eed5bbf ieee802154: add dgram sockopts for security control
Allow datagram sockets to override the security settings of the device
they send from on a per-socket basis. Requires CAP_NET_ADMIN or
CAP_NET_RAW, since raw sockets can send arbitrary packets anyway.

Signed-off-by: Phoebe Buckheister <phoebe.buckheister@itwm.fraunhofer.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-05-16 17:23:41 -04:00
Phoebe Buckheister
f30be4d53c mac802154: integrate llsec with wpan devices
Signed-off-by: Phoebe Buckheister <phoebe.buckheister@itwm.fraunhofer.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-05-16 17:23:41 -04:00
Phoebe Buckheister
dc20759f28 ieee802154: add types for link-layer security
The added structures match 802.15.4-2011 link-layer security PIBs as
closely as is reasonable. Some lists required by the standard were
modeled as bitmaps (frame_types and command_frame_ids in *llsec_key,
802.15.4-2011 7.5/Table 61), since using lists for those seems a bit
excessive and not particularly useful. The DeviceDescriptorHandleList
was inverted and is here a per-device list, since operations on this
list are likely to have both a key and a device at hand, and per-device
lists of keys are shorter than per-key lists of devices.

Signed-off-by: Phoebe Buckheister <phoebe.buckheister@itwm.fraunhofer.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-05-16 17:23:40 -04:00
Andrzej Kaczmarek
d0455ed996 Bluetooth: Store max TX power level for connection
This patch adds support to store local maximum TX power level for
connection when reply for HCI_Read_Transmit_Power_Level is received.

Signed-off-by: Andrzej Kaczmarek <andrzej.kaczmarek@tieto.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2014-05-15 21:48:07 -07:00
Andrzej Kaczmarek
dd9838087b Bluetooth: Add support to get connection information
This patch adds support for Get Connection Information mgmt command
which can be used to query for information about connection, i.e. RSSI
and local TX power level.

In general values cached in hci_conn are returned as long as they are
considered valid, i.e. do not exceed age limit set in hdev. This limit
is calculated as random value between min/max values to avoid client
trying to guess when to poll for updated information.

Signed-off-by: Andrzej Kaczmarek <andrzej.kaczmarek@tieto.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2014-05-15 21:48:06 -07:00
Andrzej Kaczmarek
31ad169148 Bluetooth: Add conn info lifetime parameters to debugfs
This patch adds conn_info_min_age and conn_info_max_age parameters to
debugfs which determine lifetime of connection information. Actual
lifetime will be random value between min and max age.

Default values for min and max age are 1000ms and 3000ms respectively.

Signed-off-by: Andrzej Kaczmarek <andrzej.kaczmarek@tieto.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2014-05-15 21:48:05 -07:00
Duan Jiong
be7a010d6f ipv6: update Destination Cache entries when gateway turn into host
RFC 4861 states in 7.2.5:

	The IsRouter flag in the cache entry MUST be set based on the
         Router flag in the received advertisement.  In those cases
         where the IsRouter flag changes from TRUE to FALSE as a result
         of this update, the node MUST remove that router from the
         Default Router List and update the Destination Cache entries
         for all destinations using that neighbor as a router as
         specified in Section 7.3.3.  This is needed to detect when a
         node that is used as a router stops forwarding packets due to
         being configured as a host.

Currently, when dealing with NA Message which IsRouter flag changes from
TRUE to FALSE, the kernel only removes router from the Default Router List,
and don't update the Destination Cache entries.

Now in order to update those Destination Cache entries, i introduce
function rt6_clean_tohost().

Signed-off-by: Duan Jiong <duanj.fnst@cn.fujitsu.com>
Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-05-15 23:26:27 -04:00
Phoebe Buckheister
32edc40ae6 ieee802154: change _cb handling slightly
The current mac_cb handling of ieee802154 is rather awkward and limited.
Decompose the single flags field into multiple fields with the meanings
of each subfield of the flags field to make future extensions (for
example, link-layer security) easier. Also don't set the frame sequence
number in upper layers, since that's a thing the MAC is supposed to set
on frame transmit - we set it on header creation, but assuming that
upper layers do not blindly duplicate our headers, this is fine.

Signed-off-by: Phoebe Buckheister <phoebe.buckheister@itwm.fraunhofer.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-05-15 15:51:42 -04:00
Phoebe Buckheister
c3a6114f31 ieee802154: add definitions for link-layer security and header functions
When dealing with 802.15.4, one often has to know the maximum payload
size for a given packet. This depends on many factors, one of which is
whether or not a security header is present in the frame. These
definitions and functions provide an easy way for any upper layer to
calculate the maximum payload size for a packet. The first obvious user
for this is 6lowpan, which duplicates this calculation and gets it
partially wrong because it ignores security headers.

Signed-off-by: Phoebe Buckheister <phoebe.buckheister@itwm.fraunhofer.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-05-15 15:51:42 -04:00
David S. Miller
fcd77db07d net: Fix CONFIG_SYSCTL ifdef test.
> include/net/ip.h:211:5: warning: "CONFIG_SYSCTL" is not defined [-Wundef]
>  #if CONFIG_SYSCTL
>      ^

Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-05-15 13:43:14 -04:00
Andrei Otcheretianski
1af586c911 mac80211: Handle the CSA counters correctly
Make the beacon CSA counters part of ieee80211_mutable_offsets and don't
decrement CSA counters when generating a beacon template. This permits the
driver to offload the CSA counters handling. Since mac80211 updates the probe
responses with the correct counter, the driver should sync the counter's value
with mac80211 using ieee80211_csa_update_counter function.

Signed-off-by: Andrei Otcheretianski <andrei.otcheretianski@intel.com>
Signed-off-by: Luciano Coelho <luciano.coelho@intel.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2014-05-15 15:01:00 +02:00
Andrei Otcheretianski
6ec8c332a0 mac80211: Provide ieee80211_beacon_get_template API
Add a new API ieee80211_beacon_get_template, which doesn't
affect DTIM counter and should be used if the device generates beacon
frames, and new beacon template is needed. In addition set the offsets
to TIM IE for MESH interface.

Signed-off-by: Andrei Otcheretianski <andrei.otcheretianski@intel.com>
Signed-off-by: Luciano Coelho <luciano.coelho@intel.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2014-05-15 15:01:00 +02:00
Andrei Otcheretianski
9a774c78e2 cfg80211: Support multiple CSA counters
Change the type of NL80211_ATTR_CSA_C_OFF_BEACON and
NL80211_ATTR_CSA_C_OFF_PRESP to be NLA_BINARY which allows
userspace to use beacons and probe responses with
multiple CSA counters.
This isn't breaking the API since userspace can
continue to use nla_put_u16 for this attributes, which
is equivalent to a single element u16 array.
In addition advertise max number of supported CSA counters.
This is needed when using CSA and eCSA IEs together.

Signed-off-by: Andrei Otcheretianski <andrei.otcheretianski@intel.com>
Signed-off-by: Luciano Coelho <luciano.coelho@intel.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2014-05-15 15:00:42 +02:00
Andrei Otcheretianski
34d22ce22b cfg80211: Add API to update CSA counters in mgmt frames
Add NL80211_ATTR_CSA_C_OFFSETS_TX which holds an array
of offsets to the CSA counters which should be updated
when sending a management frames with NL80211_CMD_FRAME.

This API should be used by the drivers that wish to keep the
CSA counter updated in probe responses, but do not implement
probe response offloading and so, do not use
ieee80211_proberesp_get function.

Signed-off-by: Andrei Otcheretianski <andrei.otcheretianski@intel.com>
Signed-off-by: Luciano Coelho <luciano.coelho@intel.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2014-05-15 14:52:44 +02:00
Joe Perches
c722831744 net: Use a more standard macro for INET_ADDR_COOKIE
Missing a colon on definition use is a bit odd so
change the macro for the 32 bit case to declare an
__attribute__((unused)) and __deprecated variable.

The __deprecated attribute will cause gcc to emit
an error if the variable is actually used.

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-05-14 16:07:23 -04:00
WANG Cong
122ff243f5 ipv4: make ip_local_reserved_ports per netns
ip_local_port_range is already per netns, so should ip_local_reserved_ports
be. And since it is none by default we don't actually need it when we don't
enable CONFIG_SYSCTL.

By the way, rename inet_is_reserved_local_port() to inet_is_local_reserved_port()

Cc: "David S. Miller" <davem@davemloft.net>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-05-14 15:31:45 -04:00
Lorenzo Colitti
84f39b08d7 net: support marking accepting TCP sockets
When using mark-based routing, sockets returned from accept()
may need to be marked differently depending on the incoming
connection request.

This is the case, for example, if different socket marks identify
different networks: a listening socket may want to accept
connections from all networks, but each connection should be
marked with the network that the request came in on, so that
subsequent packets are sent on the correct network.

This patch adds a sysctl to mark TCP sockets based on the fwmark
of the incoming SYN packet. If enabled, and an unmarked socket
receives a SYN, then the SYN packet's fwmark is written to the
connection's inet_request_sock, and later written back to the
accepted socket when the connection is established.  If the
socket already has a nonzero mark, then the behaviour is the same
as it is today, i.e., the listening socket's fwmark is used.

Black-box tested using user-mode linux:

- IPv4/IPv6 SYN+ACK, FIN, etc. packets are routed based on the
  mark of the incoming SYN packet.
- The socket returned by accept() is marked with the mark of the
  incoming SYN packet.
- Tested with syncookies=1 and syncookies=2.

Signed-off-by: Lorenzo Colitti <lorenzo@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-05-13 18:35:09 -04:00
Lorenzo Colitti
e110861f86 net: add a sysctl to reflect the fwmark on replies
Kernel-originated IP packets that have no user socket associated
with them (e.g., ICMP errors and echo replies, TCP RSTs, etc.)
are emitted with a mark of zero. Add a sysctl to make them have
the same mark as the packet they are replying to.

This allows an administrator that wishes to do so to use
mark-based routing, firewalling, etc. for these replies by
marking the original packets inbound.

Tested using user-mode linux:
 - ICMP/ICMPv6 echo replies and errors.
 - TCP RST packets (IPv4 and IPv6).

Signed-off-by: Lorenzo Colitti <lorenzo@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-05-13 18:35:08 -04:00
Yuchung Cheng
843f4a55e3 tcp: use tcp_v4_send_synack on first SYN-ACK
To avoid large code duplication in IPv6, we need to first simplify
the complicate SYN-ACK sending code in tcp_v4_conn_request().

To use tcp_v4(6)_send_synack() to send all SYN-ACKs, we need to
initialize the mini socket's receive window before trying to
create the child socket and/or building the SYN-ACK packet. So we move
that initialization from tcp_make_synack() to tcp_v4_conn_request()
as a new function tcp_openreq_init_req_rwin().

After this refactoring the SYN-ACK sending code is simpler and easier
to implement Fast Open for IPv6.

Signed-off-by: Yuchung Cheng <ycheng@google.com>
Signed-off-by: Daniel Lee <longinus00@gmail.com>
Signed-off-by: Jerry Chu <hkchu@google.com>
Acked-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-05-13 17:53:02 -04:00
Yuchung Cheng
89278c9dc9 tcp: simplify fast open cookie processing
Consolidate various cookie checking and generation code to simplify
the fast open processing. The main goal is to reduce code duplication
in tcp_v4_conn_request() for IPv6 support.

Removes two experimental sysctl flags TFO_SERVER_ALWAYS and
TFO_SERVER_COOKIE_NOT_CHKD used primarily for developmental debugging
purposes.

Signed-off-by: Yuchung Cheng <ycheng@google.com>
Signed-off-by: Daniel Lee <longinus00@gmail.com>
Signed-off-by: Jerry Chu <hkchu@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-05-13 17:53:02 -04:00
Yuchung Cheng
5b7ed0892f tcp: move fastopen functions to tcp_fastopen.c
Move common TFO functions that will be used by both v4 and v6
to tcp_fastopen.c. Create a helper tcp_fastopen_queue_check().

Signed-off-by: Yuchung Cheng <ycheng@google.com>
Signed-off-by: Daniel Lee <longinus00@gmail.com>
Signed-off-by: Jerry Chu <hkchu@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-05-13 17:53:02 -04:00
John W. Linville
3231d65ffe Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless 2014-05-13 15:27:44 -04:00
Felix Fietkau
8c48b50a1a cfg80211: allow restricting supported dfs regions
At the moment, the ath9k/ath10k DFS module only supports detecting ETSI
radar patterns.
Add a bitmap in the interface combinations, indicating which DFS regions
are supported by the detector. If unset, support for all regions is
assumed.

Signed-off-by: Felix Fietkau <nbd@openwrt.org>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2014-05-13 15:50:06 +02:00
WANG Cong
60ff746739 net: rename local_df to ignore_df
As suggested by several people, rename local_df to ignore_df,
since it means "ignore df bit if it is set".

Cc: Maciej Żenczykowski <maze@google.com>
Cc: Florian Westphal <fw@strlen.de>
Cc: David S. Miller <davem@davemloft.net>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Acked-by: Maciej Żenczykowski <maze@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-05-12 14:03:41 -04:00
David S. Miller
5f013c9bc7 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Conflicts:
	drivers/net/ethernet/altera/altera_sgdma.c
	net/netlink/af_netlink.c
	net/sched/cls_api.c
	net/sched/sch_api.c

The netlink conflict dealt with moving to netlink_capable() and
netlink_ns_capable() in the 'net' tree vs. supporting 'tc' operations
in non-init namespaces.  These were simple transformations from
netlink_capable to netlink_ns_capable.

The Altera driver conflict was simply code removal overlapping some
void pointer cast cleanups in net-next.

Signed-off-by: David S. Miller <davem@davemloft.net>
2014-05-12 13:19:14 -04:00
Andrzej Kaczmarek
5a134faeef Bluetooth: Store TX power level for connection
This patch adds support to store local TX power level for connection
when reply for HCI_Read_Transmit_Power_Level is received.

Signed-off-by: Andrzej Kaczmarek <andrzej.kaczmarek@tieto.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2014-05-09 14:16:42 -07:00
David S. Miller
1448eb5669 Merge branch 'for-davem' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless
John W. Linville says:

====================
pull request: wireless 2014-05-08

This one is all from Johannes:

"Here are a few small fixes for the current cycle: radiotap TX flags were
wrong (fix by Bob), Chun-Yeow fixes an SMPS issue with mesh interfaces,
Eliad fixes a locking bug and a cfg80211 state problem and finally
Henning sent me a fix for IBSS rate information."

Please let me know if there are problems!
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2014-05-09 16:46:53 -04:00
Johannes Berg
f6837ba8c9 mac80211: handle failed restart/resume better
When the driver fails during HW restart or resume, the whole
stack goes into a very confused state with interfaces being
up while the hardware is down etc.

Address this by shutting down everything; we'll run into a
lot of warnings in the process but that's better than having
the whole stack get messed up.

Reviewed-by: Arik Nemtsov <arik@wizery.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2014-05-09 12:21:34 +02:00
Cong Wang
ba6b918ab2 ping: move ping_group_range out of CONFIG_SYSCTL
Similarly, when CONFIG_SYSCTL is not set, ping_group_range should still
work, just that no one can change it. Therefore we should move it out of
sysctl_net_ipv4.c. And, it should not share the same seqlock with
ip_local_port_range.

BTW, rename it to ->ping_group_range instead.

Cc: David S. Miller <davem@davemloft.net>
Cc: Francois Romieu <romieu@fr.zoreil.com>
Reported-by: Stefan de Konink <stefan@konink.de>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-05-08 22:50:47 -04:00
Cong Wang
c9d8f1a642 ipv4: move local_port_range out of CONFIG_SYSCTL
When CONFIG_SYSCTL is not set, ip_local_port_range should still work,
just that no one can change it. Therefore we should move it out of sysctl_inet.c.
Also, rename it to ->ip_local_ports instead.

Cc: David S. Miller <davem@davemloft.net>
Cc: Francois Romieu <romieu@fr.zoreil.com>
Reported-by: Stefan de Konink <stefan@konink.de>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-05-08 22:50:47 -04:00
John W. Linville
6153871f77 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless into for-davem 2014-05-08 11:13:41 -04:00
Andrzej Kaczmarek
5ae76a9415 Bluetooth: Store RSSI for connection
This patch adds support to store RSSI for connection when reply for
HCI_Read_RSSI is received.

Signed-off-by: Andrzej Kaczmarek <andrzej.kaczmarek@tieto.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2014-05-08 08:01:57 -07:00
Luciano Coelho
c3d620362d cfg80211: fix docbook warning
When trying to generate documentation, at least xmldocs, we get the
following warning:

Warning(include/net/cfg80211.h:461): No description found for parameter 'nl80211_iftype'

Fix it by adding the iftype argument name to the
cfg80211_chandef_dfs_required() function declaration.

Reported-and-tested-by: Masanari Iida <standby24x7@gmail.com>
Signed-off-by: Luciano Coelho <luciano.coelho@intel.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2014-05-08 11:59:35 +02:00
WANG Cong
698365fa18 net: clean up snmp stats code
commit 8f0ea0fe3a (snmp: reduce percpu needs by 50%)
reduced snmp array size to 1, so technically it doesn't have to be
an array any more. What's more, after the following commit:

	commit 933393f58f
	Date:   Thu Dec 22 11:58:51 2011 -0600

	    percpu: Remove irqsafe_cpu_xxx variants

	    We simply say that regular this_cpu use must be safe regardless of
	    preemption and interrupt state.  That has no material change for x86
	    and s390 implementations of this_cpu operations.  However, arches that
	    do not provide their own implementation for this_cpu operations will
	    now get code generated that disables interrupts instead of preemption.

probably no arch wants to have SNMP_ARRAY_SZ == 2. At least after
almost 3 years, no one complains.

So, just convert the array to a single pointer and remove snmp_mib_init()
and snmp_mib_free() as well.

Cc: Christoph Lameter <cl@linux.com>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: David S. Miller <davem@davemloft.net>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-05-07 16:06:05 -04:00
Neal Cardwell
d28071d102 tunnel: fix RFC number in comment for INET_ECN_decapsulate()
The quoted text and figure are from RFC 6040 ("Tunnelling of Explicit
Congestion Notification").

Signed-off-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-05-07 15:30:52 -04:00
John W. Linville
cabae81103 Merge git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211 2014-05-06 14:05:51 -04:00
Michal Kazior
f04c22033c cfg80211: export interface stopping function
This exports a new cfg80211_stop_iface() function.

This is intended for driver internal interface
combination management and channel switching.

Due to locking issues (it re-enters driver) the
call is asynchronous and uses cfg80211 event
list/worker.

Signed-off-by: Michal Kazior <michal.kazior@tieto.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2014-05-06 15:16:34 +02:00
Michal Kazior
59af6928d2 mac80211: fix CSA tx queue stopping
It was possible for tx queues to be stuck stopped
if AP CSA finalization failed. In that case
neither stop_ap nor do_stop woke the queues up.
This means it was impossible to perform tx at all
until driver was reloaded or a successful CSA was
performed later.

It was possible to solve this in a simpler manner
however this is more robust and future proof
(having multi-vif CSA in mind).

New sdata->csa_block_tx is introduced to keep
track of which interfaces requested tx to be
blocked for CSA. This is required because mac80211
stops all tx queues for that purpose. This means
queues must be awoken only when last tx-blocking
CSA interface is finished.

It is still possible to have tx queues stopped
after CSA failure but as soon as offending
interfaces are stopped from userspace (stop_ap or
ifdown) tx queues are woken up properly.

Signed-off-by: Michal Kazior <michal.kazior@tieto.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2014-05-06 15:10:00 +02:00
Libor Pechacek
86aae6c7b5 Bluetooth: Convert RFCOMM spinlocks into mutexes
Enabling CONFIG_DEBUG_ATOMIC_SLEEP has shown that some rfcomm functions
acquiring spinlocks call sleeping locks further in the chain.  Converting
the offending spinlocks into mutexes makes sleeping safe.

Signed-off-by: Libor Pechacek <lpechacek@suse.cz>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2014-05-05 19:25:06 -07:00
Tom Herbert
e4f45b7f40 net: Call skb_checksum_init in IPv6
Call skb_checksum_init instead of private functions.

Signed-off-by: Tom Herbert <therbert@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-05-05 15:26:30 -04:00
Tom Herbert
ed70fcfcee net: Call skb_checksum_init in IPv4
Call skb_checksum_init instead of private functions.

Signed-off-by: Tom Herbert <therbert@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-05-05 15:26:30 -04:00
Tom Herbert
07064c6e02 net: Allow csum_add to be provided in arch
csum_add is really nothing more then add-with-carry which
can be implemented efficiently in some architectures.
Allow architecture to define this protected by HAVE_ARCH_CSUM_ADD.

Signed-off-by: Tom Herbert <therbert@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-05-05 15:26:29 -04:00
David S. Miller
2ad0649687 Merge branch 'for-davem' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next
John W. Linville says:

====================
pull request: wireless-next 2014-05-02

Please pull this batch of updates intended for the 3.16 stream...

For the mac80211 bits, Johannes says:

"In this round we have a large number of small features and
improvements from people too numerous to list here. The only really
bit thing is Michał and Luca's CSA work (including changing how
interface combination verification is done)."

For the Bluetooth bits, Gustavo says:

"Here goes some patches for the -next release. There is nothing
really special for this pull request, just a bunch of refactors,
fixes and clean ups."

For the ath10k/ath6kl bits, Kalle says:

"For ath6kl Kalle fixed a bunch of checkpatch warnings.

In ath10k we had more changes, major ones being:

* fix memory allocation failures after a firmware crash (Michal)

* some rework of DFS configuration to enable it correctly in all cases
  (Michal)

* add a new firmware crash option to make it possible to crash 10.1
  firmware for testing purposes (Marek P)

* fix RTS/CTS protection in certain cases (Marek K)

* fix wrong RSSI and rate reporting in some cases (Janusz)

* fix firmware stats reporting (Chun, Ben & Bartosz)"

For the iwlwifi bits, Emmanuel says:

"I have here a bunch of unrelated things. I disabled support for
-7.ucode which means that I can removed a lot of code. Eliad has
a brand new feature: we reduce the Tx power when the link allows -
this reduces our power consumption. The regular changes in power and
scan area. One interesting thing though is the patches from Johannes,
we have now GRO which allows to increase our throughput in TCP Rx. The
main advantage is that it reduces the number of TCP Acks - these TCP
Acks are completely useless when we are using A-MPDU since the first
packet of the A-MPDU generates a TCP Ack which is made obsolete by
the next packets."

Along with that, there are a variety of updates to b43, mwifiex,
rtl8180 and wil6210 drivers and a handful of other updates here
and there.

Please let me know if there are problems!
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2014-05-05 13:36:26 -04:00
Andy King
2c4a336e0a vsock: Make transport the proto owner
Right now the core vsock module is the owner of the proto family. This
means there's nothing preventing the transport module from unloading if
there are open sockets, which results in a panic. Fix that by allowing
the transport to be the owner, which will refcount it properly.

Includes version bump to 1.0.1.0-k

Passes checkpatch this time, I swear...

Acked-by: Dmitry Torokhov <dtor@vmware.com>
Signed-off-by: Andy King <acking@vmware.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-05-05 13:13:50 -04:00
Arik Nemtsov
0c4972ccaa mac80211: set an external flag for TDLS stations
Expose a new tdls flag for the public ieee80211_sta struct.
This can be used in some rate control decisions.

Signed-off-by: Arik Nemtsov <arikx.nemtsov@intel.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2014-05-05 15:56:02 +02:00
Geert Uytterhoeven
325483a9bf wimax: Spelling s/than/that/, wording s/destinatary/recipient/
Signed-off-by: Geert Uytterhoeven <geert+renesas@linux-m68k.org>
Cc: Inaky Perez-Gonzalez <inaky.perez-gonzalez@intel.com>
Cc: wimax@linuxwimax.org
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2014-05-05 15:37:55 +02:00
Eliad Peller
792e6aa7a1 cfg80211: add cfg80211_sched_scan_stopped_rtnl
Add locked-version for cfg80211_sched_scan_stopped.
This is used for some users that might want to
call it when rtnl is already locked.

Fixes: d43c6b6 ("mac80211: reschedule sched scan after HW restart")
Cc: stable@vger.kernel.org (3.14+)
Signed-off-by: Eliad Peller <eliadx.peller@intel.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2014-05-05 15:14:57 +02:00
Eric Dumazet
249015515f tcp: remove in_flight parameter from cong_avoid() methods
Commit e114a710aa ("tcp: fix cwnd limited checking to improve
congestion control") obsoleted in_flight parameter from
tcp_is_cwnd_limited() and its callers.

This patch does the removal as promised.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-05-03 19:23:07 -04:00
Eric Dumazet
e114a710aa tcp: fix cwnd limited checking to improve congestion control
Yuchung discovered tcp_is_cwnd_limited() was returning false in
slow start phase even if the application filled the socket write queue.

All congestion modules take into account tcp_is_cwnd_limited()
before increasing cwnd, so this behavior limits slow start from
probing the bandwidth at full speed.

The problem is that even if write queue is full (aka we are _not_
application limited), cwnd can be under utilized if TSO should auto
defer or TCP Small queues decided to hold packets.

So the in_flight can be kept to smaller value, and we can get to the
point tcp_is_cwnd_limited() returns false.

With TCP Small Queues and FQ/pacing, this issue is more visible.

We fix this by having tcp_cwnd_validate(), which is supposed to track
such things, take into account unsent_segs, the number of segs that we
are not sending at the moment due to TSO or TSQ, but intend to send
real soon. Then when we are cwnd-limited, remember this fact while we
are processing the window of ACKs that comes back.

For example, suppose we have a brand new connection with cwnd=10; we
are in slow start, and we send a flight of 9 packets. By the time we
have received ACKs for all 9 packets we want our cwnd to be 18.
We implement this by setting tp->lsnd_pending to 9, and
considering ourselves to be cwnd-limited while cwnd is less than
twice tp->lsnd_pending (2*9 -> 18).

This makes tcp_is_cwnd_limited() more understandable, by removing
the GSO/TSO kludge, that tried to work around the issue.

Note the in_flight parameter can be removed in a followup cleanup
patch.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: Yuchung Cheng <ycheng@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-05-02 17:54:35 -04:00
John W. Linville
406a94d7fa Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next into for-davem 2014-05-02 13:47:50 -04:00
Lorenzo Colitti
5c98631cca net: ipv6: Introduce ip6_sk_dst_hoplimit.
This replaces 6 identical code snippets with a call to a new
static inline function.

Signed-off-by: Lorenzo Colitti <lorenzo@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-04-30 13:31:26 -04:00
Florian Fainelli
7fa857ed04 net: dsa: add ds_to_priv
DSA drivers have a trick which consists in allocating "priv_size" more
bytes to account for the DSA driver private context. Add a helper
function to access that private context instead of open-coding it in
drivers with (void *)(ds + 1).

Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-04-30 13:31:25 -04:00
John W. Linville
f6595444c1 Merge branch 'for-john' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211-next
Conflicts:
	net/mac80211/chan.c
2014-04-30 12:04:27 -04:00
John W. Linville
0006433a5b Merge branch 'for-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth-next 2014-04-30 11:56:43 -04:00
Florian Westphal
f768e5bdef netfilter: add helper for adding nat extension
Reduce copy-past a bit by adding a common helper.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2014-04-29 20:56:22 +02:00
Jouni Malinen
e16821bcfb cfg80211: Dynamic channel bandwidth changes in AP mode
This extends NL80211_CMD_SET_CHANNEL to allow dynamic channel bandwidth
changes in AP mode (including P2P GO) during a lifetime of the BSS. This
can be used to implement, e.g., HT 20/40 MHz co-existence rules on the
2.4 GHz band.

Signed-off-by: Jouni Malinen <jouni@qca.qualcomm.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2014-04-28 18:09:59 +02:00
Cong Wang
2f7ef2f879 sched, cls: check if we could overwrite actions when changing a filter
When actions are attached to a filter, they are a part of the filter
itself, so when changing a filter we should allow to overwrite the actions
inside as well.

In my specific case, when I tried to _append_ a new action to an existing
filter which already has an action, I got EEXIST since kernel refused
to overwrite the existing one in kernel.

This patch checks if we are changing the filter checking NLM_F_CREATE flag
(Sigh, filters don't use NLM_F_REPLACE...) and then passes the boolean down
to actions. This fixes the problem above.

Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Cc: David S. Miller <davem@davemloft.net>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: Cong Wang <cwang@twopensource.com>
Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-04-27 23:42:39 -04:00
Rostislav Lisovy
ea077c1cea cfg80211: Add attributes describing prohibited channel bandwidth
Since there are frequency bands (e.g. 5.9GHz) allowing channels
with only 10 or 5 MHz bandwidth, this patch adds attributes that
allow keeping track about this information.

When channel attributes are reported to user-space, make sure to
not break old tools, i.e. if the 'split wiphy dump' is enabled,
report the extra attributes (if present) describing the bandwidth
restrictions.  If the 'split wiphy dump' is not enabled,
completely omit those channels that have flags set to either
IEEE80211_CHAN_NO_10MHZ or IEEE80211_CHAN_NO_20MHZ.

Add the check for new bandwidth restriction flags in
cfg80211_chandef_usable() to comply with the restrictions.

Signed-off-by: Rostislav Lisovy <rostislav.lisovy@fel.cvut.cz>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2014-04-25 17:38:23 +02:00
Marek Kwaczynski
17d38fa8c2 mac80211: add option to generate CCMP IVs only for mgmt frames
Some chips can encrypt managment frames in HW, but
require generated IV in the frame. Add a key flag
that allows us to achieve this.

Signed-off-by: Marek Kwaczynski <marek.kwaczynski@tieto.com>
[use BIT(0) to fill that spot, fix indentation]
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2014-04-25 17:26:15 +02:00
Michal Kazior
65a124dd71 cfg80211: allow drivers to iterate over matching combinations
The patch splits cfg80211_check_combinations()
into an iterator function and a simple iteration
user.

This makes it possible for drivers to asses how
many channels can use given iftype setup. This in
turn can be used for future
multi-interface/multi-channel channel switching.

Signed-off-by: Michal Kazior <michal.kazior@tieto.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2014-04-25 17:08:14 +02:00
Nicolas Dichtel
f01ec1c017 vxlan: add x-netns support
This patch allows to switch the netns when packet is encapsulated or
decapsulated.
The vxlan socket is openned into the i/o netns, ie into the netns where
encapsulated packets are received. The socket lookup is done into this netns to
find the corresponding vxlan tunnel. After decapsulation, the packet is
injecting into the corresponding interface which may stand to another netns.

When one of the two netns is removed, the tunnel is destroyed.

Configuration example:
ip netns add netns1
ip netns exec netns1 ip link set lo up
ip link add vxlan10 type vxlan id 10 group 239.0.0.10 dev eth0 dstport 0
ip link set vxlan10 netns netns1
ip netns exec netns1 ip addr add 192.168.0.249/24 broadcast 192.168.0.255 dev vxlan10
ip netns exec netns1 ip link set vxlan10 up

Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-04-24 16:18:26 -04:00
Eric W. Biederman
a3b299da86 net: Add variants of capable for use on on sockets
sk_net_capable - The common case, operations that are safe in a network namespace.
sk_capable - Operations that are not known to be safe in a network namespace
sk_ns_capable - The general case for special cases.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-04-24 13:44:53 -04:00
Luis R. Rodriguez
7e65eac8e3 6lowpan: nuke net_ieee802154_lowpan() accessor when 6lowpan is disabled
Johannes noted this is not needed, all of the fragment
accessors don't need CONFIG_NET_NS. This goes test compiled with
CONFIG_BT_6LOWPAN=y and a disabled CONFIG_NET_NS.

CC: Alexander Smirnov <alex.bluesman.smirnov@gmail.com>
Cc: Dmitry Eremin-Solenikov <dbaryshkov@gmail.com>
Cc: linux-zigbee-devel@lists.sourceforge.net
Cc: David S. Miller" <davem@davemloft.net>
Cc: netdev@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: Luis R. Rodriguez <mcgrof@suse.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-04-24 12:36:00 -04:00
Tomasz Bursztyka
aa45660c6b netfilter: nf_tables: Make meta expression core functions public
This will be useful to create network family dedicated META expression
as for NFPROTO_BRIDGE for instance.

Signed-off-by: Tomasz Bursztyka <tomasz.bursztyka@linux.intel.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2014-04-23 13:55:30 +02:00
Tetsuo Handa
2e71029e2c xfrm: Remove useless xfrm_audit struct.
Commit f1370cc4 "xfrm: Remove useless secid field from xfrm_audit." changed
"struct xfrm_audit" to have either
{ audit_get_loginuid(current) / audit_get_sessionid(current) } or
{ INVALID_UID / -1 } pair.

This means that we can represent "struct xfrm_audit" as "bool".
This patch replaces "struct xfrm_audit" argument with "bool".

Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2014-04-23 08:21:04 +02:00
Tetsuo Handa
f1370cc4a0 xfrm: Remove useless secid field from xfrm_audit.
It seems to me that commit ab5f5e8b "[XFRM]: xfrm audit calls" is doing
something strange at xfrm_audit_helper_usrinfo().
If secid != 0 && security_secid_to_secctx(secid) != 0, the caller calls
audit_log_task_context() which basically does
secid != 0 && security_secid_to_secctx(secid) == 0 case
except that secid is obtained from current thread's context.

Oh, what happens if secid passed to xfrm_audit_helper_usrinfo() was
obtained from other thread's context? It might audit current thread's
context rather than other thread's context if security_secid_to_secctx()
in xfrm_audit_helper_usrinfo() failed for some reason.

Then, are all the caller of xfrm_audit_helper_usrinfo() passing either
secid obtained from current thread's context or secid == 0?
It seems to me that they are.

If I didn't miss something, we don't need to pass secid to
xfrm_audit_helper_usrinfo() because audit_log_task_context() will
obtain secid from current thread's context.

Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2014-04-22 10:47:53 +02:00
Mark A. Greer
51d98fa47c NFC: digital: Add macros for the ISO/IEC 14443-B Protocol
Add RF tech and framing macros for the ISO/IEC 14443-B Protocol.

Cc: Thierry Escande <thierry.escande@linux.intel.com>
Signed-off-by: Mark A. Greer <mgreer@animalcreek.com>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2014-04-22 00:37:28 +02:00
Christophe Ricard
e240bc3612 NFC: hci: Add load_session HCI operand
load_session allows a CLF to restore the gate <-> pipe table from some
proprietary location.
The main advantage to add this function is to reduce the memory wear by
running pipe creation (and storing) only once.

Signed-off-by: Christophe Ricard <christophe-h.ricard@st.com>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2014-04-22 00:37:26 +02:00
Weiping Pan
86fd14ad1e tcp: make tcp_cwnd_application_limited() static
Make tcp_cwnd_application_limited() static and move it from tcp_input.c to
tcp_output.c

Signed-off-by: Weiping Pan <wpan@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-04-20 18:18:56 -04:00
Luis R. Rodriguez
17d8ecb8ff 6lowpan: include net/net_namespace.h on 6lowpan namepsace header
Don't rely on driver files or other headers having this file included.

CC: Alexander Smirnov <alex.bluesman.smirnov@gmail.com>
Cc: Dmitry Eremin-Solenikov <dbaryshkov@gmail.com>
Cc: linux-zigbee-devel@lists.sourceforge.net
Signed-off-by: Luis R. Rodriguez <mcgrof@suse.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-04-20 18:18:55 -04:00
Luis R. Rodriguez
599018a710 6lowpan: add helper to get 6lowpan namespace
This will simplify the new reassembly backport
with no code changes being required.

CC: Alexander Smirnov <alex.bluesman.smirnov@gmail.com>
Cc: Dmitry Eremin-Solenikov <dbaryshkov@gmail.com>
Cc: linux-zigbee-devel@lists.sourceforge.net
Cc: David S. Miller" <davem@davemloft.net>
Cc: netdev@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Luis R. Rodriguez <mcgrof@suse.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-04-20 18:18:55 -04:00
Vlad Yasevich
b14878ccb7 net: sctp: cache auth_enable per endpoint
Currently, it is possible to create an SCTP socket, then switch
auth_enable via sysctl setting to 1 and crash the system on connect:

Oops[#1]:
CPU: 0 PID: 0 Comm: swapper Not tainted 3.14.1-mipsgit-20140415 #1
task: ffffffff8056ce80 ti: ffffffff8055c000 task.ti: ffffffff8055c000
[...]
Call Trace:
[<ffffffff8043c4e8>] sctp_auth_asoc_set_default_hmac+0x68/0x80
[<ffffffff8042b300>] sctp_process_init+0x5e0/0x8a4
[<ffffffff8042188c>] sctp_sf_do_5_1B_init+0x234/0x34c
[<ffffffff804228c8>] sctp_do_sm+0xb4/0x1e8
[<ffffffff80425a08>] sctp_endpoint_bh_rcv+0x1c4/0x214
[<ffffffff8043af68>] sctp_rcv+0x588/0x630
[<ffffffff8043e8e8>] sctp6_rcv+0x10/0x24
[<ffffffff803acb50>] ip6_input+0x2c0/0x440
[<ffffffff8030fc00>] __netif_receive_skb_core+0x4a8/0x564
[<ffffffff80310650>] process_backlog+0xb4/0x18c
[<ffffffff80313cbc>] net_rx_action+0x12c/0x210
[<ffffffff80034254>] __do_softirq+0x17c/0x2ac
[<ffffffff800345e0>] irq_exit+0x54/0xb0
[<ffffffff800075a4>] ret_from_irq+0x0/0x4
[<ffffffff800090ec>] rm7k_wait_irqoff+0x24/0x48
[<ffffffff8005e388>] cpu_startup_entry+0xc0/0x148
[<ffffffff805a88b0>] start_kernel+0x37c/0x398
Code: dd0900b8  000330f8  0126302d <dcc60000> 50c0fff1  0047182a  a48306a0
03e00008  00000000
---[ end trace b530b0551467f2fd ]---
Kernel panic - not syncing: Fatal exception in interrupt

What happens while auth_enable=0 in that case is, that
ep->auth_hmacs is initialized to NULL in sctp_auth_init_hmacs()
when endpoint is being created.

After that point, if an admin switches over to auth_enable=1,
the machine can crash due to NULL pointer dereference during
reception of an INIT chunk. When we enter sctp_process_init()
via sctp_sf_do_5_1B_init() in order to respond to an INIT chunk,
the INIT verification succeeds and while we walk and process
all INIT params via sctp_process_param() we find that
net->sctp.auth_enable is set, therefore do not fall through,
but invoke sctp_auth_asoc_set_default_hmac() instead, and thus,
dereference what we have set to NULL during endpoint
initialization phase.

The fix is to make auth_enable immutable by caching its value
during endpoint initialization, so that its original value is
being carried along until destruction. The bug seems to originate
from the very first days.

Fix in joint work with Daniel Borkmann.

Reported-by: Joshua Kinard <kumba@gentoo.org>
Signed-off-by: Vlad Yasevich <vyasevic@redhat.com>
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Tested-by: Joshua Kinard <kumba@gentoo.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-04-18 18:32:00 -04:00
Peter Zijlstra
4e857c58ef arch: Mass conversion of smp_mb__*()
Mostly scripted conversion of the smp_mb__* barriers.

Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Link: http://lkml.kernel.org/n/tip-55dhyhocezdw1dg7u19hmh1u@git.kernel.org
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: linux-arch@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-04-18 14:20:48 +02:00
Cong Wang
6a662719c9 ipv4, fib: pass LOOPBACK_IFINDEX instead of 0 to flowi4_iif
As suggested by Julian:

	Simply, flowi4_iif must not contain 0, it does not
	look logical to ignore all ip rules with specified iif.

because in fib_rule_match() we do:

        if (rule->iifindex && (rule->iifindex != fl->flowi_iif))
                goto out;

flowi4_iif should be LOOPBACK_IFINDEX by default.

We need to move LOOPBACK_IFINDEX to include/net/flow.h:

1) It is mostly used by flowi_iif

2) Fix the following compile error if we use it in flow.h
by the patches latter:

In file included from include/linux/netfilter.h:277:0,
                 from include/net/netns/netfilter.h:5,
                 from include/net/net_namespace.h:21,
                 from include/linux/netdevice.h:43,
                 from include/linux/icmpv6.h:12,
                 from include/linux/ipv6.h:61,
                 from include/net/ipv6.h:16,
                 from include/linux/sunrpc/clnt.h:27,
                 from include/linux/nfs_fs.h:30,
                 from init/do_mounts.c:32:
include/net/flow.h: In function ‘flowi4_init_output’:
include/net/flow.h:84:32: error: ‘LOOPBACK_IFINDEX’ undeclared (first use in this function)

Cc: Eric Biederman <ebiederm@xmission.com>
Cc: Julian Anastasov <ja@ssi.bg>
Cc: David S. Miller <davem@davemloft.net>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: Cong Wang <cwang@twopensource.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-04-16 15:05:11 -04:00
Eric Dumazet
aad88724c9 ipv4: add a sock pointer to dst->output() path.
In the dst->output() path for ipv4, the code assumes the skb it has to
transmit is attached to an inet socket, specifically via
ip_mc_output() : The sk_mc_loop() test triggers a WARN_ON() when the
provider of the packet is an AF_PACKET socket.

The dst->output() method gets an additional 'struct sock *sk'
parameter. This needs a cascade of changes so that this parameter can
be propagated from vxlan to final consumer.

Fixes: 8f646c922d ("vxlan: keep original skb ownership")
Reported-by: lucien xin <lucien.xin@gmail.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-04-15 13:47:15 -04:00
Eric Dumazet
b0270e9101 ipv4: add a sock pointer to ip_queue_xmit()
ip_queue_xmit() assumes the skb it has to transmit is attached to an
inet socket. Commit 31c70d5956 ("l2tp: keep original skb ownership")
changed l2tp to not change skb ownership and thus broke this assumption.

One fix is to add a new 'struct sock *sk' parameter to ip_queue_xmit(),
so that we do not assume skb->sk points to the socket used by l2tp
tunnel.

Fixes: 31c70d5956 ("l2tp: keep original skb ownership")
Reported-by: Zhan Jianyu <nasa4836@gmail.com>
Tested-by: Zhan Jianyu <nasa4836@gmail.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-04-15 12:58:34 -04:00
David S. Miller
00cbc3dcd1 Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf
Pablo Neira Ayuso says:

====================
Netfilter fixes for net

The following patchset contains three Netfilter fixes for your net tree,
they are:

* Fix missing generation sequence initialization which results in a splat
  if lockdep is enabled, it was introduced in the recent works to improve
  nf_conntrack scalability, from Andrey Vagin.

* Don't flush the GRE keymap list in nf_conntrack when the pptp helper is
  disabled otherwise this crashes due to a double release, from Andrey
  Vagin.

* Fix nf_tables cmp fast in big endian, from Patrick McHardy.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2014-04-14 19:00:10 -04:00
Daniel Borkmann
362d52040c Revert "net: sctp: Fix a_rwnd/rwnd management to reflect real state of the receiver's buffer"
This reverts commit ef2820a735 ("net: sctp: Fix a_rwnd/rwnd management
to reflect real state of the receiver's buffer") as it introduced a
serious performance regression on SCTP over IPv4 and IPv6, though a not
as dramatic on the latter. Measurements are on 10Gbit/s with ixgbe NICs.

Current state:

[root@Lab200slot2 ~]# iperf3 --sctp -4 -c 192.168.241.3 -V -l 1452 -t 60
iperf version 3.0.1 (10 January 2014)
Linux Lab200slot2 3.14.0 #1 SMP Thu Apr 3 23:18:29 EDT 2014 x86_64
Time: Fri, 11 Apr 2014 17:56:21 GMT
Connecting to host 192.168.241.3, port 5201
      Cookie: Lab200slot2.1397238981.812898.548918
[  4] local 192.168.241.2 port 38616 connected to 192.168.241.3 port 5201
Starting Test: protocol: SCTP, 1 streams, 1452 byte blocks, omitting 0 seconds, 60 second test
[ ID] Interval           Transfer     Bandwidth
[  4]   0.00-1.09   sec  20.8 MBytes   161 Mbits/sec
[  4]   1.09-2.13   sec  10.8 MBytes  86.8 Mbits/sec
[  4]   2.13-3.15   sec  3.57 MBytes  29.5 Mbits/sec
[  4]   3.15-4.16   sec  4.33 MBytes  35.7 Mbits/sec
[  4]   4.16-6.21   sec  10.4 MBytes  42.7 Mbits/sec
[  4]   6.21-6.21   sec  0.00 Bytes    0.00 bits/sec
[  4]   6.21-7.35   sec  34.6 MBytes   253 Mbits/sec
[  4]   7.35-11.45  sec  22.0 MBytes  45.0 Mbits/sec
[  4]  11.45-11.45  sec  0.00 Bytes    0.00 bits/sec
[  4]  11.45-11.45  sec  0.00 Bytes    0.00 bits/sec
[  4]  11.45-11.45  sec  0.00 Bytes    0.00 bits/sec
[  4]  11.45-12.51  sec  16.0 MBytes   126 Mbits/sec
[  4]  12.51-13.59  sec  20.3 MBytes   158 Mbits/sec
[  4]  13.59-14.65  sec  13.4 MBytes   107 Mbits/sec
[  4]  14.65-16.79  sec  33.3 MBytes   130 Mbits/sec
[  4]  16.79-16.79  sec  0.00 Bytes    0.00 bits/sec
[  4]  16.79-17.82  sec  5.94 MBytes  48.7 Mbits/sec
(etc)

[root@Lab200slot2 ~]#  iperf3 --sctp -6 -c 2001:db8:0:f101::1 -V -l 1400 -t 60
iperf version 3.0.1 (10 January 2014)
Linux Lab200slot2 3.14.0 #1 SMP Thu Apr 3 23:18:29 EDT 2014 x86_64
Time: Fri, 11 Apr 2014 19:08:41 GMT
Connecting to host 2001:db8:0:f101::1, port 5201
      Cookie: Lab200slot2.1397243321.714295.2b3f7c
[  4] local 2001:db8:0:f101::2 port 55804 connected to 2001:db8:0:f101::1 port 5201
Starting Test: protocol: SCTP, 1 streams, 1400 byte blocks, omitting 0 seconds, 60 second test
[ ID] Interval           Transfer     Bandwidth
[  4]   0.00-1.00   sec   169 MBytes  1.42 Gbits/sec
[  4]   1.00-2.00   sec   201 MBytes  1.69 Gbits/sec
[  4]   2.00-3.00   sec   188 MBytes  1.58 Gbits/sec
[  4]   3.00-4.00   sec   174 MBytes  1.46 Gbits/sec
[  4]   4.00-5.00   sec   165 MBytes  1.39 Gbits/sec
[  4]   5.00-6.00   sec   199 MBytes  1.67 Gbits/sec
[  4]   6.00-7.00   sec   163 MBytes  1.36 Gbits/sec
[  4]   7.00-8.00   sec   174 MBytes  1.46 Gbits/sec
[  4]   8.00-9.00   sec   193 MBytes  1.62 Gbits/sec
[  4]   9.00-10.00  sec   196 MBytes  1.65 Gbits/sec
[  4]  10.00-11.00  sec   157 MBytes  1.31 Gbits/sec
[  4]  11.00-12.00  sec   175 MBytes  1.47 Gbits/sec
[  4]  12.00-13.00  sec   192 MBytes  1.61 Gbits/sec
[  4]  13.00-14.00  sec   199 MBytes  1.67 Gbits/sec
(etc)

After patch:

[root@Lab200slot2 ~]#  iperf3 --sctp -4 -c 192.168.240.3 -V -l 1452 -t 60
iperf version 3.0.1 (10 January 2014)
Linux Lab200slot2 3.14.0+ #1 SMP Mon Apr 14 12:06:40 EDT 2014 x86_64
Time: Mon, 14 Apr 2014 16:40:48 GMT
Connecting to host 192.168.240.3, port 5201
      Cookie: Lab200slot2.1397493648.413274.65e131
[  4] local 192.168.240.2 port 50548 connected to 192.168.240.3 port 5201
Starting Test: protocol: SCTP, 1 streams, 1452 byte blocks, omitting 0 seconds, 60 second test
[ ID] Interval           Transfer     Bandwidth
[  4]   0.00-1.00   sec   240 MBytes  2.02 Gbits/sec
[  4]   1.00-2.00   sec   239 MBytes  2.01 Gbits/sec
[  4]   2.00-3.00   sec   240 MBytes  2.01 Gbits/sec
[  4]   3.00-4.00   sec   239 MBytes  2.00 Gbits/sec
[  4]   4.00-5.00   sec   245 MBytes  2.05 Gbits/sec
[  4]   5.00-6.00   sec   240 MBytes  2.01 Gbits/sec
[  4]   6.00-7.00   sec   240 MBytes  2.02 Gbits/sec
[  4]   7.00-8.00   sec   239 MBytes  2.01 Gbits/sec

With the reverted patch applied, the SCTP/IPv4 performance is back
to normal on latest upstream for IPv4 and IPv6 and has same throughput
as 3.4.2 test kernel, steady and interval reports are smooth again.

Fixes: ef2820a735 ("net: sctp: Fix a_rwnd/rwnd management to reflect real state of the receiver's buffer")
Reported-by: Peter Butler <pbutler@sonusnet.com>
Reported-by: Dongsheng Song <dongsheng.song@gmail.com>
Reported-by: Fengguang Wu <fengguang.wu@intel.com>
Tested-by: Peter Butler <pbutler@sonusnet.com>
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Cc: Matija Glavinic Pecotic <matija.glavinic-pecotic.ext@nsn.com>
Cc: Alexander Sverdlin <alexander.sverdlin@nsn.com>
Cc: Vlad Yasevich <vyasevich@gmail.com>
Acked-by: Vlad Yasevich <vyasevich@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-04-14 16:26:48 -04:00
Eric Dumazet
30f78d8ebf ipv6: Limit mtu to 65575 bytes
Francois reported that setting big mtu on loopback device could prevent
tcp sessions making progress.

We do not support (yet ?) IPv6 Jumbograms and cook corrupted packets.

We must limit the IPv6 MTU to (65535 + 40) bytes in theory.

Tested:

ifconfig lo mtu 70000
netperf -H ::1

Before patch : Throughput :   0.05 Mbits

After patch : Throughput : 35484 Mbits

Reported-by: Francois WELLENREITER <f.wellenreiter@gmail.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-04-14 12:39:59 -04:00
Patrick McHardy
b855d416dc netfilter: nf_tables: fix nft_cmp_fast failure on big endian for size < 4
nft_cmp_fast is used for equality comparisions of size <= 4. For
comparisions of size < 4 byte a mask is calculated that is applied to
both the data from userspace (during initialization) and the register
value (during runtime). Both values are stored using (in effect) memcpy
to a memory area that is then interpreted as u32 by nft_cmp_fast.

This works fine on little endian since smaller types have the same base
address, however on big endian this is not true and the smaller types
are interpreted as a big number with trailing zero bytes.

The mask therefore must not include the lower bytes, but the higher bytes
on big endian. Add a helper function that does a cpu_to_le32 to switch
the bytes on big endian. Since we're dealing with a mask of just consequitive
bits, this works out fine.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2014-04-14 10:38:02 +02:00
Linus Torvalds
454fd351f2 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Pull yet more networking updates from David Miller:

 1) Various fixes to the new Redpine Signals wireless driver, from
    Fariya Fatima.

 2) L2TP PPP connect code takes PMTU from the wrong socket, fix from
    Dmitry Petukhov.

 3) UFO and TSO packets differ in whether they include the protocol
    header in gso_size, account for that in skb_gso_transport_seglen().
   From Florian Westphal.

 4) If VLAN untagging fails, we double free the SKB in the bridging
    output path.  From Toshiaki Makita.

 5) Several call sites of sk->sk_data_ready() were referencing an SKB
    just added to the socket receive queue in order to calculate the
    second argument via skb->len.  This is dangerous because the moment
    the skb is added to the receive queue it can be consumed in another
    context and freed up.

    It turns out also that none of the sk->sk_data_ready()
    implementations even care about this second argument.

    So just kill it off and thus fix all these use-after-free bugs as a
    side effect.

 6) Fix inverted test in tcp_v6_send_response(), from Lorenzo Colitti.

 7) pktgen needs to do locking properly for LLTX devices, from Daniel
    Borkmann.

 8) xen-netfront driver initializes TX array entries in RX loop :-) From
    Vincenzo Maffione.

 9) After refactoring, some tunnel drivers allow a tunnel to be
    configured on top itself.  Fix from Nicolas Dichtel.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (46 commits)
  vti: don't allow to add the same tunnel twice
  gre: don't allow to add the same tunnel twice
  drivers: net: xen-netfront: fix array initialization bug
  pktgen: be friendly to LLTX devices
  r8152: check RTL8152_UNPLUG
  net: sun4i-emac: add promiscuous support
  net/apne: replace IS_ERR and PTR_ERR with PTR_ERR_OR_ZERO
  net: ipv6: Fix oif in TCP SYN+ACK route lookup.
  drivers: net: cpsw: enable interrupts after napi enable and clearing previous interrupts
  drivers: net: cpsw: discard all packets received when interface is down
  net: Fix use after free by removing length arg from sk_data_ready callbacks.
  Drivers: net: hyperv: Address UDP checksum issues
  Drivers: net: hyperv: Negotiate suitable ndis version for offload support
  Drivers: net: hyperv: Allocate memory for all possible per-pecket information
  bridge: Fix double free and memory leak around br_allowed_ingress
  bonding: Remove debug_fs files when module init fails
  i40evf: program RSS LUT correctly
  i40evf: remove open-coded skb_cow_head
  ixgb: remove open-coded skb_cow_head
  igbvf: remove open-coded skb_cow_head
  ...
2014-04-12 17:31:22 -07:00
Linus Torvalds
582076ab16 A bunch of updates and cleanup within the transport layer,
particularly with a focus on RDMA.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.14 (GNU/Linux)
 Comment: GPGTools - http://gpgtools.org
 
 iQIcBAABAgAGBQJTRXV1AAoJEDZk62b0Tg6xYWAP/0Kfp9/8Z05SQ1fnJveySw1O
 nKK//1/GBmquntqFAVHI6yJRLzPJ+Z/Y4u4a0qriwfgpPQvUJQrL77tY/VfEBUPB
 VoJG1tj7lpdLrO3p4YyPkPjymyC7YOoFNjGEstWFg7HetwnnqqZL2LB+5yJzyqjx
 y9nv1HzsrbAE7j8C4hQ1Nmds5muUb5VhnTtPhjrx4tP1sWWh8XTVJbsVDiEqx6cu
 uJXFFTbkONr9jKfv+Ki3H2pZej2yD7w4tU4lkdcGNyij/Q4Xn1iERqroW2/GT6Cl
 AXxlIKN24ASjWo4VqW0Wf8gO8vbUtHRChoiZ69DvzNTkbWmAIWSFHyQJ4cinwuyr
 UbOQZuccO59QtpNVpBvG/vjnbI54rg+VGLy+xE0vcrBDlyoptc56IAFSg8zJY5UN
 ysbyHCGME//9VZ1zeeZvkMjm8z5Enp6x4zmtnUHmufO7DVMTFUePED6U1u9WIyP5
 FFy5EboXMSh97yB8REvbIlY2MgBJWYdnyzLKFMeRpzC8fOXJqBoCX8i3Z5SkMYWJ
 1FS/pGr7ec/VX1iHXSYi9hhzTJ6o9mEmOIhaO4UcqAuK8Rk2jbCp1Lx0iDhmJtdT
 zjofGDe57ro7nOZf8A/TI5z+6BZq8KYVfZrXtYaPqC4rXOzJAox/yHlz9FmAJYgv
 ssOIKXX0ujLwqrMBVatj
 =yzy/
 -----END PGP SIGNATURE-----

Merge tag 'for-linus-3.15' of git://git.kernel.org/pub/scm/linux/kernel/git/ericvh/v9fs

Pull 9p changes from Eric Van Hensbergen:
 "A bunch of updates and cleanup within the transport layer,
  particularly with a focus on RDMA"

* tag 'for-linus-3.15' of git://git.kernel.org/pub/scm/linux/kernel/git/ericvh/v9fs:
  9pnet_rdma: check token type before int conversion
  9pnet: trans_fd : allocate struct p9_trans_fd and struct p9_conn together.
  9pnet: p9_client->conn field is unused. Remove it.
  9P: Get rid of REQ_STATUS_FLSH
  9pnet_rdma: add cancelled()
  9pnet_rdma: update request status during send
  9P: Add cancelled() to the transport functions.
  net: Mark function as static in 9p/client.c
  9P: Add memory barriers to protect request fields over cb/rpc threads handoff
2014-04-11 14:14:57 -07:00
David S. Miller
676d23690f net: Fix use after free by removing length arg from sk_data_ready callbacks.
Several spots in the kernel perform a sequence like:

	skb_queue_tail(&sk->s_receive_queue, skb);
	sk->sk_data_ready(sk, skb->len);

But at the moment we place the SKB onto the socket receive queue it
can be consumed and freed up.  So this skb->len access is potentially
to freed up memory.

Furthermore, the skb->len can be modified by the consumer so it is
possible that the value isn't accurate.

And finally, no actual implementation of this callback actually uses
the length argument.  And since nobody actually cared about it's
value, lots of call sites pass arbitrary values in such as '0' and
even '1'.

So just remove the length argument from the callback, that way there
is no confusion whatsoever and all of these use-after-free cases get
fixed as a side effect.

Based upon a patch by Eric Dumazet and his suggestion to audit this
issue tree-wide.

Signed-off-by: David S. Miller <davem@davemloft.net>
2014-04-11 16:15:36 -04:00
Rostislav Lisovy
041f607de1 mac80211: Update conf_is_ht() to work properly with 5/10MHz channels
The channels with 5/10MHz bandwidth are not HT. We have to
reflect this in conf_is_ht() function which returns whether the
particular channel is HT or not.

Signed-off-by: Rostislav Lisovy <rostislav.lisovy@fel.cvut.cz>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2014-04-09 10:56:04 +02:00
Kalle Valo
ce26151bc3 cfg80211: update comment about WIPHY_FLAG_CUSTOM_REGULATORY
Commit a2f73b6c5d ("cfg80211: move regulatory flags to their own variable")
renamed WIPHY_FLAG_CUSTOM_REGULATORY to REGULATORY_CUSTOM_REG, but missed to
update one comment.

Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2014-04-09 10:56:02 +02:00
Luciano Coelho
5d52ee8110 mac80211: allow reservation of a running chanctx
With single-channel drivers, we need to be able to change a running
chanctx if we want to use chanctx reservation.  Not all drivers may be
able to do this, so add a flag that indicates support for it.

Changing a running chanctx can also be used as an optimization in
multi-channel drivers when the context needs to be reserved for future
usage.

Introduce IEEE80211_CHANCTX_RESERVED chanctx mode to mark a channel as
reserved so nobody else can use it (since we know it's going to
change).  In the future, we may allow several vifs to use the same
reservation as long as they plan to use the chanctx on the same
future channel.

Signed-off-by: Luciano Coelho <luciano.coelho@intel.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2014-04-09 10:55:56 +02:00
Luciano Coelho
73de86a389 cfg80211/mac80211: move interface counting for combination check to mac80211
Move the counting part of the interface combination check from
cfg80211 to mac80211.

This is needed to simplify locking when the driver has to perform a
combination check by itself (eg. with channel-switch).

Signed-off-by: Luciano Coelho <luciano.coelho@intel.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2014-04-09 10:55:43 +02:00
Luciano Coelho
2beb6dab2d cfg80211/mac80211: refactor cfg80211_chandef_dfs_required()
Some interface types don't require DFS (such as STATION, P2P_CLIENT
etc).  In order to centralize these decisions, make
cfg80211_chandef_dfs_required() take the iftype into consideration.

Signed-off-by: Luciano Coelho <luciano.coelho@intel.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2014-04-09 10:55:41 +02:00
Luciano Coelho
cb2d956dd3 cfg80211: refactor cfg80211_can_use_iftype_chan()
Separate the code that counts the interface types and channels from
the code that check the interface combinations.  The new function that
checks for combinations is exported so it can be called by the
drivers.

This is done in preparation for moving the interface combinations
checks out of cfg80211.

Signed-off-by: Luciano Coelho <luciano.coelho@intel.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2014-04-09 10:55:39 +02:00
Ilan Peer
174e0cd28a cfg80211: Enable GO operation on additional channels
Allow GO operation on a channel marked with IEEE80211_CHAN_GO_CONCURRENT
iff there is an active station interface that is associated to
an AP operating on the same channel in the 2 GHz band or the same UNII band
(in the 5 GHz band). This relaxation is not allowed if the channel is
marked with IEEE80211_CHAN_RADAR.

Note that this is a permissive approach to the FCC definitions,
that require a clear assessment that the device operating the AP is
an authorized master, i.e., with radar detection and DFS capabilities.

It is assumed that such restrictions are enforced by user space.
Furthermore, it is assumed, that if the conditions that allowed for
the operation of the GO on such a channel change, i.e., the station
interface disconnected from the AP, it is the responsibility of user
space to evacuate the GO from the channel.

Signed-off-by: Ilan Peer <ilan.peer@intel.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2014-04-09 10:55:34 +02:00
David Spinadel
570dbde137 cfg80211: Add indoor only and GO concurrent channel attributes
The FCC are clarifying some soft configuration requirements,
which among other include the following:

1. Indoor operation, where a device can use channels requiring indoor
   operation, subject to that it can guarantee indoor operation,
   i.e., the device is connected to AC Power or the device is under
   the control of a local master that is acting as an AP and is
   connected to AC Power.
2. Concurrent GO operation, where devices may instantiate a P2P GO
   while they are under the guidance of an authorized master. For example,
   on a channel on which a BSS is connected to an authorized master, i.e.,
   with DFS and radar detection capability in the UNII band.

See https://apps.fcc.gov/eas/comments/GetPublishedDocument.html?id=327&tn=528122

Add support for advertising Indoor-only and GO-Concurrent channel
properties.

Signed-off-by: David Spinadel <david.spinadel@intel.com>
Signed-off-by: Ilan Peer <ilan.peer@intel.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2014-04-09 10:55:32 +02:00
Emmanuel Grumbach
77be2c54c5 mac80211: add vif to flush call
This will allow the low level driver to make decision based
on the vif such as queues etc...
Since the vif might be NULL, we can't add it to the tracing
functions.

Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
[fix staging rtl8821ae driver]
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2014-04-09 10:55:29 +02:00
Johannes Berg
78f22b6a3a cfg80211: allow userspace to take ownership of interfaces
When dynamically creating interfaces from userspace, e.g. for P2P usage,
such interfaces are usually owned by the process that created them, i.e.
wpa_supplicant. Should wpa_supplicant crash, such interfaces will often
cease operating properly and cause problems on restarting the process.

To avoid this problem, introduce an ownership concept for interfaces. If
an interface is owned by a netlink socket, then it will be destroyed if
the netlink socket is closed for any reason, including if the process it
belongs to crashed. This gives us a race-free way to get rid of any such
interfaces.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2014-04-09 10:55:28 +02:00
Linus Torvalds
ce7613db2d Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Pull more networking updates from David Miller:

 1) If a VXLAN interface is created with no groups, we can crash on
    reception of packets.  Fix from Mike Rapoport.

 2) Missing includes in CPTS driver, from Alexei Starovoitov.

 3) Fix string validations in isdnloop driver, from YOSHIFUJI Hideaki
    and Dan Carpenter.

 4) Missing irq.h include in bnxw2x, enic, and qlcnic drivers.  From
    Josh Boyer.

 5) AF_PACKET transmit doesn't statistically count TX drops, from Daniel
    Borkmann.

 6) Byte-Queue-Limit enabled drivers aren't handled properly in
    AF_PACKET transmit path, also from Daniel Borkmann.

    Same problem exists in pktgen, and Daniel fixed it there too.

 7) Fix resource leaks in driver probe error paths of new sxgbe driver,
    from Francois Romieu.

 8) Truesize of SKBs can gradually get more and more corrupted in NAPI
    packet recycling path, fix from Eric Dumazet.

 9) Fix uniprocessor netfilter build, from Florian Westphal.  In the
    longer term we should perhaps try to find a way for ARRAY_SIZE() to
    work even with zero sized array elements.

10) Fix crash in netfilter conntrack extensions due to mis-estimation of
    required extension space.  From Andrey Vagin.

11) Since we commit table rule updates before trying to copy the
    counters back to userspace (it's the last action we perform), we
    really can't signal the user copy with an error as we are beyond the
    point from which we can unwind everything.  This causes all kinds of
    use after free crashes and other mysterious behavior.

    From Thomas Graf.

12) Restore previous behvaior of div/mod by zero in BPF filter
    processing.  From Daniel Borkmann.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (38 commits)
  net: sctp: wake up all assocs if sndbuf policy is per socket
  isdnloop: several buffer overflows
  netdev: remove potentially harmful checks
  pktgen: fix xmit test for BQL enabled devices
  net/at91_ether: avoid NULL pointer dereference
  tipc: Let tipc_release() return 0
  at86rf230: fix MAX_CSMA_RETRIES parameter
  mac802154: fix duplicate #include headers
  sxgbe: fix duplicate #include headers
  net: filter: be more defensive on div/mod by X==0
  netfilter: Can't fail and free after table replacement
  xen-netback: Trivial format string fix
  net: bcmgenet: Remove unnecessary version.h inclusion
  net: smc911x: Remove unused local variable
  bonding: Inactive slaves should keep inactive flag's value
  netfilter: nf_tables: fix wrong format in request_module()
  netfilter: nf_tables: set names cannot be larger than 15 bytes
  netfilter: nf_conntrack: reserve two bytes for nf_ct_ext->len
  netfilter: Add {ipt,ip6t}_osf aliases for xt_osf
  netfilter: x_tables: allow to use cgroup match for LOCAL_IN nf hooks
  ...
2014-04-08 12:41:23 -07:00
Andrey Vagin
223b02d923 netfilter: nf_conntrack: reserve two bytes for nf_ct_ext->len
"len" contains sizeof(nf_ct_ext) and size of extensions. In a worst
case it can contain all extensions. Bellow you can find sizes for all
types of extensions. Their sum is definitely bigger than 256.

nf_ct_ext_types[0]->len = 24
nf_ct_ext_types[1]->len = 32
nf_ct_ext_types[2]->len = 24
nf_ct_ext_types[3]->len = 32
nf_ct_ext_types[4]->len = 152
nf_ct_ext_types[5]->len = 2
nf_ct_ext_types[6]->len = 16
nf_ct_ext_types[7]->len = 8

I have seen "len" up to 280 and my host has crashes w/o this patch.

The right way to fix this problem is reducing the size of the ecache
extension (4) and Florian is going to do this, but these changes will
be quite large to be appropriate for a stable tree.

Fixes: 5b423f6a40 (netfilter: nf_conntrack: fix racy timer handling with reliable)
Cc: Pablo Neira Ayuso <pablo@netfilter.org>
Cc: Patrick McHardy <kaber@trash.net>
Cc: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Cc: "David S. Miller" <davem@davemloft.net>
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2014-04-03 23:52:31 +02:00
Linus Torvalds
32d01dc7be Merge branch 'for-3.15' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup
Pull cgroup updates from Tejun Heo:
 "A lot updates for cgroup:

   - The biggest one is cgroup's conversion to kernfs.  cgroup took
     after the long abandoned vfs-entangled sysfs implementation and
     made it even more convoluted over time.  cgroup's internal objects
     were fused with vfs objects which also brought in vfs locking and
     object lifetime rules.  Naturally, there are places where vfs rules
     don't fit and nasty hacks, such as credential switching or lock
     dance interleaving inode mutex and cgroup_mutex with object serial
     number comparison thrown in to decide whether the operation is
     actually necessary, needed to be employed.

     After conversion to kernfs, internal object lifetime and locking
     rules are mostly isolated from vfs interactions allowing shedding
     of several nasty hacks and overall simplification.  This will also
     allow implmentation of operations which may affect multiple cgroups
     which weren't possible before as it would have required nesting
     i_mutexes.

   - Various simplifications including dropping of module support,
     easier cgroup name/path handling, simplified cgroup file type
     handling and task_cg_lists optimization.

   - Prepatory changes for the planned unified hierarchy, which is still
     a patchset away from being actually operational.  The dummy
     hierarchy is updated to serve as the default unified hierarchy.
     Controllers which aren't claimed by other hierarchies are
     associated with it, which BTW was what the dummy hierarchy was for
     anyway.

   - Various fixes from Li and others.  This pull request includes some
     patches to add missing slab.h to various subsystems.  This was
     triggered xattr.h include removal from cgroup.h.  cgroup.h
     indirectly got included a lot of files which brought in xattr.h
     which brought in slab.h.

  There are several merge commits - one to pull in kernfs updates
  necessary for converting cgroup (already in upstream through
  driver-core), others for interfering changes in the fixes branch"

* 'for-3.15' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup: (74 commits)
  cgroup: remove useless argument from cgroup_exit()
  cgroup: fix spurious lockdep warning in cgroup_exit()
  cgroup: Use RCU_INIT_POINTER(x, NULL) in cgroup.c
  cgroup: break kernfs active_ref protection in cgroup directory operations
  cgroup: fix cgroup_taskset walking order
  cgroup: implement CFTYPE_ONLY_ON_DFL
  cgroup: make cgrp_dfl_root mountable
  cgroup: drop const from @buffer of cftype->write_string()
  cgroup: rename cgroup_dummy_root and related names
  cgroup: move ->subsys_mask from cgroupfs_root to cgroup
  cgroup: treat cgroup_dummy_root as an equivalent hierarchy during rebinding
  cgroup: remove NULL checks from [pr_cont_]cgroup_{name|path}()
  cgroup: use cgroup_setup_root() to initialize cgroup_dummy_root
  cgroup: reorganize cgroup bootstrapping
  cgroup: relocate setting of CGRP_DEAD
  cpuset: use rcu_read_lock() to protect task_cs()
  cgroup_freezer: document freezer_fork() subtleties
  cgroup: update cgroup_transfer_tasks() to either succeed or fail
  cgroup: drop task_lock() protection around task->cgroups
  cgroup: update how a newly forked task gets associated with css_set
  ...
2014-04-03 13:05:42 -07:00
Linus Torvalds
cd6362befe Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next
Pull networking updates from David Miller:
 "Here is my initial pull request for the networking subsystem during
  this merge window:

   1) Support for ESN in AH (RFC 4302) from Fan Du.

   2) Add full kernel doc for ethtool command structures, from Ben
      Hutchings.

   3) Add BCM7xxx PHY driver, from Florian Fainelli.

   4) Export computed TCP rate information in netlink socket dumps, from
      Eric Dumazet.

   5) Allow IPSEC SA to be dumped partially using a filter, from Nicolas
      Dichtel.

   6) Convert many drivers to pci_enable_msix_range(), from Alexander
      Gordeev.

   7) Record SKB timestamps more efficiently, from Eric Dumazet.

   8) Switch to microsecond resolution for TCP round trip times, also
      from Eric Dumazet.

   9) Clean up and fix 6lowpan fragmentation handling by making use of
      the existing inet_frag api for it's implementation.

  10) Add TX grant mapping to xen-netback driver, from Zoltan Kiss.

  11) Auto size SKB lengths when composing netlink messages based upon
      past message sizes used, from Eric Dumazet.

  12) qdisc dumps can take a long time, add a cond_resched(), From Eric
      Dumazet.

  13) Sanitize netpoll core and drivers wrt.  SKB handling semantics.
      Get rid of never-used-in-tree netpoll RX handling.  From Eric W
      Biederman.

  14) Support inter-address-family and namespace changing in VTI tunnel
      driver(s).  From Steffen Klassert.

  15) Add Altera TSE driver, from Vince Bridgers.

  16) Optimizing csum_replace2() so that it doesn't adjust the checksum
      by checksumming the entire header, from Eric Dumazet.

  17) Expand BPF internal implementation for faster interpreting, more
      direct translations into JIT'd code, and much cleaner uses of BPF
      filtering in non-socket ocntexts.  From Daniel Borkmann and Alexei
      Starovoitov"

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1976 commits)
  netpoll: Use skb_irq_freeable to make zap_completion_queue safe.
  net: Add a test to see if a skb is freeable in irq context
  qlcnic: Fix build failure due to undefined reference to `vxlan_get_rx_port'
  net: ptp: move PTP classifier in its own file
  net: sxgbe: make "core_ops" static
  net: sxgbe: fix logical vs bitwise operation
  net: sxgbe: sxgbe_mdio_register() frees the bus
  Call efx_set_channels() before efx->type->dimension_resources()
  xen-netback: disable rogue vif in kthread context
  net/mlx4: Set proper build dependancy with vxlan
  be2net: fix build dependency on VxLAN
  mac802154: make csma/cca parameters per-wpan
  mac802154: allow only one WPAN to be up at any given time
  net: filter: minor: fix kdoc in __sk_run_filter
  netlink: don't compare the nul-termination in nla_strcmp
  can: c_can: Avoid led toggling for every packet.
  can: c_can: Simplify TX interrupt cleanup
  can: c_can: Store dlc private
  can: c_can: Reduce register access
  can: c_can: Make the code readable
  ...
2014-04-02 20:53:45 -07:00
Linus Torvalds
159d8133d0 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial
Pull trivial tree updates from Jiri Kosina:
 "Usual rocket science -- mostly documentation and comment updates"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial:
  sparse: fix comment
  doc: fix double words
  isdn: capi: fix "CAPI_VERSION" comment
  doc: DocBook: Fix typos in xml and template file
  Bluetooth: add module name for btwilink
  driver core: unexport static function create_syslog_header
  mmc: core: typo fix in printk specifier
  ARM: spear: clean up editing mistake
  net-sysfs: fix comment typo 'CONFIG_SYFS'
  doc: Insert MODULE_ in module-signing macros
  Documentation: update URL to hfsplus Technote 1150
  gpio: update path to documentation
  ixgbe: Fix format string in ixgbe_fcoe.
  Kconfig: Remove useless "default N" lines
  user_namespace.c: Remove duplicated word in comment
  CREDITS: fix formatting
  treewide: Fix typo in Documentation/DocBook
  mm: Fix warning on make htmldocs caused by slab.c
  ata: ata-samsung_cf: cleanup in header file
  idr: remove unused prototype of idr_free()
2014-04-02 16:23:38 -07:00
Patrick McHardy
c50b960ccc netfilter: nf_tables: implement proper set selection
The current set selection simply choses the first set type that provides
the requested features, which always results in the rbtree being chosen
by virtue of being the first set in the list.

What we actually want to do is choose the implementation that can provide
the requested features and is optimal from either a performance or memory
perspective depending on the characteristics of the elements and the
preferences specified by the user.

The elements are not known when creating a set. Even if we would provide
them for anonymous (literal) sets, we'd still have standalone sets where
the elements are not known in advance. We therefore need an abstract
description of the data charcteristics.

The kernel already knows the size of the key, this patch starts by
introducing a nested set description which so far contains only the maximum
amount of elements. Based on this the set implementations are changed to
provide an estimate of the required amount of memory and the lookup
complexity class.

The set ops have a new callback ->estimate() that is invoked during set
selection. It receives a structure containing the attributes known to the
kernel and is supposed to populate a struct nft_set_estimate with the
complexity class and, in case the size is known, the complete amount of
memory required, or the amount of memory required per element otherwise.

Based on the policy specified by the user (performance/memory, defaulting
to performance) the kernel will then select the best suited implementation.

Even if the set implementation would allow to add more than the specified
maximum amount of elements, they are enforced since new implementations
might not be able to add more than maximum based on which they were
selected.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2014-04-02 21:32:57 +02:00
Phoebe Buckheister
e462ded699 mac802154: make csma/cca parameters per-wpan
Commit 9b2777d608 (ieee802154: add TX power control to wpan_phy)
and following erroneously added CSMA and CCA parameters for 802.15.4
devices as PHY parameters, while they are actually MAC parameters and
can differ for any two WPAN instances. Since it is now sensible to have
multiple WPAN devices with differing CSMA/CCA parameters, make these
parameters MAC parameters instead.

Signed-off-by: Phoebe Buckheister <phoebe.buckheister@itwm.fraunhofer.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-04-01 16:25:51 -04:00
David S. Miller
ce22bb6122 Merge branch 'for-davem' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next
John W. Linville says:

====================
pull request: wireless-next 2014-03-31

Please accept this one last round of general wireless updates for
the 3.15 merge window!

For the Bluetooth bits, Gustavo says:

"Here follow another set of patches to 3.15. This is mostly a bug fix
pull request with the exception of one commit from Marcel which adds
tracking to the current configured LE scan type parameter."

Beyond that, notable bits include some final refactoring of rtl8180
and the addition of the rtl8187se driver, fixes for a number of
problems identified by Dan Carpenter and his static analysis tools,
and a handful of other bits here and there.

Please let me know if there are problems!
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-31 16:44:55 -04:00
Wang Yufen
60ea37f7a5 ipv6: reuse rt6_need_strict
Move the whole rt6_need_strict as static inline into ip6_route.h,
so that it can be reused

Signed-off-by: Wang Yufen <wangyufen@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-31 16:16:16 -04:00
John W. Linville
96da266e77 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next into for-davem 2014-03-31 15:22:17 -04:00
Daniel Borkmann
fbc907f0b1 net: filter: move filter accounting to filter core
This patch basically does two things, i) removes the extern keyword
from the include/linux/filter.h file to be more consistent with the
rest of Joe's changes, and ii) moves filter accounting into the filter
core framework.

Filter accounting mainly done through sk_filter_{un,}charge() take
care of the case when sockets are being cloned through sk_clone_lock()
so that removal of the filter on one socket won't result in eviction
as it's still referenced by the other.

These functions actually belong to net/core/filter.c and not
include/net/sock.h as we want to keep all that in a central place.
It's also not in fast-path so uninlining them is fine and even allows
us to get rd of sk_filter_release_rcu()'s EXPORT_SYMBOL and a forward
declaration.

Joint work with Alexei Starovoitov.

Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Signed-off-by: Alexei Starovoitov <ast@plumgrid.com>
Cc: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-31 00:45:09 -04:00
David S. Miller
64c27237a0 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Conflicts:
	drivers/net/ethernet/marvell/mvneta.c

The mvneta.c conflict is a case of overlapping changes,
a conversion to devm_ioremap_resource() vs. a conversion
to netdev_alloc_pcpu_stats.

Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-29 18:48:54 -04:00
Hannes Frederic Sowa
c15b1ccadb ipv6: move DAD and addrconf_verify processing to workqueue
addrconf_join_solict and addrconf_join_anycast may cause actions which
need rtnl locked, especially on first address creation.

A new DAD state is introduced which defers processing of the initial
DAD processing into a workqueue.

To get rtnl lock we need to push the code paths which depend on those
calls up to workqueues, specifically addrconf_verify and the DAD
processing.

(v2)
addrconf_dad_failure needs to be queued up to the workqueue, too. This
patch introduces a new DAD state and stop the DAD processing in the
workqueue (this is because of the possible ipv6_del_addr processing
which removes the solicited multicast address from the device).

addrconf_verify_lock is removed, too. After the transition it is not
needed any more.

As we are not processing in bottom half anymore we need to be a bit more
careful about disabling bottom half out when we lock spin_locks which are also
used in bh.

Relevant backtrace:
[  541.030090] RTNL: assertion failed at net/core/dev.c (4496)
[  541.031143] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G           O 3.10.33-1-amd64-vyatta #1
[  541.031145] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2007
[  541.031146]  ffffffff8148a9f0 000000000000002f ffffffff813c98c1 ffff88007c4451f8
[  541.031148]  0000000000000000 0000000000000000 ffffffff813d3540 ffff88007fc03d18
[  541.031150]  0000880000000006 ffff88007c445000 ffffffffa0194160 0000000000000000
[  541.031152] Call Trace:
[  541.031153]  <IRQ>  [<ffffffff8148a9f0>] ? dump_stack+0xd/0x17
[  541.031180]  [<ffffffff813c98c1>] ? __dev_set_promiscuity+0x101/0x180
[  541.031183]  [<ffffffff813d3540>] ? __hw_addr_create_ex+0x60/0xc0
[  541.031185]  [<ffffffff813cfe1a>] ? __dev_set_rx_mode+0xaa/0xc0
[  541.031189]  [<ffffffff813d3a81>] ? __dev_mc_add+0x61/0x90
[  541.031198]  [<ffffffffa01dcf9c>] ? igmp6_group_added+0xfc/0x1a0 [ipv6]
[  541.031208]  [<ffffffff8111237b>] ? kmem_cache_alloc+0xcb/0xd0
[  541.031212]  [<ffffffffa01ddcd7>] ? ipv6_dev_mc_inc+0x267/0x300 [ipv6]
[  541.031216]  [<ffffffffa01c2fae>] ? addrconf_join_solict+0x2e/0x40 [ipv6]
[  541.031219]  [<ffffffffa01ba2e9>] ? ipv6_dev_ac_inc+0x159/0x1f0 [ipv6]
[  541.031223]  [<ffffffffa01c0772>] ? addrconf_join_anycast+0x92/0xa0 [ipv6]
[  541.031226]  [<ffffffffa01c311e>] ? __ipv6_ifa_notify+0x11e/0x1e0 [ipv6]
[  541.031229]  [<ffffffffa01c3213>] ? ipv6_ifa_notify+0x33/0x50 [ipv6]
[  541.031233]  [<ffffffffa01c36c8>] ? addrconf_dad_completed+0x28/0x100 [ipv6]
[  541.031241]  [<ffffffff81075c1d>] ? task_cputime+0x2d/0x50
[  541.031244]  [<ffffffffa01c38d6>] ? addrconf_dad_timer+0x136/0x150 [ipv6]
[  541.031247]  [<ffffffffa01c37a0>] ? addrconf_dad_completed+0x100/0x100 [ipv6]
[  541.031255]  [<ffffffff8105313a>] ? call_timer_fn.isra.22+0x2a/0x90
[  541.031258]  [<ffffffffa01c37a0>] ? addrconf_dad_completed+0x100/0x100 [ipv6]

Hunks and backtrace stolen from a patch by Stephen Hemminger.

Reported-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-28 16:54:50 -04:00
Lukasz Rymanowski
3d5a76f08b Bluetooth: Keep msec in DISCOV_LE_TIMEOUT
To be consistent, lets use msec for this timeout as well.

Note: This define value is a minimum scan time taken from BT Core spec 4.0,
Vol 3, Part C, chapter 9.2.6

Signed-off-by: Lukasz Rymanowski <lukasz.rymanowski@tieto.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2014-03-28 00:09:30 -07:00
Lukasz Rymanowski
b9a7a61e5c Bluetooth: Add new debugfs parameter
With this patch it is possible to control discovery interleaved
timeout value from debugfs.

It is for fine tuning of this timeout.

Signed-off-by: Lukasz Rymanowski <lukasz.rymanowski@tieto.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2014-03-28 00:09:30 -07:00
Lukasz Rymanowski
ae55f5982a Bluetooth: Keep msec in DISCOV_INTERLEAVED_TIMEOUT
Keep msec instead of jiffies in this define. This is needed by following
patch where we want this timeout to be exposed in debugfs.

Note: Value of this timeout comes from recommendation in BT Core Spec.4.0,
Vol 3, Part C, chapter 13.2.1.

Signed-off-by: Lukasz Rymanowski <lukasz.rymanowski@tieto.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2014-03-28 00:09:30 -07:00
Michal Kubeček
e5fd387ad5 ipv6: do not overwrite inetpeer metrics prematurely
If an IPv6 host route with metrics exists, an attempt to add a
new route for the same target with different metrics fails but
rewrites the metrics anyway:

12sp0:~ # ip route add fec0::1 dev eth0 rto_min 1000
12sp0:~ # ip -6 route show
fe80::/64 dev eth0  proto kernel  metric 256
fec0::1 dev eth0  metric 1024  rto_min lock 1s
12sp0:~ # ip route add fec0::1 dev eth0 rto_min 1500
RTNETLINK answers: File exists
12sp0:~ # ip -6 route show
fe80::/64 dev eth0  proto kernel  metric 256
fec0::1 dev eth0  metric 1024  rto_min lock 1.5s

This is caused by all IPv6 host routes using the metrics in
their inetpeer (or the shared default). This also holds for the
new route created in ip6_route_add() which shares the metrics
with the already existing route and thus ip6_route_add()
rewrites the metrics even if the new route ends up not being
used at all.

Another problem is that old metrics in inetpeer can reappear
unexpectedly for a new route, e.g.

12sp0:~ # ip route add fec0::1 dev eth0 rto_min 1000
12sp0:~ # ip route del fec0::1
12sp0:~ # ip route add fec0::1 dev eth0
12sp0:~ # ip route change fec0::1 dev eth0 hoplimit 10
12sp0:~ # ip -6 route show
fe80::/64 dev eth0  proto kernel  metric 256
fec0::1 dev eth0  metric 1024  hoplimit 10 rto_min lock 1s

Resolve the first problem by moving the setting of metrics down
into fib6_add_rt2node() to the point we are sure we are
inserting the new route into the tree. Second problem is
addressed by introducing new flag DST_METRICS_FORCE_OVERWRITE
which is set for a new host route in ip6_route_add() and makes
ipv6_cow_metrics() always overwrite the metrics in inetpeer
(even if they are not "new"); it is reset after that.

v5: use a flag in _metrics member rather than one in flags

v4: fix a typo making a condition always true (thanks to Hannes
Frederic Sowa)

v3: rewritten based on David Miller's idea to move setting the
metrics (and allocation in non-host case) down to the point we
already know the route is to be inserted. Also rebased to
net-next as it is quite late in the cycle.

Signed-off-by: Michal Kubecek <mkubecek@suse.cz>
Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-27 15:09:07 -04:00
John W. Linville
2de21e5899 Merge branch 'for-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth-next 2014-03-27 14:16:50 -04:00
Tom Herbert
61b905da33 net: Rename skb->rxhash to skb->hash
The packet hash can be considered a property of the packet, not just
on RX path.

This patch changes name of rxhash and l4_rxhash skbuff fields to be
hash and l4_hash respectively. This includes changing uses of the
field in the code which don't call the access functions.

Signed-off-by: Tom Herbert <therbert@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Mahesh Bandewar <maheshb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-26 15:58:20 -04:00
Johan Hedberg
ff5cd29f5c Bluetooth: Store also RSSI for pending advertising reports
Especially in crowded environments it can become frequent that we have
to send out whatever pending event there is stored. Since user space
has its own filtering of small RSSI changes sending a 0 value will
essentially force user space to wake up the higher layers (e.g. over
D-Bus) even though the RSSI didn't actually change more than the
threshold value.

This patch adds storing also of the RSSI for pending advertising reports
so that we report an as accurate RSSI as possible when we have to send
out the stored information to user space.

Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2014-03-26 09:31:40 -07:00
Johan Hedberg
73cf71d986 Bluetooth: Fix line splitting of mgmt_device_found parameters
The line was incorrectly split between the variable type and its name.
This patch fixes the issue.

Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2014-03-26 09:31:39 -07:00
Johan Hedberg
b9a6328f2a Bluetooth: Merge ADV_IND/ADV_SCAN_IND and SCAN_RSP together
To avoid too many events being sent to user space and to help parsing of
all available remote device data it makes sense for us to wait for the
scan response and send a single merged Device Found event to user space.

This patch adds a few new variables to hci_dev to track the last
received ADV_IND/ADV_SCAN_IND, i.e. those which will cause a SCAN_REQ to
be send in the case of active scanning. When the SCAN_RSP is received
the pending data is passed together with the SCAN_RSP to the
mgmt_device_found function which takes care of merging them into a
single Device Found event.

We also need a bit of extra logic to handle situations where we don't
receive a SCAN_RSP after caching some data. In such a scenario we simply
have to send out the pending data as it is and then operate on the new
report as if there was no pending data.

We also need to send out any pending data when scanning stops as
well as ensure that the storage is empty at the start of a new active
scanning session. These both cases are covered by the update to the
hci_cc_le_set_scan_enable function in this patch.

Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2014-03-26 09:31:38 -07:00
Johan Hedberg
3c857757ef Bluetooth: Add directed advertising support through connect()
When we're in peripheral mode (HCI_ADVERTISING flag is set) the most
natural mapping of connect() is to perform directed advertising to the
peer device.

This patch does the necessary changes to enable directed advertising and
keeps the hci_conn state as BT_CONNECT in a similar way as is done for
central or BR/EDR connection initiation.

Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2014-03-26 09:31:38 -07:00
Johan Hedberg
5d2e9fadf4 Bluetooth: Add scan_rsp parameter to mgmt_device_found()
In preparation for being able to merge ADV_IND/ADV_SCAN_IND and SCAN_RSP
together into a single device found event add a second parameter to the
mgmt_device_found function. For now all callers pass NULL as this
parameters since we don't yet have storing of the last received
advertising report.

Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2014-03-26 09:31:37 -07:00
David S. Miller
04f58c8854 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Conflicts:
	Documentation/devicetree/bindings/net/micrel-ks8851.txt
	net/core/netpoll.c

The net/core/netpoll.c conflict is a bug fix in 'net' happening
to code which is completely removed in 'net-next'.

In micrel-ks8851.txt we simply have overlapping changes.

Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-25 20:29:20 -04:00
David S. Miller
0fc3196603 Merge branch 'for-davem' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next
John W. Linville says:

====================
Please pull this batch of wireless updates intended for 3.15!

For the mac80211 bits, Johannes says:

"This has a whole bunch of bugfixes for things that went into -next
previously as well as some other bugfixes I didn't want to rush into
3.14 at this point. The rest of it is some cleanups and a few small
features, the biggest of which is probably Janusz's regulatory DFS CAC
time code."

For the Bluetooth bits, Gustavo says:

"One more pull request to 3.15. This is mostly and bug fix pull request, it
contains several fixes and clean up all over the tree, plus some small new
features."

For the NFC bits, Samuel says:

"This is the NFC pull request for 3.15. With this one we have:

- Support for ISO 15693 a.k.a. NFC vicinity a.k.a. Type 5 tags. ISO
  15693 are long range (1 - 2 meters) vicinity tags/cards. The kernel
  now supports those through the NFC netlink and digital APIs.

- Support for TI's trf7970a chipset. This chipset relies on the NFC
  digital layer and the driver currently supports type 2, 4A and 5 tags.

- Support for NXP's pn544 secure firmare download. The pn544 C3 chipsets
  relies on a different firmware download protocal than the C2 one. We
  now support both and use the right one depending on the version we
  detect at runtime.

- Support for 4A tags from the NFC digital layer.

- A bunch of cleanups and minor fixes from Axel Lin and Thierry Escande."

For the iwlwifi bits, Emmanuel says:

"We were sending a host command while the mutex wasn't held. This
led to hard-to-catch races."

And...

"I have a fix for a "merge damage" which is not really a merge
damage: it enables scheduled scan which has been disabled in
wireless.git. Since you merged wireless.git into wireless-next.git,
this can now be fixed in wireless-next.git.

Besides this, Alex made a workaround for a hardware bug. This fix
allows us to consume less power in S3. Arik and Eliad continue to
work on D0i3 which is a run-time power saving feature. Eliad also
contributes a few bits to the rate scaling logic to which Eyal adds his
own contribution. Avri dives deep in the power code - newer firmware
will allow to enable power save in newer scenarios. Johannes made a few
clean-ups. I have the regular amount of BT Coex boring stuff. I disable
uAPSD since we identified firmware bugs that cause packet loss. One
thing that do stand out is the udev event that we now send when the
FW asserts. I hope it will allow us to debug the FW more easily."

Also included is one last iwlwifi pull for a build breakage fix...

For the Atheros bits, Kalle says:

"Michal now did some optimisations and was able to improve throughput by
100 Mbps on our MIPS based AP135 platform. Chun-Yeow added some
workarounds to be able to better use ad-hoc mode. Ben improved log
messages and added support for MSDU chaining. And, as usual, also some
smaller fixes."

Beyond that...

Andrea Merello continues his rtl8180 refactoring, in preparation for
a long-awaited rtl8187 driver.  We get a new driver (rsi) for the
RS9113 chip, from Fariya Fatima.  And, of course, we get the usual
round of updates for ath9k, brcmfmac, mwifiex, wil6210, etc. as well.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-25 19:25:39 -04:00
Simon Derr
7b4f307276 9pnet: p9_client->conn field is unused. Remove it.
Signed-off-by: Simon Derr <simon.derr@bull.net>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
2014-03-25 16:38:16 -05:00
Simon Derr
0bfd6845c0 9P: Get rid of REQ_STATUS_FLSH
This request state is mostly useless, and properly implementing it
for RDMA would require an extra lock to be taken in handle_recv()
and in rdma_cancel() to avoid this race:

    handle_recv()           rdma_cancel()
        .                     .
        .                   if req->state == SENT
    req->state = RCVD         .
        .                           req->state = FLSH

So just get rid of it.

Signed-off-by: Simon Derr <simon.derr@bull.net>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
2014-03-25 16:38:15 -05:00
Simon Derr
afd8d65411 9P: Add cancelled() to the transport functions.
And move transport-specific code out of net/9p/client.c

Signed-off-by: Simon Derr <simon.derr@bull.net>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
2014-03-25 16:38:11 -05:00
Dominique Martinet
2b6e72ed74 9P: Add memory barriers to protect request fields over cb/rpc threads handoff
We need barriers to guarantee this pattern works as intended:
[w] req->rc, 1		[r] req->status, 1
wmb			rmb
[w] req->status, 1	[r] req->rc

Where the wmb ensures that rc gets written before status,
and the rmb ensures that if you observe status == 1, rc is the new value.

Signed-off-by: Dominique Martinet <dominique.martinet@cea.fr>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
2014-03-25 16:37:59 -05:00
Li RongQing
0b8c7f6f2a ipv4: remove ip_rt_dump from route.c
ip_rt_dump do nothing after IPv4 route caches removal, so we can remove it.

Signed-off-by: Li RongQing <roy.qing.li@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-24 12:45:01 -04:00
Eric Dumazet
99f0b958b1 net: optimize csum_replace2()
When changing one 16bit value by another in IP header, we can adjust
the IP checksum by doing a simple operation described in RFC 1624, as
reminded by David.

csum_partial() is a complex function on x86_64, not really suited for
small number of checksummed bytes.

I spotted csum_partial() being in the top 20 most consuming functions
(more than 1 %) in a GRO workload, which was rather unexpected.

The caller was inet_gro_complete() doing a csum_replace2() when
building the new IP header for the GRO packet.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-24 00:18:44 -04:00
Marcel Holtmann
533553f873 Bluetooth: Track current configured LE scan type parameter
The LE scan type paramter defines if active scanning or passive scanning
is in use. Track the currently set value so it can be used for decision
making from other pieces in the core.

Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
2014-03-21 22:02:12 +02:00
John W. Linville
49c0ca17ee Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next into for-davem 2014-03-21 14:02:04 -04:00
Eric Dumazet
6326231531 tcp: syncookies: do not use getnstimeofday()
While it is true that getnstimeofday() uses about 40 cycles if TSC
is available, it can use 1600 cycles if hpet is the clocksource.

Switch to get_jiffies_64(), as this is more than enough, and
go back to 60 seconds periods.

Fixes: 8c27bd75f0 ("tcp: syncookies: reduce cookie lifetime to 128 seconds")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Florian Westphal <fw@strlen.de>
Acked-by: Florian Westphal <fw@strlen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-20 16:22:42 -04:00
John W. Linville
370c5acef0 Merge branch 'for-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth-next 2014-03-20 11:54:22 -04:00
John W. Linville
7eb2450a51 Merge branch 'for-john' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211-next 2014-03-20 11:53:20 -04:00
Johan Hedberg
39adbffe4b Bluetooth: Fix passkey endianess in user_confirm and notify_passkey
The passkey_notify and user_confirm functions in mgmt.c were expecting
different endianess for the passkey, leading to a big endian bug and
sparse warning in recently added SMP code. This patch converts both
functions to expect host endianess and do the conversion to little
endian only when assigning to the mgmt event struct.

Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2014-03-19 23:22:07 -07:00
Emmanuel Grumbach
fb378c231d mac80211: set beamforming bit in radiotap
Add a bit in rx_status.vht_flags to let the low level driver
notify mac80211 about a beamformed packet. Propagate this
to the radiotap header.

Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2014-03-19 21:29:57 +01:00
Emmanuel Grumbach
3afc2167f6 cfg80211/mac80211: ignore signal if the frame was heard on wrong channel
On 2.4Ghz band, the channels overlap since the delta
between different channels is 5Mhz while the width of the
receiver is 20Mhz (at least).

This means that we can hear beacons or probe responses from
adjacent channels. These frames will have a significant
lower RSSI which will feed all kinds of logic with inaccurate
data. An obvious example is the roaming algorithm that will
think our AP is getting weak and will try to move to another
AP.

In order to avoid this, update the signal only if the frame
has been heard on the same channel as the one advertised by
the AP in its DS / HT IEs.
We refrain from updating the values only if the AP is
already in the BSS list so that we will still have a valid
(but inaccurate) value if the AP was heard on an adjacent
channel only.

To achieve this, stop taking the channel from DS / HT IEs
in mac80211. The DS / HT IEs is taken into account to
discard the frame if it was received on a disabled channel.
This can happen due to the same phenomenon: the frame is
sent on channel 12, but heard on channel 11 while channel
12 can be disabled on certain devices. Since this check
is done in cfg80211, stop even checking this in mac80211.

Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
[remove unused rx_freq variable]
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2014-03-19 21:29:56 +01:00
Eliad Peller
a0f995a561 mac80211: add status_driver_data array to ieee80211_tx_info
Drivers might want to have private data in addition
to all other ieee80211_tx_info.status fields.

The current ieee80211_tx_info.rate_driver_data overlaps
with some of the non-rate data (e.g. ampdu_ack_len), so
it might not be good enough.

Since we already know how much free bytes remained,
simply use this size to define (void *) array.

While on it, change ack_signal type from int to the more
explicit s32 type.

Signed-off-by: Eliad Peller <eliadx.peller@intel.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2014-03-19 21:29:55 +01:00
David S. Miller
995dca4ce9 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec-next
Steffen Klassert says:

====================
One patch to rename a newly introduced struct. The rest is
the rework of the IPsec virtual tunnel interface for ipv6 to
support inter address family tunneling and namespace crossing.

1) Rename the newly introduced struct xfrm_filter to avoid a
   conflict with iproute2. From Nicolas Dichtel.

2) Introduce xfrm_input_afinfo to access the address family
   dependent tunnel callback functions properly.

3) Add and use a IPsec protocol multiplexer for ipv6.

4) Remove dst_entry caching. vti can lookup multiple different
   dst entries, dependent of the configured xfrm states. Therefore
   it does not make to cache a dst_entry.

5) Remove caching of flow informations. vti6 does not use the the
   tunnel endpoint addresses to do route and xfrm lookups.

6) Update the vti6 to use its own receive hook.

7) Remove the now unused xfrm_tunnel_notifier. This was used from vti
   and is replaced by the IPsec protocol multiplexer hooks.

8) Support inter address family tunneling for vti6.

9) Check if the tunnel endpoints of the xfrm state and the vti interface
   are matching and return an error otherwise.

10) Enable namespace crossing for vti devices.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-18 14:09:07 -04:00
David S. Miller
e86e180b82 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-next
Pablo Neira Ayuso says:

====================
Netfilter/IPVS updates for net-next

The following patchset contains Netfilter/IPVS updates for net-next,
most relevantly they are:

* cleanup to remove double semicolon from stephen hemminger.

* calm down sparse warning in xt_ipcomp, from Fan Du.

* nf_ct_labels support for nf_tables, from Florian Westphal.

* new macros to simplify rcu dereferences in the scope of nfnetlink
  and nf_tables, from Patrick McHardy.

* Accept queue and drop (including reason for drop) to verdict
  parsing in nf_tables, also from Patrick.

* Remove unused random seed initialization in nfnetlink_log, from
  Florian Westphal.

* Allow to attach user-specific information to nf_tables rules, useful
  to attach user comments to rule, from me.

* Return errors in ipset according to the manpage documentation, from
  Jozsef Kadlecsik.

* Fix coccinelle warnings related to incorrect bool type usage for ipset,
  from Fengguang Wu.

* Add hash:ip,mark set type to ipset, from Vytas Dauksa.

* Fix message for each spotted by ipset for each netns that is created,
  from Ilia Mirkin.

* Add forceadd option to ipset, which evicts a random entry from the set
  if it becomes full, from Josh Hunt.

* Minor IPVS cleanups and fixes from Andi Kleen and Tingwei Liu.

* Improve conntrack scalability by removing a central spinlock, original
  work from Eric Dumazet. Jesper Dangaard Brouer took them over to address
  remaining issues. Several patches to prepare this change come in first
  place.

* Rework nft_hash to resolve bugs (leaking chain, missing rcu synchronization
  on element removal, etc. from Patrick McHardy.

* Restore context in the rule deletion path, as we now release rule objects
  synchronously, from Patrick McHardy. This gets back event notification for
  anonymous sets.

* Fix NAT family validation in nft_nat, also from Patrick.

* Improve scalability of xt_connlimit by using an array of spinlocks and
  by introducing a rb-tree of hashtables for faster lookup of accounted
  objects per network. This patch was preceded by several patches and
  refactorizations to accomodate this change including the use of kmem_cache,
  from Florian Westphal.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-17 15:06:24 -04:00
John W. Linville
20d83f2464 NFC: 3.15: First pull request
This is the NFC pull request for 3.15. With this one we have:
 
 - Support for ISO 15693 a.k.a. NFC vicinity a.k.a. Type 5 tags. ISO
   15693 are long range (1 - 2 meters) vicinity tags/cards. The kernel
   now supports those through the NFC netlink and digital APIs.
 
 - Support for TI's trf7970a chipset. This chipset relies on the NFC
   digital layer and the driver currently supports type 2, 4A and 5 tags.
 
 - Support for NXP's pn544 secure firmare download. The pn544 C3 chipsets
   relies on a different firmware download protocal than the C2 one. We
   now support both and use the right one depending on the version we
   detect at runtime.
 
 - Support for 4A tags from the NFC digital layer.
 
 - A bunch of cleanups and minor fixes from Axel Lin and Thierry Escande.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJTI1oQAAoJEIqAPN1PVmxKmxgP/iiwhQwsehiUZ6oXYjAc06WT
 e+vmJwHuDgbEnr5QDSIdpeaKuxYtCErbFcG4SD5+thTQNEeiRsmTLYdkwZ0FCDF+
 WKU3F59EdUZu4F0GyUXvt8lUC3D96bJSIHMCJdg1YDJtyaJcuITeK7lW325CK4hF
 4ik0+HXcv9mt/Dq4k6xgqdgCOSz7o9dJSbzw5NTCN1I+G6CNgYrvYqSmvsKut2C7
 DQvFu8KhbUkSlXgY61YIA2qpf1hfATkfgLdSsjGJfYdKvWLUAyKbEwoG8dkFA38Y
 yqc49SJ6DJ4FGmSt4Qm2VtRa8HLqwjA30jBEeb7Zog80j7zEBS+th0NJxqDWp/82
 eaeNdfrA5d17CHiR9ODuGOFt3VJZeN83u+JCli3FKGF727A/2Z0UI8UgnB5ZfcbZ
 +5IxV6mQp/h8H4AaQzGQ88x2xibhnrW/XYmTpoxj7CcT3AdbGw/IerZXf+afXBce
 iyJZ+Ktr6d5zFucCbm0QyZDSwAhfYDIYGiM6AIM3oum9n3K31dA0Qxy6saMiiK8c
 ArYY6raQbRCaOBVCI8JI6PoQFvw5MNug40v7Q/G88dElte9mSLsGslVVhCNBpAVB
 Atdwa+I3gwgsea1oqGJo0rSEaWv6PsOeQcFG8boDoQDUvgapzk17bTCz4k2DlXO5
 U0Q/GcJnFEdcDhqVKqP5
 =xikL
 -----END PGP SIGNATURE-----

Merge tag 'nfc-next-3.15-1' of git://git.kernel.org/pub/scm/linux/kernel/git/sameo/nfc-next

Samuel Ortiz <sameo@linux.intel.com> says:

"NFC: 3.15: First pull request

This is the NFC pull request for 3.15. With this one we have:

- Support for ISO 15693 a.k.a. NFC vicinity a.k.a. Type 5 tags. ISO
  15693 are long range (1 - 2 meters) vicinity tags/cards. The kernel
  now supports those through the NFC netlink and digital APIs.

- Support for TI's trf7970a chipset. This chipset relies on the NFC
  digital layer and the driver currently supports type 2, 4A and 5 tags.

- Support for NXP's pn544 secure firmare download. The pn544 C3 chipsets
  relies on a different firmware download protocal than the C2 one. We
  now support both and use the right one depending on the version we
  detect at runtime.

- Support for 4A tags from the NFC digital layer.

- A bunch of cleanups and minor fixes from Axel Lin and Thierry Escande."

Signed-off-by: John W. Linville <linville@tuxdriver.com>
2014-03-17 13:16:50 -04:00
David S. Miller
85dcce7a73 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Conflicts:
	drivers/net/usb/r8152.c
	drivers/net/xen-netback/netback.c

Both the r8152 and netback conflicts were simple overlapping
changes.

Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-14 22:31:55 -04:00
Phoebe Buckheister
a13061ec04 6lowpan: move lowpan frag_info out of 802.15.4 headers
Fragmentation and reassembly information for 6lowpan is independent from
the 802.15.4 stack and used only by the 6lowpan reassembly process. Move
the ieee802154_frag_info struct to a private are, it needn't be in the
802.15.4 skb control block.

Signed-off-by: Phoebe Buckheister <phoebe.buckheister@itwm.fraunhofer.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-14 22:15:26 -04:00
Phoebe Buckheister
ae531b9475 ieee802154: use ieee802154_addr instead of *_sa variants
Change all internal uses of ieee802154_addr_sa to ieee802154_addr,
except for those instances that communicate directly with userspace.

Signed-off-by: Phoebe Buckheister <phoebe.buckheister@itwm.fraunhofer.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-14 22:15:26 -04:00
Phoebe Buckheister
e6278d9200 mac802154: use header operations to create/parse headers
Use the operations on 802.15.4 header structs introduced in a previous
patch to create and parse all headers in the mac802154 stack. This patch
reduces code duplication between different parts of the mac802154 stack
that needed information from headers, and also fixes a few bugs that
seem to have gone unnoticed until now:

 * 802.15.4 dgram sockets would return a slightly incorrect value for
   the SIOCINQ ioctl
 * mac802154 would not drop frames with the "security enabled" bit set,
   even though it does not support security, in violation of the
   standard

Signed-off-by: Phoebe Buckheister <phoebe.buckheister@itwm.fraunhofer.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-14 22:15:26 -04:00
Phoebe Buckheister
94b4f6c21c ieee802154: add header structs with endiannes and operations
This patch provides a set of structures to represent 802.15.4 MAC
headers, and a set of operations to push/pull/peek these structs from
skbs. We cannot simply pointer-cast the skb MAC header pointer to these
structs, because 802.15.4 headers are wildly variable - depending on the
first three bytes, virtually all other fields of the header may be
present or not, and be present with different lengths.

The new header creation/parsing routines also support 802.15.4 security
headers, which are currently not supported by the mac802154
implementation of the protocol.

Signed-off-by: Phoebe Buckheister <phoebe.buckheister@itwm.fraunhofer.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-14 22:15:26 -04:00
Phoebe Buckheister
b70ab2e87f ieee802154: enforce consistent endianness in the 802.15.4 stack
Enable sparse warnings about endianness, replace the remaining fields
regarding network operations without explicit endianness annotations
with such that are annotated, and propagate this through the entire
stack.

Uses of ieee802154_addr_sa are not changed yet, this patch is only
concerned with all other fields (such as address filters, operation
parameters and the likes).

Signed-off-by: Phoebe Buckheister <phoebe.buckheister@itwm.fraunhofer.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-14 22:15:26 -04:00
Phoebe Buckheister
46ef0eb3ea ieee802154: add address struct with proper endiannes and some operations
Add a replacement ieee802154_addr struct with proper endianness on
fields. Short address fields are stored as __le16 as on the network,
extended (EUI64) addresses are __le64 as opposed to the u8[8] format
used previously. This disconnect with the netdev address, which is
stored as big-endian u8[8], is intentional.

Signed-off-by: Phoebe Buckheister <phoebe.buckheister@itwm.fraunhofer.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-14 22:15:26 -04:00
Phoebe Buckheister
376b7bd355 ieee802154: rename struct ieee802154_addr to *_sa
The struct as currently defined uses host byte order for some fields,
and most big endian/EUI display byte order for other fields. Inside the
stack, endianness should ideally match network byte order where possible
to minimize the number of byteswaps done in critical paths, but this
patch does not address this; it is only preparatory.

Signed-off-by: Phoebe Buckheister <phoebe.buckheister@itwm.fraunhofer.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-14 22:15:25 -04:00
Steffen Klassert
573ce1c11b xfrm6: Remove xfrm_tunnel_notifier
This was used from vti and is replaced by the IPsec protocol
multiplexer hooks. It is now unused, so remove it.

Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2014-03-14 07:28:08 +01:00
Steffen Klassert
7e14ea1521 xfrm6: Add IPsec protocol multiplexer
This patch adds an IPsec protocol multiplexer for ipv6. With
this it is possible to add alternative protocol handlers, as
needed for IPsec virtual tunnel interfaces.

Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2014-03-14 07:28:07 +01:00
Steffen Klassert
2f32b51b60 xfrm: Introduce xfrm_input_afinfo to access the the callbacks properly
IPv6 can be build as a module, so we need mechanism to access
the address family dependent callback functions properly.
Therefore we introduce xfrm_input_afinfo, similar to that
what we have for the address family dependent part of
policies and states.

Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2014-03-14 07:28:07 +01:00
John W. Linville
42775a34d2 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next into for-davem
Conflicts:
	drivers/net/wireless/ath/ath9k/recv.c
2014-03-13 14:21:43 -04:00
Steffen Klassert
4a93f5095a flowcache: Fix resource leaks on namespace exit.
We leak an active timer, the hotcpu notifier and all allocated
resources when we exit a namespace. Fix this by introducing a
flow_cache_fini() function where we release the resources before
we exit.

Fixes: ca925cf153 ("flowcache: Make flow cache name space aware")
Reported-by: Jakub Kicinski <moorray3@wp.pl>
Tested-by: Jakub Kicinski <moorray3@wp.pl>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Fan Du <fan.du@windriver.com>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-12 15:31:18 -04:00
Eric Dumazet
c3f9b01849 tcp: tcp_release_cb() should release socket ownership
Lars Persson reported following deadlock :

-000 |M:0x0:0x802B6AF8(asm) <-- arch_spin_lock
-001 |tcp_v4_rcv(skb = 0x8BD527A0) <-- sk = 0x8BE6B2A0
-002 |ip_local_deliver_finish(skb = 0x8BD527A0)
-003 |__netif_receive_skb_core(skb = 0x8BD527A0, ?)
-004 |netif_receive_skb(skb = 0x8BD527A0)
-005 |elk_poll(napi = 0x8C770500, budget = 64)
-006 |net_rx_action(?)
-007 |__do_softirq()
-008 |do_softirq()
-009 |local_bh_enable()
-010 |tcp_rcv_established(sk = 0x8BE6B2A0, skb = 0x87D3A9E0, th = 0x814EBE14, ?)
-011 |tcp_v4_do_rcv(sk = 0x8BE6B2A0, skb = 0x87D3A9E0)
-012 |tcp_delack_timer_handler(sk = 0x8BE6B2A0)
-013 |tcp_release_cb(sk = 0x8BE6B2A0)
-014 |release_sock(sk = 0x8BE6B2A0)
-015 |tcp_sendmsg(?, sk = 0x8BE6B2A0, ?, ?)
-016 |sock_sendmsg(sock = 0x8518C4C0, msg = 0x87D8DAA8, size = 4096)
-017 |kernel_sendmsg(?, ?, ?, ?, size = 4096)
-018 |smb_send_kvec()
-019 |smb_send_rqst(server = 0x87C4D400, rqst = 0x87D8DBA0)
-020 |cifs_call_async()
-021 |cifs_async_writev(wdata = 0x87FD6580)
-022 |cifs_writepages(mapping = 0x852096E4, wbc = 0x87D8DC88)
-023 |__writeback_single_inode(inode = 0x852095D0, wbc = 0x87D8DC88)
-024 |writeback_sb_inodes(sb = 0x87D6D800, wb = 0x87E4A9C0, work = 0x87D8DD88)
-025 |__writeback_inodes_wb(wb = 0x87E4A9C0, work = 0x87D8DD88)
-026 |wb_writeback(wb = 0x87E4A9C0, work = 0x87D8DD88)
-027 |wb_do_writeback(wb = 0x87E4A9C0, force_wait = 0)
-028 |bdi_writeback_workfn(work = 0x87E4A9CC)
-029 |process_one_work(worker = 0x8B045880, work = 0x87E4A9CC)
-030 |worker_thread(__worker = 0x8B045880)
-031 |kthread(_create = 0x87CADD90)
-032 |ret_from_kernel_thread(asm)

Bug occurs because __tcp_checksum_complete_user() enables BH, assuming
it is running from softirq context.

Lars trace involved a NIC without RX checksum support but other points
are problematic as well, like the prequeue stuff.

Problem is triggered by a timer, that found socket being owned by user.

tcp_release_cb() should call tcp_write_timer_handler() or
tcp_delack_timer_handler() in the appropriate context :

BH disabled and socket lock held, but 'owned' field cleared,
as if they were running from timer handlers.

Fixes: 6f458dfb40 ("tcp: improve latencies of timer triggered events")
Reported-by: Lars Persson <lars.persson@axis.com>
Tested-by: Lars Persson <lars.persson@axis.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-11 16:45:59 -04:00
Eric Dumazet
d32d9bb85c flowcache: restore a single flow_cache kmem_cache
It is not legal to create multiple kmem_cache having the same name.

flowcache can use a single kmem_cache, no need for a per netns
one.

Fixes: ca925cf153 ("flowcache: Make flow cache name space aware")
Reported-by: Jakub Kicinski <moorray3@wp.pl>
Tested-by: Jakub Kicinski <moorray3@wp.pl>
Tested-by: Fan Du <fan.du@windriver.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-10 21:45:11 -04:00
Mark A. Greer
ceeee42d85 NFC: digital: Rename Type V tags to Type 5 tags
According to the latest draft specification from
the NFC-V committee, ISO/IEC 15693 tags will be
referred to as "Type 5" tags and not "Type V"
tags anymore.  Make the code reflect the new
terminology.

Signed-off-by: Mark A. Greer <mgreer@animalcreek.com>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2014-03-11 00:40:59 +01:00
Gu Zheng
5812521be0 net: add a pre-check of net_ns in sk_change_net()
We do not need to switch the net_ns if the target net_ns the same
as the current one, so here we add a pre-check of net_ns to avoid
this as David suggested.

Signed-off-by: Gu Zheng <guz.fnst@cn.fujitsu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-10 16:29:48 -04:00
Marcel Holtmann
53ac6ab612 Bluetooth: Make LTK and CSRK only persisent when bonding
In case the pairable option has been disabled, the pairing procedure
does not create keys for bonding. This means that these generated keys
should not be stored persistently.

For LTK and CSRK this is important to tell userspace to not store these
new keys. They will be available for the lifetime of the device, but
after the next power cycle they should not be used anymore.

Inform userspace to actually store the keys persistently only if both
sides request bonding.

Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
2014-03-10 14:57:33 +02:00
Marcel Holtmann
7ee4ea3692 Bluetooth: Add support for handling signature resolving keys
The connection signature resolving key (CSRK) is used for attribute
protocol signed write procedures. This change generates a new local
key during pairing and requests the peer key as well.

Newly generated key and received key will be provided to userspace
using the New Signature Resolving Key management event.

The Master CSRK can be used for verification of remote signed write
PDUs and the Slave CSRK can be used for sending signed write PDUs
to the remote device.

Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
2014-03-09 21:39:50 +02:00
Patrick McHardy
62472bcefb netfilter: nf_tables: restore context for expression destructors
In order to fix set destruction notifications and get rid of unnecessary
members in private data structures, pass the context to expressions'
destructor functions again.

In order to do so, replace various members in the nft_rule_trans structure
by the full context.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2014-03-08 12:35:17 +01:00
Jesper Dangaard Brouer
93bb0ceb75 netfilter: conntrack: remove central spinlock nf_conntrack_lock
nf_conntrack_lock is a monolithic lock and suffers from huge contention
on current generation servers (8 or more core/threads).

Perf locking congestion is clear on base kernel:

-  72.56%  ksoftirqd/6  [kernel.kallsyms]    [k] _raw_spin_lock_bh
   - _raw_spin_lock_bh
      + 25.33% init_conntrack
      + 24.86% nf_ct_delete_from_lists
      + 24.62% __nf_conntrack_confirm
      + 24.38% destroy_conntrack
      + 0.70% tcp_packet
+   2.21%  ksoftirqd/6  [kernel.kallsyms]    [k] fib_table_lookup
+   1.15%  ksoftirqd/6  [kernel.kallsyms]    [k] __slab_free
+   0.77%  ksoftirqd/6  [kernel.kallsyms]    [k] inet_getpeer
+   0.70%  ksoftirqd/6  [nf_conntrack]       [k] nf_ct_delete
+   0.55%  ksoftirqd/6  [ip_tables]          [k] ipt_do_table

This patch change conntrack locking and provides a huge performance
improvement.  SYN-flood attack tested on a 24-core E5-2695v2(ES) with
10Gbit/s ixgbe (with tool trafgen):

 Base kernel:   810.405 new conntrack/sec
 After patch: 2.233.876 new conntrack/sec

Notice other floods attack (SYN+ACK or ACK) can easily be deflected using:
 # iptables -A INPUT -m state --state INVALID -j DROP
 # sysctl -w net/netfilter/nf_conntrack_tcp_loose=0

Use an array of hashed spinlocks to protect insertions/deletions of
conntracks into the hash table. 1024 spinlocks seem to give good
results, at minimal cost (4KB memory). Due to lockdep max depth,
1024 becomes 8 if CONFIG_LOCKDEP=y

The hash resize is a bit tricky, because we need to take all locks in
the array. A seqcount_t is used to synchronize the hash table users
with the resizing process.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Reviewed-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2014-03-07 11:41:13 +01:00
Jesper Dangaard Brouer
ca7433df3a netfilter: conntrack: seperate expect locking from nf_conntrack_lock
Netfilter expectations are protected with the same lock as conntrack
entries (nf_conntrack_lock).  This patch split out expectations locking
to use it's own lock (nf_conntrack_expect_lock).

Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Reviewed-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2014-03-07 11:41:01 +01:00
Jesper Dangaard Brouer
b7779d06f9 netfilter: conntrack: spinlock per cpu to protect special lists.
One spinlock per cpu to protect dying/unconfirmed/template special lists.
(These lists are now per cpu, a bit like the untracked ct)
Add a @cpu field to nf_conn, to make sure we hold the appropriate
spinlock at removal time.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Reviewed-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2014-03-07 11:40:38 +01:00
Jesper Dangaard Brouer
b476b72a0f netfilter: trivial code cleanup and doc changes
Changes while reading through the netfilter code.

Added hint about how conntrack nf_conn refcnt is accessed.
And renamed repl_hash to reply_hash for readability

Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Reviewed-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2014-03-07 11:40:04 +01:00
Nicolas Dichtel
870a2df4ca xfrm: rename struct xfrm_filter
iproute2 already defines a structure with that name, let's use another one to
avoid any conflict.

CC: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2014-03-07 08:12:37 +01:00
Alexander Aring
cefc8c8a7c 6lowpan: move 6lowpan header to include/net
This header is used by bluetooth and ieee802154 branch. This patch
move this header to the include/net directory to avoid a use of a
relative path in include.

Signed-off-by: Alexander Aring <alex.aring@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-06 17:21:38 -05:00
Andrew Lutomirski
adca476782 net: Improve SO_TIMESTAMPING documentation and fix a minor code bug
The original documentation was very unclear.

The code fix is presumably related to the formerly unclear
documentation: SOCK_TIMESTAMPING_RX_SOFTWARE has no effect on
__sock_recv_timestamp's behavior, so calling __sock_recv_ts_and_drops
from sock_recv_ts_and_drops if only SOCK_TIMESTAMPING_RX_SOFTWARE is
set is pointless.  This should have no user-observable effect.

Signed-off-by: Andy Lutomirski <luto@amacapital.net>
Acked-by: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-06 16:18:01 -05:00
David S. Miller
f7324acd98 tcp: Use NET_ADD_STATS instead of NET_ADD_STATS_BH in tcp_event_new_data_sent()
Can be invoked from non-BH context.

Based upon a patch by Eric Dumazet.

Fixes: f19c29e3e3 ("tcp: snmp stats for Fast Open, SYN rtx, and data pkts")
Reported-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-06 15:19:43 -05:00
Hannes Frederic Sowa
e90c14835b inet: remove now unused flag DST_NOPEER
Commit e688a60480 ("net: introduce DST_NOPEER dst flag") introduced
DST_NOPEER because because of crashes in ipv6_select_ident called from
udp6_ufo_fragment.

Since commit 916e4cf46d ("ipv6: reuse ip6_frag_id from
ip6_ufo_append_data") we don't call ipv6_select_ident any more from
ip6_ufo_append_data, thus this flag lost its purpose and can be removed.

Cc: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-06 13:15:52 -05:00
David S. Miller
67ddc87f16 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Conflicts:
	drivers/net/wireless/ath/ath9k/recv.c
	drivers/net/wireless/mwifiex/pcie.c
	net/ipv6/sit.c

The SIT driver conflict consists of a bug fix being done by hand
in 'net' (missing u64_stats_init()) whilst in 'net-next' a helper
was created (netdev_alloc_pcpu_stats()) which takes care of this.

The two wireless conflicts were overlapping changes.

Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-05 20:32:02 -05:00
John W. Linville
9ffd2e9a17 Merge branch 'for-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth-next 2014-03-03 14:59:12 -05:00
Alexander Aring
7240cdec60 6lowpan: handling 6lowpan fragmentation via inet_frag api
This patch drops the current way of 6lowpan fragmentation on receiving
side and replace it with a implementation which use the inet_frag api.
The old fragmentation handling has some race conditions and isn't
rfc4944 compatible. Also adding support to match fragments on
destination address, source address, tag value and datagram_size
which is missing in the current implementation.

Signed-off-by: Alexander Aring <alex.aring@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-28 17:05:22 -05:00
Alexander Aring
633fc86ff6 net: ns: add ieee802154_6lowpan namespace
This patch adds necessary ieee802154 6lowpan namespace to provide the
inet_frag information. This is a initial support for handling 6lowpan
fragmentation with the inet_frag api.

Signed-off-by: Alexander Aring <alex.aring@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-28 17:05:22 -05:00