Commit Graph

57946 Commits

Author SHA1 Message Date
David S. Miller
69fb78121b Merge branch 'for-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth-next
Johan Hedberg says:

====================
pull request: bluetooth-next 2016-04-12

Here's a set of Bluetooth & 802.15.4 patches intended for the 4.7 kernel:

 - Fix for race condition in vhci driver
 - Memory leak fix for ieee802154/adf7242 driver
 - Improvements to deal with single-mode (LE-only) Bluetooth controllers
 - Fix for allowing the BT_SECURITY_FIPS security level
 - New BCM2E71 ACPI ID
 - NULL pointer dereference fix fox hci_ldisc driver

Let me know if there are any issues pulling. Thanks.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-12 11:57:53 -04:00
Johannes Berg
57fbcce37b cfg80211: remove enum ieee80211_band
This enum is already perfectly aliased to enum nl80211_band, and
the only reason for it is that we get IEEE80211_NUM_BANDS out of
it. There's no really good reason to not declare the number of
bands in nl80211 though, so do that and remove the cfg80211 one.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2016-04-12 15:56:15 +02:00
Dmitry Ivanov
8f815cdde3 nl80211: check netlink protocol in socket release notification
A non-privileged user can create a netlink socket with the same port_id as
used by an existing open nl80211 netlink socket (e.g. as used by a hostapd
process) with a different protocol number.

Closing this socket will then lead to the notification going to nl80211's
socket release notification handler, and possibly cause an action such as
removing a virtual interface.

Fix this issue by checking that the netlink protocol is NETLINK_GENERIC.
Since generic netlink has no notifier chain of its own, we can't fix the
problem more generically.

Fixes: 026331c4d9 ("cfg80211/mac80211: allow registering for and sending action frames")
Cc: stable@vger.kernel.org
Signed-off-by: Dmitry Ivanov <dima@ubnt.com>
[rewrite commit message]
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2016-04-12 15:39:06 +02:00
David Howells
5ac7eace2d KEYS: Add a facility to restrict new links into a keyring
Add a facility whereby proposed new links to be added to a keyring can be
vetted, permitting them to be rejected if necessary.  This can be used to
block public keys from which the signature cannot be verified or for which
the signature verification fails.  It could also be used to provide
blacklisting.

This affects operations like add_key(), KEYCTL_LINK and KEYCTL_INSTANTIATE.

To this end:

 (1) A function pointer is added to the key struct that, if set, points to
     the vetting function.  This is called as:

	int (*restrict_link)(struct key *keyring,
			     const struct key_type *key_type,
			     unsigned long key_flags,
			     const union key_payload *key_payload),

     where 'keyring' will be the keyring being added to, key_type and
     key_payload will describe the key being added and key_flags[*] can be
     AND'ed with KEY_FLAG_TRUSTED.

     [*] This parameter will be removed in a later patch when
     	 KEY_FLAG_TRUSTED is removed.

     The function should return 0 to allow the link to take place or an
     error (typically -ENOKEY, -ENOPKG or -EKEYREJECTED) to reject the
     link.

     The pointer should not be set directly, but rather should be set
     through keyring_alloc().

     Note that if called during add_key(), preparse is called before this
     method, but a key isn't actually allocated until after this function
     is called.

 (2) KEY_ALLOC_BYPASS_RESTRICTION is added.  This can be passed to
     key_create_or_update() or key_instantiate_and_link() to bypass the
     restriction check.

 (3) KEY_FLAG_TRUSTED_ONLY is removed.  The entire contents of a keyring
     with this restriction emplaced can be considered 'trustworthy' by
     virtue of being in the keyring when that keyring is consulted.

 (4) key_alloc() and keyring_alloc() take an extra argument that will be
     used to set restrict_link in the new key.  This ensures that the
     pointer is set before the key is published, thus preventing a window
     of unrestrictedness.  Normally this argument will be NULL.

 (5) As a temporary affair, keyring_restrict_trusted_only() is added.  It
     should be passed to keyring_alloc() as the extra argument instead of
     setting KEY_FLAG_TRUSTED_ONLY on a keyring.  This will be replaced in
     a later patch with functions that look in the appropriate places for
     authoritative keys.

Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
2016-04-11 22:37:37 +01:00
David Ahern
4f7f34eaab net: vrf: Fix dev refcnt leak due to IPv6 prefix route
ifupdown2 found a kernel bug with IPv6 routes and movement from the main
table to the VRF table. Sequence of events:

Create the interface and add addresses:
    ip link add dev eth4.105 link eth4 type vlan id 105
    ip addr add dev eth4.105 8.105.105.10/24
    ip -6 addr add dev eth4.105 2008:105:105::10/64

At this point IPv6 has inserted a prefix route in the main table even
though the interface is 'down'. From there the VRF device is created:
    ip link add dev vrf105 type vrf table 105
    ip addr add dev vrf105 9.9.105.10/32
    ip -6 addr add dev vrf105 2000:9:105::10/128
    ip link set vrf105 up

Then the interface is enslaved, while still in the 'down' state:
    ip link set dev eth4.105 master vrf105

Since the device is down the VRF driver cycling the device does not
send the NETDEV_UP and NETDEV_DOWN but rather the NETDEV_CHANGE event
which does not flush the routes inserted prior.

When the link is brought up
    ip link set dev eth4.105 up

the prefix route is added in the VRF table, but does not remove
the route from the main table.

Fix by handling the NETDEV_CHANGEUPPER event similar what was implemented
for IPv4 in 7f49e7a38b ("net: Flush local routes when device changes vrf
association")

Fixes: 35402e3136 ("net: Add IPv6 support to VRF device")

Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-11 15:56:20 -04:00
David Ahern
9ab179d83b net: vrf: Fix dst reference counting
Vivek reported a kernel exception deleting a VRF with an active
connection through it. The root cause is that the socket has a cached
reference to a dst that is destroyed. Converting the dst_destroy to
dst_release and letting proper reference counting kick in does not
work as the dst has a reference to the device which needs to be released
as well.

I talked to Hannes about this at netdev and he pointed out the ipv4 and
ipv6 dst handling has dst_ifdown for just this scenario. Rather than
continuing with the reinvented dst wheel in VRF just remove it and
leverage the ipv4 and ipv6 versions.

Fixes: 193125dbd8 ("net: Introduce VRF device driver")
Fixes: 35402e3136 ("net: Add IPv6 support to VRF device")

Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-11 15:56:20 -04:00
David Howells
e0e4d82f3b rxrpc: Create a null security type and get rid of conditional calls
Create a null security type for security index 0 and get rid of all
conditional calls to the security operations.  We expect normally to be
using security, so this should be of little negative impact.

Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-11 15:34:41 -04:00
David Howells
648af7fca1 rxrpc: Absorb the rxkad security module
Absorb the rxkad security module into the af_rxrpc module so that there's
only one module file.  This avoids a circular dependency whereby rxkad pins
af_rxrpc and cached connections pin rxkad but can't be manually evicted
(they will expire eventually and cease pinning).

With this change, af_rxrpc can just be unloaded, despite having cached
connections.

Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-11 15:34:41 -04:00
David Howells
6dd050f88d rxrpc: Don't assume transport address family and size when using it
Don't assume transport address family and size when using the peer address
to send a packet.  Instead, use the start of the transport address rather
than any particular element of the union and use the transport address
length noted inside the sockaddr_rxrpc struct.

This will be necessary when IPv6 support is introduced.

Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-11 15:34:41 -04:00
David Howells
843099cac0 rxrpc: Don't pass gfp around in incoming call handling functions
Don't pass gfp around in incoming call handling functions, but rather hard
code it at the points where we actually need it since the value comes from
within the rxrpc driver and is always the same.

Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-11 15:34:41 -04:00
David Howells
dc44b3a09a rxrpc: Differentiate local and remote abort codes in structs
In the rxrpc_connection and rxrpc_call structs, there's one field to hold
the abort code, no matter whether that value was generated locally to be
sent or was received from the peer via an abort packet.

Split the abort code fields in two for cleanliness sake and add an error
field to hold the Linux error number to the rxrpc_call struct too
(sometimes this is generated in a context where we can't return it to
userspace directly).

Furthermore, add a skb mark to indicate a packet that caused a local abort
to be generated so that recvmsg() can pick up the correct abort code.  A
future addition will need to be to indicate to userspace the difference
between aborts via a control message.

Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-11 15:34:40 -04:00
David Howells
5b3e87f19e rxrpc: Static arrays of strings should be const char *const[]
Static arrays of strings should be const char *const[].

Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-11 15:34:40 -04:00
David Howells
8e688d9c16 rxrpc: Move some miscellaneous bits out into their own file
Move some miscellaneous bits out into their own file to make it easier to
split the call handling.

Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-11 15:34:40 -04:00
David Howells
8f7e6e75d3 rxrpc: Disable a debugging statement that has been left enabled.
Disable a debugging statement that has been left enabled

Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-11 15:34:40 -04:00
Willem de Bruijn
4d0fc73ebe rxrpc: do not pull udp headers on receive
Commit e6afc8ace6 modified the udp receive path by pulling the udp
header before queuing an skbuff onto the receive queue.

Rxrpc also calls skb_recv_datagram to dequeue an skb from a udp
socket. Modify this receive path to also no longer expect udp
headers.

Fixes: e6afc8ace6 ("udp: remove headers from UDP packets before queueing")

Signed-off-by: Willem de Bruijn <willemb@google.com>
Tested-by: Thierry Reding <treding@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-11 15:31:33 -04:00
Willem de Bruijn
1da8c681d5 sunrpc: do not pull udp headers on receive
Commit e6afc8ace6 modified the udp receive path by pulling the udp
header before queuing an skbuff onto the receive queue.

Sunrpc also calls skb_recv_datagram to dequeue an skb from a udp
socket. Modify this receive path to also no longer expect udp
headers.

Fixes: e6afc8ace6 ("udp: remove headers from UDP packets before queueing")

Reported-by: Franklin S Cooper Jr. <fcooper@ti.com>
Signed-off-by: Willem de Bruijn <willemb@google.com>
Tested-by: Thierry Reding <treding@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-11 15:31:33 -04:00
Erik Hugne
ddb1d33969 tipc: purge deferred updates from dead nodes
If a peer node becomes unavailable, in addition to removing the
nametable entries from this node we also need to purge all deferred
updates associated with this node.

Signed-off-by: Erik Hugne <erik.hugne@gmail.com>
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-11 15:22:20 -04:00
Erik Hugne
541726abe7 tipc: make dist queue pernet
Nametable updates received from the network that cannot be applied
immediately are placed on a defer queue. This queue is global to the
TIPC module, which might cause problems when using TIPC in containers.
To prevent nametable updates from escaping into the wrong namespace,
we make the queue pernet instead.

Signed-off-by: Erik Hugne <erik.hugne@gmail.com>
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-11 15:22:20 -04:00
David Ahern
a6db4494d2 net: ipv4: Consider failed nexthops in multipath routes
Multipath route lookups should consider knowledge about next hops and not
select a hop that is known to be failed.

Example:

                     [h2]                   [h3]   15.0.0.5
                      |                      |
                     3|                     3|
                    [SP1]                  [SP2]--+
                     1  2                   1     2
                     |  |     /-------------+     |
                     |   \   /                    |
                     |     X                      |
                     |    / \                     |
                     |   /   \---------------\    |
                     1  2                     1   2
         12.0.0.2  [TOR1] 3-----------------3 [TOR2] 12.0.0.3
                     4                         4
                      \                       /
                        \                    /
                         \                  /
                          -------|   |-----/
                                 1   2
                                [TOR3]
                                  3|
                                   |
                                  [h1]  12.0.0.1

host h1 with IP 12.0.0.1 has 2 paths to host h3 at 15.0.0.5:

    root@h1:~# ip ro ls
    ...
    12.0.0.0/24 dev swp1  proto kernel  scope link  src 12.0.0.1
    15.0.0.0/16
            nexthop via 12.0.0.2  dev swp1 weight 1
            nexthop via 12.0.0.3  dev swp1 weight 1
    ...

If the link between tor3 and tor1 is down and the link between tor1
and tor2 then tor1 is effectively cut-off from h1. Yet the route lookups
in h1 are alternating between the 2 routes: ping 15.0.0.5 gets one and
ssh 15.0.0.5 gets the other. Connections that attempt to use the
12.0.0.2 nexthop fail since that neighbor is not reachable:

    root@h1:~# ip neigh show
    ...
    12.0.0.3 dev swp1 lladdr 00:02:00:00:00:1b REACHABLE
    12.0.0.2 dev swp1  FAILED
    ...

The failed path can be avoided by considering known neighbor information
when selecting next hops. If the neighbor lookup fails we have no
knowledge about the nexthop, so give it a shot. If there is an entry
then only select the nexthop if the state is sane. This is similar to
what fib_detect_death does.

To maintain backward compatibility use of the neighbor information is
based on a new sysctl, fib_multipath_use_neigh.

Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
Reviewed-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-11 15:16:13 -04:00
Al Viro
ce23e64013 ->getxattr(): pass dentry and inode as separate arguments
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-04-11 00:48:00 -04:00
Dmitry Ivanov
e272602039 netlink: don't send NETLINK_URELEASE for unbound sockets
All existing users of NETLINK_URELEASE use it to clean up resources that
were previously allocated to a socket via some command. As a result, no
users require getting this notification for unbound sockets.

Sending it for unbound sockets, however, is a problem because any user
(including unprivileged users) can create a socket that uses the same ID
as an existing socket. Binding this new socket will fail, but if the
NETLINK_URELEASE notification is generated for such sockets, the users
thereof will be tricked into thinking the socket that they allocated the
resources for is closed.

In the nl80211 case, this will cause destruction of virtual interfaces
that still belong to an existing hostapd process; this is the case that
Dmitry noticed. In the NFC case, it will cause a poll abort. In the case
of netlink log/queue it will cause them to stop reporting events, as if
NFULNL_CFG_CMD_UNBIND/NFQNL_CFG_CMD_UNBIND had been called.

Fix this problem by checking that the socket is bound before generating
the NETLINK_URELEASE notification.

Cc: stable@vger.kernel.org
Signed-off-by: Dmitry Ivanov <dima@ubnt.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-10 23:32:23 -04:00
David S. Miller
a36a0d4008 decnet: Do not build routes to devices without decnet private data.
In particular, make sure we check for decnet private presence
for loopback devices.

Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-10 23:01:30 -04:00
Marcelo Ricardo Leitner
ba6f5e33bd sctp: avoid refreshing heartbeat timer too often
Currently on high rate SCTP streams the heartbeat timer refresh can
consume quite a lot of resources as timer updates are costly and it
contains a random factor, which a) is also costly and b) invalidates
mod_timer() optimization for not editing a timer to the same value.
It may even cause the timer to be slightly advanced, for no good reason.

As suggested by David Laight this patch now removes this timer update
from hot path by leaving the timer on and re-evaluating upon its
expiration if the heartbeat is still needed or not, similarly to what is
done for TCP. If it's not needed anymore the timer is re-scheduled to
the new timeout, considering the time already elapsed.

For this, we now record the last tx timestamp per transport, updated in
the same spots as hb timer was restarted on tx. Also split up
sctp_transport_reset_timers into sctp_transport_reset_t3_rtx and
sctp_transport_reset_hb_timer, so we can re-arm T3 without re-arming the
heartbeat one.

On loopback with MTU of 65535 and data chunks with 1636, so that we
have a considerable amount of chunks without stressing system calls,
netperf -t SCTP_STREAM -l 30, perf looked like this before:

Samples: 103K of event 'cpu-clock', Event count (approx.): 25833000000
  Overhead  Command  Shared Object      Symbol
+    6,15%  netperf  [kernel.vmlinux]   [k] copy_user_enhanced_fast_string
-    5,43%  netperf  [kernel.vmlinux]   [k] _raw_write_unlock_irqrestore
   - _raw_write_unlock_irqrestore
      - 96,54% _raw_spin_unlock_irqrestore
         - 36,14% mod_timer
            + 97,24% sctp_transport_reset_timers
            + 2,76% sctp_do_sm
         + 33,65% __wake_up_sync_key
         + 28,77% sctp_ulpq_tail_event
         + 1,40% del_timer
      - 1,84% mod_timer
         + 99,03% sctp_transport_reset_timers
         + 0,97% sctp_do_sm
      + 1,50% sctp_ulpq_tail_event

And after this patch, now with netperf -l 60:

Samples: 230K of event 'cpu-clock', Event count (approx.): 57707250000
  Overhead  Command  Shared Object      Symbol
+    5,65%  netperf  [kernel.vmlinux]   [k] memcpy_erms
+    5,59%  netperf  [kernel.vmlinux]   [k] copy_user_enhanced_fast_string
-    5,05%  netperf  [kernel.vmlinux]   [k] _raw_spin_unlock_irqrestore
   - _raw_spin_unlock_irqrestore
      + 49,89% __wake_up_sync_key
      + 45,68% sctp_ulpq_tail_event
      - 2,85% mod_timer
         + 76,51% sctp_transport_reset_t3_rtx
         + 23,49% sctp_do_sm
      + 1,55% del_timer
+    2,50%  netperf  [sctp]             [k] sctp_datamsg_from_user
+    2,26%  netperf  [sctp]             [k] sctp_sendmsg

Throughput-wise, from 6800mbps without the patch to 7050mbps with it,
~3.7%.

Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-10 22:22:34 -04:00
David S. Miller
ae95d71261 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2016-04-09 17:41:41 -04:00
Eric Dumazet
03c5b53418 ipv6: fix inet6_lookup_listener()
A stupid refactoring bug in inet6_lookup_listener() needs to be fixed
in order to get proper SO_REUSEPORT behavior.

Fixes: 3b24d854cb ("tcp/dccp: do not touch listener sk_refcnt under synflood")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Maciej Żenczykowski <maze@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-09 16:53:52 -04:00
Linus Torvalds
9ef11ceb0d Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Pull networking fixes from David Miller:

 1) Stale SKB data pointer access across pskb_may_pull() calls in L2TP,
    from Haishuang Yan.

 2) Fix multicast frame handling in mac80211 AP code, from Felix
    Fietkau.

 3) mac80211 station hashtable insert errors not handled properly, fix
    from Johannes Berg.

 4) Fix TX descriptor count limit handling in e1000, from Alexander
    Duyck.

 5) Revert a buggy netdev refcount fix in netpoll, from Bjorn Helgaas.

 6) Must assign rtnl_link_ops of the device before registering it, fix
    in ip6_tunnel from Thadeu Lima de Souza Cascardo.

 7) Memory leak fix in tc action net exit, from WANG Cong.

 8) Add missing AF_KCM entries to name tables, from Dexuan Cui.

 9) Fix regression in GRE handling of csums wrt.  FOU, from Alexander
    Duyck.

10) Fix memory allocation alignment and congestion map corruption in
    RDS, from Shamir Rabinovitch.

11) Fix default qdisc regression in tuntap driver, from Jason Wang.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (44 commits)
  bridge, netem: mark mailing lists as moderated
  tuntap: restore default qdisc
  mpls: find_outdev: check for err ptr in addition to NULL check
  ipv6: Count in extension headers in skb->network_header
  RDS: fix congestion map corruption for PAGE_SIZE > 4k
  RDS: memory allocated must be align to 8
  GRE: Disable segmentation offloads w/ CSUM and we are encapsulated via FOU
  net: add the AF_KCM entries to family name tables
  MAINTAINERS: intel-wired-lan list is moderated
  lib/test_bpf: Add additional BPF_ADD tests
  lib/test_bpf: Add test to check for result of 32-bit add that overflows
  lib/test_bpf: Add tests for unsigned BPF_JGT
  lib/test_bpf: Fix JMP_JSET tests
  VSOCK: Detach QP check should filter out non matching QPs.
  stmmac: fix adjust link call in case of a switch is attached
  af_packet: tone down the Tx-ring unsupported spew.
  net_sched: fix a memory leak in tc action
  samples/bpf: Enable powerpc support
  samples/bpf: Use llc in PATH, rather than a hardcoded value
  samples/bpf: Fix build breakage with map_perf_test_user.c
  ...
2016-04-09 10:50:44 -07:00
Vivien Didelot
4d5770b397 net: dsa: make the VLAN add function return void
The switchdev design implies that a software error should not happen in
the commit phase since it must have been previously reported in the
prepare phase. If an hardware error occurs during the commit phase,
there is nothing switchdev can do about it.

The DSA layer separates port_vlan_prepare and port_vlan_add for
simplicity and convenience. If an hardware error occurs during the
commit phase, there is no need to report it outside the driver itself.

Make the DSA port_vlan_add routine return void for explicitness.

Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-08 16:50:41 -04:00
Vivien Didelot
8497aa618d net: dsa: make the FDB add function return void
The switchdev design implies that a software error should not happen in
the commit phase since it must have been previously reported in the
prepare phase. If an hardware error occurs during the commit phase,
there is nothing switchdev can do about it.

The DSA layer separates port_fdb_prepare and port_fdb_add for simplicity
and convenience. If an hardware error occurs during the commit phase,
there is no need to report it outside the DSA driver itself.

Make the DSA port_fdb_add routine return void for explicitness.

Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-08 16:50:40 -04:00
Vivien Didelot
43c44a9f65 net: dsa: make the STP state function return void
The DSA layer doesn't care about the return code of the port_stp_update
routine, so make it void in the layer and the DSA drivers.

Replace the useless dsa_slave_stp_update function with a
dsa_slave_stp_state function used to reply to the switchdev
SWITCHDEV_ATTR_ID_PORT_STP_STATE attribute.

In the meantime, rename port_stp_update to port_stp_state_set to
explicit the state change.

Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-08 16:50:40 -04:00
David S. Miller
1089ac6977 Merge tag 'mac80211-next-for-davem-2016-04-06' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211-next
Johannes Berg says:

====================
For the 4.7 cycle, we have a number of changes:
 * Bob's mesh mode rhashtable conversion, this includes
   the rhashtable API change for allocation flags
 * BSSID scan, connect() command reassoc support (Jouni)
 * fast (optimised data only) and support for RSS in mac80211 (myself)
 * various smaller changes
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-08 16:42:31 -04:00
David S. Miller
30d237a6c2 Merge tag 'mac80211-for-davem-2016-04-06' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211
Johannes Berg says:

====================
For the current RC series, we have the following fixes:
 * TDLS fixes from Arik and Ilan
 * rhashtable fixes from Ben and myself
 * documentation fixes from Luis
 * U-APSD fixes from Emmanuel
 * a TXQ fix from Felix
 * and a compiler warning suppression from Jeff
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-08 16:41:28 -04:00
Jiri Pirko
1fc2257e83 devlink: share user_ptr pointer for both devlink and devlink_port
Ptr to devlink structure can be easily obtained from
devlink_port->devlink. So share user_ptr[0] pointer for both and leave
user_ptr[1] free for other users.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-08 15:40:08 -04:00
Jiri Pirko
a9844881ba devlink: remove implicit type set in port register
As we rely on caller zeroing or correctly set the struct before the call,
this implicit type set is either no-op (DEVLINK_PORT_TYPE_NOTSET is 0)
or it rewrites wanted value. So remove this.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-08 15:38:42 -04:00
Alexander Aring
feb2add323 6lowpan: iphc: fix handling of link-local compression
This patch fixes handling in case of link-local address compression. A
IPv6 link-local address is defined as fe80::/10 prefix which is also
what ipv6_addr_type checks for link-local addresses.

But IPHC compression for link-local addresses are for fe80::/64 types
only. This patch adds additional checks for zero padded bits in case of
link-local address compression to match on a fe80::/64 address only.

Signed-off-by: Alexander Aring <aar@pengutronix.de>
Acked-by: Jukka Rissanen <jukka.rissanen@linux.intel.com>
Reviewed-by: Stefan Schmidt <stefan@osg.samsung.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2016-04-08 19:28:13 +02:00
Patrik Flykt
a164cee111 Bluetooth: Allow setting BT_SECURITY_FIPS with setsockopt
Update the security level check to allow setting BT_SECURITY_FIPS for
an L2CAP socket.

Signed-off-by: Patrik Flykt <patrik.flykt@linux.intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2016-04-08 19:10:57 +02:00
Johan Hedberg
56b40fbf61 Bluetooth: Ignore unknown advertising packet types
In case of buggy controllers send advertising packet types that we
don't know of we should simply ignore them instead of trying to react
to them in some (potentially wrong) way.

Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2016-04-08 18:51:44 +02:00
Johan Hedberg
f18ba58f53 Bluetooth: Fix setting NO_BREDR advertising flag
If we're dealing with a single-mode controller or BR/EDR is disable
for a dual-mode one, the NO_BREDR flag needs to be unconditionally
present in the advertising data. This patch moves it out from behind
an extra condition to be always set in the create_instance_adv_data()
function if BR/EDR is disabled.

Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2016-04-08 18:50:40 +02:00
Roopa Prabhu
94a57f1f8a mpls: find_outdev: check for err ptr in addition to NULL check
find_outdev calls inet{,6}_fib_lookup_dev() or dev_get_by_index() to
find the output device. In case of an error, inet{,6}_fib_lookup_dev()
returns error pointer and dev_get_by_index() returns NULL. But the function
only checks for NULL and thus can end up calling dev_put on an ERR_PTR.
This patch adds an additional check for err ptr after the NULL check.

Before: Trying to add an mpls route with no oif from user, no available
path to 10.1.1.8 and no default route:
$ip -f mpls route add 100 as 200 via inet 10.1.1.8
[  822.337195] BUG: unable to handle kernel NULL pointer dereference at
00000000000003a3
[  822.340033] IP: [<ffffffff8148781e>] mpls_nh_assign_dev+0x10b/0x182
[  822.340033] PGD 1db38067 PUD 1de9e067 PMD 0
[  822.340033] Oops: 0000 [#1] SMP
[  822.340033] Modules linked in:
[  822.340033] CPU: 0 PID: 11148 Comm: ip Not tainted 4.5.0-rc7+ #54
[  822.340033] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
BIOS rel-1.7.5.1-0-g8936dbb-20141113_115728-nilsson.home.kraxel.org
04/01/2014
[  822.340033] task: ffff88001db82580 ti: ffff88001dad4000 task.ti:
ffff88001dad4000
[  822.340033] RIP: 0010:[<ffffffff8148781e>]  [<ffffffff8148781e>]
mpls_nh_assign_dev+0x10b/0x182
[  822.340033] RSP: 0018:ffff88001dad7a88  EFLAGS: 00010282
[  822.340033] RAX: ffffffffffffff9b RBX: ffffffffffffff9b RCX:
0000000000000002
[  822.340033] RDX: 00000000ffffff9b RSI: 0000000000000008 RDI:
0000000000000000
[  822.340033] RBP: ffff88001ddc9ea0 R08: ffff88001e9f1768 R09:
0000000000000000
[  822.340033] R10: ffff88001d9c1100 R11: ffff88001e3c89f0 R12:
ffffffff8187e0c0
[  822.340033] R13: ffffffff8187e0c0 R14: ffff88001ddc9e80 R15:
0000000000000004
[  822.340033] FS:  00007ff9ed798700(0000) GS:ffff88001fc00000(0000)
knlGS:0000000000000000
[  822.340033] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  822.340033] CR2: 00000000000003a3 CR3: 000000001de89000 CR4:
00000000000006f0
[  822.340033] Stack:
[  822.340033]  0000000000000000 0000000100000000 0000000000000000
0000000000000000
[  822.340033]  0000000000000000 0801010a00000000 0000000000000000
0000000000000000
[  822.340033]  0000000000000004 ffffffff8148749b ffffffff8187e0c0
000000000000001c
[  822.340033] Call Trace:
[  822.340033]  [<ffffffff8148749b>] ? mpls_rt_alloc+0x2b/0x3e
[  822.340033]  [<ffffffff81488e66>] ? mpls_rtm_newroute+0x358/0x3e2
[  822.340033]  [<ffffffff810e7bbc>] ? get_page+0x5/0xa
[  822.340033]  [<ffffffff813b7d94>] ? rtnetlink_rcv_msg+0x17e/0x191
[  822.340033]  [<ffffffff8111794e>] ? __kmalloc_track_caller+0x8c/0x9e
[  822.340033]  [<ffffffff813c9393>] ?
rht_key_hashfn.isra.20.constprop.57+0x14/0x1f
[  822.340033]  [<ffffffff813b7c16>] ? __rtnl_unlock+0xc/0xc
[  822.340033]  [<ffffffff813cb794>] ? netlink_rcv_skb+0x36/0x82
[  822.340033]  [<ffffffff813b4507>] ? rtnetlink_rcv+0x1f/0x28
[  822.340033]  [<ffffffff813cb2b1>] ? netlink_unicast+0x106/0x189
[  822.340033]  [<ffffffff813cb5b3>] ? netlink_sendmsg+0x27f/0x2c8
[  822.340033]  [<ffffffff81392ede>] ? sock_sendmsg_nosec+0x10/0x1b
[  822.340033]  [<ffffffff81393df1>] ? ___sys_sendmsg+0x182/0x1e3
[  822.340033]  [<ffffffff810e4f35>] ?
__alloc_pages_nodemask+0x11c/0x1e4
[  822.340033]  [<ffffffff8110619c>] ? PageAnon+0x5/0xd
[  822.340033]  [<ffffffff811062fe>] ? __page_set_anon_rmap+0x45/0x52
[  822.340033]  [<ffffffff810e7bbc>] ? get_page+0x5/0xa
[  822.340033]  [<ffffffff810e85ab>] ? __lru_cache_add+0x1a/0x3a
[  822.340033]  [<ffffffff81087ea9>] ? current_kernel_time64+0x9/0x30
[  822.340033]  [<ffffffff813940c4>] ? __sys_sendmsg+0x3c/0x5a
[  822.340033]  [<ffffffff8148f597>] ?
entry_SYSCALL_64_fastpath+0x12/0x6a
[  822.340033] Code: 83 08 04 00 00 65 ff 00 48 8b 3c 24 e8 40 7c f2 ff
eb 13 48 c7 c3 9f ff ff ff eb 0f 89 ce e8 f1 ae f1 ff 48 89 c3 48 85 db
74 15 <48> 8b 83 08 04 00 00 65 ff 08 48 81 fb 00 f0 ff ff 76 0d eb 07
[  822.340033] RIP  [<ffffffff8148781e>] mpls_nh_assign_dev+0x10b/0x182
[  822.340033]  RSP <ffff88001dad7a88>
[  822.340033] CR2: 00000000000003a3
[  822.435363] ---[ end trace 98cc65e6f6b8bf11 ]---

After patch:
$ip -f mpls route add 100 as 200 via inet 10.1.1.8
RTNETLINK answers: Network is unreachable

Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com>
Reported-by: David Miller <davem@davemloft.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-08 12:43:20 -04:00
Jakub Sitnicki
3ba3458fb9 ipv6: Count in extension headers in skb->network_header
When sending a UDPv6 message longer than MTU, account for the length
of fragmentable IPv6 extension headers in skb->network_header offset.
Same as we do in alloc_new_skb path in __ip6_append_data().

This ensures that later on __ip6_make_skb() will make space in
headroom for fragmentable extension headers:

	/* move skb->data to ip header from ext header */
	if (skb->data < skb_network_header(skb))
		__skb_pull(skb, skb_network_offset(skb));

Prevents a splat due to skb_under_panic:

skbuff: skb_under_panic: text:ffffffff8143397b len:2126 put:14 \
head:ffff880005bacf50 data:ffff880005bacf4a tail:0x48 end:0xc0 dev:lo
------------[ cut here ]------------
kernel BUG at net/core/skbuff.c:104!
invalid opcode: 0000 [#1] KASAN
CPU: 0 PID: 160 Comm: reproducer Not tainted 4.6.0-rc2 #65
[...]
Call Trace:
 [<ffffffff813eb7b9>] skb_push+0x79/0x80
 [<ffffffff8143397b>] eth_header+0x2b/0x100
 [<ffffffff8141e0d0>] neigh_resolve_output+0x210/0x310
 [<ffffffff814eab77>] ip6_finish_output2+0x4a7/0x7c0
 [<ffffffff814efe3a>] ip6_output+0x16a/0x280
 [<ffffffff815440c1>] ip6_local_out+0xb1/0xf0
 [<ffffffff814f1115>] ip6_send_skb+0x45/0xd0
 [<ffffffff81518836>] udp_v6_send_skb+0x246/0x5d0
 [<ffffffff8151985e>] udpv6_sendmsg+0xa6e/0x1090
[...]

Reported-by: Ji Jianwen <jiji@redhat.com>
Signed-off-by: Jakub Sitnicki <jkbs@redhat.com>
Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-07 22:41:37 -04:00
Jon Paul Maloy
5b7066c3dd tipc: stricter filtering of packets in bearer layer
Resetting a bearer/interface, with the consequence of resetting all its
pertaining links, is not an atomic action. This becomes particularly
evident in very large clusters, where a lot of traffic may happen on the
remaining links while we are busy shutting them down. In extreme cases,
we may even see links being re-created and re-established before we are
finished with the job.

To solve this, we now introduce a solution where we temporarily detach
the bearer from the interface when the bearer is reset. This inhibits
all packet reception, while sending still is possible. For the latter,
we use the fact that the device's user pointer now is zero to filter out
which packets can be sent during this situation; i.e., outgoing RESET
messages only.  This filtering serves to speed up the neighbors'
detection of the loss event, and saves us from unnecessary probing.

Acked-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-07 17:00:13 -04:00
Jon Paul Maloy
4e801fa14f tipc: eliminate buffer leak in bearer layer
When enabling a bearer we create a 'neigbor discoverer' instance by
calling the function tipc_disc_create() before the bearer is actually
registered in the list of enabled bearers. Because of this, the very
first discovery broadcast message, created by the mentioned function,
is lost, since it cannot find any valid bearer to use. Furthermore,
the used send function, tipc_bearer_xmit_skb() does not free the given
buffer when it cannot find a  bearer, resulting in the leak of exactly
one send buffer each time a bearer is enabled.

This commit fixes this problem by introducing two changes:

1) Instead of attemting to send the discovery message directly, we let
   tipc_disc_create() return the discovery buffer to the calling
   function, tipc_enable_bearer(), so that the latter can send it
   when the enabling sequence is finished.

2) In tipc_bearer_xmit_skb(), as well as in the two other transmit
   functions at the bearer layer, we now free the indicated buffer or
   buffer chain when a valid bearer cannot be found.

Acked-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-07 17:00:13 -04:00
shamir rabinovitch
579ba85552 RDS: fix congestion map corruption for PAGE_SIZE > 4k
When PAGE_SIZE > 4k single page can contain 2 RDS fragments. If
'rds_ib_cong_recv' ignore the RDS fragment offset in to the page it
then read the data fragment as far congestion map update and lead to
corruption of the RDS connection far congestion map.

Signed-off-by: Shamir Rabinovitch <shamir.rabinovitch@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-07 16:58:28 -04:00
shamir rabinovitch
e98499ac63 RDS: memory allocated must be align to 8
Fix issue in 'rds_ib_cong_recv' when accessing unaligned memory
allocated by 'rds_page_remainder_alloc' using uint64_t pointer.

Signed-off-by: Shamir Rabinovitch <shamir.rabinovitch@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-07 16:58:27 -04:00
Alexander Duyck
a0ca153f98 GRE: Disable segmentation offloads w/ CSUM and we are encapsulated via FOU
This patch fixes an issue I found in which we were dropping frames if we
had enabled checksums on GRE headers that were encapsulated by either FOU
or GUE.  Without this patch I was barely able to get 1 Gb/s of throughput.
With this patch applied I am now at least getting around 6 Gb/s.

The issue is due to the fact that with FOU or GUE applied we do not provide
a transport offset pointing to the GRE header, nor do we offload it in
software as the GRE header is completely skipped by GSO and treated like a
VXLAN or GENEVE type header.  As such we need to prevent the stack from
generating it and also prevent GRE from generating it via any interface we
create.

Fixes: c3483384ee ("gro: Allow tunnel stacking in the case of FOU/GUE")
Signed-off-by: Alexander Duyck <aduyck@mirantis.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-07 16:56:33 -04:00
Tom Herbert
46aa2f30aa udp: Remove udp_offloads
Now that the UDP encapsulation GRO functions have been moved to the UDP
socket we not longer need the udp_offload insfrastructure so removing it.

Signed-off-by: Tom Herbert <tom@herbertland.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-07 16:53:30 -04:00
Tom Herbert
d92283e338 fou: change to use UDP socket GRO
Adapt gue_gro_receive, gue_gro_complete to take a socket argument.
Don't set udp_offloads any more.

Signed-off-by: Tom Herbert <tom@herbertland.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-07 16:53:29 -04:00
Tom Herbert
38fd2af24f udp: Add socket based GRO and config
Add gro_receive and  gro_complete to struct udp_tunnel_sock_cfg.

Signed-off-by: Tom Herbert <tom@herbertland.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-07 16:53:29 -04:00
Tom Herbert
a6024562ff udp: Add GRO functions to UDP socket
This patch adds GRO functions (gro_receive and gro_complete) to UDP
sockets. udp_gro_receive is changed to perform socket lookup on a
packet. If a socket is found the related GRO functions are called.

This features obsoletes using UDP offload infrastructure for GRO
(udp_offload). This has the advantage of not being limited to provide
offload on a per port basis, GRO is now applied to whatever individual
UDP sockets are bound to.  This also allows the possbility of
"application defined GRO"-- that is we can attach something like
a BPF program to a UDP socket to perfrom GRO on an application
layer protocol.

Signed-off-by: Tom Herbert <tom@herbertland.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-07 16:53:29 -04:00
Tom Herbert
63058308cd udp: Add udp6_lib_lookup_skb and udp4_lib_lookup_skb
Add externally visible functions to lookup a UDP socket by skb. This
will be used for GRO in UDP sockets. These functions also check
if skb->dst is set, and if it is not skb->dev is used to get dev_net.
This allows calling lookup functions before dst has been set on the
skbuff.

Signed-off-by: Tom Herbert <tom@herbertland.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-07 16:53:14 -04:00
Hannes Frederic Sowa
8ced425ee6 tun: use socket locks for sk_{attach,detatch}_filter
This reverts commit 5a5abb1fa3 ("tun, bpf: fix suspicious RCU usage
in tun_{attach, detach}_filter") and replaces it to use lock_sock around
sk_{attach,detach}_filter. The checks inside filter.c are updated with
lockdep_sock_is_held to check for proper socket locks.

It keeps the code cleaner by ensuring that only one lock governs the
socket filter instead of two independent locks.

Cc: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-07 16:44:14 -04:00