Commit Graph

57946 Commits

Author SHA1 Message Date
Chuck Lever
38f1932e60 xprtrdma: Remove FMRs from the unmap list after unmapping
ib_unmap_fmr() takes a list of FMRs to unmap. However, it does not
remove the FMRs from this list as it processes them. Other
ib_unmap_fmr() call sites are careful to remove FMRs from the list
after ib_unmap_fmr() returns.

Since commit 7c7a5390dc ("xprtrdma: Add ro_unmap_sync method for FMR")
fmr_op_unmap_sync passes more than one FMR to ib_unmap_fmr(), but
it didn't bother to remove the FMRs from that list once the call was
complete.

I've noticed some instability that could be related to list
tangling by the new fmr_op_unmap_sync() logic. In an abundance
of caution, add some defensive logic to clean up properly after
ib_unmap_fmr().

Fixes: 7c7a5390dc ("xprtrdma: Add ro_unmap_sync method for FMR")
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Tested-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2016-07-11 15:50:43 -04:00
Michal Kubeček
a612769774 udp: prevent bugcheck if filter truncates packet too much
If socket filter truncates an udp packet below the length of UDP header
in udpv6_queue_rcv_skb() or udp_queue_rcv_skb(), it will trigger a
BUG_ON in skb_pull_rcsum(). This BUG_ON (and therefore a system crash if
kernel is configured that way) can be easily enforced by an unprivileged
user which was reported as CVE-2016-6162. For a reproducer, see
http://seclists.org/oss-sec/2016/q3/8

Fixes: e6afc8ace6 ("udp: remove headers from UDP packets before queueing")
Reported-by: Marco Grassi <marco.gra@gmail.com>
Signed-off-by: Michal Kubecek <mkubecek@suse.cz>
Acked-by: Eric Dumazet <edumazet@google.com>
Acked-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-11 12:43:15 -07:00
David S. Miller
7d32eb8781 Merge tag 'batadv-net-for-davem-20160708' of git://git.open-mesh.org/linux-merge
Simon Wunderlich says:

====================
Here are a couple batman-adv bugfix patches, all by Sven Eckelmann:

 - Fix possible NULL pointer dereference for vlan_insert_tag (two patches)

 - Fix reference handling in some features, which may lead to reference
   leaks or invalid memory access (four patches)

 - Fix speedy join: DHCP packets handled by the gateway feature should
   be sent with 4-address unicast instead of 3-address unicast to make
   speedy join work. This fixes/speeds up DHCP assignment for clients
   which join a mesh for the first time. (one patch)
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-11 12:28:44 -07:00
Toby DiPasquale
c2b9b4fee8 netfilter: nf_conntrack_h323: fix off-by-one in DecodeQ931
This patch corrects an off-by-one error in the DecodeQ931 function in
the nf_conntrack_h323 module. This error could result in reading off
the end of a Q.931 frame.

Signed-off-by: Toby DiPasquale <toby@cbcg.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2016-07-11 12:32:45 +02:00
Pablo Neira Ayuso
c080b460df Merge tag 'ipvs-for-v4.8' of https://git.kernel.org/pub/scm/linux/kernel/git/horms/ipvs-next
Simon Horman says:

====================
IPVS Updates for v4.8

please consider these enhancements to the IPVS. This alters the behaviour
of the "least connection" schedulers such that pre-established connections
are included in the active connection count. This avoids overloading
servers when a large number of new connections arrive in a short space of
time - e.g. when clients reconnect after a node or network failure.
====================

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2016-07-11 12:16:34 +02:00
Pablo Neira Ayuso
42a5576913 netfilter: nf_tables: get rid of possible_net_t from set and basechain
We can pass the netns pointer as parameter to the functions that need to
gain access to it. From basechains, I didn't find any client for this
field anymore so let's remove this too.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2016-07-11 12:16:04 +02:00
Liping Zhang
3f8b61b7f9 netfilter: nft_ct: make byte/packet expr more friendly
If we want to use ct packets expr, and add a rule like follows:
  # nft add rule filter input ct packets gt 1 counter

We will find that no packets will hit it, because
nf_conntrack_acct is disabled by default. So It will
not work until we enable it manually via
"echo 1 > /proc/sys/net/netfilter/nf_conntrack_acct".

This is not friendly, so like xt_connbytes do, if the user
want to use ct byte/packet expr, enable nf_conntrack_acct
automatically.

Signed-off-by: Liping Zhang <liping.zhang@spreadtrum.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2016-07-11 12:16:02 +02:00
Hangbin Liu
47c7445625 netfilter: physdev: physdev-is-out should not work with OUTPUT chain
physdev_mt() will check skb->nf_bridge first, which was alloced in
br_nf_pre_routing. So if we want to use --physdev-out and physdev-is-out,
we need to match it in FORWARD or POSTROUTING chain. physdev_mt_check()
only checked physdev-out and missed physdev-is-out. Fix it and update the
debug message to make it clearer.

Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Reviewed-by: Marcelo R Leitner <marcelo.leitner@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2016-07-11 12:16:01 +02:00
Florian Westphal
870190a9ec netfilter: nat: convert nat bysrc hash to rhashtable
It did use a fixed-size bucket list plus single lock to protect add/del.

Unlike the main conntrack table we only need to add and remove keys.
Convert it to rhashtable to get table autosizing and per-bucket locking.

The maximum number of entries is -- as before -- tied to the number of
conntracks so we do not need another upperlimit.

The change does not handle rhashtable_remove_fast error, only possible
"error" is -ENOENT, and that is something that can happen legitimetely,
e.g. because nat module was inserted at a later time and no src manip
took place yet.

Tested with http-client-benchmark + httpterm with DNAT and SNAT rules
in place.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2016-07-11 12:07:57 +02:00
Pablo Neira Ayuso
4edfa9d0bf Merge tag 'ipvs-fixes2-for-v4.7' of https://git.kernel.org/pub/scm/linux/kernel/git/horms/ipvs
Simon Horman says:

====================
Second Round of IPVS Fixes for v4.7

The fix from Quentin Armitage allows the backup sync daemon to
be bound to a link-local mcast IPv6 address as is already the case
for IPv4.
====================

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2016-07-11 11:58:33 +02:00
Florian Westphal
7c96643519 netfilter: move nat hlist_head to nf_conn
The nat extension structure is 32bytes in size on x86_64:

struct nf_conn_nat {
        struct hlist_node          bysource;             /*     0    16 */
        struct nf_conn *           ct;                   /*    16     8 */
        union nf_conntrack_nat_help help;                /*    24     4 */
        int                        masq_index;           /*    28     4 */
        /* size: 32, cachelines: 1, members: 4 */
        /* last cacheline: 32 bytes */
};

The hlist is needed to quickly check for possible tuple collisions
when installing a new nat binding. Storing this in the extension
area has two drawbacks:

1. We need ct backpointer to get the conntrack struct from the extension.
2. When reallocation of extension area occurs we need to fixup the bysource
   hash head via hlist_replace_rcu.

We can avoid both by placing the hlist_head in nf_conn and place nf_conn in
the bysource hash rather than the extenstion.

We can also remove the ->move support; no other extension needs it.

Moving the entire nat extension into nf_conn would be possible as well but
then we have to add yet another callback for deletion from the bysource
hash table rather than just using nat extension ->destroy hook for this.

nf_conn size doesn't increase due to aligment, followup patch replaces
hlist_node with single pointer.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2016-07-11 11:47:50 +02:00
Florian Westphal
242922a027 netfilter: conntrack: simplify early_drop
We don't need to acquire the bucket lock during early drop, we can
use lockless traveral just like ____nf_conntrack_find.

The timer deletion serves as synchronization point, if another cpu
attempts to evict same entry, only one will succeed with timer deletion.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2016-07-11 11:46:22 +02:00
Liping Zhang
8786a9716d netfilter: nf_ct_helper: unlink helper again when hash resize happen
From: Liping Zhang <liping.zhang@spreadtrum.com>

Similar to ctnl_untimeout, when hash resize happened, we should try
to do unhelp from the 0# bucket again.

Signed-off-by: Liping Zhang <liping.zhang@spreadtrum.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2016-07-11 11:44:34 +02:00
Liping Zhang
474803d37e netfilter: cttimeout: unlink timeout obj again when hash resize happen
Imagine such situation, nf_conntrack_htable_size now is 4096, we are doing
ctnl_untimeout, and iterate on 3000# bucket.

Meanwhile, another user try to reduce hash size to 2048, then all nf_conn
are removed to the new hashtable. When this hash resize operation finished,
we still try to itreate ct begin from 3000# bucket, find nothing to do and
just return.

We may miss unlinking some timeout objects. And later we will end up with
invalid references to timeout object that are already gone.

So when we find that hash resize happened, try to unlink timeout objects
from the 0# bucket again.

Signed-off-by: Liping Zhang <liping.zhang@spreadtrum.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2016-07-11 11:39:08 +02:00
Liping Zhang
64b87639c9 netfilter: conntrack: fix race between nf_conntrack proc read and hash resize
When we do "cat /proc/net/nf_conntrack", and meanwhile resize the conntrack
hash table via /sys/module/nf_conntrack/parameters/hashsize, race will
happen, because reader can observe a newly allocated hash but the old size
(or vice versa). So oops will happen like follows:

  BUG: unable to handle kernel NULL pointer dereference at 0000000000000017
  IP: [<ffffffffa0418e21>] seq_print_acct+0x11/0x50 [nf_conntrack]
  Call Trace:
  [<ffffffffa0412f4e>] ? ct_seq_show+0x14e/0x340 [nf_conntrack]
  [<ffffffff81261a1c>] seq_read+0x2cc/0x390
  [<ffffffff812a8d62>] proc_reg_read+0x42/0x70
  [<ffffffff8123bee7>] __vfs_read+0x37/0x130
  [<ffffffff81347980>] ? security_file_permission+0xa0/0xc0
  [<ffffffff8123cf75>] vfs_read+0x95/0x140
  [<ffffffff8123e475>] SyS_read+0x55/0xc0
  [<ffffffff817c2572>] entry_SYSCALL_64_fastpath+0x1a/0xa4

It is very easy to reproduce this kernel crash.
1. open one shell and input the following cmds:
  while : ; do
    echo $RANDOM > /sys/module/nf_conntrack/parameters/hashsize
  done
2. open more shells and input the following cmds:
  while : ; do
    cat /proc/net/nf_conntrack
  done
3. just wait a monent, oops will happen soon.

The solution in this patch is based on Florian's Commit 5e3c61f981
("netfilter: conntrack: fix lookup race during hash resize"). And
add a wrapper function nf_conntrack_get_ht to get hash and hsize
suggested by Florian Westphal.

Signed-off-by: Liping Zhang <liping.zhang@spreadtrum.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2016-07-11 11:38:57 +02:00
Thierry Escande
d85a301c26 NFC: digital: Fix RTOX supervisor PDU handling
When the target needs more time to process the received PDU, it sends
Response Timeout Extension (RTOX) PDU.

When the initiator receives a RTOX PDU, it must reply with a RTOX PDU
and extends the current rwt value with the formula:
 rwt_int = rwt * rtox

This patch takes care of the rtox value passed by the target in the RTOX
PDU and extends the timeout for the next response accordingly.

Signed-off-by: Thierry Escande <thierry.escande@collabora.com>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2016-07-11 02:02:03 +02:00
Thierry Escande
1a09c56f54 NFC: digital: Add support for NFC DEP Response Waiting Time
When sending an ATR_REQ, the initiator must wait for the ATR_RES at
least 'RWT(nfcdep,activation) + dRWT(nfcdep)' and no more than
'RWT(nfcdep,activation) + dRWT(nfcdep) + dT(nfcdep,initiator)'. This
gives a timeout value between 1237 ms and 1337 ms. This patch defines
DIGITAL_ATR_RES_RWT to 1337 used for the timeout value of ATR_REQ
command.

For other DEP PDUs, the initiator must wait between 'RWT + dRWT(nfcdep)'
and 'RWT + dRWT(nfcdep) + dT(nfcdep,initiator)' where RWT is given by
the following formula: '(256 * 16 / f(c)) * 2^wt' where wt is the value
of the TO field in the ATR_RES response and is in the range between 0
and 14. This patch declares a mapping table for wt values and gives RWT
max values between 100 ms and 5049 ms.

This patch also defines DIGITAL_ATR_RES_TO_WT, the maximum wt value in
target mode, to 8.

Signed-off-by: Thierry Escande <thierry.escande@collabora.com>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2016-07-11 02:01:14 +02:00
Thierry Escande
e200f008ac NFC: digital: Free supervisor PDUs
This patch frees the RTOX resp sk_buff in initiator mode. It also makes
use of the free_resp exit point for ATN supervisor PDUs in both
initiator and target mode.

Signed-off-by: Thierry Escande <thierry.escande@collabora.com>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2016-07-11 02:00:26 +02:00
Thierry Escande
e073eb6797 NFC: digital: Rework ACK PDU handling in initiator mode
With this patch, ACK PDU sk_buffs are now freed and code has been
refactored for better errors handling.

Signed-off-by: Thierry Escande <thierry.escande@collabora.com>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2016-07-11 01:59:37 +02:00
Thierry Escande
482333b277 NFC: digital: Fix ACK & NACK PDUs handling in target mode
When the target receives a NACK PDU, it re-sends the last sent PDU.

ACK PDUs are received by the target as a reply from the initiator to
chained I-PDUs. There are 3 cases to handle:
- If the target has previously received 1 or more ATN PDUs and the PNI
  in the ACK PDU is equal to the target PNI - 1, then it means that the
  initiator did not received the last issued PDU from the target. In
  this case it re-sends this PDU.
- If the target has received 1 or more ATN PDUs but the ACK PNI is not
  the target PNI - 1, then this means that this ACK is the reply of the
  previous chained I-PDU sent by the target. The target did not received
  it on the first attempt and it is being re-sent by the initiator. The
  process continues as usual.
- No ATN PDU received before this ACK PDU. This is the reply of a
  chained I-PDU. The target keeps on processing its chained I-PDU.

The code has been refactored to avoid too many indentation levels.

Also, ACK and NACK PDUs were not freed. This is now fixed.

Signed-off-by: Thierry Escande <thierry.escande@collabora.com>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2016-07-11 01:58:46 +02:00
Thierry Escande
f23a9868b1 NFC: digital: Fix target DEP_REQ I-PDU handling after ATN PDU
When the initiator sends a DEP_REQ I-PDU, the target device may not
reply in a timely manner. In this case the initiator device must send an
attention PDU (ATN) and if the recipient replies with an ATN PDU in
return, then the last I-PDU must be sent again by the initiator.

This patch fixes how the target handles I-PDU received after an ATN PDU
has been received.

There are 2 possible cases:
- The target has received the initial DEP_REQ and sends back the DEP_RES
  but the initiator did not receive it. In this case, after the
  initiator has sent an ATN PDU and the target replied it (with an ATN
  as well), the initiator sends the saved skb of the initial DEP_REQ
  again and the target replies with the saved skb of the initial
  DEP_RES.
- Or the target did not even received the initial DEP_REQ. In this case,
  after the ATN PDUs exchange, the initiator sends the saved skb and the
  target simply passes it up, just as usual.

This behavior is controlled using the atn_count and the PNI field of the
digital device structure.

Signed-off-by: Thierry Escande <thierry.escande@collabora.com>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2016-07-11 01:57:50 +02:00
Thierry Escande
e8e7f42175 NFC: digital: Remove useless call to skb_reserve()
When allocating chained I-PDUs, there is no need to call skb_reserve()
since it's already done by digital_alloc_skb() and contains enough room
for the driver head and tail data.

Signed-off-by: Thierry Escande <thierry.escande@collabora.com>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2016-07-11 01:56:45 +02:00
Thierry Escande
1d984c2e03 NFC: digital: Fix handling of saved PDU sk_buff pointers
This patch fixes the way an I-PDU is saved in case it needs to be sent
again. It is now copied using pskb_copy() and not simply referenced
using skb_get() since it could be modified by the driver.

digital_in_send_saved_skb() and digital_tg_send_saved_skb() still get a
reference on the saved skb which is re-sent but release it if the send
operation fails. That way the caller doesn't have to take care about skb
ref in case of error.

RTOX supervisor PDU must not be saved as this can override a previously
saved I-PDU that should be re-sent later on.

Signed-off-by: Thierry Escande <thierry.escande@collabora.com>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2016-07-11 01:55:42 +02:00
Eric Dumazet
95556a8838 dccp: avoid deadlock in dccp_v4_ctl_send_reset
In the prep work I did before enabling BH while handling socket backlog,
I missed two points in DCCP :

1) dccp_v4_ctl_send_reset() uses bh_lock_sock(), assuming BH were
blocked. It is not anymore always true.

2) dccp_v4_route_skb() was using __IP_INC_STATS() instead of
  IP_INC_STATS()

A similar fix was done for TCP, in commit 47dcc20a39
("ipv4: tcp: ip_send_unicast_reply() is not BH safe")

Fixes: 7309f8821f ("dccp: do not assume DCCP code is non preemptible")
Fixes: 5413d1babe ("net: do not block BH while processing socket backlog")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-09 18:14:17 -04:00
Eric Dumazet
927265bc6c ipv6: do not abuse GFP_ATOMIC in inet6_netconf_notify_devconf()
All inet6_netconf_notify_devconf() callers are in process context,
so we can use GFP_KERNEL allocations if we take care of not holding
a rwlock while not needed in ip6mr (we hold RTNL there)

Fixes: d67b8c616b ("netconf: advertise mc_forwarding status")
Fixes: f3a1bfb11c ("rtnl/ipv6: use netconf msg to advertise forwarding status")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Acked-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-09 18:13:20 -04:00
Eric Dumazet
fa17806cde ipv4: do not abuse GFP_ATOMIC in inet_netconf_notify_devconf()
inet_forward_change() runs with RTNL held.
We are allowed to sleep if required.

If we use __in_dev_get_rtnl() instead of __in_dev_get_rcu(),
we no longer have to use GFP_ATOMIC allocations in
inet_netconf_notify_devconf(), meaning we are less likely to miss
notifications under memory pressure, and wont touch precious memory
reserves either and risk dropping incoming packets.

inet_netconf_get_devconf() can also use GFP_KERNEL allocation.

Fixes: edc9e74893 ("rtnl/ipv4: use netconf msg to advertise forwarding status")
Fixes: 9e5511106f ("rtnl/ipv4: add support of RTM_GETNETCONF")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Acked-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-09 18:12:25 -04:00
Jesper Dangaard Brouer
1db19db7f5 net: tracepoint napi:napi_poll add work and budget
An important information for the napi_poll tracepoint is knowing
the work done (packets processed) by the napi_poll() call. Add
both the work done and budget, as they are related.

Handle trace_napi_poll() param change in dropwatch/drop_monitor
and in python perf script netdev-times.py in backward compat way,
as python fortunately supports optional parameter handling.

Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-09 18:05:02 -04:00
Simon Horman
407f31be9d mpls: allow routes on ipip and sit devices
Allow MPLS routes on IPIP and SIT devices now that they
support forwarding MPLS packets.

Signed-off-by: Simon Horman <simon.horman@netronome.com>
Reviewed-by: Dinan Gunawardena <dinan.gunawardena@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-09 17:45:56 -04:00
Simon Horman
1b69e7e6c4 ipip: support MPLS over IPv4
Extend the IPIP driver to support MPLS over IPv4. The implementation is an
extension of existing support for IPv4 over IPv4 and is based of multiple
inner-protocol support for the SIT driver.

Signed-off-by: Simon Horman <simon.horman@netronome.com>
Reviewed-by: Dinan Gunawardena <dinan.gunawardena@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-09 17:45:56 -04:00
Simon Horman
49dbe7ae21 sit: support MPLS over IPv4
Extend the SIT driver to support MPLS over IPv4. This implementation
extends existing support for IPv6 over IPv4 and IPv4 over IPv4.

Signed-off-by: Simon Horman <simon.horman@netronome.com>
Reviewed-by: Dinan Gunawardena <dinan.gunawardena@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-09 17:45:56 -04:00
Simon Horman
8afe97e5d4 tunnels: support MPLS over IPv4 tunnels
Extend tunnel support to MPLS over IPv4.  The implementation extends the
existing differentiation between IPIP and IPv6 over IPv4 to also cover MPLS
over IPv4.

Signed-off-by: Simon Horman <simon.horman@netronome.com>
Reviewed-by: Dinan Gunawardena <dinan.gunawardena@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-09 17:45:56 -04:00
Nikolay Aleksandrov
a65056ecf4 net: bridge: extend MLD/IGMP query stats
As was suggested this patch adds support for the different versions of MLD
and IGMP query types. Since the user visible structure is still in net-next
we can augment it instead of adding netlink attributes.
The distinction between the different IGMP/MLD query types is done as
suggested in Section 7.1, RFC 3376 [1] and Section 8.1, RFC 3810 [2] based
on query payload size and code for IGMP. Since all IGMP packets go through
multicast_rcv() and it uses ip_mc_check_igmp/ipv6_mc_check_mld we can be
sure that at least the ip/ipv6 header can be directly used.

[1] https://tools.ietf.org/html/rfc3376#section-7
[2] https://tools.ietf.org/html/rfc3810#section-8.1

Suggested-by: Linus Lüssing <linus.luessing@c0d3.blue>
Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-09 17:40:09 -04:00
Marcel Holtmann
ca8bee5dde Bluetooth: Rename HCI_BREDR into HCI_PRIMARY
The HCI_BREDR naming is confusing since it actually stands for Primary
Bluetooth Controller. Which is a term that has been used in the latest
standard. However from a legacy point of view there only really have
been Basic Rate (BR) and Enhanced Data Rate (EDR). Recent versions of
Bluetooth introduced Low Energy (LE) and made this terminology a little
bit confused since Dual Mode Controllers include BR/EDR and LE. To
simplify this the name HCI_PRIMARY stands for the Primary Controller
which can be a single mode or dual mode controller.

Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
2016-07-09 21:37:13 +03:00
Marcel Holtmann
e14dbe7203 Bluetooth: Remove controller device attributes
The controller device attributes are not used and expose no valuable
information.

Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
2016-07-09 21:37:11 +03:00
Marcel Holtmann
2a0be13986 Bluetooth: Remove connection link attributes
The connection link attributes are not used and expose no valuable
information.

Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
2016-07-09 21:37:08 +03:00
Marcelo Ricardo Leitner
f1533cce60 sctp: fix panic when sending auth chunks
When we introduced GSO support, if using auth the auth chunk was being
left queued on the packet even after the final segment was generated.
Later on sctp_transmit_packet it calls sctp_packet_reset, which zeroed
the packet len while not accounting for this left-over. This caused more
space to be used the next packet due to the chunk still being queued,
but space which wasn't allocated as its size wasn't accounted.

The fix is to only queue it back when we know that we are going to
generate another segment.

Fixes: 90017accff ("sctp: Add GSO support")
Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-09 00:08:21 -04:00
Vivien Didelot
d390238c4f net: dsa: initialize the routing table
The routing table of every switch in a tree is currently initialized to
all zeros. This is an issue since 0 is a valid port number.

Add a DSA_RTABLE_NONE=-1 constant to initialize the signed values of the
routing table pointing to other switches.

This fixes the device mapping of the mv88e6xxx driver where the port
pointing to the switch itself and to non-existent switches was wrongly
configured to be 0. It is now set to the expected 0xf value.

Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-08 23:59:49 -04:00
David S. Miller
5b58d83617 Merge tag 'mac80211-for-davem-2016-07-06' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211
Johannes Berg says:

====================
Two more fixes:
 * handle allocation failures in new(ish) A-MSDU decapsulation
 * don't leak memory on nl80211 ACL parse errors
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-08 23:53:41 -04:00
David S. Miller
cc3baecb21 Merge tag 'rxrpc-rewrite-20160706' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs
David Howells says:

====================
rxrpc: Improve conn/call lookup and fix call number generation [ver #3]

I've fixed a couple of patch descriptions and excised the patch that
duplicated the connections list for reconsideration at a later date.

For reference, the excised patch is sitting on the rxrpc-experimental
branch of my git tree, based on top of the rxrpc-rewrite branch.  Diffing
it against yesterday's tag shows no differences.

Would you prefer the patch set to be emailed afresh instead of a git-pull
request?

David
---
Here's the next part of the AF_RXRPC rewrite.  The two main purposes of
this set are to fix the call number handling and to make use of RCU when
looking up the connection or call to pass a received packet to.

Important changes in this set include:

 (1) Avoidance of placing stack data into SG lists in rxkad so that kernel
     stacks can become vmalloc'd (Herbert Xu).

 (2) Calls cease pinning the connection they used as soon as possible,
     which allows the connection to be discarded sooner and allows the call
     channel on that connection to be reused earlier.

 (3) Make each call channel on a connection have a separate and independent
     call number space rather than having a shared number space for the
     connection.  Call numbers should increment monotonically per channel
     on the client, and the server should ignore a call with a lower call
     number for that channel than the latest it has seen.  The RESPONSE
     packet sets the minimum values of each call ID counter on a
     connection.

 (4) Look up calls by indexing the channel array on a connection rather
     than by keeping calls in an rbtree on that connection.  Also look up
     calls using the channel array rather than using a hashtable.

     The call hashtable can then be removed.

 (5) Call terminal statuses are cached in the channel array for the last
     call.  It is assumed that if we the server have seen call N, then the
     client no longer cares about call N-1 on the same channel.

     This will allow retransmission of the terminal status in future
     without the need to keep the rxrpc_call struct around.

 (6) Peer lookups are moved out of common connection handling code and into
     service connection handling code as client connections (a) must point
     to a peer before they can be used and (b) are looked up by a
     machine-unique connection ID directly, so we only need to look up the
     peer first if we're going to deal with a service call.

 (7) The reference count on a connection is held elevated by 1 whilst it is
     alive (ie. idle unused connections have a refcount of 1).  The reaper
     will attempt to change the refcount from 1->0 and skip if this cannot
     be done, whilst look ups only increment the refcount if it's non-zero.

     This makes the implementation of RCU lookups easier as we don't have
     to get a ref on the connection or a lock on the connection list to
     prevent a connection being reaped whilst we're contemplating queueing
     a packet that initiates a new service call upon it.

     If we need to get a connection, but there's a dead connection in the
     tree, we use rb_replace_node() to replace the dead one with a new one.

 (8) Use a seqlock to validate the walk over the service connection rbtree
     attached to a peer when it's being walked in RCU mode.

 (9) Make the incoming call/connection packet handling code use RCU mode
     and locks and make it only take a reference if the call/connection
     gets queued on a workqueue.

The intention is that the next set will introduce the connection lifetime
management and capacity limits to prevent clients from overloading the
server.

There are some fixes too:

 (1) Verifying that a packet coming in to a client connection came from the
     expected source.

 (2) Fix handling of connection failure in client call creation where we
     don't reinitialise the list linkage block and a second attempt to
     unlink the failed connection oopses and also we don't set the state
     correctly, which causes an assertion failure.

 (3) New service calls were being added to the socket's accept queue under
     the wrong lock.

Changes:

 (V2) In rxrpc_find_service_conn_rcu() initialised the sequence number to 0.

      Fixed the RCU handling in conn_service.c by introducing and using
      rb_replace_node_rcu() as an RCU-safe alternative in
      rxrpc_publish_service_conn().

      Modified and used rcu_dereference_raw() to avoid RCU sparse warnings
      in rxrpc_find_service_conn_rcu().

      Added in some missing RCU dereference wrappers.  It seems to be
      necessary to turn on CONFIG_PROVE_RCU_REPEATEDLY as well as
      CONFIG_SPARSE_RCU_POINTER to get the static __rcu annotation checking
      to happen.

      Fixed some other sparse warnings, including a missing ntohs() in
      jumbo packet processing.

 (V3) Fixed some commit descriptions.

      Excised the patch that duplicated the connection list to separate out
      the procfs list for reconsideration at a later date.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-08 23:52:12 -04:00
Florian Westphal
bba7eb5d9b hfsc: reduce hfsc_sched to 14 cachelines
hfsc_sched is huge (size: 920, cachelines: 15), but we can get it to 14
cachelines by placing level after filter_cnt (covering 4 byte hole) and
reducing period/nactive/flags to u32 (period is just a counter,
incremented when class becomes active -- 2**32 is plenty for this
purpose, also, long is only 32bit wide on 32bit platforms anyway).

cl_vtperiod is exported to userspace via tc_hfsc_stats, but its period
member is already u32, so no precision is lost there either.

Cc: Michal Soltys <soltys@ziu.info>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-08 23:08:39 -04:00
Florian Westphal
c8607e0200 netfilter: nft_ct: fix expiration getter
We need to compute timeout.expires - jiffies, not the other way around.
Add a helper, another patch can then later change more places in
conntrack code where we currently open-code this.

Will allow us to only change one place later when we remove per-ct timer.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2016-07-08 14:55:14 +02:00
Alexander Aring
9e262f5037 6lowpan: ndisc: set invalid unicast short addr to unspec
When receiving neighbour information with short address option field we
should check the complete range of invalid short addresses and set it to
one invalid address setting which is the unspecified address. This
address is also used when by creating at first a new neighbour entry to
indicate no short address is set.

Signed-off-by: Alexander Aring <aar@pengutronix.de>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2016-07-08 13:23:12 +02:00
Alexander Aring
0ea0b9af9b ieee802154: 6lowpan: fix intra pan id check
The RIOT-OS stack does send intra-pan frames but don't set the intra pan
flag inside the mac header. It seems this is valid frame addressing but
inefficient. Anyway this patch adds a new function for intra pan
addressing, doesn't matter if intra pan flag or source and destination
are the same. The newly introduction function will be used to check on
intra pan addressing for 6lowpan.

Signed-off-by: Alexander Aring <aar@pengutronix.de>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2016-07-08 13:23:12 +02:00
Denis Kenzior
83871f8ccd Bluetooth: Fix hci_sock_recvmsg return value
If recvmsg is called with a destination buffer that is too small to
receive the contents of skb in its entirety, the return value from
recvmsg was inconsistent with common SOCK_SEQPACKET or SOCK_DGRAM
semantics.

If destination buffer provided by userspace is too small (e.g. len <
copied), then MSG_TRUNC flag is set and copied is returned.  Instead, it
should return the length of the message, which is consistent with how
other datagram based sockets act.  Quoting 'man recv':

"All  three calls return the length of the message on successful comple‐
tion.  If a message is too long to fit in the supplied  buffer,  excess
bytes  may  be discarded depending on the type of socket the message is
received from."

and

"MSG_TRUNC (since Linux 2.2)

    For   raw   (AF_PACKET),   Internet   datagram   (since    Linux
    2.4.27/2.6.8),  netlink  (since Linux 2.6.22), and UNIX datagram
    (since Linux 3.4) sockets: return the real length of the packet
    or datagram, even when it was longer than the passed buffer."

Signed-off-by: Denis Kenzior <denkenz@gmail.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2016-07-08 12:20:57 +02:00
Denis Kenzior
b5f34f9420 Bluetooth: Fix bt_sock_recvmsg return value
If recvmsg is called with a destination buffer that is too small to
receive the contents of skb in its entirety, the return value from
recvmsg was inconsistent with common SOCK_SEQPACKET or SOCK_DGRAM
semantics.

If destination buffer provided by userspace is too small (e.g. len <
copied), then MSG_TRUNC flag is set and copied is returned.  Instead, it
should return the length of the message, which is consistent with how
other datagram based sockets act.  Quoting 'man recv':

"All  three calls return the length of the message on successful comple‐
tion.  If a message is too long to fit in the supplied  buffer,  excess
bytes  may  be discarded depending on the type of socket the message is
received from."

and

"MSG_TRUNC (since Linux 2.2)

    For   raw   (AF_PACKET),   Internet   datagram   (since    Linux
    2.4.27/2.6.8),  netlink  (since Linux 2.6.22), and UNIX datagram
    (since Linux 3.4) sockets: return the real length of the packet
    or datagram, even when it was longer than the passed buffer."

Signed-off-by: Denis Kenzior <denkenz@gmail.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2016-07-08 12:20:57 +02:00
Alexander Aring
1c5bf998b3 ieee802154: allow netns create of lowpan interface
This patch reverts commit f9d1ce8f81 ("ieee802154: fix netns settings").
The lowpan interface need to be created inside the net namespace where
the wpan interface is available. The wpan namespace can be changed only
by nl802154 before. Without this patch it's not possible to create a
lowpan interface for a wpan interface which isn't inside init_net
namespace.

Cc: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Reviewed-by: Stefan Schmidt <stefan@osg.samsung.com>
Signed-off-by: Alexander Aring <aar@pengutronix.de>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2016-07-08 12:20:57 +02:00
Alexander Aring
66e5c2672c ieee802154: add netns support
This patch adds netns support for 802.15.4 subsystem. Most parts are
copy&pasted from wireless subsystem, it has the identically userspace
API.

Cc: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Reviewed-by: Stefan Schmidt <stefan@osg.samsung.com>
Signed-off-by: Alexander Aring <aar@pengutronix.de>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2016-07-08 12:20:57 +02:00
Alexander Aring
966be9e790 6lowpan: ndisc: add missing 802.15.4 only check
This patch adds a missing check to handle short address parsing for
802.15.4 6LoWPAN only.

Signed-off-by: Alexander Aring <aar@pengutronix.de>
Reviewed-by: Stefan Schmidt <stefan@osg.samsung.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2016-07-08 12:20:57 +02:00
Alexander Aring
929946a471 6lowpan: ndisc: fix double read unlock
This patch removes a double unlock case to accessing neighbour private
data.

Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Alexander Aring <aar@pengutronix.de>
Reviewed-by: Stefan Schmidt <stefan@osg.samsung.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2016-07-08 12:20:57 +02:00
Andy Lutomirski
a4770e1117 Bluetooth: Switch SMP to crypto_cipher_encrypt_one()
SMP does ECB crypto on stack buffers.  This is complicated and
fragile, and it will not work if the stack is virtually allocated.

Switch to the crypto_cipher interface, which is simpler and safer.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Acked-by: Johan Hedberg <johan.hedberg@intel.com>
Tested-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2016-07-08 12:20:57 +02:00