Commit Graph

26756 Commits

Author SHA1 Message Date
Hannes Frederic Sowa
20314092c1 ipv6: don't accept multicast traffic with scope 0
v2:
a) moved before multicast source address check
b) changed comment to netdev style

Cc: Erik Hugne <erik.hugne@ericsson.com>
Cc: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Acked-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Acked-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-02-11 14:00:54 -05:00
Hannes Frederic Sowa
dd40851521 ipv6: don't let node/interface scoped multicast traffic escape on the wire
Reported-by: Erik Hugne <erik.hugne@ericsson.com>
Cc: Erik Hugne <erik.hugne@ericsson.com>
Cc: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Acked-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-02-11 14:00:54 -05:00
Steffen Klassert
7cb8a93968 xfrm: Allow inserting policies with matching mark and different priorities
We currently can not insert policies with mark and mask
such that some flows would be matched from both policies.
We make this possible when the priority of these policies
are different. If both policies match a flow, the one with
the higher priority is used.

Reported-by: Emmanuel Thierry <emmanuel.thierry@telecom-bretagne.eu>
Reported-by: Romain Kuntz <r.kuntz@ipflavors.com>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2013-02-11 14:07:01 +01:00
Johannes Berg
3d9646d0ab mac80211: fix channel selection bug
When trying to connect to an AP that advertises HT but not
VHT, the mac80211 code erroneously uses the configuration
from the AP as is instead of checking it against regulatory
and local capabilities. This can lead to using an invalid
or even inexistent channel (like 11/HT40+).

Additionally, the return flags from downgrading must be
ORed together, to collect them from all of the downgrades.
Also clarify the message.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2013-02-11 11:12:26 +01:00
YOSHIFUJI Hideaki / 吉藤英明
daaba4fa17 net neighbour, decnet: Ensure to align device private data on preferred alignment.
To allow both of protocol-specific data and device-specific data
attached with neighbour entry, and to eliminate size calculation
cost when allocating entry, sizeof protocol-speicic data must be
multiple of NEIGH_PRIV_ALIGN.  On 64bit archs,
sizeof(struct dn_neigh) is multiple of NEIGH_PRIV_ALIGN, but on
32bit archs, it was not.

Introduce NEIGH_ENTRY_SPACE() macro to ensure that protocol-specific
entry-size meets our requirement.

Reported-by: Fengguang Wu <fengguang.wu@intel.com>
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-02-11 00:21:44 -05:00
YOSHIFUJI Hideaki / 吉藤英明
ec16ef2228 ipv6 mcast: Do not join device multicast for interface-local multicasts.
RFC4291 (IPv6 addressing architecture) says that interface-Local scope
spans only a single interface on a node.  We should not join L2 device
multicast list for addresses in interface-local (or smaller) scope.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-02-11 00:21:44 -05:00
David S. Miller
cfa82e020b Merge branch 'master' of git://1984.lsi.us.es/nf
Pablo Neira Ayuso says:

====================
The following patchset contains Netfilter/IPVS fixes for 3.8-rc7, they are:

* Fix oops in IPVS state-sync due to releasing a random memory area due
  to unitialized pointer, from Dan Carpenter.

* Fix SCTP flow establishment due to bad checksumming mangling in IPVS,
  from Daniel Borkmann.

* Three fixes for the recently added IPv6 NPT, all from YOSHIFUJI Hideaki,
  with an amendment collapsed into those patches from Ulrich Weber. They
  fiix adjustment calculation, fix prefix mangling and ensure LSB of
  prefixes are zeroes (as required by RFC).

Specifically, it took me a while to validate the 1's complement arithmetics/
checksumming approach in the IPv6 NPT code.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2013-02-10 20:44:08 -05:00
Eric Dumazet
044453b3ef arp: fix possible crash in arp_rcv()
We should call skb_share_check() before pskb_may_pull(), or we
can crash in pskb_expand_head()

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-02-10 20:39:39 -05:00
David Ward
86fbe9bb59 net/8021q: Implement Multiple VLAN Registration Protocol (MVRP)
Initial implementation of the Multiple VLAN Registration Protocol
(MVRP) from IEEE 802.1Q-2011, based on the existing implementation
of the GARP VLAN Registration Protocol (GVRP).

Signed-off-by: David Ward <david.ward@ll.mit.edu>
Acked-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-02-10 20:37:22 -05:00
David Ward
febf018d22 net/802: Implement Multiple Registration Protocol (MRP)
Initial implementation of the Multiple Registration Protocol (MRP)
from IEEE 802.1Q-2011, based on the existing implementation of the
Generic Attribute Registration Protocol (GARP).

Signed-off-by: David Ward <david.ward@ll.mit.edu>
Acked-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-02-10 20:37:22 -05:00
Andy King
d021c34405 VSOCK: Introduce VM Sockets
VM Sockets allows communication between virtual machines and the hypervisor.
User level applications both in a virtual machine and on the host can use the
VM Sockets API, which facilitates fast and efficient communication between
guest virtual machines and their host.  A socket address family, designed to be
compatible with UDP and TCP at the interface level, is provided.

Today, VM Sockets is used by various VMware Tools components inside the guest
for zero-config, network-less access to VMware host services.  In addition to
this, VMware's users are using VM Sockets for various applications, where
network access of the virtual machine is restricted or non-existent.  Examples
of this are VMs communicating with device proxies for proprietary hardware
running as host applications and automated testing of applications running
within virtual machines.

The VMware VM Sockets are similar to other socket types, like Berkeley UNIX
socket interface.  The VM Sockets module supports both connection-oriented
stream sockets like TCP, and connectionless datagram sockets like UDP. The VM
Sockets protocol family is defined as "AF_VSOCK" and the socket operations
split for SOCK_DGRAM and SOCK_STREAM.

For additional information about the use of VM Sockets, please refer to the
VM Sockets Programming Guide available at:

https://www.vmware.com/support/developer/vmci-sdk/

Signed-off-by: George Zhang <georgezhang@vmware.com>
Signed-off-by: Dmitry Torokhov <dtor@vmware.com>
Signed-off-by: Andy king <acking@vmware.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-02-10 19:41:08 -05:00
David S. Miller
fd5023111c Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Synchronize with 'net' in order to sort out some l2tp, wireless, and
ipv6 GRE fixes that will be built on top of in 'net-next'.

Signed-off-by: David S. Miller <davem@davemloft.net>
2013-02-08 18:02:14 -05:00
Daniel Borkmann
03536e23ac net: sctp: sctp_auth_make_key_vector: use sctp_auth_create_key
In sctp_auth_make_key_vector, we allocate a temporary sctp_auth_bytes
structure with kmalloc instead of the sctp_auth_create_key allocator.
Change this to sctp_auth_create_key as it is the case everywhere else,
so that we also can properly free it via sctp_auth_key_put. This makes
it easier for future code changes in the structure and allocator itself,
since a single API is consistently used for this purpose. Also, by
using sctp_auth_create_key we're doing sanity checks over the arguments.

Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Acked-by: Vlad Yasevich <vyasevich@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-02-08 17:55:48 -05:00
Amerigo Wang
6a98dcf032 ipv6: fix a RCU warning in net/ipv6/ip6_flowlabel.c
This patch fixes the following RCU warning:

[   51.680236] ===============================
[   51.681914] [ INFO: suspicious RCU usage. ]
[   51.683610] 3.8.0-rc6-next-20130206-sasha-00028-g83214f7-dirty #276 Tainted: G        W
[   51.686703] -------------------------------
[   51.688281] net/ipv6/ip6_flowlabel.c:671 suspicious rcu_dereference_check() usage!

we should use rcu_dereference_bh() when we hold rcu_read_lock_bh().

Reported-by: Sasha Levin <sasha.levin@oracle.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: Cong Wang <amwang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-02-08 17:55:48 -05:00
Alexander Duyck
e5e6730588 skbuff: Move definition of NETDEV_FRAG_PAGE_MAX_SIZE
In order to address the fact that some devices cannot support the full 32K
frag size we need to have the value accessible somewhere so that we can use it
to do comparisons against what the device can support.  As such I am moving
the values out of skbuff.c and into skbuff.h.

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-02-08 17:33:35 -05:00
Linus Torvalds
e06b84052a Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Pull networking fixes from David Miller:

 1) Revert iwlwifi reclaimed packet tracking, it causes problems for a
    bunch of folks.  From Emmanuel Grumbach.

 2) Work limiting code in brcmsmac wifi driver can clear tx status
    without processing the event.  From Arend van Spriel.

 3) rtlwifi USB driver processes wrong SKB, fix from Larry Finger.

 4) l2tp tunnel delete can race with close, fix from Tom Parkin.

 5) pktgen_add_device() failures are not checked at all, fix from Cong
    Wang.

 6) Fix unintentional removal of carrier off from tun_detach(),
    otherwise we confuse userspace, from Michael S.  Tsirkin.

 7) Don't leak socket reference counts and ubufs in vhost-net driver,
    from Jason Wang.

 8) vmxnet3 driver gets it's initial carrier state wrong, fix from Neil
    Horman.

 9) Protect against USB networking devices which spam the host with 0
    length frames, from Bjørn Mork.

10) Prevent neighbour overflows in ipv6 for locally destined routes,
    from Marcelo Ricardo.  This is the best short-term fix for this, a
    longer term fix has been implemented in net-next.

11) L2TP uses ipv4 datagram routines in it's ipv6 code, whoops.  This
    mistake is largely because the ipv6 functions don't even have some
    kind of prefix in their names to suggest they are ipv6 specific.
    From Tom Parkin.

12) Check SYN packet drops properly in tcp_rcv_fastopen_synack(), from
    Yuchung Cheng.

13) Fix races and TX skb freeing bugs in via-rhine's NAPI support, from
    Francois Romieu and your's truly.

14) Fix infinite loops and divides by zero in TCP congestion window
    handling, from Eric Dumazet, Neal Cardwell, and Ilpo Järvinen.

15) AF_PACKET tx ring handling can leak kernel memory to userspace, fix
    from Phil Sutter.

16) Fix error handling in ipv6 GRE tunnel transmit, from Tommi Rantala.

17) Protect XEN netback driver against hostile frontend putting garbage
    into the rings, don't leak pages in TX GOP checking, and add proper
    resource releasing in error path of xen_netbk_get_requests().  From
    Ian Campbell.

18) SCTP authentication keys should be cleared out and released with
    kzfree(), from Daniel Borkmann.

19) L2TP is a bit too clever trying to maintain skb->truesize, and ends
    up corrupting socket memory accounting to the point where packet
    sending is halted indefinitely.  Just remove the adjustments
    entirely, they aren't really needed.  From Eric Dumazet.

20) ATM Iphase driver uses a data type with the same name as the S390
    headers, rename to fix the build.  From Heiko Carstens.

21) Fix a typo in copying the inner network header offset from one SKB
    to another, from Pravin B Shelar.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (56 commits)
  net: sctp: sctp_endpoint_free: zero out secret key data
  net: sctp: sctp_setsockopt_auth_key: use kzfree instead of kfree
  atm/iphase: rename fregt_t -> ffreg_t
  net: usb: fix regression from FLAG_NOARP code
  l2tp: dont play with skb->truesize
  net: sctp: sctp_auth_key_put: use kzfree instead of kfree
  netback: correct netbk_tx_err to handle wrap around.
  xen/netback: free already allocated memory on failure in xen_netbk_get_requests
  xen/netback: don't leak pages on failure in xen_netbk_tx_check_gop.
  xen/netback: shutdown the ring if it contains garbage.
  net: qmi_wwan: add more Huawei devices, including E320
  net: cdc_ncm: add another Huawei vendor specific device
  ipv6/ip6_gre: fix error case handling in ip6gre_tunnel_xmit()
  tcp: fix for zero packets_in_flight was too broad
  brcmsmac: rework of mac80211 .flush() callback operation
  ssb: unregister gpios before unloading ssb
  bcma: unregister gpios before unloading bcma
  rtlwifi: Fix scheduling while atomic bug
  net: usbnet: fix tx_dropped statistics
  tcp: ipv6: Update MIB counters for drops
  ...
2013-02-09 07:55:24 +11:00
Daniel Borkmann
b5c37fe6e2 net: sctp: sctp_endpoint_free: zero out secret key data
On sctp_endpoint_destroy, previously used sensitive keying material
should be zeroed out before the memory is returned, as we already do
with e.g. auth keys when released.

Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Acked-by: Vlad Yasevich <vyasevic@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-02-08 14:54:24 -05:00
Daniel Borkmann
6ba542a291 net: sctp: sctp_setsockopt_auth_key: use kzfree instead of kfree
In sctp_setsockopt_auth_key, we create a temporary copy of the user
passed shared auth key for the endpoint or association and after
internal setup, we free it right away. Since it's sensitive data, we
should zero out the key before returning the memory back to the
allocator. Thus, use kzfree instead of kfree, just as we do in
sctp_auth_key_put().

Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-02-08 14:54:24 -05:00
John W. Linville
f5237f278f Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next into for-davem 2013-02-08 13:16:17 -05:00
Eric Dumazet
87c084a980 l2tp: dont play with skb->truesize
Andrew Savchenko reported a DNS failure and we diagnosed that
some UDP sockets were unable to send more packets because their
sk_wmem_alloc was corrupted after a while (tx_queue column in
following trace)

$ cat /proc/net/udp
  sl  local_address rem_address   st tx_queue rx_queue tr tm->when retrnsmt   uid  timeout inode ref pointer drops
...
  459: 00000000:0270 00000000:0000 07 00000000:00000000 00:00000000 00000000     0        0 4507 2 ffff88003d612380 0
  466: 00000000:0277 00000000:0000 07 00000000:00000000 00:00000000 00000000     0        0 4802 2 ffff88003d613180 0
  470: 076A070A:007B 00000000:0000 07 FFFF4600:00000000 00:00000000 00000000   123        0 5552 2 ffff880039974380 0
  470: 010213AC:007B 00000000:0000 07 00000000:00000000 00:00000000 00000000     0        0 4986 2 ffff88003dbd3180 0
  470: 010013AC:007B 00000000:0000 07 00000000:00000000 00:00000000 00000000     0        0 4985 2 ffff88003dbd2e00 0
  470: 00FCA8C0:007B 00000000:0000 07 FFFFFB00:00000000 00:00000000 00000000     0        0 4984 2 ffff88003dbd2a80 0
...

Playing with skb->truesize is tricky, especially when
skb is attached to a socket, as we can fool memory charging.

Just remove this code, its not worth trying to be ultra
precise in xmit path.

Reported-by: Andrew Savchenko <bircoph@gmail.com>
Tested-by: Andrew Savchenko <bircoph@gmail.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: James Chapman <jchapman@katalix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-02-08 01:49:49 -05:00
Daniel Borkmann
241448c2b8 net: sctp: sctp_auth_make_key_vector: remove duplicate ntohs calls
Instead of calling 3 times ntohs(random->param_hdr.length), 2 times
ntohs(hmacs->param_hdr.length), and 3 times ntohs(chunks->param_hdr.length)
within the same function, we only call each once and store it in a
variable.

Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Acked-by: Vlad Yasevich <vyasevich@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-02-07 23:44:37 -05:00
Daniel Borkmann
586c31f3bf net: sctp: sctp_auth_key_put: use kzfree instead of kfree
For sensitive data like keying material, it is common practice to zero
out keys before returning the memory back to the allocator. Thus, use
kzfree instead of kfree.

Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Acked-by: Vlad Yasevich <vyasevich@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-02-07 23:43:42 -05:00
David S. Miller
6cddded4af Merge branch 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jesse/openvswitch into openvswitch
Jesse Gross says:

====================
One bug fix for net/3.8 for a long standing problem that was reported a few
times recently.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2013-02-07 23:37:36 -05:00
Johannes Berg
d601cd8d95 mac80211: fix managed mode channel context use
My commit f2d9d270c1
("mac80211: support VHT association") introduced a
very stupid bug: the loop to downgrade the channel
width never attempted to actually use it again so
it would downgrade all the way to 20_NOHT. Fix it.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2013-02-07 20:56:01 +01:00
YOSHIFUJI Hideaki / 吉藤英明
edb27228db netfilter: ip6t_NPT: Ensure to check lower part of prefixes are zero
RFC 6296 points that address bits that are not part of the prefix
has to be zeroed.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2013-02-07 18:40:27 +01:00
YOSHIFUJI Hideaki / 吉藤英明
d4c38fa87d netfilter: ip6t_NPT: Fix prefix mangling
Make sure only the bits that are part of the prefix are mangled.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2013-02-07 18:40:26 +01:00
YOSHIFUJI Hideaki / 吉藤英明
f5271fff56 netfilter: ip6t_NPT: Fix adjustment calculation
Cast __wsum from/to __sum16 is wrong.  Instead, apply appropriate
conversion function: csum_unfold() or csum_fold().

[ The original patch has been modified to undo the final ~ that
  csum_fold returns. We only need to fold the 32-bit word that
  results from the checksum calculation into a 16-bit to ensure
  that the original subnet is restored appropriately. Spotted by
  Ulrich Weber. ]

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2013-02-07 18:37:41 +01:00
Tommi Rantala
41ab3e31bd ipv6/ip6_gre: fix error case handling in ip6gre_tunnel_xmit()
ip6gre_tunnel_xmit() is leaking the skb when we hit this error branch,
and the -1 return value from this function is bogus. Use the error
handling we already have in place in ip6gre_tunnel_xmit() for this error
case to fix this.

Signed-off-by: Tommi Rantala <tt.rantala@gmail.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-02-06 16:02:05 -05:00
Eric Dumazet
6d1ccff627 net: reset mac header in dev_start_xmit()
On 64 bit arches :

There is a off-by-one error in qdisc_pkt_len_init() because
mac_header is not set in xmit path.

skb_mac_header() returns an out of bound value that was
harmless because hdr_len is an 'unsigned int'

On 32bit arches, the error is abysmal.

This patch is also a prereq for "macvlan: add multicast filter"

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Ben Greear <greearb@candelatech.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-02-06 15:59:47 -05:00
Cong Wang
12b0004d1d net: adjust skb_gso_segment() for calling in rx path
skb_gso_segment() is almost always called in tx path,
except for openvswitch. It calls this function when
it receives the packet and tries to queue it to user-space.
In this special case, the ->ip_summed check inside
skb_gso_segment() is no longer true, as ->ip_summed value
has different meanings on rx path.

This patch adjusts skb_gso_segment() so that we can at least
avoid such warnings on checksum.

Cc: Jesse Gross <jesse@nicira.com>
Cc: David S. Miller <davem@davemloft.net>
Signed-off-by: Cong Wang <amwang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-02-06 15:58:00 -05:00
Alexander Aring
25060d8f3f wpan: use stack buffer instead of heap
head buffer is only temporary available in mac802154_header_create.
So it's not necessary to put it on the heap.

Signed-off-by: Alexander Aring <alex.aring@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-02-06 15:56:17 -05:00
Alexander Aring
fc4e98dbba 6lowpan: use stack buffer instead of heap
head buffer is only temporary available in lowpan_header_create.
So it's not necessary to put it on the heap.

Also fixed a comment codestyle issue.

Signed-off-by: Alexander Aring <alex.aring@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-02-06 15:56:17 -05:00
David S. Miller
a07fdceccf 6lowpan: Remove __init tag from lowpan_netlink_fini().
It's called from both __init and __exit code, so neither
tag is appropriate.

Signed-off-by: David S. Miller <davem@davemloft.net>
2013-02-06 15:54:38 -05:00
Ilpo Järvinen
6731d2095b tcp: fix for zero packets_in_flight was too broad
There are transients during normal FRTO procedure during which
the packets_in_flight can go to zero between write_queue state
updates and firing the resulting segments out. As FRTO processing
occurs during that window the check must be more precise to
not match "spuriously" :-). More specificly, e.g., when
packets_in_flight is zero but FLAG_DATA_ACKED is true the problematic
branch that set cwnd into zero would not be taken and new segments
might be sent out later.

Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Tested-by: Eric Dumazet <edumazet@google.com>
Acked-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-02-06 15:53:03 -05:00
Neil Horman
ca99ca14c9 netpoll: protect napi_poll and poll_controller during dev_[open|close]
Ivan Vercera was recently backporting commit
9c13cb8bb4 to a RHEL kernel, and I noticed that,
while this patch protects the tg3 driver from having its ndo_poll_controller
routine called during device initalization, it does nothing for the driver
during shutdown. I.e. it would be entirely possible to have the
ndo_poll_controller method (or subsequently the ndo_poll) routine called for a
driver in the netpoll path on CPU A while in parallel on CPU B, the ndo_close or
ndo_open routine could be called.  Given that the two latter routines tend to
initizlize and free many data structures that the former two rely on, the result
can easily be data corruption or various other crashes.  Furthermore, it seems
that this is potentially a problem with all net drivers that support netpoll,
and so this should ideally be fixed in a common path.

As Ben H Pointed out to me, we can't preform dev_open/dev_close in atomic
context, so I've come up with this solution.  We can use a mutex to sleep in
open/close paths and just do a mutex_trylock in the napi poll path and abandon
the poll attempt if we're locked, as we'll just retry the poll on the next send
anyway.

I've tested this here by flooding netconsole with messages on a system whos nic
driver I modfied to periodically return NETDEV_TX_BUSY, so that the netpoll tx
workqueue would be forced to send frames and poll the device.  While this was
going on I rapidly ifdown/up'ed the interface and watched for any problems.
I've not found any.

Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
CC: Ivan Vecera <ivecera@redhat.com>
CC: "David S. Miller" <davem@davemloft.net>
CC: Ben Hutchings <bhutchings@solarflare.com>
CC: Francois Romieu <romieu@fr.zoreil.com>
CC: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-02-06 15:45:03 -05:00
Alexander Aring
f458c647ea wpan: whitespace fix
Signed-off-by: Alexander Aring <alex.aring@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-02-06 15:44:02 -05:00
Steffen Klassert
f4e53e292a ipv6: Don't send packet to big messages to self
Calling icmpv6_send() on a local message size error leads to an
incorrect update of the path mtu in the case when IPsec is used.
So use ipv6_local_error() instead to notify the socket about the
error.

Reported-by: Jiri Bohac <jbohac@suse.cz>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-02-06 15:12:39 -05:00
Joe Perches
62b5942aa5 net: core: Remove unnecessary alloc/OOM messages
alloc failures already get standardized OOM
messages and a dump_stack.

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-02-06 14:58:52 -05:00
John W. Linville
b3b66ae4c8 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless into for-davem 2013-02-06 13:55:44 -05:00
Cong Ding
9887dbf5b2 mac80211: fix error in sizeof() usage
Using 'sizeof' on array given as function argument returns
size of a pointer rather than the size of array.

Cc: stable@vger.kernel.org
Signed-off-by: Cong Ding <dinggnu@gmail.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2013-02-06 17:31:55 +01:00
Michal Kubecek
8d068875ca xfrm: make gc_thresh configurable in all namespaces
The xfrm gc threshold can be configured via xfrm{4,6}_gc_thresh
sysctl but currently only in init_net, other namespaces always
use the default value. This can substantially limit the number
of IPsec tunnels that can be effectively used.

Signed-off-by: Michal Kubecek <mkubecek@suse.cz>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2013-02-06 11:36:29 +01:00
Michal Kubecek
1f53c80850 xfrm: remove unused xfrm4_policy_fini()
Function xfrm4_policy_fini() is unused since xfrm4_fini() was
removed in 2.6.11.

Signed-off-by: Michal Kubecek <mkubecek@suse.cz>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2013-02-06 11:34:31 +01:00
Steffen Klassert
a0073fe18e xfrm: Add a state resolution packet queue
As the default, we blackhole packets until the key manager resolves
the states. This patch implements a packet queue where IPsec packets
are queued until the states are resolved. We generate a dummy xfrm
bundle, the output routine of the returned route enqueues the packet
to a per policy queue and arms a timer that checks for state resolution
when dst_output() is called. Once the states are resolved, the packets
are sent out of the queue. If the states are not resolved after some
time, the queue is flushed.

This patch keeps the defaut behaviour to blackhole packets as long
as we have no states. To enable the packet queue the sysctl
xfrm_larval_drop must be switched off.

Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2013-02-06 08:31:10 +01:00
Daniel Borkmann
4b47bc9a9e ipvs: sctp: fix checksumming on snat and dnat handlers
In our test lab, we have a simple SCTP client connecting to a SCTP
server via an IPVS load balancer. On some machines, load balancing
works, but on others the initial handshake just fails, thus no
SCTP connection whatsoever can be established!

We observed that the SCTP INIT-ACK handshake reply from the IPVS
machine to the client had a correct IP checksum, but corrupt SCTP
checksum when forwarded, thus on the client-side the packet was
dropped and an intial handshake retriggered until all attempts
run into the void.

To fix this issue, this patch i) adds a missing CHECKSUM_UNNECESSARY
after the full checksum (re-)calculation (as done in IPVS TCP and UDP
code as well), ii) calculates the checksum in little-endian format
(as fixed with the SCTP code in commit 4458f04c: sctp: Clean up sctp
checksumming code) and iii) refactors duplicate checksum code into a
common function. Tested by myself.

Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Acked-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Simon Horman <horms@verge.net.au>
2013-02-06 09:56:50 +09:00
Stephen Hemminger
ca2eb5679f tcp: remove Appropriate Byte Count support
TCP Appropriate Byte Count was added by me, but later disabled.
There is no point in maintaining it since it is a potential source
of bugs and Linux already implements other better window protection
heuristics.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-02-05 14:51:16 -05:00
David S. Miller
547472b8e1 ipv4: Disallow non-namespace aware protocols to register.
All in-tree ipv4 protocol implementations are now namespace
aware.  Therefore all the run-time checks are superfluous.

Reject registry of any non-namespace aware ipv4 protocol.
Eventually we'll remove prot->netns_ok and this registry
time check as well.

Signed-off-by: David S. Miller <davem@davemloft.net>
2013-02-05 14:42:23 -05:00
David S. Miller
9d6ddb1990 l2tp: Make ipv4 protocol handler namespace aware.
The infrastructure is already pretty much entirely there
to allow this conversion.

The tunnel and session lookups have per-namespace tables,
and the ipv4 bind lookup includes the namespace in the
lookup key.

Set netns_ok in l2tp_ip_protocol.

Signed-off-by: David S. Miller <davem@davemloft.net>
2013-02-05 14:37:01 -05:00
Tom Parkin
167eb17e0b l2tp: create tunnel sockets in the right namespace
When creating unmanaged tunnel sockets we should honour the network namespace
passed to l2tp_tunnel_create.  Furthermore, unmanaged tunnel sockets should
not hold a reference to the network namespace lest they accidentally keep
alive a namespace which should otherwise have been released.

Unmanaged tunnel sockets now drop their namespace reference via sk_change_net,
and are released in a new pernet exit callback, l2tp_exit_net.

Signed-off-by: Tom Parkin <tparkin@katalix.com>
Signed-off-by: James Chapman <jchapman@katalix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-02-05 14:20:30 -05:00
Tom Parkin
cbb95e0ca9 l2tp: prevent tunnel creation on netns mismatch
l2tp_tunnel_create is passed a pointer to the network namespace for the
tunnel, along with an optional file descriptor for the tunnel which may
be passed in from userspace via. netlink.

In the case where the file descriptor is defined, ensure that the namespace
associated with that socket matches the namespace explicitly passed to
l2tp_tunnel_create.

Signed-off-by: Tom Parkin <tparkin@katalix.com>
Signed-off-by: James Chapman <jchapman@katalix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-02-05 14:20:30 -05:00
Tom Parkin
b6fdfdfab0 l2tp: set netnsok flag for netlink messages
The L2TP netlink code can run in namespaces.  Set the netnsok flag in
genl_family to true to reflect that fact.

Signed-off-by: Tom Parkin <tparkin@katalix.com>
Signed-off-by: James Chapman <jchapman@katalix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-02-05 14:20:30 -05:00
Tom Parkin
f8ccac0e44 l2tp: put tunnel socket release on a workqueue
To allow l2tp_tunnel_delete to be called from an atomic context, place the
tunnel socket release calls on a workqueue for asynchronous execution.

Tunnel memory is eventually freed in the tunnel socket destructor.

Signed-off-by: Tom Parkin <tparkin@katalix.com>
Signed-off-by: James Chapman <jchapman@katalix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-02-05 14:20:30 -05:00
David S. Miller
188d1f76d0 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Conflicts:
	drivers/net/ethernet/intel/e1000e/ethtool.c
	drivers/net/vmxnet3/vmxnet3_drv.c
	drivers/net/wireless/iwlwifi/dvm/tx.c
	net/ipv6/route.c

The ipv6 route.c conflict is simple, just ignore the 'net' side change
as we fixed the same problem in 'net-next' by eliminating cached
neighbours from ipv6 routes.

The e1000e conflict is an addition of a new statistic in the ethtool
code, trivial.

The vmxnet3 conflict is about one change in 'net' removing a guarding
conditional, whilst in 'net-next' we had a netdev_info() conversion.

The iwlwifi conflict is dealing with a WARN_ON() conversion in
'net-next' vs. a revert happening in 'net'.

Signed-off-by: David S. Miller <davem@davemloft.net>
2013-02-05 14:12:20 -05:00
John W. Linville
4c52d3d3fd Merge branch 'for-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth 2013-02-04 16:40:07 -05:00
David S. Miller
27000929ef ipcomp: Mark as netns_ok.
This module is namespace aware, netns_ok was just disabled by default
for sanity.

Signed-off-by: David S. Miller <davem@davemloft.net>
2013-02-04 15:46:15 -05:00
Jean Sacren
56db1c5f41 mcast: do not check 'rv' twice in a row
With the loop, don't check 'rv' twice in a row. Without the loop, 'rv'
doesn't even need to be checked.

Make the comment more grammar-friendly.

Signed-off-by: Jean Sacren <sakiwit@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-02-04 13:26:49 -05:00
Ying Xue
25cc4ae913 net: remove redundant check for timer pending state before del_timer
As in del_timer() there has already placed a timer_pending() function
to check whether the timer to be deleted is pending or not, it's
unnecessary to check timer pending state again before del_timer() is
called.

Signed-off-by: Ying Xue <ying.xue@windriver.com>
Cc: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-02-04 13:26:49 -05:00
Daniel Borkmann
8c98653f05 sctp: sctp_close: fix release of bindings for deferred call_rcu's
It seems due to RCU usage, i.e. within SCTP's address binding list,
a, say, ``behavioral change'' was introduced which does actually
not conform to the RFC anymore. In particular consider the following
(fictional) scenario to demonstrate this:

  do:
    Two SOCK_SEQPACKET-style sockets are opened (S1, S2)
    S1 is bound to 127.0.0.1, port 1024 [server]
    S2 is bound to 127.0.0.1, port 1025 [client]
    listen(2) is invoked on S1
    From S2 we call one sendmsg(2) with msg.msg_name and
       msg.msg_namelen parameters set to the server's
       address
    S1, S2 are closed
    goto do

The first pass of this loop passes successful, while the second round
fails during binding of S1 (address still in use). What is happening?
In the first round, the initial handshake is being done, and, at the
time close(2) is called on S1, a non-graceful shutdown is performed via
ABORT since in S1's receive queue an unprocessed packet is present,
thus stating an error condition. This can be considered as a correct
behavior.

During close also all bound addresses are freed, thus nothing *must*
be active anymore. In reference to RFC2960:

  After checking the Verification Tag, the receiving endpoint shall
  remove the association from its record, and shall report the
  termination to its upper layer. (9.1 Abort of an Association)

Also, no half-open states are supported, thus after an ungraceful
shutdown, we leave nothing behind. However, this seems not to be
happening though. In a real-world scenario, this is exactly where
it breaks the lksctp-tools functional test suite, *for instance*:

  ./test_sockopt
  test_sockopt.c  1 PASS : getsockopt(SCTP_STATUS) on a socket with no assoc
  test_sockopt.c  2 PASS : getsockopt(SCTP_STATUS)
  test_sockopt.c  3 PASS : getsockopt(SCTP_STATUS) with invalid associd
  test_sockopt.c  4 PASS : getsockopt(SCTP_STATUS) with NULL associd
  test_sockopt.c  5 BROK : bind: Address already in use

The underlying problem is that sctp_endpoint_destroy() hasn't been
triggered yet while the next bind attempt is being done. It will be
triggered eventually (but too late) by sctp_transport_destroy_rcu()
after one RCU grace period:

  sctp_transport_destroy()
    sctp_transport_destroy_rcu() ----.
      sctp_association_put() [*]  <--+--> sctp_packet_free()
        sctp_association_destroy()          [...]
          sctp_endpoint_put()                 skb->destructor
            sctp_endpoint_destroy()             sctp_wfree()
              sctp_bind_addr_free()               sctp_association_put() [*]

Thus, we move out the condition with sctp_association_put() as well as
the sctp_packet_free() invocation and the issue can be solved. We also
better free the SCTP chunks first before putting the ref of the association.

With this patch, the example above (which simulates a similar scenario
as in the implementation of this test case) and therefore also the test
suite run successfully through. Tested by myself.

Cc: Vlad Yasevich <vyasevich@gmail.com>
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Acked-by: Vlad Yasevich <vyasevich@gmail.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-02-04 13:22:33 -05:00
Gao feng
e4d343ea92 netns: bridge: allow unprivileged users add/delete mdb entry
since the mdb table is belong to bridge device,and the
bridge device can only be seen in one netns.
So it's safe to allow unprivileged user which is the
creator of userns and netns to modify the mdb table.

Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-02-04 13:12:16 -05:00
Gao feng
bb12b8b26e netns: ebtable: allow unprivileged users to operate ebtables
ebt_table is a private resource of netns, operating ebtables
in one netns will not affect other netns, we can allow the
creator user of userns and netns to change the ebtables.

Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-02-04 13:12:16 -05:00
Gao feng
c5c351088a netns: fdb: allow unprivileged users to add/del fdb entries
Right now,only ixgdb,macvlan,vxlan and bridge implement
fdb_add/fdb_del operations.

these operations only operate the private data of net
device. So allowing the unprivileged users who creates
the userns and netns to add/del fdb entries will do no
harm to other netns.

Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-02-04 13:12:16 -05:00
Vijay Subramanian
5f1e942cb4 tcp: ipv6: Update MIB counters for drops
This patch updates LINUX_MIB_LISTENDROPS and LINUX_MIB_LISTENOVERFLOWS in
tcp_v6_conn_request() and tcp_v6_err(). tcp_v6_conn_request() in particular can
drop SYNs for various reasons which are not currently tracked.

Signed-off-by: Vijay Subramanian <subramanian.vijay@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-02-04 13:06:27 -05:00
Vijay Subramanian
848bf15f36 tcp: Update MIB counters for drops
This patch updates LINUX_MIB_LISTENDROPS in tcp_v4_conn_request() and
tcp_v4_err(). tcp_v4_conn_request() in particular can drop SYNs for various
reasons which are not currently tracked.

Signed-off-by: Vijay Subramanian <subramanian.vijay@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-02-04 13:06:27 -05:00
Phil Sutter
9665d5d624 packet: fix leakage of tx_ring memory
When releasing a packet socket, the routine packet_set_ring() is reused
to free rings instead of allocating them. But when calling it for the
first time, it fills req->tp_block_nr with the value of rb->pg_vec_len
which in the second invocation makes it bail out since req->tp_block_nr
is greater zero but req->tp_block_size is zero.

This patch solves the problem by passing a zeroed auto-variable to
packet_set_ring() upon each invocation from packet_release().

As far as I can tell, this issue exists even since 69e3c75 (net: TX_RING
and packet mmap), i.e. the original inclusion of TX ring support into
af_packet, but applies only to sockets with both RX and TX ring
allocated, which is probably why this was unnoticed all the time.

Signed-off-by: Phil Sutter <phil.sutter@viprinet.com>
Cc: Johann Baudy <johann.baudy@gnu-log.net>
Cc: Daniel Borkmann <dborkman@redhat.com>
Acked-by: Daniel Borkmann <dborkman@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-02-03 16:15:23 -05:00
Pravin B Shelar
92df9b217e net: Fix inner_network_header assignment in skb-copy.
Use correct inner offset to set inner_network_offset.
Found by inspection.

Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-02-03 16:10:36 -05:00
Eric Dumazet
2e5f421211 tcp: frto should not set snd_cwnd to 0
Commit 9dc274151a (tcp: fix ABC in tcp_slow_start())
uncovered a bug in FRTO code :
tcp_process_frto() is setting snd_cwnd to 0 if the number
of in flight packets is 0.

As Neal pointed out, if no packet is in flight we lost our
chance to disambiguate whether a loss timeout was spurious.

We should assume it was a proper loss.

Reported-by: Pasi Kärkkäinen <pasik@iki.fi>
Signed-off-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Cc: Yuchung Cheng <ycheng@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-02-03 16:00:25 -05:00
Eric Dumazet
973ec449bb tcp: fix an infinite loop in tcp_slow_start()
Since commit 9dc274151a (tcp: fix ABC in tcp_slow_start()),
a nul snd_cwnd triggers an infinite loop in tcp_slow_start()

Avoid this infinite loop and log a one time error for further
analysis. FRTO code is suspected to cause this bug.

Reported-by: Pasi Kärkkäinen <pasik@iki.fi>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Neal Cardwell <ncardwell@google.com>
Cc: Yuchung Cheng <ycheng@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-02-03 16:00:25 -05:00
John W. Linville
ed6882ac40 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless into for-davem 2013-02-01 13:43:25 -05:00
Andre Guedes
a3d0935649 Bluetooth: Refactor mgmt_pending_foreach
This patch does a trivial refactor in mgmt_pending_foreach function.
It replaces list_for_each_safe by list_for_each_entry_safe, simplifying
the function.

Signed-off-by: Andre Guedes <andre.guedes@openbossa.org>
Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>
2013-02-01 15:50:18 -02:00
Andre Guedes
2b8a9a2e6a Bluetooth: Remove unneeded locking
This patch removes unneeded locking in hci_le_adv_report_evt. There
is no need to lock hdev before calling mgmt_device_found.

Signed-off-by: Andre Guedes <andre.guedes@openbossa.org>
Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>
2013-02-01 15:50:18 -02:00
Andre Guedes
405280887f Bluetooth: Reduce critical section in sco_conn_ready
This patch reduces the critical section protected by sco_conn_lock in
sco_conn_ready function. The lock is acquired only when it is really
needed.

This patch fixes the following lockdep warning which is generated
when the host terminates a SCO connection.

Today, this warning is a false positive. There is no way those
two threads reported by lockdep are running at the same time since
hdev->workqueue (where rx_work is queued) is single-thread. However,
if somehow this behavior is changed in future, we will have a
potential deadlock.

======================================================
[ INFO: possible circular locking dependency detected ]
3.8.0-rc1+ #7 Not tainted
-------------------------------------------------------
kworker/u:1H/1018 is trying to acquire lock:
 (&(&conn->lock)->rlock){+.+...}, at: [<ffffffffa0033ba6>] sco_chan_del+0x66/0x190 [bluetooth]

but task is already holding lock:
 (slock-AF_BLUETOOTH-BTPROTO_SCO){+.+...}, at: [<ffffffffa0033d5a>] sco_conn_del+0x8a/0xe0 [bluetooth]

which lock already depends on the new lock.

the existing dependency chain (in reverse order) is:

-> #1 (slock-AF_BLUETOOTH-BTPROTO_SCO){+.+...}:
       [<ffffffff81083011>] lock_acquire+0xb1/0xe0
       [<ffffffff813efd01>] _raw_spin_lock+0x41/0x80
       [<ffffffffa003436e>] sco_connect_cfm+0xbe/0x350 [bluetooth]
       [<ffffffffa0015d6c>] hci_event_packet+0xd3c/0x29b0 [bluetooth]
       [<ffffffffa0004583>] hci_rx_work+0x133/0x870 [bluetooth]
       [<ffffffff8104d65f>] process_one_work+0x2bf/0x4f0
       [<ffffffff81050022>] worker_thread+0x2b2/0x3e0
       [<ffffffff81056021>] kthread+0xd1/0xe0
       [<ffffffff813f14bc>] ret_from_fork+0x7c/0xb0

-> #0 (&(&conn->lock)->rlock){+.+...}:
       [<ffffffff81082215>] __lock_acquire+0x1465/0x1c70
       [<ffffffff81083011>] lock_acquire+0xb1/0xe0
       [<ffffffff813efd01>] _raw_spin_lock+0x41/0x80
       [<ffffffffa0033ba6>] sco_chan_del+0x66/0x190 [bluetooth]
       [<ffffffffa0033d6d>] sco_conn_del+0x9d/0xe0 [bluetooth]
       [<ffffffffa0034653>] sco_disconn_cfm+0x53/0x60 [bluetooth]
       [<ffffffffa000fef3>] hci_disconn_complete_evt.isra.54+0x363/0x3c0 [bluetooth]
       [<ffffffffa00150f7>] hci_event_packet+0xc7/0x29b0 [bluetooth]
       [<ffffffffa0004583>] hci_rx_work+0x133/0x870 [bluetooth]
       [<ffffffff8104d65f>] process_one_work+0x2bf/0x4f0
       [<ffffffff81050022>] worker_thread+0x2b2/0x3e0
       [<ffffffff81056021>] kthread+0xd1/0xe0
       [<ffffffff813f14bc>] ret_from_fork+0x7c/0xb0

other info that might help us debug this:

 Possible unsafe locking scenario:

       CPU0                    CPU1
       ----                    ----
  lock(slock-AF_BLUETOOTH-BTPROTO_SCO);
                               lock(&(&conn->lock)->rlock);
                               lock(slock-AF_BLUETOOTH-BTPROTO_SCO);
  lock(&(&conn->lock)->rlock);

 *** DEADLOCK ***

4 locks held by kworker/u:1H/1018:
 #0:  (hdev->name#2){.+.+.+}, at: [<ffffffff8104d5f8>] process_one_work+0x258/0x4f0
 #1:  ((&hdev->rx_work)){+.+.+.}, at: [<ffffffff8104d5f8>] process_one_work+0x258/0x4f0
 #2:  (&hdev->lock){+.+.+.}, at: [<ffffffffa000fbe9>] hci_disconn_complete_evt.isra.54+0x59/0x3c0 [bluetooth]
 #3:  (slock-AF_BLUETOOTH-BTPROTO_SCO){+.+...}, at: [<ffffffffa0033d5a>] sco_conn_del+0x8a/0xe0 [bluetooth]

stack backtrace:
Pid: 1018, comm: kworker/u:1H Not tainted 3.8.0-rc1+ #7
Call Trace:
 [<ffffffff813e92f9>] print_circular_bug+0x1fb/0x20c
 [<ffffffff81082215>] __lock_acquire+0x1465/0x1c70
 [<ffffffff81083011>] lock_acquire+0xb1/0xe0
 [<ffffffffa0033ba6>] ? sco_chan_del+0x66/0x190 [bluetooth]
 [<ffffffff813efd01>] _raw_spin_lock+0x41/0x80
 [<ffffffffa0033ba6>] ? sco_chan_del+0x66/0x190 [bluetooth]
 [<ffffffffa0033ba6>] sco_chan_del+0x66/0x190 [bluetooth]
 [<ffffffffa0033d6d>] sco_conn_del+0x9d/0xe0 [bluetooth]
 [<ffffffffa0034653>] sco_disconn_cfm+0x53/0x60 [bluetooth]
 [<ffffffffa000fef3>] hci_disconn_complete_evt.isra.54+0x363/0x3c0 [bluetooth]
 [<ffffffffa000fbd0>] ? hci_disconn_complete_evt.isra.54+0x40/0x3c0 [bluetooth]
 [<ffffffffa00150f7>] hci_event_packet+0xc7/0x29b0 [bluetooth]
 [<ffffffff81202e90>] ? __dynamic_pr_debug+0x80/0x90
 [<ffffffff8133ff7d>] ? kfree_skb+0x2d/0x40
 [<ffffffffa0021644>] ? hci_send_to_monitor+0x1a4/0x1c0 [bluetooth]
 [<ffffffffa0004583>] hci_rx_work+0x133/0x870 [bluetooth]
 [<ffffffff8104d5f8>] ? process_one_work+0x258/0x4f0
 [<ffffffff8104d65f>] process_one_work+0x2bf/0x4f0
 [<ffffffff8104d5f8>] ? process_one_work+0x258/0x4f0
 [<ffffffff8104fdc1>] ? worker_thread+0x51/0x3e0
 [<ffffffffa0004450>] ? hci_tx_work+0x800/0x800 [bluetooth]
 [<ffffffff81050022>] worker_thread+0x2b2/0x3e0
 [<ffffffff8104fd70>] ? busy_worker_rebind_fn+0x100/0x100
 [<ffffffff81056021>] kthread+0xd1/0xe0
 [<ffffffff81055f50>] ? flush_kthread_worker+0xc0/0xc0
 [<ffffffff813f14bc>] ret_from_fork+0x7c/0xb0
 [<ffffffff81055f50>] ? flush_kthread_worker+0xc0/0xc0

Signed-off-by: Andre Guedes <andre.guedes@openbossa.org>
Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>
2013-02-01 15:50:18 -02:00
Johan Hedberg
3810285cf8 Bluetooth: Increment Management interface revision
This patch increments the management interface revision due to the
various fixes, improvements and other changes that have gone in lately.

Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Acked-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>
2013-02-01 15:50:18 -02:00
Johan Hedberg
f0ff92fbfa Bluetooth: Fix link security setting when powering on
If a controller is powered on while the HCI_AUTO_OFF flag is set the
link security setting (HCI_LINK_SECURITY) might not be in sync with the
actual state of the controller (HCI_AUTH). This patch fixes the issue by
checking for inequality between the intended and actual settings and
sends a HCI_Write_Auth_Enable command if necessary.

Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Acked-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>
2013-02-01 15:50:18 -02:00
Johan Hedberg
c00d575bd5 Bluetooth: Add support for 128-bit UUIDs in EIR data
This patch adds the necessary code for encoding a list of 128-bit UUIDs
into the EIR data.

Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Acked-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>
2013-02-01 15:50:17 -02:00
Johan Hedberg
cdf1963f7b Bluetooth: Add support for 32-bit UUIDs in EIR data
This patch adds the necessary code for inserting a list of 32-bit UUIDs
into the EIR data.

Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Acked-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>
2013-02-01 15:50:17 -02:00
Johan Hedberg
213202edc9 Bluetooth: Refactor UUID-16 list generation into its own function
We will need to create three separate UUID lists in the EIR data (for
16, 32 and 128 bit UUIDs) so the code is easier to follow if each list
is generated in their own function.

Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Acked-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>
2013-02-01 15:50:17 -02:00
Johan Hedberg
892bbc5794 Bluetooth: Remove useless eir_len variable from EIR creation
The amount of data encoded so far in the create_eir() function can be
calculated simply through the difference between the data and ptr
pointer variables. The eir_len variable then becomes essentially
useless.

Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Acked-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>
2013-02-01 15:50:17 -02:00
Johan Hedberg
a10f27cf42 Bluetooth: Simplify UUID16 list generation for EIR
There's no need to use two separate loops to generate a UUID list for
the EIR data. This patch merges the two loops previously used for the
16-bit UUID list generation into a single loop, thus simplifying the
code a great deal.

Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Acked-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>
2013-02-01 15:50:17 -02:00
Johan Hedberg
056341c8cb Bluetooth: Simplify UUID removal code
The UUID removal code can be simplified by using
list_for_each_entry_safe instead of list_for_each_safe.

Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Acked-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>
2013-02-01 15:50:17 -02:00
Johan Hedberg
83be8eca2e Bluetooth: Keep track of UUID type upon addition
The primary purpose of the UUIDs is to enable generation of EIR and AD
data. In these data formats the UUIDs are split into separate fields
based on whether they're 16, 32 or 128 bit UUIDs. To make the generation
of these data fields simpler this patch adds a type member to the
bt_uuid struct and assigns a value to it as soon as the UUID is added to
the kernel. This way the type doesn't need to be calculated each time
the UUID list is later iterated.

Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Acked-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>
2013-02-01 15:50:17 -02:00
Johan Hedberg
4821002ce2 Bluetooth: Simplify UUIDs clearing code
The code for clearing the UUIDs list can be simplified by using
list_for_each_entry_safe instead of list_for_each_safe.

Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Acked-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>
2013-02-01 15:50:16 -02:00
Johan Hedberg
de66aa6305 Bluetooth: Store UUIDs in the same order that they were added
We should be encoding UUIDs to the EIR data in the same order that they
were added to the kernel, i.e. each UUID should be added to the end of
the UUIDs list. This patch fixes the issue by using list_add_tail
instead of list_add for storing the UUIDs.

Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Acked-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>
2013-02-01 15:50:16 -02:00
Li RongQing
fa8599db8f xfrm: fix a unbalanced lock
Signed-off-by: Li RongQing <roy.qing.li@gmail.com>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2013-02-01 10:33:40 +01:00
Jussi Kivilinna
7e50f84c94 pf_key/xfrm_algo: prepare pf_key and xfrm_algo for new algorithms without pfkey support
Mark existing algorithms as pfkey supported and make pfkey only use algorithms
that have pfkey_supported set.

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@mbnet.fi>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2013-02-01 10:13:43 +01:00
Paul Gortmaker
6fcdf4facb wanrouter: delete now orphaned header content, files/drivers
The wanrouter support was identified earlier as unused for years,
and so the previous commit totally decoupled it from the kernel,
leaving the related wanrouter files present, but totally inert.

Here we take the final step in that cleanup, by doing a wholesale
removal of these files.  The two step process is used so that the
large deletion is decoupled from the git history of files that we
still care about.

The drivers deleted here all were dependent on the Kconfig setting
CONFIG_WAN_ROUTER_DRIVERS.

A stub wanrouter.h header (kernel & uapi) are left behind so that
drivers/isdn/i4l/isdn_x25iface.c continues to compile, and so that
we don't accidentally break userspace that expected these defines.

Cc: Joe Perches <joe@perches.com>
Cc: Dan Carpenter <dan.carpenter@oracle.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2013-01-31 19:56:35 -05:00
Paul Gortmaker
a786a7c0ad wanrouter: completely decouple obsolete code from kernel.
The original suggestion to delete wanrouter started earlier
with the mainline commit f0d1b3c2bc
("net/wanrouter: Deprecate and schedule for removal") in May 2012.

More importantly, Dan Carpenter found[1] that the driver had a
fundamental breakage introduced back in 2008, with commit
7be6065b39 ("netdevice wanrouter: Convert directly reference of
netdev->priv").  So we know with certainty that the code hasn't been
used by anyone willing to at least take the effort to send an e-mail
report of breakage for at least 4 years.

This commit does a decouple of the wanrouter subsystem, by going
after the Makefile/Kconfig and similar files, so that these mainline
files that we are keeping do not have the big wanrouter file/driver
deletion commit tied into their history.

Once this commit is in place, we then can remove the obsolete cyclomx
drivers and similar that have a dependency on CONFIG_WAN_ROUTER_DRIVERS.

[1] http://www.spinics.net/lists/netdev/msg218670.html

Originally-by: Joe Perches <joe@perches.com>
Cc: Dan Carpenter <dan.carpenter@oracle.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2013-01-31 19:20:33 -05:00
Linus Torvalds
bf6c8a8148 NFS client bugfixe for Linux 3.8
- Error reporting in nfs_xdev_mount incorrectly maps all errors to ENOMEM
 - Fix an NFSv4 refcounting issue
 - Fix a mount failure when the server reboots during NFSv4 trunking discovery
 - NFSv4.1 mounts may need to run the lease recovery thread.
 - Don't silently fail setattr() requests on mountpoints
 - Fix a SUNRPC socket/transport livelock and priority queue issue
 - We must handle NFS4ERR_DELAY when resetting the NFSv4.1 session.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.13 (GNU/Linux)
 
 iQIcBAABAgAGBQJRCpS4AAoJEGcL54qWCgDyqucP/2CTv5leu+X5/0PXBOykAIHg
 8oEsTEz7/4IvIxTXzuHDYirMnm/mulfGF6NrdPqfvxpAHqRfVBfLFocfNLMVhQci
 97RmBfEEGM22AToYUubML5bIxr0QllV4s9Vmyh/zGDan52y7zNNlZX+v6aLjZbJB
 Fbolihpcch6lhQEUNAzK0B0ddimDl9lazx/WTmMOD/JrwOqzA4FJC+YxBe88nfzQ
 c6sYyEptBaSirbCOlueqGpv8skB1CLpFJXguXToPXFxpWed6uoGrIwLO7MLUdFpJ
 Xw+j8cuv/wjyYJGVKjhW7kXtwK8T7+u4bT2L883R01XYXr8XfkkLON0dgG1X/unk
 80mLzCO1+qRdoDSQ4b/V4B0nScPRCJuoZpftjCi2uhKewNcxQPMZ2V5/D7pO3uyE
 NdhxByB8D86JfNrIcBcRaxfuiQsurQBDvsDNmWPBZSOmH/dmHqTQGLcIe6N94A0B
 c7KFtXrN2MzOl8S68dUpbhftObq9X0oK2oxFFLWRQoqjrFtiDLU5JilV0bEH2CMo
 gJX7CRrPoJ2wmNToKdPJRWBYDmtMMIThq3vIpCj1FFzX17r5grJpqCmp/TViu/ER
 r8rmJzni+nmHaO1NLJMTCHzLJ8soiKyBq8PZKAbipTJxy+TXI/69jnWzpIQ/pbM/
 JN+0tiCcqmUmXsO+hrCP
 =lWFV
 -----END PGP SIGNATURE-----

Merge tag 'nfs-for-3.8-4' of git://git.linux-nfs.org/projects/trondmy/linux-nfs

Pull NFS client bugfixes from Trond Myklebust:

 - Error reporting in nfs_xdev_mount incorrectly maps all errors to
   ENOMEM

 - Fix an NFSv4 refcounting issue

 - Fix a mount failure when the server reboots during NFSv4 trunking
   discovery

 - NFSv4.1 mounts may need to run the lease recovery thread.

 - Don't silently fail setattr() requests on mountpoints

 - Fix a SUNRPC socket/transport livelock and priority queue issue

 - We must handle NFS4ERR_DELAY when resetting the NFSv4.1 session.

* tag 'nfs-for-3.8-4' of git://git.linux-nfs.org/projects/trondmy/linux-nfs:
  NFSv4.1: Handle NFS4ERR_DELAY when resetting the NFSv4.1 session
  SUNRPC: When changing the queue priority, ensure that we change the owner
  NFS: Don't silently fail setattr() requests on mountpoints
  NFSv4.1: Ensure that nfs41_walk_client_list() does start lease recovery
  NFSv4: Fix NFSv4 trunking discovery
  NFSv4: Fix NFSv4 reference counting for trunked sessions
  NFS: Fix error reporting in nfs_xdev_mount
2013-02-01 08:43:52 +11:00
Yuchung Cheng
66555e92fb tcp: detect SYN/data drop when F-RTO is disabled
On receiving the SYN-ACK, Fast Open checks icsk_retransmit for SYN
retransmission to detect SYN/data drops. But if F-RTO is disabled,
icsk_retransmit is reset at step D of tcp_fastretrans_alert() (
under tcp_ack()) before tcp_rcv_fastopen_synack(). The fix is to use
total_retrans instead which accounts for SYN retransmission regardless
the use of F-RTO.

Signed-off-by: Yuchung Cheng <ycheng@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-01-31 14:20:07 -05:00
Tom Parkin
700163db3d l2tp: correctly handle ancillary data in the ip6 recv path
l2tp_ip6 is incorrectly using the IPv4-specific ip_cmsg_recv to handle
ancillary data.  This means that socket options such as IPV6_RECVPKTINFO are
not honoured in userspace.

Convert l2tp_ip6 to use the IPv6-specific handler.

Ref: net/ipv6/udp.c

Signed-off-by: Tom Parkin <tparkin@katalix.com>
Signed-off-by: James Chapman <jchapman@katalix.com>
Signed-off-by: Chris Elston <celston@katalix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-01-31 13:53:09 -05:00
Tom Parkin
8e72d37eb3 ipv6: export ip6_datagram_recv_ctl
ip6_datagram_recv_ctl and ip6_datagram_send_ctl are used for handling IPv6
ancillary data.  Since ip6_datagram_send_ctl is already publicly exported for
use in modules, ip6_datagram_recv_ctl should also be available to support
ancillary data in the receive path.

Signed-off-by: Tom Parkin <tparkin@katalix.com>
Signed-off-by: James Chapman <jchapman@katalix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-01-31 13:53:08 -05:00
Tom Parkin
73df66f8b1 ipv6: rename datagram_send_ctl and datagram_recv_ctl
The datagram_*_ctl functions in net/ipv6/datagram.c are IPv6-specific.  Since
datagram_send_ctl is publicly exported it should be appropriately named to
reflect the fact that it's for IPv6 only.

Signed-off-by: Tom Parkin <tparkin@katalix.com>
Signed-off-by: James Chapman <jchapman@katalix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-01-31 13:53:08 -05:00
Andre Guedes
4c02e2d444 Bluetooth: Fix hci_conn timeout routine
If occurs a LE or SCO hci_conn timeout and the connection is already
established (BT_CONNECTED state), the connection is not terminated as
expected. This bug can be reproduced using l2test or scotest tool.
Once the connection is established, kill l2test/scotest and the
connection won't be terminated.

This patch fixes hci_conn_disconnect helper so it is able to
terminate LE and SCO connections, as well as ACL.

Signed-off-by: Andre Guedes <andre.guedes@openbossa.org>
Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>
2013-01-31 15:38:02 -02:00
Johan Hedberg
8cf9fa1240 Bluetooth: Fix handling of unexpected SMP PDUs
The conn->smp_chan pointer can be NULL if SMP PDUs arrive at unexpected
moments. To avoid NULL pointer dereferences the code should be checking
for this and disconnect if an unexpected SMP PDU arrives. This patch
fixes the issue by adding a check for conn->smp_chan for all other PDUs
except pairing request and security request (which are are the first
PDUs to come to initialize the SMP context).

Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
CC: stable@vger.kernel.org
Acked-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>
2013-01-31 15:35:42 -02:00
YOSHIFUJI Hideaki / 吉藤英明
c33e7b05f1 ipv6 anycast: Convert ipv6_sk_ac_lock to spinlock.
Since all users are write-lock, it does not make sense to use
rwlock here.  Use simple spinlock.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-01-30 22:41:13 -05:00
YOSHIFUJI Hideaki / 吉藤英明
18367681a1 ipv6 flowlabel: Convert np->ipv6_fl_list to RCU.
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-01-30 22:41:13 -05:00
YOSHIFUJI Hideaki / 吉藤英明
d3aedd5ebd ipv6 flowlabel: Convert hash list to RCU.
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-01-30 22:41:13 -05:00
YOSHIFUJI Hideaki / 吉藤英明
f256dc59d0 ipv6 flowlabel: Ensure to take lock when modifying np->ip6_sk_fl_list.
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-01-30 22:41:12 -05:00
Marcelo Ricardo Leitner
bd30e94720 ipv6: do not create neighbor entries for local delivery
They will be created at output, if ever needed. This avoids creating
empty neighbor entries when TPROXYing/Forwarding packets for addresses
that are not even directly reachable.

Note that IPv4 already handles it this way. No neighbor entries are
created for local input.

Tested by myself and customer.

Signed-off-by: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: Marcelo Ricardo Leitner <mleitner@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-01-30 20:26:07 -05:00
Trond Myklebust
edd2e36fe8 SUNRPC: When changing the queue priority, ensure that we change the owner
This fixes a livelock in the xprt->sending queue where we end up never
making progress on lower priority tasks because sleep_on_priority()
keeps adding new tasks with the same owner to the head of the queue,
and priority bumps mean that we keep resetting the queue->owner to
whatever task is at the head of the queue.

Regression introduced by commit c05eecf636
(SUNRPC: Don't allow low priority tasks to pre-empt higher priority ones).

Reported-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-01-30 17:45:14 -05:00
John W. Linville
20fb9e5033 Merge branch 'for-john' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211-next 2013-01-30 14:22:19 -05:00
John W. Linville
0f496df2d9 Merge branch 'for-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth-next 2013-01-30 14:21:04 -05:00