Commit Graph

28935 Commits

Author SHA1 Message Date
Daniel Borkmann
b527fe6933 net: sctp: minor: sctp_seq_dump_local_addrs add missing newline
A trailing newline has been forgotten to add into the WARN().

Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Acked-by: Vlad Yasevich <vyasevich@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-25 16:33:04 -07:00
Daniel Borkmann
52db882f3f net: sctp: migrate cookie life from timeval to ktime
Currently, SCTP code defines its own timeval functions (since timeval
is rarely used inside the kernel by others), namely tv_lt() and
TIMEVAL_ADD() macros, that operate on SCTP cookie expiration.

We might as well remove all those, and operate directly on ktime
structures for a couple of reasons: ktime is available on all archs;
complexity of ktime calculations depending on the arch is less than
(reduces to a simple arithmetic operations on archs with
BITS_PER_LONG == 64 or CONFIG_KTIME_SCALAR) or equal to timeval
functions (other archs); code becomes more readable; macros can be
thrown out.

Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Acked-by: Vlad Yasevich <vyasevich@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-25 16:33:04 -07:00
Hannes Frederic Sowa
2b9651d72d ipv6: remove old token ipv6 address as soon as possible
If the tokenized ip address is re-set on an interface we depend on the
arrival of a new router advertisment to call addrconf_verify to clean
up the old address (which valid_lft is now set to 0). Old addresses can
linger around for a longer time if e.g. the source of router advertisments
vanishes.

So, call addrconf_verify immediately after setting the new tokenized
address to get rid of the old tokenized addresses.

Cc: Daniel Borkmann <dborkman@redhat.com>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Acked-by: Daniel Borkmann <dborkman@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-25 16:28:24 -07:00
Hannes Frederic Sowa
dc8482926e ipv6: check return value of ipv6_get_lladdr
We should check the return value of ipv6_get_lladdr in inet6_set_iftoken.

A possible situation, which could leave ll_addr unassigned is, when
the user removed her link-local address but a global scoped address was
already set. In this case the interface would still be IF_READY and not
dead. In that case the RS source address is some value from the stack.

v2: Daniel Borkmann noted a small indent inconstancy; no semantic
changes.

Cc: Daniel Borkmann <dborkman@redhat.com>
Acked-by: Daniel Borkmann <dborkman@redhat.com>
Reviewed-by: Flavio Leitner <fbl@redhat.com>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-25 16:27:28 -07:00
Hannes Frederic Sowa
876fd05ddb ipv6: don't disable interface if last ipv6 address is removed
The reason behind this change is that as soon as we delete
the last ipv6 address of an interface we also lose the
/proc/sys/net/ipv6/conf/<interface> directory. This seems to be a
usability problem for me.

I don't see any reason why we should shutdown ipv6 on that interface in
such cases.

Cc: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-25 16:23:03 -07:00
Hannes Frederic Sowa
b7b1bfce0b ipv6: split duplicate address detection and router solicitation timer
This patch splits the timers for duplicate address detection and router
solicitations apart. The router solicitations timer goes into inet6_dev
and the dad timer stays in inet6_ifaddr.

The reason behind this patch is to reduce the number of unneeded router
solicitations send out by the host if additional link-local addresses
are created. Currently we send out RS for every link-local address on
an interface.

If the RS timer fires we pick a source address with ipv6_get_lladdr. This
change could hurt people adding additional link-local addresses and
specifying these addresses in the radvd clients section because we
no longer guarantee that we use every ll address as source address in
router solicitations.

Cc: Flavio Leitner <fleitner@redhat.com>
Cc: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org>
Cc: David Stevens <dlstevens@us.ibm.com>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Reviewed-by: Flavio Leitner <fbl@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-25 16:23:03 -07:00
Eric Dumazet
bd8a7036c0 gre: fix a possible skb leak
commit 68c3316311 ("v4 GRE: Add TCP segmentation offload for GRE")
added a possible skb leak, because it frees only the head of segment
list, in case a skb_linearize() call fails.

This patch adds a kfree_skb_list() helper to fix the bug.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Pravin B Shelar <pshelar@nicira.com>
Cc: Daniel Borkmann <dborkman@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-25 16:07:44 -07:00
David S. Miller
2b7a5db060 Merge branch 'for-davem' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless
John W. Linville says:

====================
A few more late-breaking fixes hoping for 3.10...

Regarding the Bluetooth fix, Gustavo says:

"A important fix to 3.10, this patch fixes an issues that was preventing
the l2cap info response command to be handled properly."

Also for that Bluetooth fix, Johan adds:

"Once the code gives up parsing this PDU it also gives up essential
parts of the L2CAP connection creation process, i.e. without this
patch the stack will fail to establish connections properly."

Moving onto ath9k, Felix Fietkau fixes an RCU locking issue in
the transmit path.  As for ath9k_htc, Sujith Manoharan fixes some
authentication timeouts by ensuring that a chip reset is done when
IDLE is turned off.

I think these are all micro-fixes that shouldn't cause any trouble.
Please let me know if there are problems!
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-25 16:04:35 -07:00
YOSHIFUJI Hideaki / 吉藤英明
ab4eb3537e ipv6: Process unicast packet with Router Alert by checking flag in skb.
Router Alert option is marked in skb.
Previously, IP6CB(skb)->ra was set to positive value for such packets.
Since commit dd3332bf ("ipv6: Store Router Alert option in IP6CB
directly."), IP6SKB_ROUTERALERT is set in IP6CB(skb)->flags, and
the value of Router Alert option (in network byte order) is set
to IP6CB(skb)->ra for such packets.

Multicast forwarding path uses that flag and value, but unicast
forwarding path does not use the flag and misuses IP6CB(skb)->ra
value.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-25 14:47:22 -07:00
John W. Linville
9d5c34f568 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless into for-davem 2013-06-25 13:24:12 -04:00
Mike Rapoport
f693dff710 rtnetlink: allow using zero MAC address in rtnl_fdb_{add,del}
This is required for multiple default destinations management in VXLAN

Signed-off-by: Mike Rapoport <mike.rapoport@ravellosystems.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2013-06-25 09:31:39 -07:00
Eric Dumazet
6da334ee0c ipv6: add include file to suppress sparse warnings
commit f88c91ddba ("ipv6: statically link
register_inet6addr_notifier()" added following sparse warnings :

net/ipv6/addrconf_core.c:83:5: warning: symbol
'register_inet6addr_notifier' was not declared. Should it be static?
net/ipv6/addrconf_core.c:89:5: warning: symbol
'unregister_inet6addr_notifier' was not declared. Should it be static?
net/ipv6/addrconf_core.c:95:5: warning: symbol
'inet6addr_notifier_call_chain' was not declared. Should it be static?

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Cong Wang <amwang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-25 02:44:05 -07:00
Daniel Borkmann
bcbde0d449 net: netlink: virtual tap device management
Similarly to the networking receive path with ptype_all taps, we add
the possibility to register netdevices that are for ARPHRD_NETLINK to
the netlink subsystem, so that those can be used for netlink analyzers
resp. debuggers. We do not offer a direct callback function as out-of-tree
modules could do crap with it. Instead, a netdevice must be registered
properly and only receives a clone, managed by the netlink layer. Symbols
are exported as GPL-only.

Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-24 16:39:05 -07:00
David S. Miller
a3d9dd89b7 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf
Pablo Neira Ayuso says:

====================
The following patchset contains five fixes for Netfilter/IPVS, they are:

* A skb leak fix in fragmentation handling in case that helpers are in place,
  it occurs since the IPV6 NAT infrastructure, from Phil Oester.

* Fix SCTP port mangling in ICMP packets for IPVS, from Julian Anastasov.

* Fix event delivery in ctnetlink regarding the new connlabel infrastructure,
  from Florian Westphal.

* Fix mangling in the SIP NAT helper, from Balazs Peter Odor.

* Fix crash in ipt_ULOG introduced while adding netnamespace support,
  from Gao Feng.

I'll take care of passing several of these patches to -stable once they hit
Linus' tree.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-24 12:45:24 -07:00
John W. Linville
9fbdc75116 Merge branch 'for-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth-next 2013-06-24 14:45:50 -04:00
John W. Linville
66ba271ab9 Merge branch 'for-john' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211-next 2013-06-24 14:44:59 -04:00
John W. Linville
57bf74407b Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth 2013-06-24 13:53:15 -04:00
Gao feng
c8fc51cfa7 netfilter: ipt_ULOG: fix incorrect setting of ulog timer
The parameter of setup_timer should be &ulog->nlgroup[i].
the incorrect parameter will cause kernel panic in
ulog_timer.

Bug introducted in commit 355430671a
"netfilter: ipt_ULOG: add net namespace support for ipt_ULOG"

ebt_ULOG doesn't have this problem.

[ I have mangled this patch to fix nlgroup != 0 case, we were
  also crashing there --pablo ]

Tested-by: George Spelvin <linux@horizon.com>
Reported-by: Borislav Petkov <bp@alien8.de>
Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2013-06-24 17:10:44 +02:00
Thomas Pedersen
6c7c4cbfd5 mac80211: initialize power mode for mesh STAs
Previously the default mesh STA nonpeer power mode was
UNKNOWN (0) make the default mesh STA power mode ACTIVE,
to prevent unnecessary  frame buffering while peering is
not yet complete. Fixes a panic in ath9k_htc when adding
stations from userspace, and mcast buffered frames are
later released.

Thanks to Bob Copeland for his help debugging this.

Signed-off-by: Thomas Pedersen <thomas@cozybit.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2013-06-24 15:59:20 +02:00
Thomas Pedersen
ac49e1a896 mac80211: allow self-protected frame tx without sta
Useful for userspace mesh to authenticate and peer without
a station entry, since both steps may fail anyway.

Signed-off-by: Thomas Pedersen <thomas@cozybit.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2013-06-24 15:59:20 +02:00
Arend van Spriel
a33d402610 cfg80211: fix compilation warning for cfg80211_leave_all()
The following compilation issue popped up moving from v3.10-rc1 to
v3.10-rc6 after merging wireless-testing.

net/wireless/sysfs.c:86:13: error: 'cfg80211_leave_all' defined
but not used [-Werror=unused-function]

The function is only called when CONFIG_PM is enabled. Moving the
function under CONFIG_PM as well.

Signed-off-by: Arend van Spriel <arend@broadcom.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2013-06-24 15:57:32 +02:00
Ben Greear
f9bef3df52 wireless: check for dangling wdev->current_bss pointer
If it *is* still set when the netdev is being deleted,
then we are about to leak a pointer.  Warn and clean up
in that case.

Signed-off-by: Ben Greear <greearb@candelatech.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2013-06-24 15:55:36 +02:00
Ben Greear
0e3a39b562 wireless: add comments about bss refcounting
Should help the next person that tries to understand
the bss refcounting logic.

Signed-off-by: Ben Greear <greearb@candelatech.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2013-06-24 15:54:45 +02:00
Ben Greear
6f390908e5 wireless: Make sure __cfg80211_connect_result always puts bss
Otherwise, we can leak a bss reference.

Signed-off-by: Ben Greear <greearb@candelatech.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2013-06-24 15:51:22 +02:00
Florian Westphal
797a7d66d2 netfilter: ctnetlink: send event when conntrack label was modified
commit 0ceabd8387
(netfilter: ctnetlink: deliver labels to userspace) sets the event bit
when we raced with another packet, instead of raising the event bit
when the label bit is set for the first time.

commit 9b21f6a909
(netfilter: ctnetlink: allow userspace to modify labels) forgot to update
the event mask in the "conntrack already exists" case.

Both issues result in CTA_LABELS attribute not getting included in the
conntrack event.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2013-06-24 11:32:56 +02:00
Balazs Peter Odor
5aed93875c netfilter: nf_nat_sip: fix mangling
In (b20ab9c netfilter: nf_ct_helper: better logging for dropped packets)
there were some missing brackets around the logging information, thus
always returning drop.

Closes https://bugzilla.kernel.org/show_bug.cgi?id=60061

Signed-off-by: Balazs Peter Odor <balazs@obiserver.hu>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2013-06-24 11:32:40 +02:00
Wedson Almeida Filho
aeb193ea6c net: Unmap fragment page once iterator is done
Callers of skb_seq_read() are currently forced to call skb_abort_seq_read()
even when consuming all the data because the last call to skb_seq_read (the
one that returns 0 to indicate the end) fails to unmap the last fragment page.

With this patch callers will be allowed to traverse the SKB data by calling
skb_prepare_seq_read() once and repeatedly calling skb_seq_read() as originally
intended (and documented in the original commit 677e90eda), that is, only call
skb_abort_seq_read() if the sequential read is actually aborted.

Signed-off-by: Wedson Almeida Filho <wedsonaf@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-24 01:46:01 -07:00
David S. Miller
7e2f934dc5 Merge branch 'for-davem' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next
John W. Linville says:

====================
I would guess that this is the last big wireless pull request before
the 3.11 merge window...

Regarding the mac80211 bits, Johannes says:

"I have a number of mesh fixes and improvements from Colleen, Jacob,
Ashok and Thomas, powersave fixes in mac80211 from Alex, improved
management-TX from Antonio, and a few various things, including locking
fixes, from others and myself. Overall though, nothing really stands
out."

As for the iwlwifi bits, Johannes says:

"Emmanuel contributed two AP mode fixes, removed an unused field, fixed a
comment and added a warning for something that shouldn't happen in
practice, and I removed the declaration of a function that doesn't even
exist and cleaned up a small include."

"This time I have a number of cleanups, a small fix from Emmanuel and two
performance improvements that combined reduce our driver's CPU
utilisation as much as 75% in high TX-throughput scenarios."

"These two patches fix two issues with using rfkill randomly during
traffic, which would then cause our driver to stop working and not be
able to recover at all."

Regarding the ath6kl bits, Kalle says:

"Here are few simple patches for ath6kl. We have a suspend crash fix for
USB from Shafi, use of mac_pton(), a compiler warning fix and a fix for
module initialisation error path."

Kalle also sends the biggest single item of note, the new ath10k
driver for Qualcomm Atheros 802.11ac CQA98xx devices.

Included is an NFC pull, of which Samuel says:

"These are the pending NFC patches for the 3.11 merge window.

It contains the pending fixes that were on nfc-fixes (nfc-fixes-3.10-2),
along with a few more for the pn544 and pn533 drivers, the LLCP
disconnection path and an LLCP memory leak.

Highlights for this one are:

- An initial secure element API. NFC chipsets can carry an embedded
  secure element or get access to the SIM one. In both cases they
  control the secure elements and this API provides a way to discover,
  enable and disable the available SEs. It also exports that to
  userspace in order for SE focused middleware to actually do something
  with them (e.g. payments).

- NCI over SPI support. SPI is the most complex NCI specified transport
  layer and we now have support for it in the kernel. The next step will
  be to implement drivers for NCI chipsets using this transport like
  e.g. bcm2079x.

- NFC p2p hardware simulation driver. We now have an nfcsim driver that
  is mostly a loopback device between 2 NFC interfaces. It also
  implements the rest of the NFC core API like polling and target
  detection. This driver, with neard running on top of it, allows us to
  completely test the LLCP, SNEP and Handover implementation without
  physical hardware.

- A Firmware update netlink API. Most (All ?) HCI chipsets have a
  special firmware update mode where applications can push a new
  firmware that will be flashed. We now have a netlink API for providing
  that mode to e.g. nfctool."

On top of all that, there are a variety of updates to brcmfmac,
iwlegacy, rtlwifi, wil6210, and the TI wl12xx drivers.  As usual,
the bcma and ssb busses get a little love as well, as do a handful
of others here and there.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-24 00:31:02 -07:00
Pravin B Shelar
479b1a5825 openvswitch: Use correct config guard.
This bug was introduced by commit aa310701e7
(openvswitch: Add gre tunnel support.)

Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-24 00:16:46 -07:00
Cong Wang
7c77602f57 bridge: fix a typo in comments
Cc: Stephen Hemminger <stephen@networkplumber.org>
Cc: "David S. Miller" <davem@davemloft.net>
Signed-off-by: Cong Wang <amwang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-24 00:15:54 -07:00
Eric Dumazet
60877a32bc net: allow large number of tx queues
netif_alloc_netdev_queues() uses kcalloc() to allocate memory
for the "struct netdev_queue *_tx" array.

For large number of tx queues, kcalloc() might fail, so this
patch does a fallback to vzalloc().

As vmalloc() adds overhead on a critical network path, add __GFP_REPEAT
to kzalloc() flags to do this fallback only when really needed.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-23 23:56:55 -07:00
Asias He
a49dd9dcb5 VSOCK: Fix VSOCK_HASH and VSOCK_CONN_HASH
If we mod with VSOCK_HASH_SIZE -1, we get 0, 1, .... 249.  Actually, we
have vsock_bind_table[0 ... 250] and vsock_connected_table[0 .. 250].
In this case the last entry will never be used.

We should mod with VSOCK_HASH_SIZE instead.

Signed-off-by: Asias He <asias@redhat.com>
Acked-by: Andy King <acking@vmware.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-23 23:51:48 -07:00
Asias He
0fc9324676 VSOCK: Remove unnecessary label
Signed-off-by: Asias He <asias@redhat.com>
Acked-by: Andy King <acking@vmware.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-23 23:51:48 -07:00
Asias He
dce1a28777 VSOCK: Return VMCI_ERROR_NO_MEM when fails to allocate skb
vmci_transport_recv_dgram_cb always return VMCI_SUCESS even if we fail
to allocate skb, return VMCI_ERROR_NO_MEM instead.

Signed-off-by: Asias He <asias@redhat.com>
Acked-by: Andy King <acking@vmware.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-23 23:51:48 -07:00
Asias He
b3a6dfe817 VSOCK: Introduce vsock_auto_bind helper
This peace of code is called three times, let's have a helper for it.

Signed-off-by: Asias He <asias@redhat.com>
Acked-by: Andy King <acking@vmware.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-23 23:51:48 -07:00
Cong Wang
b33698e267 ipv6: remove a useless pr_info() in addrconf_gre_config()
This is debug info, should at least be pr_debug(), but given
that this code is in upstream for two years, there is no
need to keep this debugging printk any more, so just remove it.

Cc: Stephen Hemminger <stephen@networkplumber.org>
Cc: "David S. Miller" <davem@davemloft.net>
Signed-off-by: Cong Wang <amwang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-23 18:46:36 -07:00
Gustavo Padovan
b8f4e06800 Bluetooth: Improve comments on the HCI_Delete_Store_Link_Key issue
Some Bluetooth controllers doesn't support this command so we first
need to check for its support before sending it. This patch adds a
lengthful commentary about this.

Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>
2013-06-23 03:05:47 +01:00
Jaganath Kanakkassery
3f6fa3d489 Bluetooth: Fix invalid length check in l2cap_information_rsp()
The length check is invalid since the length varies with type of
info response.

This was introduced by the commit cb3b3152b2

Because of this, l2cap info rsp is not handled and command reject is sent.

> ACL data: handle 11 flags 0x02 dlen 16
        L2CAP(s): Info rsp: type 2 result 0
          Extended feature mask 0x00b8
            Enhanced Retransmission mode
            Streaming mode
            FCS Option
            Fixed Channels
< ACL data: handle 11 flags 0x00 dlen 10
        L2CAP(s): Command rej: reason 0
          Command not understood

Cc: stable@vger.kernel.org
Signed-off-by: Jaganath Kanakkassery <jaganath.k@samsung.com>
Signed-off-by: Chan-Yeol Park <chanyeol.park@samsung.com>
Acked-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>
2013-06-23 00:24:58 +01:00
Chen Gang
673e1dd7ed Bluetooth: hidp: using strlcpy instead of strncpy, also beautify code.
For NULL terminated string, need always let it ended by zero.

Since have already called memcpy() to initialize 'ci', so need not
redundant initialization.

Better use ''if(session->hid) {} else if(session->input) {}"" instead
of ''if(session->hid) {}; if(session->input) {};''

Signed-off-by: Chen Gang <gang.chen@asianux.com>
Reviewed-by: David Herrmann <dh.herrmann@gmail.com>
Acked-by: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>
2013-06-23 00:23:53 +01:00
Andrei Emeltchenko
0a804654af Bluetooth: Remove unneeded flag
Remove HCI_LINK_KEYS flag since using HCI_MGMT is enough for test that
user space expects the kernel managing link keys.

Signed-off-by: Andrei Emeltchenko <andrei.emeltchenko@intel.com>
Acked-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>
2013-06-23 00:23:53 +01:00
Andrei Emeltchenko
034cbea093 Bluetooth: Use HCI_MGMT instead of HCI_LINK_KEYS flag
Use HCI_MGMT flag instead of HCI_LINK_KEYS flag. There is a problem with
HCI_LINK_KEYS flag since it is set only when link keys are loaded. Otherwise
kernel assumes that old interface is used.

Signed-off-by: Andrei Emeltchenko <andrei.emeltchenko@intel.com>
Acked-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>
2013-06-23 00:23:52 +01:00
Andre Guedes
12602d0cc0 Bluetooth: Mgmt Device Found Event
We only want to send Mgmt Device Found Events if we are running the
Device Discovery procedure (started by the MGMT Start Discovery
Command). Inquiry or LE scanning triggered by HCI raw interface (e.g.
hcitool) or kernel internals should not send Mgmt Device Found Events.

Signed-off-by: Andre Guedes <andre.guedes@openbossa.org>
Acked-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>
2013-06-23 00:23:52 +01:00
Andre Guedes
8892d8beb3 Bluetooth: Remove empty event handler
This patch removes the hci_cc_le_set_scan_param event handler. This
handler became empty because failures of this event are now handled
by start_discovery_complete function in mgmt.c.

Signed-off-by: Andre Guedes <andre.guedes@openbossa.org>
Acked-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>
2013-06-23 00:23:52 +01:00
Andre Guedes
b0434345f2 Bluetooth: Remove inquiry helpers
This patch removes hci_do_inquiry and hci_cancel_inquiry helpers. We
now use the HCI request framework in device discovery functionality
and these helpers are no longer needed.

Signed-off-by: Andre Guedes <andre.guedes@openbossa.org>
Acked-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>
2013-06-23 00:23:52 +01:00
Andre Guedes
917eedc56c Bluetooth: Remove LE scan helpers
This patch removes the LE scan helpers hci_le_scan and hci_cancel_
le_scan and all code related to it. We now use the HCI request
framework in device discovery functionality and these helpers are
no longer needed.

Signed-off-by: Andre Guedes <andre.guedes@openbossa.org>
Acked-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>
2013-06-23 00:23:51 +01:00
Andre Guedes
3fd319b830 Bluetooth: Refactor hci_cc_le_set_scan_enable
This patch does a trivial refactoring in hci_cc_le_set_scan_enable.
Since start and stop discovery command failures are now handled in
mgmt layer, the status check became empty. So, we can move it to
outside the switch statement.

Signed-off-by: Andre Guedes <andre.guedes@openbossa.org>
Acked-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>
2013-06-23 00:23:51 +01:00
Andre Guedes
1183fdcad4 Bluetooth: Make mgmt_stop_discovery_failed static
mgmt_stop_discovery_failed is now only used in mgmt.c so we can
make it a local function. This patch also moves the mgmt_stop_
discovery_failed definition up in mgmt.c to avoid forward
declaration.

Signed-off-by: Andre Guedes <andre.guedes@openbossa.org>
Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>
2013-06-23 00:23:51 +01:00
Andre Guedes
82f4785ca7 Bluetooth: Remove stop discovery handling from hci_event.c
Since all mgmt stop discovery command complete events are now handled
in stop_discovery_complete callback in mgmt.c, we can remove this
handling from hci_event.c.

Signed-off-by: Andre Guedes <andre.guedes@openbossa.org>
Acked-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>
2013-06-23 00:23:51 +01:00
Andre Guedes
0e05bba6f6 Bluetooth: Update stop_discovery to use HCI request
This patch modifies the stop_discovery function so it uses the HCI
request framework.

The HCI request is built according to the current discovery state
(inquiry, LE scanning or name resolving) and a complete callback is
register to handle the command complete event for the stop discovery
command. This way, we move all stop_discovery mgmt handling code
spread in hci_event.c to a single place in mgmt.c.

Signed-off-by: Andre Guedes <andre.guedes@openbossa.org>
Acked-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>
2013-06-23 00:23:51 +01:00
Andre Guedes
4c87eaab01 Bluetooth: Use HCI request in interleaved discovery
In order to have a better HCI error handling in interleaved discovery
functionality, we should use the HCI request framework.

This patch updates le_scan_disable_work function so it uses the
HCI request framework instead of the hci_send_cmd helper. A complete
callback is registered (le_scan_disable_work_complete function) so we
are able to trigger the inquiry procedure (if we are running the
interleaved discovery) or to stop the discovery procedure (if we are
running LE-only discovery).

This patch also removes the extra logic in hci_cc_le_set_scan_enable
to trigger the inquiry procedure and the mgmt_interleaved_discovery
function since they become useless.

Signed-off-by: Andre Guedes <andre.guedes@openbossa.org>
Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>
2013-06-23 00:23:50 +01:00
Andre Guedes
0d8cc935e0 Bluetooth: Move discovery macros to hci_core.h
Some of discovery macros will be used in hci_core so we need to
define them in common place such as hci_core.h. Thus, this patch
moves discovery macros to hci_core.h and also adds the DISCOV_
prefix to them.

Signed-off-by: Andre Guedes <andre.guedes@openbossa.org>
Acked-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>
2013-06-23 00:23:50 +01:00
Andre Guedes
41dc2bd6d1 Bluetooth: Make mgmt_start_discovery_failed static
mgmt_start_discovery_failed is now only used in mgmt.c so we can
make it a local function. This patch also moves the mgmt_start_
discovery_failed definition up in mgmt.c to avoid forward
declaration.

Signed-off-by: Andre Guedes <andre.guedes@openbossa.org>
Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>
2013-06-23 00:23:50 +01:00
Andre Guedes
fef5234a79 Bluetooth: Remove start discovery handling from hci_event.c
Since all mgmt start discovery command complete events are now handled
in start_discovery_complete callback in mgmt.c, we can remove this
handling from hci_event.c.

Signed-off-by: Andre Guedes <andre.guedes@openbossa.org>
Acked-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>
2013-06-23 00:23:50 +01:00
Andre Guedes
7c3077207c Bluetooth: Update start_discovery to use HCI request
This patch modifies the start_discovery function so it uses the HCI
request framework.

We build the HCI request according to the discovery type (add inquiry
or LE scan HCI commands) and run the HCI request. We also register
the start_discovery_complete callback which handles mgmt command
complete events for this command. This way, we move all start_
discovery mgmt handling code spread in hci_event.c to a single place
in mgmt.c.

This patch also merges the LE-only and interleaved discovery type
cases since these cases are pretty much the same now.

Signed-off-by: Andre Guedes <andre.guedes@openbossa.org>
Acked-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>
2013-06-23 00:23:50 +01:00
Andre Guedes
1f9b9a5dc5 Bluetooth: Make inquiry_cache_flush non-static
In order to use HCI request framework in start_discovery, we'll need
to call inquiry_cache_flush in mgmt.c. Therefore, this patch adds the
hci_ prefix to inquiry_cache_flush and makes it non-static.

Signed-off-by: Andre Guedes <andre.guedes@openbossa.org>
Acked-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>
2013-06-23 00:23:49 +01:00
Johan Hedberg
44f3b0fbaa Bluetooth: Fix multiple LE socket handling
The LE ATT server socket needs to be superseded by any ATT client
sockets. Previously this was done by looking at the hcon->out variable
(indicating whether the connection is outgoing or incoming) which is a
too crude way of determining whether the server socket needs to be
picked or not (an outgoing connection doesn't necessarily mean that an
ATT client socket has triggered it).

This patch extends the ATT server socket lookup function
(l2cap_le_conn_ready) to be used for all LE connections (regardless of
the hcon->out value) and adds an internal check into the function for
the existence of any ATT client sockets (in which case the server socket
should be skipped). For this to work reliably all lookups must be done
while the l2cap_conn->chan_lock is held, meaning also that the call to
l2cap_chan_add needs to be changed to its lockless __l2cap_chan_add
counterpart.

Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Acked-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>
2013-06-23 00:23:49 +01:00
Johan Hedberg
0cc59a72c7 Bluetooth: Remove useless hci_conn disc_timeout setting
There's no need to reset disc_timeout in l2cap_le_conn_ready since
HCI_DISCONN_TIMEOUT is the default when the hci_conn is created and
there should be no way for it to get changed between creation and
l2cap_le_conn_ready being called.

Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Acked-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>
2013-06-23 00:23:49 +01:00
Johan Hedberg
5ee9891dd8 Bluetooth: Simplify hci_conn_hold/drop logic for L2CAP
The L2CAP code has been incrementing the hci_conn reference for each
l2cap_chan instance in the l2cap_conn list. Likewise, the reference is
dropped each time an l2cap_chan is removed from the list. The reference
counting policy with respect to removal has been clear and explicit in
the l2cap_chan_del function, however for addition the function
calling 2cap_chan_add has always had to do a separate hci_conn_hold
call.

What made the counting even more hard to follow is that the
hci_connect() procedure increments the reference and the L2CAP layer
making this call took advantage of it to use it as its own reference.

This patch aims to clarify things by having the call to hci_conn_hold
inside __l2cap_chan_add, thereby removing the need to do it in the
functions calling __l2cap_chan_add. The reference count for hci_connect
is still kept as it's necessary for users such as mgmt_pair_device,
however for the L2CAP layer it means that an extra call to hci_conn_drop
must be performed once l2cap_chan_add has been done.

Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Acked-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>
2013-06-23 00:23:49 +01:00
Johan Hedberg
af1c01349e Bluetooth: Remove unnecessary L2CAP channel state check
In l2cap_att_channel() we're only interested in the BT_CONNECTED state
so this state can directly be passed to l2cap_global_chan_by_scid().
This way there's no need to do any additional state check later.

Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Acked-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>
2013-06-23 00:23:49 +01:00
Johan Hedberg
60bac184c9 Bluetooth: Remove useless sk variable in l2cap_le_conn_ready
The sk variable is of quite little use since it's only used to simplify
access in the two bt_sk() calls.

Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Acked-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>
2013-06-23 00:23:48 +01:00
Johan Hedberg
97f57c0b14 Bluetooth: Fix duplicate call to l2cap_chan_ready()
In l2cap_le_conn_ready() after doing l2cap_chann_add() the LE channel is
part of the list which is subsequently iterated in l2cap_conn_ready() in
this loop each channel will get l2cap_chan_ready() called which would
result in trying to set the channel two times into BT_CONNECTED state.
Instead it makes sense to just add the channel but not call chan_ready
in l2cap_le_conn_ready, which is what this patch does.

Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Acked-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>
2013-06-23 00:23:48 +01:00
Johan Hedberg
d8729922b4 Bluetooth: Add clarifying comment to l2cap_conn_ready()
There is an extra call to smp_conn_security() for outgoing LE
connections from l2cap_conn_ready() but the reason for this call is far
from clear. After a bit of commit history research and using git blame I
found out that this extra call is for socket-less pairing processes
added by commit 160dc6ac1. This patch adds a clarifying comment to the
code for this.

Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Acked-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>
2013-06-23 00:23:48 +01:00
Johan Hedberg
9f22398ce4 Bluetooth: Fix hardcoding ATT CID in __l2cap_chan_add()
Since in the future more than the ATT CID may be permissible we should
not be hardcoding it for all LE connections in __l2cap_chan_add().
Instead, the source ATT CID should only be set if the destination is
also ATT, and in other cases we should just use the existing dynamic CID
allocation function.

Assigning scid based on dcid means that whenever __l2cap_chan_add() is
called that chan->dcid is properly initialized. l2cap_le_conn_ready()
wasn't initializing is properly so this is also taken care of in this
patch.

Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Acked-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>
2013-06-23 00:23:48 +01:00
Johan Hedberg
141d57065a Bluetooth: Fix EBUSY condition test in l2cap_chan_connect
The current test in l2cap_chan_connect is intended to protect against
multiple conflicting connect attempts. However, it assumes that there
will ever only be a single CID that is connected to, which is not true.
We do need to check for conflicts with connect attempts to the same
destination CID but this check is not in anyway specific to LE but can
be applied to BR/EDR as well.

Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Acked-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>
2013-06-23 00:23:47 +01:00
Johan Hedberg
f224ca5fc2 Bluetooth: Fix LE vs BR/EDR selection when connecting
The choice between LE and BR/EDR should be made on the destination
address type instead of the destination CID. This is particularly
important when in the future more than one CID will be allowed for LE.

Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Acked-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>
2013-06-23 00:23:47 +01:00
Johan Hedberg
073d1cf35f Bluetooth: Rename L2CAP_CID_LE_DATA to L2CAP_CID_ATT
In future Core Specification versions the ATT CID will be just one of
many possible CIDs that can be used for data transfer. Therefore, it
makes sense to rename the define for the ATT CID to something less
ambigous.

Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Acked-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>
2013-06-23 00:23:47 +01:00
Johan Hedberg
c5623556fc Bluetooth: Handle LE L2CAP signalling in its own function
The LE L2CAP signalling channel follows its own rules and will continue
to evolve independently from the BR/EDR signalling channel. Therefore,
it makes sense to have a clear split from BR/EDR by having a dedicated
function for handling LE signalling commands.

Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Acked-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>
2013-06-23 00:23:47 +01:00
John W. Linville
7d2a47aab2 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next into for-davem
Conflicts:
	net/wireless/nl80211.c
2013-06-21 15:42:30 -04:00
Eric Dumazet
681f130f39 netfilter: xt_socket: add XT_SOCKET_NOWILDCARD flag
xt_socket module can be a nice replacement to conntrack module
in some cases (SYN filtering for example)

But it lacks the ability to match the 3rd packet of TCP
handshake (ACK coming from the client).

Add a XT_SOCKET_NOWILDCARD flag to disable the wildcard mechanism.

The wildcard is the legacy socket match behavior, that ignores
LISTEN sockets bound to INADDR_ANY (or ipv6 equivalent)

iptables -I INPUT -p tcp --syn -j SYN_CHAIN
iptables -I INPUT -m socket --nowildcard -j ACCEPT

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Patrick McHardy <kaber@trash.net>
Cc: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2013-06-20 20:28:49 +02:00
Phil Oester
142dcdd3c2 netfilter: nf_conntrack_ipv6: Plug sk_buff leak in fragment handling
In commit 4cdd3408 ("netfilter: nf_conntrack_ipv6: improve fragmentation
handling"), an sk_buff leak was introduced when dealing with reassembled
packets by grabbing a reference to the original skb instead of the
reassembled skb.  At this point, the leak only impacted conntracks with an
associated helper.

In commit 58a317f1 ("netfilter: ipv6: add IPv6 NAT support"), the bug was
expanded to include all reassembled packets with unconfirmed conntracks.

Fix this by grabbing a reference to the proper reassembled skb.  This
closes netfilter bugzilla #823.

Signed-off-by: Phil Oester <kernel@linuxace.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2013-06-20 12:01:24 +02:00
Florian Westphal
6547a22187 netfilter: nf_conntrack: avoid large timeout for mid-stream pickup
When loose tracking is enabled (default), non-syn packets cause
creation of new conntracks in established state with default timeout for
established state (5 days).  This causes the table to fill up with UNREPLIED
when the 'new ack' packet happened to be the last-ack of a previous,
already timed-out connection.

Consider:

A 192.168.x.52792 > 10.184.y.80: F, 426:426(0) ack 9237 win 255
B 10.184.y.80 > 192.168.x.52792: ., ack 427 win 123
<61 second pause>
C 10.184.y.80 > 192.168.x.52792: F, 9237:9237(0) ack 427 win 123
D 192.168.x.52792 > 10.184.y.80: ., ack 9238 win 255

B moves conntrack to CLOSE_WAIT and will kill it after 60 second timeout,
C is ignored (FIN set), but last packet (D) causes new ct with 5-days timeout.

Use UNACK timeout (5 minutes) instead to get rid of these entries sooner
when in ESTABLISHED state without having seen traffic in both directions.

Signed-off-by: Florian Westphal <fw@strlen.de>
Acked-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2013-06-20 11:20:13 +02:00
Daniel Borkmann
130ffbc263 netfilter: check return code from nla_parse_tested
These are the only calls under net/ that do not check nla_parse_nested()
for its error code, but simply continue execution. If parsing of netlink
attributes fails, we should return with an error instead of continuing.
In nearly all of these calls we have a policy attached, that is being
type verified during nla_parse_nested(), which we would miss checking
for otherwise.

Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2013-06-20 11:20:13 +02:00
Rami Rosen
af92e5425e inet: frag , remove an empty ifdef.
This patch removes an empty ifdef from inet_frag_intern()
in net/ipv4/inet_fragment.c.

commit b67bfe0d42
(hlist: drop the node parameter from iterators) removed hlist from
net/ipv4/inet_fragment.c, but did not remove the enclosing ifdef command,
which is now empty.

Signed-off-by: Rami Rosen <ramirose@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-19 23:06:52 -07:00
Eric Dumazet
c9364636dc htb: refactor struct htb_sched fields for performance
htb_sched structures are big, and source of false sharing on SMP.

Every time a packet is queued or dequeue, many cache lines must be
touched because structures are not lay out properly.

By carefully splitting htb_sched in two parts, and define sub structures
to increase data locality, we can improve performance dramatically on
SMP.

New htb_prio structure can also be used in htb_class to increase data
locality.

I got 26 % performance increase on a 24 threads machine, with 200
concurrent netperf in TCP_RR mode, using a HTB hierarchy of 4 classes.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Tom Herbert <therbert@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-19 23:06:52 -07:00
Cong Wang
bcefe17cff tcp: introduce a per-route knob for quick ack
In previous discussions, I tried to find some reasonable heuristics
for delayed ACK, however this seems not possible, according to Eric:

	"ACKS might also be delayed because of bidirectional
	traffic, and is more controlled by the application
	response time. TCP stack can not easily estimate it."

	"ACK can be incredibly useful to recover from losses in
	a short time.

	The vast majority of TCP sessions are small lived, and we
	send one ACK per received segment anyway at beginning or
	retransmits to let the sender smoothly increase its cwnd,
	so an auto-tuning facility wont help them that much."

and according to David:

	"ACKs are the only information we have to detect loss.

	And, for the same reasons that TCP VEGAS is fundamentally
	broken, we cannot measure the pipe or some other
	receiver-side-visible piece of information to determine
	when it's "safe" to stretch ACK.

	And even if it's "safe", we should not do it so that losses are
	accurately detected and we don't spuriously retransmit.

	The only way to know when the bandwidth increases is to
	"test" it, by sending more and more packets until drops happen.
	That's why all successful congestion control algorithms must
	operate on explicited tested pieces of information.

	Similarly, it's not really possible to universally know if
	it's safe to stretch ACK or not."

It still makes sense to enable or disable quick ack mode like
what TCP_QUICK_ACK does.

Similar to TCP_QUICK_ACK option, but for people who can't
modify the source code and still wants to control
TCP delayed ACK behavior. As David suggested, this should belong
to per-path scope, since different pathes may want different
behaviors.

Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Rick Jones <rick.jones2@hp.com>
Cc: Stephen Hemminger <stephen@networkplumber.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Thomas Graf <tgraf@suug.ch>
CC: David Laight <David.Laight@ACULAB.COM>
Signed-off-by: Cong Wang <amwang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-19 23:06:51 -07:00
Gao feng
a881ae1f62 ipv6: don't call addrconf_dst_alloc again when enable lo
If we disable all of the net interfaces, and enable
un-lo interface before lo interface, we already allocated
the addrconf dst in ipv6_add_addr. So we shouldn't allocate
it again when we enable lo interface.

Otherwise the message below will be triggered.
unregister_netdevice: waiting for sit1 to become free. Usage count = 1

This problem is introduced by commit 25fb6ca4ed
"net IPv6 : Fix broken IPv6 routing table after loopback down-up"

Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-19 23:04:50 -07:00
Dave Jones
2c0740e4e1 sctp: Convert __list_for_each use to list_for_each
Signed-off-by: Dave Jones <davej@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-19 23:02:49 -07:00
Weiping Pan
9ef71e0c82 tcp:typo unset should be unsent
Signed-off-by: Weiping Pan <wpan@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-19 22:21:09 -07:00
Aydin Arik
c0353c7b5d ipv4: Fixed MD5 key lookups when adding/ removing MD5 to/ from TCP sockets.
MD5 key lookups on a given TCP socket were being performed
incorrectly. This fix alters parameter inputs to the MD5
lookup function tcp_md5_do_lookup, which is called by functions
tcp_md5_do_add and tcp_md5_do_del. Specifically, the change now
inputs the correct address and address family required to make
a proper lookup.

Signed-off-by: Aydin Arik <aydin.arik@alliedtelesis.co.nz>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-19 21:21:53 -07:00
Nicolas Dichtel
c2ff682a6f sit: fix an oops when IFLA_IPTUN_PROTO is not set
The use of this attribute has been added in 32b8a8e59c (sit: add IPv4 over
IPv4 support). It is optional, by default proto is IPPROTO_IPV6.

Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-19 21:18:17 -07:00
Gao feng
dc25c676f5 neigh: disallow un-init_net to change thresh of neigh
thresh and interval are global resources,
only init net can change them.

Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-19 21:13:24 -07:00
Gao feng
170d6f9954 neigh: only allow init_net to change the default neigh_parms
Though we don't export the /proc/sys/net/ipv[4,6]/neigh/default/
directory to the un-init_net, but we can still use cmd such as
"ip ntable change name arp_cache locktime 129" to change the locktime
of default neigh_parms.

This patch disallows the un-init_net to find out the neigh_table.parms.
So the un-init_net will failed to influence the init_net.

Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-19 21:13:24 -07:00
Gao feng
cf89d6b280 neigh: no need to call lookup_neigh_parms in neigh_parms_alloc
neigh_table.parms always exist and is initialized,kmemdup
can use it to create new neigh_parms, actually lookup_neigh_parms
here will return neigh_table.parms too.

Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-19 21:13:24 -07:00
Pravin B Shelar
aa310701e7 openvswitch: Add gre tunnel support.
Add gre vport implementation.  Most of gre protocol processing
is pushed to gre module. It make use of gre demultiplexer
therefore it can co-exist with linux device based gre tunnels.

Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Acked-by: Jesse Gross <jesse@nicira.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-19 18:07:42 -07:00
Pravin B Shelar
a3e82996a8 openvswitch: Optimize flow key match for non tunnel flows.
Following patch adds start offset for sw_flow-key, so that we can
skip tunneling information in key for non-tunnel flows.

Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Acked-by: Jesse Gross <jesse@nicira.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-19 18:07:41 -07:00
Pravin B Shelar
ffe3f43217 openvswitch: Expand action buffer size.
MAX_ACTIONS_BUFSIZE limits action list size, set tunnel action
needs extra space on action list, for now increase max actions list limit.

Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Acked-by: Jesse Gross <jesse@nicira.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-19 18:07:41 -07:00
Pravin B Shelar
7d5437c709 openvswitch: Add tunneling interface.
Add ovs tunnel interface for set tunnel action for userspace.

Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Acked-by: Jesse Gross <jesse@nicira.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-19 18:07:41 -07:00
Pravin B Shelar
74f84a5726 openvswitch: Copy individual actions.
Rather than validating actions and then copying all actiaons
in one block, following patch does same operation in single pass.
This validate and copy action one by one. This is required for
ovs tunneling patch.

This patch does not change any functionality.

Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Acked-by: Jesse Gross <jesse@nicira.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-19 18:07:41 -07:00
Pravin B Shelar
3d7b46cd20 ip_tunnel: push generic protocol handling to ip_tunnel module.
Process skb tunnel header before sending packet to protocol handler.
this allows code sharing between gre and ovs gre modules.

Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-19 18:07:41 -07:00
Pravin B Shelar
0e6fbc5b6c ip_tunnels: extend iptunnel_xmit()
Refactor various ip tunnels xmit functions and extend iptunnel_xmit()
so that there is more code sharing.

Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-19 18:07:41 -07:00
Pravin B Shelar
45f2e9976c gre: export gre_handle_offloads() function.
This is required for OVS GRE offloading.

Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-19 18:07:41 -07:00
Pravin B Shelar
752f36da68 gre: export gre_build_header() function.
This is required for ovs gre module.

Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-19 18:07:40 -07:00
Pravin B Shelar
bda7bb4634 gre: Allow multiple protocol listener for gre protocol.
Currently there is only one user is allowed to register for gre
protocol.  Following patch adds de-multiplexer.  So that multiple
modules can listen on gre protocol e.g. kernel gre devices and ovs.

Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-19 18:07:40 -07:00
Pravin B Shelar
20fd4d1f04 gre: Simplify gre protocol registration locking.
Use cmpxchg() for atomic protocol registration which saves
code and data space.

Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-19 18:07:40 -07:00
David S. Miller
d98cae64e4 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Conflicts:
	drivers/net/wireless/ath/ath9k/Kconfig
	drivers/net/xen-netback/netback.c
	net/batman-adv/bat_iv_ogm.c
	net/wireless/nl80211.c

The ath9k Kconfig conflict was a change of a Kconfig option name right
next to the deletion of another option.

The xen-netback conflict was overlapping changes involving the
handling of the notify list in xen_netbk_rx_action().

Batman conflict resolution provided by Antonio Quartulli, basically
keep everything in both conflict hunks.

The nl80211 conflict is a little more involved.  In 'net' we added a
dynamic memory allocation to nl80211_dump_wiphy() to fix a race that
Linus reported.  Meanwhile in 'net-next' the handlers were converted
to use pre and post doit handlers which use a flag to determine
whether to hold the RTNL mutex around the operation.

However, the dump handlers to not use this logic.  Instead they have
to explicitly do the locking.  There were apparent bugs in the
conversion of nl80211_dump_wiphy() in that we were not dropping the
RTNL mutex in all the return paths, and it seems we very much should
be doing so.  So I fixed that whilst handling the overlapping changes.

To simplify the initial returns, I take the RTNL mutex after we try
to allocate 'tb'.

Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-19 16:49:39 -07:00
John W. Linville
4067c666f2 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless into for-davem 2013-06-19 15:24:36 -04:00
John W. Linville
2f2a8846d5 Merge branch 'for-john' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211 2013-06-19 14:50:44 -04:00
Johannes Berg
f1940c5730 cfg80211: hold BSS over association process
This fixes the potential issue that the BSS struct that we use
and later assign to wdev->current_bss is removed from the scan
list while associating.

Also warn when we don't have a BSS struct in connect_result
unless it's from a driver that only has the connect() API.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2013-06-19 18:55:39 +02:00
Johannes Berg
959867fa55 cfg80211: require passing BSS struct back to cfg80211_assoc_timeout
Doing so will allow us to hold the BSS (not just ref it) over the
association process, thus ensuring that it doesn't time out and
gets invisible to the user (e.g. in 'iw wlan0 link'.)

This also fixes a leak in mac80211 where it doesn't always release
the BSS struct properly in all cases where calling this function.
This leak was reported by Ben Greear.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2013-06-19 18:55:39 +02:00
Johannes Berg
86e8cf98de nl80211: use small state buffer for wiphy_dump
Avoid parsing the original dump message again and again by
allocating a small state struct that is used by the functions
involved in the dump, storing this struct in cb->args[0].
This reduces the memory allocation size as well.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2013-06-19 18:55:38 +02:00
Johannes Berg
f93beba705 Merge remote-tracking branch 'mac80211/master' into HEAD
Merge mac80211 to avoid conflicts with the nl80211 attrbuf
changes.

Conflicts:
	net/mac80211/iface.c
	net/wireless/nl80211.c

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2013-06-19 18:55:12 +02:00
Johannes Berg
3a5a423bb9 nl80211: fix attrbuf access race by allocating a separate one
Since my commit 3713b4e364 ("nl80211: allow splitting wiphy
information in dumps"), nl80211_dump_wiphy() uses the global
nl80211_fam.attrbuf for parsing the incoming data. This wouldn't
be a problem if it only did so on the first dump iteration which
is locked against other commands in generic netlink, but due to
space constraints in cb->args (the needed state doesn't fit) I
decided to always parse the original message. That's racy though
since nl80211_fam.attrbuf could be used by some other parsing in
generic netlink concurrently.

For now, fix this by allocating a separate parse buffer (it's a
bit too big for the stack, currently 1448 bytes on 64-bit). For
-next, I'll change the code to parse into the global buffer in
the first round only and then allocate a smaller buffer to keep
the data in cb->args.

Reported-by: Linus Torvalds <torvalds@linux-foundation.org>
Acked-by: David S. Miller <davem@davemloft.net>
Acked-by: John W. Linville <linville@tuxdriver.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2013-06-19 18:31:20 +02:00
Julian Anastasov
06f3d7f973 ipvs: SCTP ports should be writable in ICMP packets
Make sure that SCTP ports are writable when embedded in ICMP
from client, so that ip_vs_nat_icmp can translate them safely.

Signed-off-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Simon Horman <horms@verge.net.au>
2013-06-19 09:53:52 +09:00
John W. Linville
9d1059c248 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless 2013-06-18 14:04:51 -04:00
Jeff Layton
e401452d92 rpc_pipefs: only set rpc_dentry_ops if d_op isn't already set
We had a report of a reproducible WARNING:

[ 1360.039358] ------------[ cut here ]------------
[ 1360.043978] WARNING: at fs/dcache.c:1355 d_set_d_op+0x8d/0xc0()
[ 1360.049880] Hardware name: HP Z200 Workstation
[ 1360.054308] Modules linked in: nfsv4 nfs dns_resolver fscache nfsd
auth_rpcgss nfs_acl lockd sunrpc sg acpi_cpufreq mperf coretemp kvm_intel kvm
snd_hda_codec_realtek snd_hda_intel snd_hda_codec hp_wmi crc32c_intel
snd_hwdep e1000e snd_seq snd_seq_device snd_pcm snd_page_alloc snd_timer snd
sparse_keymap rfkill soundcore serio_raw ptp iTCO_wdt pps_core pcspkr
iTCO_vendor_support mei microcode lpc_ich mfd_core wmi xfs libcrc32c sr_mod
sd_mod cdrom crc_t10dif radeon i2c_algo_bit drm_kms_helper ttm ahci libahci
drm i2c_core libata dm_mirror dm_region_hash dm_log dm_mod [last unloaded:
auth_rpcgss]
[ 1360.107406] Pid: 8814, comm: mount.nfs4 Tainted: G         I --------------   3.9.0-0.55.el7.x86_64 #1
[ 1360.116771] Call Trace:
[ 1360.119219]  [<ffffffff810610c0>] warn_slowpath_common+0x70/0xa0
[ 1360.125208]  [<ffffffff810611aa>] warn_slowpath_null+0x1a/0x20
[ 1360.131025]  [<ffffffff811af46d>] d_set_d_op+0x8d/0xc0
[ 1360.136159]  [<ffffffffa05a7d6f>] __rpc_lookup_create_exclusive+0x4f/0x80 [sunrpc]
[ 1360.143710]  [<ffffffffa05a8cc6>] rpc_mkpipe_dentry+0x86/0x170 [sunrpc]
[ 1360.150311]  [<ffffffffa062a7b6>] nfs_idmap_new+0x96/0x130 [nfsv4]
[ 1360.156475]  [<ffffffffa062e7cd>] nfs4_init_client+0xad/0x2d0 [nfsv4]
[ 1360.162902]  [<ffffffff812f02df>] ? idr_get_empty_slot+0x16f/0x3c0
[ 1360.169062]  [<ffffffff812f0582>] ? idr_mark_full+0x52/0x60
[ 1360.174615]  [<ffffffff812f0699>] ? idr_alloc+0x79/0xe0
[ 1360.179826]  [<ffffffffa0598081>] ? __rpc_init_priority_wait_queue+0x81/0xc0 [sunrpc]
[ 1360.187635]  [<ffffffffa05980f3>] ? rpc_init_wait_queue+0x13/0x20 [sunrpc]
[ 1360.194493]  [<ffffffffa05d05da>] nfs_get_client+0x27a/0x350 [nfs]
[ 1360.200666]  [<ffffffffa062e438>] nfs4_set_client.isra.8+0x78/0x100 [nfsv4]
[ 1360.207624]  [<ffffffffa062f2f3>] nfs4_create_server+0xf3/0x3a0 [nfsv4]
[ 1360.214222]  [<ffffffffa06284be>] nfs4_remote_mount+0x2e/0x60 [nfsv4]
[ 1360.220644]  [<ffffffff8119ea79>] mount_fs+0x39/0x1b0
[ 1360.225691]  [<ffffffff81153880>] ? __alloc_percpu+0x10/0x20
[ 1360.231348]  [<ffffffff811b7ccf>] vfs_kern_mount+0x5f/0xf0
[ 1360.236822]  [<ffffffffa0628396>] nfs_do_root_mount+0x86/0xc0 [nfsv4]
[ 1360.243246]  [<ffffffffa06287b4>] nfs4_try_mount+0x44/0xc0 [nfsv4]
[ 1360.249410]  [<ffffffffa05d1457>] ? get_nfs_version+0x27/0x80 [nfs]
[ 1360.255659]  [<ffffffffa05db985>] nfs_fs_mount+0x5c5/0xd10 [nfs]
[ 1360.261650]  [<ffffffffa05dc550>] ? nfs_clone_super+0x140/0x140 [nfs]
[ 1360.268074]  [<ffffffffa05da8e0>] ? param_set_portnr+0x60/0x60 [nfs]
[ 1360.274406]  [<ffffffff8119ea79>] mount_fs+0x39/0x1b0
[ 1360.279443]  [<ffffffff81153880>] ? __alloc_percpu+0x10/0x20
[ 1360.285088]  [<ffffffff811b7ccf>] vfs_kern_mount+0x5f/0xf0
[ 1360.290556]  [<ffffffff811b9f5d>] do_mount+0x1fd/0xa00
[ 1360.295677]  [<ffffffff81137dee>] ? __get_free_pages+0xe/0x50
[ 1360.301405]  [<ffffffff811b9be6>] ? copy_mount_options+0x36/0x170
[ 1360.307479]  [<ffffffff811ba7e3>] sys_mount+0x83/0xc0
[ 1360.312515]  [<ffffffff8160ad59>] system_call_fastpath+0x16/0x1b
[ 1360.318503] ---[ end trace 8fa1f4cbc36094a7 ]---

The problem is that we're ending up in __rpc_lookup_create_exclusive
with a negative dentry that already has d_op set. A little debugging
has shown that when we hit this, the d_ops are already set to
simple_dentry_operations.

I believe that what's happening is that during a mount, idmapd is racing
in and doing a lookup of /var/lib/nfs/rpc_pipefs/nfs/clnt???/idmap.
Before that dentry reference is released, the kernel races in to create
that file and finds the new negative dentry, which already has the
d_op set.

This patch just avoids setting the d_op if it's already set.
simple_dentry_operations and rpc_dentry_operations are functionally
equivalent so it shouldn't matter which one it's set to.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-06-18 13:46:50 -04:00
Antonio Quartulli
52874a5e39 Revert "mac80211: in IBSS use the Auth frame to trigger STA reinsertion"
This reverts commit 6d810f1032

In this way an IBSS station will not use the AUTH messages
to trigger a state reinitialisation anymore.

The behaviour was racy and was not working properly.
It has been introduced to help wpa_supplicant to support
IBSS/RSN, however all the logic is now getting moved into
wpa_s itself which will also be in charge of handling the
AUTH messages thanks to the mgmt frame registration.

If userspace does not register for receiving AUTH frames
then mac80211 will still reply by itself.

At the same time, the auth frame registration counter can be
removed since it is not needed anymore.

Signed-off-by: Antonio Quartulli <antonio@open-mesh.com>
[remove unused variable]
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2013-06-18 16:36:59 +02:00
Simon Wunderlich
3aede78aad mac80211: change IBSS channel state to chandef
This should make some parts cleaner and is also required for handling
5/10 MHz properly.

Signed-off-by: Simon Wunderlich <siwu@hrz.tu-chemnitz.de>
Signed-off-by: Mathias Kretschmer <mathias.kretschmer@fokus.fraunhofer.de>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2013-06-18 16:27:25 +02:00
Simon Wunderlich
0418a44583 mac80211: fix various components for the new 5 and 10 MHz widths
This is a collection of minor fixes:
 * don't allow HT IEs in IBSS for 5/10 MHz
 * don't allow HT IEs in Mesh for 5/10 MHz
 * don't downgrade from/to 5 and 10 MHz channels
 * don't try HT rates for 5 and 10 MHz channels when selecting rates

Signed-off-by: Simon Wunderlich <siwu@hrz.tu-chemnitz.de>
Signed-off-by: Mathias Kretschmer <mathias.kretschmer@fokus.fraunhofer.de>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2013-06-18 16:17:11 +02:00
Simon Wunderlich
2f301ab29e nl80211/cfg80211: add 5 and 10 MHz defines and wiphy flag
Add defines for 5 and 10 MHz channel width and fix channel
handling functions accordingly.

Also check for and report the WIPHY_FLAG_SUPPORTS_5_10_MHZ
capability.

Signed-off-by: Simon Wunderlich <siwu@hrz.tu-chemnitz.de>
Signed-off-by: Mathias Kretschmer <mathias.kretschmer@fokus.fraunhofer.de>
[fix spelling in comment]
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2013-06-18 16:06:50 +02:00
Thomas Pedersen
f81a9dedaf mac80211: update mesh beacon on workqueue
Instead of updating the mesh beacon immediately when
requested (which would require the sdata_lock()), defer it
to the mac80211 workqueue.

Fixes yet another deadlock on calling sta_info_flush()
with the sdata_lock() held from ieee80211_stop_mesh(). We
could just drop the sdata_lock() around the
mesh_sta_cleanup() call, but this path is also taken from
several non-locked error paths.

Signed-off-by: Thomas Pedersen <thomas@cozybit.com>
[fix comment position]
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2013-06-18 15:57:27 +02:00
Simon Wunderlich
a1193be83b nl80211: use attributes to parse beacons
only the attributes are required and not the whole netlink info, as the
function accesses the attributes only anyway. This makes it easier to
parse nested beacon IEs later.

Signed-off-by: Simon Wunderlich <siwu@hrz.tu-chemnitz.de>
Signed-off-by: Mathias Kretschmer <mathias.kretschmer@fokus.fraunhofer.de>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2013-06-18 15:55:19 +02:00
Matthias Schiffer
33be081a81 ipv6: ndisc: fix ndisc_send_redirect writing to the wrong skb
Since some refactoring in 5f5a011, ndisc_send_redirect called
ndisc_fill_redirect_hdr_option on the wrong skb, leading to data corruption or
in the worst case a panic when the skb_put failed.

Signed-off-by: Matthias Schiffer <mschiffer@universe-factory.net>
Reviewed-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-17 22:59:12 -07:00
Linus Lüssing
32de868cbc bridge: fix switched interval for MLD Query types
General Queries (the one with the Multicast Address field
set to zero / '::') are supposed to have a Maximum Response Delay
of [Query Response Interval], while for Multicast-Address-Specific
Queries it is [Last Listener Query Interval] - not the other way
round. (see RFC2710, section 7.3+7.8)

Signed-off-by: Linus Lüssing <linus.luessing@web.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-17 17:11:22 -07:00
Fernando Luis Vazquez Cao
788dfcacac vlan: restore ethtool ABI to control VLAN hardware acceleration
As part of the push to add 802.1ad server provider tagging support to the
kernel the VLAN features flags were renamed. Unfortunately the kernel name
for the VLAN hardware acceleration features that the kernel shows user space
was included in the rename, which broke ethtool (txvlan and rxvlan options
do not work). This patch restores the original names, i.e. the original ABI.
If we wanted to make clear to users that we are refering to CTAGs we can
always change ethtool's short_name and long_name for these features (for
example something along the lines of txvlan -> txvlan-ctag, tx-vlan-offload ->
tx-vlan-ctag-offload).

Cc: Patrick McHardy <kaber@trash.net>
Cc: David S. Miller <davem@davemloft.net>
Cc: netdev@vger.kernel.org
Signed-off-by: Fernando Luis Vazquez Cao <fernando@oss.ntt.co.jp>
Reviewed-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-17 17:09:35 -07:00
Daniel Borkmann
dda9192851 net: sctp: remove SCTP_STATIC macro
SCTP_STATIC is just another define for the static keyword. It's use
is inconsistent in the SCTP code anyway and it was introduced in the
initial implementation of SCTP in 2.5. We have a regression suite in
lksctp-tools, but this is for user space only, so noone makes use of
this macro anymore. The kernel test suite for 2.5 is incompatible with
the current SCTP code anyway.

So simply Remove it, to be more consistent with the rest of the kernel
code.

Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Acked-by: Vlad Yasevich <vyasevich@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-17 17:08:05 -07:00
Daniel Borkmann
939cfa75a0 net: sctp: get rid of t_new macro for kzalloc
t_new rather obfuscates things where everyone else is using actual
function names instead of that macro, so replace it with kzalloc,
which is the function t_new wraps.

Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Acked-by: Vlad Yasevich <vyasevich@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-17 17:08:04 -07:00
David S. Miller
c5b248dd05 Merge branch 'for-davem' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless into wireless
John W. Linville says:

====================
This will probably be the last batch of wireless fixes intended
for 3.10.  Many of these are one- or two-liners, and a couple of
others are mostly relocating existing code to avoid races or to
limit the code to effecting specific hardware, etc.

The mac80211 fixes have a couple of exceptions to the above.
Regarding those, Johannes says:

"Following davem's complaint about my patch, here's a new pull request
w/o the patch he was complaining about, but instead with the const
fix rolled into the fix.

I have a fix for radar detection, one for rate control and a workaround
for broken HT APs which is a regression fix because we didn't rely
on them to be correct before."

Johannes also sends some iwlwifi fixes:

"I picked up Nikolay's patch for the chain noise calibration bug
that seems to have been there forever, a fix from Emmanuel for
setting TX flags on BAR frames and a fix of my own to avoid printing
request_module() errors if the kernel isn't even modular. We also
have our own version of Stanislaw's fix for rate control."

Along with those...

Anderson Lizardo fixes a Bluetooth memory corruption bug when an MTU
value is set to too small of a value.

Arend van Spriel sends a revised brcmsmac bug that fixes a regression
caused by a bad return value in an earlier patch.  He also sends a
brcmfmac fix to avoid an oops when loading the driver at boot.

Daniel Drake fixes a race condition in btmrvl that causes hangs on
suspend for OLPC hardware.

Johan Hedberg adds a check to avoid sending a
HCI_Delete_Stored_Link_Key command to devices that don't support them,
avoiding some scary looking log spam.

Stanislaw Gruszka gives us a fix for iwlegacy to be able to use rates
higher than 1Mb/s on older wireless networks.  He also sends an rt2x00
fix to reinstate older tx power handling behavior for some devices
that didn't work well with the current code.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-17 16:15:51 -07:00
David S. Miller
e00c7f1fad Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf
Pablo Neira Ayuso says:

====================
The following patchset contains Netfilter fixes. They are targeted to the
TCP option targets, that have receive some scrinity in the last week. The
changes are:

* Fix TCPOPTSTRIP, it stopped working in the forward chain as tcp_hdr
  uses skb->transport_header, and we cannot use that in the forwarding
  case, from myself.

* Fix default IPv6 MSS in TCPMSS in case of absence of TCP MSS options,
  from Phil Oester.

* Fix missing fragmentation handling again in TCPMSS, from Phil Oester.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-17 16:13:45 -07:00
Ying Xue
2537af9dca tipc: remove dev_base_lock use from enable_bearer
Convert enable_bearer() to RCU locking with dev_get_by_name().

Based on a similar changeset in commit 840a185d ["aoe: remove
dev_base_lock use from aoecmd_cfg_pkts()"] -- quoting that:

  "dev_base_lock is the legacy way to lock the device list,
   and is planned to disappear. (writers hold RTNL, readers
   hold RCU lock)"

Signed-off-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-17 15:53:01 -07:00
Ying Xue
126c052464 tipc: fix wrong return value for link_send_sections_long routine
When skb buffer cannot be allocated in link_send_sections_long(),
-ENOMEM error code instead of -EFAULT should be returned to its
caller.

Signed-off-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-17 15:53:01 -07:00
Ying Xue
7410f967ba tipc: make tipc_link_send_sections_fast exit earlier
Once message build request function returns invalid code, the
process of sending message cannot continue. So in case of message
build failure, tipc_link_send_sections_fast() should return
immediately.

Signed-off-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-17 15:53:01 -07:00
Ying Xue
796c75d0d3 tipc: enhance priority of link protocol packet
pfifo_fast is set as default traffic class queueing discipline. This
queue has three so called "bands". Within each band, FIFO rules apply.
However, as long as there are packets waiting in band 0, band 1 won't
be processed.

Now all kind of TIPC type packet priorities are never set, that is,
their priorities are 0, so they are mapped to band 1 of pfifo_fast
qdisc. But, especially during link congestion, if link protocol packet
can be sent out as earlier as possible than other type of packets so
that protocol packet can arrive at peer endpoint in time, the peer
will timely reset its link timeout timer to keep the link alive.
So enhancing the priority of link protocol packets can meet the
specific demand to avoid unnecessary link reset due to a transient
link congestion.

Signed-off-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-17 15:53:01 -07:00
Paul Gortmaker
ae8509c420 tipc: cosmetic realignment of function arguments
No runtime code changes here.  Just a realign of the function
arguments to start where the 1st one was, and fit as many args
as can be put in an 80 char line.

Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-17 15:53:01 -07:00
Ying Xue
c0fee8aca7 tipc: save sock structure pointer instead of void pointer to tipc_port
Directly save sock structure pointer instead of void pointer to avoid
unnecessary cast conversions.

Signed-off-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-17 15:53:01 -07:00
Ying Xue
28e5297281 tipc: convert config_lock from spinlock to mutex
As the configuration server is now running under process context,
it's unnecessary for us to have a spinlock serializing the TIPC
configuration process. Instead, we replace it with a mutex lock,
which gives us more freedom. For instance, we can now call
pre-emptable functions within the protected area.

Signed-off-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-17 15:53:01 -07:00
Ying Xue
3c5db8e4ec tipc: rename tipc_createport_raw to tipc_createport
After the removal of the native API, there is now only one way to
to create a TIPC port instance -- the function tipc_createport_raw().
We make it more readable by renaming it to tipc_createport().

Signed-off-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-17 15:53:01 -07:00
Ying Xue
f1733d7580 tipc: remove user_port instance from tipc_port structure
After the native API has been completely removed, the 'user_port'
field in struct tipc_port becomes unused, and can be removed.
As a consequence, the "usrmem" argument in tipc_msg_build() is no
longer needed, and so we remove that one too.

Signed-off-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-17 15:53:00 -07:00
Ying Xue
198d73b82b tipc: delete code orphaned by new server infrastructure
Having completed the conversion of the topology server and
configuration server to use the new server infrastructure,
the following functions become unused, and can be deleted:

   - tipc_createport()
   - port_wakeup_sh()
   - port_dispatcher()
   - port_dispatcher_sigh()
   - tipc_send_buf_fast()
   - tipc_send_buf2port

Additionally, the following variables become orphaned,
and can be deleted:

   - tipc_msg_err_event
   - tipc_named_msg_err_event
   - tipc_conn_shutdown_event
   - tipc_msg_event
   - tipc_named_msg_event
   - tipc_conn_msg_event
   - tipc_continue_event
   - msg_queue_head
   - msg_queue_tail
   - queue_lock

Deletion is done here in a separate commit in order to allow
the actual conversion changes to be more easily viewed.

Signed-off-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-17 15:53:00 -07:00
Ying Xue
7d0ab17b74 tipc: convert configuration server to use new server facility
As the new socket-based TIPC server infrastructure has been
introduced, we can now convert the configuration server to use
it.  Then we can take future steps to simplify the configuration
server locking policy.

Some minor reordering of initialization is done, due to the
dependency on having tipc_socket_init completed.

Signed-off-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-17 15:53:00 -07:00
Ying Xue
13a2e89873 tipc: convert topology server to use new server facility
As the new TIPC server infrastructure has been introduced, we can
now convert the TIPC topology server to it.  We get two benefits
from doing this:

1) It simplifies the topology server locking policy.  In the
original locking policy, we placed one spin lock pointer in the
tipc_subscriber structure to reuse the lock of the subscriber's
server port, controlling access to members of tipc_subscriber
instance.  That is, we only used one lock to ensure both
tipc_port and tipc_subscriber members were safely accessed.

Now we introduce another spin lock for tipc_subscriber structure
only protecting themselves, to get a finer granularity locking
policy.  Moreover, the change will allow us to make the topology
server code more readable and maintainable.

2) It fixes a bug where sent subscription events may be lost when
the topology port is congested.  Using the new service, the
topology server now queues sent events into an outgoing buffer,
and then wakes up a sender process which has been blocked in
workqueue context.  The process will keep picking events from the
buffer and send them to their respective subscribers, using the
kernel socket interface, until the buffer is empty. Even if the
socket is congested during transmission there is no risk that
events may be dropped, since the sender process may block when
needed.

Some minor reordering of initialization is done, since we now
have a scenario where the topology server must be started after
socket initialization has taken place, as the former depends
on the latter.  And overall, we see a simplification of the
TIPC subscriber code in making this changeover.

Signed-off-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-17 15:53:00 -07:00
Ying Xue
c5fa7b3cf3 tipc: introduce new TIPC server infrastructure
TIPC has two internal servers, one providing a subscription
service for topology events, and another providing the
configuration interface. These servers have previously been running
in BH context, accessing the TIPC-port (aka native) API directly.
Apart from these servers, even the TIPC socket implementation is
partially built on this API.

As this API may simultaneously be called via different paths and in
different contexts, a complex and costly lock policiy is required
in order to protect TIPC internal resources.

To eliminate the need for this complex lock policiy, we introduce
a new, generic service API that uses kernel sockets for message
passing instead of the native API. Once the toplogy and configuration
servers are converted to use this new service, all code pertaining
to the native API can be removed. This entails a significant
reduction in code amount and complexity, and opens up for a complete
rework of the locking policy in TIPC.

The new service also solves another problem:

As the current topology server works in BH context, it cannot easily
be blocked when sending of events fails due to congestion. In such
cases events may have to be silently dropped, something that is
unacceptable. Therefore, the new service keeps a dedicated outbound
queue receiving messages from BH context. Once messages are
inserted into this queue, we will immediately schedule a work from a
special workqueue. This way, messages/events from the topology server
are in reality sent in process context, and the server can block
if necessary.

Analogously, there is a new workqueue for receiving messages. Once a
notification about an arriving message is received in BH context, we
schedule a work from the receive workqueue to do the job of
receiving the message in process context.

As both sending and receive messages are now finished in processes,
subscribed events cannot be dropped any more.

As of this commit, this new server infrastructure is built, but
not actually yet called by the existing TIPC code, but since the
conversion changes required in order to use it are significant,
the addition is kept here as a separate commit.

Signed-off-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-17 15:53:00 -07:00
Erik Hugne
5d21cb70db tipc: allow implicit connect for stream sockets
TIPC's implied connect feature, aka piggyback connect, allows
applications to save one syscall and all SYN/SYN-ACK signalling
overhead when setting up a connection.  Until now, this has only
been supported for SEQPACKET sockets.  Here, we make it possible
to use this feature even with stream sockets.

At the connecting side, the connection is completed when the
first data message arrives from the accepting peer.  This means
that we must allow the connecting user to call blocking recv()
before the socket has reached state SS_CONNECTED.  So we must must
relax the state machine check at recv_stream(), and allow the
recv() call even if socket is in state SS_CONNECTING.

Signed-off-by: Erik Hugne <erik.hugne@ericsson.com>
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-17 15:53:00 -07:00
Ying Xue
cc79dd1ba9 tipc: change socket buffer overflow control to respect sk_rcvbuf
As per feedback from the netdev community, we change the buffer
overflow protection algorithm in receiving sockets so that it
always respects the nominal upper limit set in sk_rcvbuf.

Instead of scaling up from a small sk_rcvbuf value, which leads to
violation of the configured sk_rcvbuf limit, we now calculate the
weighted per-message limit by scaling down from a much bigger value,
still in the same field, according to the importance priority of the
received message.

To allow for administrative tunability of the socket receive buffer
size, we create a tipc_rmem sysctl variable to allow the user to
configure an even bigger value via sysctl command.  It is a size of
three (min/default/max) to be consistent with things like tcp_rmem.

By default, the value initialized in tipc_rmem[1] is equal to the
receive socket size needed by a TIPC_CRITICAL_IMPORTANCE message.
This value is also set as the default value of sk_rcvbuf.

Originally-by: Jon Maloy <jon.maloy@ericsson.com>
Cc: Neil Horman <nhorman@tuxdriver.com>
Cc: Jon Maloy <jon.maloy@ericsson.com>
[Ying: added sysctl variation to Jon's original patch]
Signed-off-by: Ying Xue <ying.xue@windriver.com>
[PG: don't compile sysctl.c if not config'd; add Documentation]
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-17 15:53:00 -07:00
Eliezer Tamir
dafcc4380d net: add socket option for low latency polling
adds a socket option for low latency polling.
This allows overriding the global sysctl value with a per-socket one.
Unexport sysctl_net_ll_poll since for now it's not needed in modules.

Signed-off-by: Eliezer Tamir <eliezer.tamir@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-17 15:48:14 -07:00
Eliezer Tamir
89bf1b5a68 net: remove NET_LL_RX_POLL config menue
Remove NET_LL_RX_POLL from the config menu.
Change default to y.
Busy polling still needs to be enabled at run time.

Signed-off-by: Eliezer Tamir <eliezer.tamir@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-17 15:48:14 -07:00
Eliezer Tamir
9a3c71aa80 net: convert low latency sockets to sched_clock()
Use sched_clock() instead of get_cycles().
We can use sched_clock() because we don't care much about accuracy.
Remove the dependency on X86_TSC

Signed-off-by: Eliezer Tamir <eliezer.tamir@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-17 15:48:14 -07:00
Eliezer Tamir
eb6db62282 net: change sysctl_net_ll_poll into an unsigned int
There is no reason for sysctl_net_ll_poll to be an unsigned long.
Change it into an unsigned int.
Fix the proc handler.

Signed-off-by: Eliezer Tamir <eliezer.tamir@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-17 15:48:13 -07:00
Greg Kroah-Hartman
0e496b8e84 Merge 3.10-rc6 into char-misc-next
We want the fixes in here.
2013-06-17 11:54:25 -07:00
John W. Linville
722a300bf8 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless into for-davem 2013-06-17 13:44:01 -04:00
Daniel Borkmann
2e0c9e7911 net: sctp: sctp_association_init: put refs in reverse order
In case we need to bail out for whatever reason during assoc
init, we call sctp_endpoint_put() and then sock_put(), however,
we've hold both refs in reverse, non-symmetric order, so first
sctp_endpoint_hold() and then sock_hold().

Reverse this, so that in an error case we have sock_put() and then
sctp_endpoint_put(). Actually shouldn't matter too much, since both
cleanup paths do the right thing, but that way, it is more consistent
with the rest of the code.

Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Acked-by: Vlad Yasevich <vyasevich@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-14 15:38:36 -07:00
Daniel Borkmann
c164b83814 net: sctp: minor: remove variable in sctp_init_sock
It's only used at this one time, so we could remove it as well.
This is valid and also makes it more explicit/obvious that in case
of error the sp->ep is NULL here, i.e. for the sctp_destroy_sock()
check that was recently added.

Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Acked-by: Vlad Yasevich <vyasevich@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-14 15:38:36 -07:00
Daniel Borkmann
405426f6ca net: sctp: sctp_sf_do_prm_asoc: do SCTP_CMD_INIT_CHOOSE_TRANSPORT first
While this currently cannot trigger any NULL pointer dereference in
sctp_seq_dump_local_addrs(), better change the order of commands to
prevent a future bug to happen. Although we first add SCTP_CMD_NEW_ASOC
and then set the SCTP_CMD_INIT_CHOOSE_TRANSPORT, it is okay for now,
since this primitive is only called by sctp_connect() or sctp_sendmsg()
with sctp_assoc_add_peer() set first. However, lets do this precaution
and first set the transport and then add it to the association hashlist
to prevent in future something to possibly triggering this.

Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Acked-by: Vlad Yasevich <vyasevich@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-14 15:38:36 -07:00
Daniel Borkmann
f9e42b8535 net: sctp: sideeffect: throw BUG if primary_path is NULL
This clearly states a BUG somewhere in the SCTP code as e.g. fixed once
in f28156335 ("sctp: Use correct sideffect command in duplicate cookie
handling"). If this ever happens, throw a trace in the sideeffect engine
where assocs clearly must have a primary_path assigned.

When in sctp_seq_dump_local_addrs() also throw a WARN and bail out since
we do not need to panic for printing this one asterisk. Also, it will
avoid the not so obvious case when primary != NULL test passes and at a
later point in time triggering a NULL ptr dereference caused by primary.
While at it, also fix up the white space.

Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Acked-by: Vlad Yasevich <vyasevich@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-14 15:38:36 -07:00
David S. Miller
09ce069dff Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/jesse/openvswitch
Jesse Gross says:

====================
A few miscellaneous improvements and cleanups before the GRE tunnel
integration series. Intended for net-next/3.11.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-14 15:31:22 -07:00
Pravin B Shelar
93d8fd1514 openvswitch: Simplify interface ovs_flow_metadata_from_nlattrs()
This is not functional change, this is just code cleanup.

Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Signed-off-by: Jesse Gross <jesse@nicira.com>
2013-06-14 15:09:12 -07:00
Pravin B Shelar
b34df5e805 openvswitch: make skb->csum consistent with rest of networking stack.
Following patch keeps skb->csum correct across ovs.

Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Signed-off-by: Jesse Gross <jesse@nicira.com>
2013-06-14 15:09:12 -07:00
Andy Hill
af7841636b openvswitch: Fix misspellings in comments and docs.
Flagged with: https://github.com/lyda/misspell-check
Run with: git ls-files | misspellings -f -

Signed-off-by: Andy Hill <hillad@gmail.com>
Signed-off-by: Jesse Gross <jesse@nicira.com>
2013-06-14 15:09:11 -07:00
Lorand Jakab
34d94f2102 openvswitch: fix variable names in comment
Signed-off-by: Lorand Jakab <lojakab@cisco.com>
Signed-off-by: Jesse Gross <jesse@nicira.com>
2013-06-14 15:09:10 -07:00
Pravin B Shelar
91b7514cdf openvswitch: Unify vport error stats handling.
Following patch changes vport->send return type so that vport
layer can do error accounting.

Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Signed-off-by: Jesse Gross <jesse@nicira.com>
2013-06-14 15:09:10 -07:00
Jesse Gross
cbd531bebb openvswitch: Remove unused get_config vport op.
The get_config vport op is left over from old compatibility code,
it is neither used nor implemented any more.

Signed-off-by: Jesse Gross <jesse@nicira.com>
2013-06-14 15:09:09 -07:00
Jesse Gross
f44f340883 openvswitch: Immediately exit on error in ovs_vport_cmd_set().
It is an error to try to change the type of a vport using the set
command. However, while we check that this is an error, we still
proceed to allocate memory which then gets freed immediately.
This stops processing after noticing the error, which does not
actually fix a bug but is more correct.

Signed-off-by: Jesse Gross <jesse@nicira.com>
2013-06-14 15:09:09 -07:00
Samuel Ortiz
4ca546e554 NFC: llcp: Fix the well known services endianness
The WKS (Well Known Services) bitmask should be transmitted in big endian
order. Picky implementations will refuse to establish an LLCP link when the
WKS bit 0 is not set to 1. The vast majority of implementations out there
are not that picky though...

Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2013-06-14 13:45:10 +02:00
Samuel Ortiz
f768b34017 NFC: llcp: Set the LLC Link Management well known service bit
In order to advertise our LLCP support properly and to follow the LLCP
specs requirements, we need to initialize the WKS (Well-Known Services)
bitfield to 1 as SAP 0 is the only mandatory supported service.

Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2013-06-14 13:45:09 +02:00
Samuel Ortiz
2635a4bdfa NFC: llcp: Do not send pending Tx frames when the remote is not ready
When we receive a RNR, the remote is busy processing the last received
frame. We set a local flag for that, and we should send a SYMM when it
is set instead of sending any pending frame.

Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2013-06-14 13:45:08 +02:00
Samuel Ortiz
b4011239a0 NFC: llcp: Fix non blocking sockets connections
Without the new LLCP_CONNECTING state, non blocking sockets will be
woken up with a POLLHUP right after calling connect() because their
state is stuck at LLCP_CLOSED.
That prevents userspace from implementing any proper non blocking
socket based NFC p2p client.

Cc: stable@vger.kernel.org
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2013-06-14 13:45:07 +02:00
Thierry Escande
f1b79dc891 NFC: Fix a potential memory leak
In nfc_llcp_tx_work() the sk_buff is not freed when the llcp_sock
is null and the PDU is an I one.

Signed-off-by: Thierry Escande <thierry.escande@linux.intel.com>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2013-06-14 13:45:06 +02:00
Thierry Escande
17f7ae16ae NFC: Keep socket alive until the DISC PDU is actually sent
This patch keeps the socket alive and therefore does not remove
it from the sockets list in the local until the DISC PDU has been
actually sent. Otherwise we would reply with DM PDUs before sending
the DISC one.

Signed-off-by: Thierry Escande <thierry.escande@linux.intel.com>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2013-06-14 13:45:05 +02:00
Thierry Escande
58e3dd1558 NFC: Rename nfc_llcp_disconnect() to nfc_llcp_send_disconnect()
nfc_llcp_send_disconnect() already exists but is not used.
nfc_llcp_disconnect() naming is not consistent with other PDU
sending functions.
This patch removes nfc_llcp_send_disconnect() and renames
nfc_llcp_disconnect()

Signed-off-by: Thierry Escande <thierry.escande@linux.intel.com>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2013-06-14 13:45:04 +02:00
Samuel Ortiz
be0856535c NFC: Add secure element enablement netlink API
Enabling or disabling an NFC accessible secure element through netlink
requires giving both an NFC controller and a secure element indexes.
Once enabled the secure element will handle card emulation once polling
starts.

Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2013-06-14 13:45:02 +02:00
Samuel Ortiz
c531c9ec29 NFC: Add secure element enablement internal API
Called via netlink, this API will enable or disable a specific secure
element. When a secure element is enabled, it will handle card emulation
and more generically ISO-DEP target mode, i.e. all target mode cases
except for p2p target mode.

Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2013-06-14 13:45:01 +02:00
Samuel Ortiz
ee656e9d09 NFC: Remove and free all SEs when releasing an NFC device
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2013-06-14 13:45:00 +02:00
Samuel Ortiz
2757c3723c NFC: Send netlink events for secure elements additions and removals
When an NFC driver or host controller stack discovers a secure element,
it will call nfc_add_se(). In order for userspace applications to use
these secure elements, a netlink event will then be sent with the SE
index and its type. With that information userspace applications can
decide wether or not to enable SEs, through their indexes.

Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2013-06-14 13:44:59 +02:00
Samuel Ortiz
fed7c25ec0 NFC: Add secure elements addition and removal API
This API will allow NFC drivers to add and remove the secure elements
they know about or detect. Typically this should be called (asynchronously
or not) from the driver or the host interface stack detect_se hook.

Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2013-06-14 13:44:58 +02:00
Samuel Ortiz
0a946301c2 NFC: Extend and fix the internal secure element API
Secure elements need to be discovered after enabling the NFC controller.
This is typically done by the NCI core and the HCI drivers (HCI does not
specify how to discover SEs, it is left to the specific drivers).
Also, the SE enable/disable API explicitely takes a SE index as its
argument.

Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2013-06-14 13:44:53 +02:00
Samuel Ortiz
0b456c418a NFC: Remove the static supported_se field
Supported secure elements are typically found during a discovery process
initiated when the NFC controller is up and running. For a given NFC
chipset there can be many configurations (embedded SE or not, with or
without a SIM card wired to the NFC controller SWP interface, etc...) and
thus driver code will never know before hand which SEs are available.
So we remove this field, it will be replaced by a real SE discovery
mechanism.

Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2013-06-14 13:44:19 +02:00
Frederic Danis
391d8a2da7 NFC: Add NCI over SPI receive
Before any operation, driver interruption is de-asserted to prevent
race condition between TX and RX.

Transaction starts by emitting "Direct read" and acknowledged mode
bytes. Then packet length is read allowing to allocate correct NCI
socket buffer. After that payload is retrieved.

A delay after the transaction can be added.
This delay is determined by the driver during nci_spi_allocate_device()
call and can be 0.

If acknowledged mode is set:
- CRC of header and payload is checked
- if frame reception fails (CRC error): NACK is sent
- if received frame has ACK or NACK flag: unblock nci_spi_send()

Payload is passed to NCI module.

At the end, driver interruption is re asserted.

Signed-off-by: Frederic Danis <frederic.danis@linux.intel.com>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2013-06-14 13:44:16 +02:00
Frederic Danis
ee9596d467 NFC: Add NCI over SPI send
Before any operation, driver interruption is de-asserted to prevent
race condition between TX and RX.

The NCI over SPI header is added in front of NCI packet.
If acknowledged mode is set, CRC-16-CCITT is added to the packet.
Then the packet is forwarded to SPI module to be sent.

A delay after the transaction is added.
This delay is determined by the driver during nci_spi_allocate_device()
call and can be 0.

After data has been sent, driver interruption is re-asserted.

If acknowledged mode is set, nci_spi_send will block until
acknowledgment is received.

Signed-off-by: Frederic Danis <frederic.danis@linux.intel.com>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2013-06-14 13:44:15 +02:00
Frederic Danis
8a00a61b0e NFC: Add basic NCI over SPI
The NFC Forum defines a transport interface based on
Serial Peripheral Interface (SPI) for the NFC Controller
Interface (NCI).

This module implements the SPI transport of NCI, calling SPI module
directly to read/write data to NFC controller (NFCC).

NFCC driver should provide functions performing device open and close.
It should also provide functions asserting/de-asserting interruption
to prevent TX/RX race conditions.
NFCC driver can also fix a delay between transactions if needed by
the hardware.

Signed-off-by: Frederic Danis <frederic.danis@linux.intel.com>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2013-06-14 13:44:03 +02:00
Neil Horman
c5c7774d7e sctp: fully initialize sctp_outq in sctp_outq_init
In commit 2f94aabd9f
(refactor sctp_outq_teardown to insure proper re-initalization)
we modified sctp_outq_teardown to use sctp_outq_init to fully re-initalize the
outq structure.  Steve West recently asked me why I removed the q->error = 0
initalization from sctp_outq_teardown.  I did so because I was operating under
the impression that sctp_outq_init would properly initalize that value for us,
but it doesn't.  sctp_outq_init operates under the assumption that the outq
struct is all 0's (as it is when called from sctp_association_init), but using
it in __sctp_outq_teardown violates that assumption. We should do a memset in
sctp_outq_init to ensure that the entire structure is in a known state there
instead.

Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
Reported-by: "West, Steve (NSN - US/Fort Worth)" <steve.west@nsn.com>
CC: Vlad Yasevich <vyasevich@gmail.com>
CC: netdev@vger.kernel.org
CC: davem@davemloft.net
Acked-by: Vlad Yasevich <vyasevich@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-13 18:05:24 -07:00
Rony Efraim
1d8faf48c7 net/core: Add VF link state control
Add netlink directives and ndo entry to allow for controling
VF link, which can be in one of three states:

Auto - VF link state reflects the PF link state (default)

Up - VF link state is up, traffic from VF to VF works even if
the actual PF link is down

Down - VF link state is down, no traffic from/to this VF, can be of
use while configuring the VF

Signed-off-by: Rony Efraim <ronye@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-13 17:51:04 -07:00
Eric Dumazet
ca4ec90b31 htb: reorder struct htb_class fields for performance
htb_class structures are big, and source of false sharing on SMP.

By carefully splitting them in two parts, we can improve performance.

I got 9 % performance increase on a 24 threads machine, with 200
concurrent netperf in TCP_RR mode, using a HTB hierarchy of 4 classes.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Tom Herbert <therbert@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-13 17:17:02 -07:00
Willem de Bruijn
5f121b9a83 net-rps: fixes for rps flow limit
Caught by sparse:
- __rcu: missing annotation to sd->flow_limit
- __user: direct access in cpumask_scnprintf

Also
- add endline character when printing bitmap if room in buffer
- avoid bucket overflow by reducing FLOW_LIMIT_HISTORY

The last item warrants some explanation. The hashtable buckets are
subject to overflow if FLOW_LIMIT_HISTORY is larger than or equal
to bucket size, since all packets may end up in a single bucket. The
current (rather arbitrary) history value of 256 happens to match the
buffer size (u8).

As a result, with a single flow, the first 128 packets are accepted
(correct), the second 128 packets dropped (correct) and then the
history[] array has filled, so that each subsequent new packet
causes an increment in the bucket for new_flow plus a decrement
for old_flow: a steady state.

This is fine if packets are dropped, as the steady state goes away
as soon as a mix of traffic reappears. But, because the 256th packet
overflowed the bucket to 0: no packets are dropped.

Instead of explicitly adding an overflow check, this patch changes
FLOW_LIMIT_HISTORY to never be able to overflow a single bucket.

Reported-by: Fengguang Wu <fengguang.wu@intel.com>
(first item)

Signed-off-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-13 17:13:05 -07:00
Samuel Ortiz
a395298c9c NFC: HCI: Follow a positive code path in the HCI ops implementations
Exiting on the error case is more typical to the kernel coding style.

Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2013-06-14 00:26:10 +02:00
Eric Lapuyade
9a695d23aa NFC: HCI: Implement fw_upload ops
This is a simple forward to the HCI driver. When driver is done with the
operation, it shall directly notify NFC Core by calling
nfc_fw_upload_done().

Signed-off-by: Eric Lapuyade <eric.lapuyade@intel.com>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2013-06-14 00:26:09 +02:00
Eric Lapuyade
9674da8759 NFC: Add firmware upload netlink command
As several NFC chipsets can have their firmwares upgraded and
reflashed, this patchset adds a new netlink command to trigger
that the driver loads or flashes a new firmware. This will allows
userspace triggered firmware upgrade through netlink.
The firmware name or hint is passed as a parameter, and the driver
will eventually fetch the firmware binary through the request_firmware
API.
The cmd can only be executed when the nfc dev is not in use. Actual
firmware loading/flashing is an asynchronous operation. Result of the
operation shall send a new event up to user space through the nfc dev
multicast socket. During operation, the nfc dev is not openable and
thus not usable.

Signed-off-by: Eric Lapuyade <eric.lapuyade@intel.com>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2013-06-14 00:26:08 +02:00
Frederic Danis
1095e69f47 NFC: NCI: Fix skb->dev usage
skb->dev is used for carrying a net_device pointer and not
an nci_dev pointer.

Remove usage of skb-dev to carry nci_dev and replace it by parameter
in nci_recv_frame(), nci_send_frame() and driver send() functions.

NfcWilink driver is also updated to use those functions.

Signed-off-by: Frederic Danis <frederic.danis@linux.intel.com>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2013-06-14 00:25:53 +02:00
Johan Hedberg
59f45d576a Bluetooth: Fix conditions for HCI_Delete_Stored_Link_Key
Even though the HCI_Delete_Stored_Link_Key command is mandatory for 1.1
and later controllers some controllers do not seem to support it
properly as was witnessed by one Broadcom based controller:

< HCI Command: Delete Stored Link Key (0x03|0x0012) plen 7
    bdaddr 00:00:00:00:00:00 all 1
> HCI Event: Command Complete (0x0e) plen 4
    Delete Stored Link Key (0x03|0x0012) ncmd 1
    status 0x11 deleted 0
    Error: Unsupported Feature or Parameter Value

Luckily this same controller also doesn't list the command in its
supported commands bit mask (counting from 0 bit 7 of octet 6):

< HCI Command: Read Local Supported Commands (0x04|0x0002) plen 0
> HCI Event: Command Complete (0x0e) plen 68
    Read Local Supported Commands (0x04|0x0002) ncmd 1
    status 0x00
    Commands: ffffffffffff1ffffffffffff30fffff3f

Therefore, it makes sense to move sending of HCI_Delete_Stored_Link_Key
to after receiving the supported commands response and to only send it
if its respective bit in the mask is set. The downside of this is that
we no longer send the HCI_Delete_Stored_Link_Key command for Bluetooth
1.1 controllers since HCI_Read_Local_Supported_Command was introduced in
version 1.2, but this is an acceptable penalty as the command in
question shouldn't affect critical behavior.

Reported-by: Pavel Machek <pavel@ucw.cz>
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Tested-by: Pavel Machek <pavel@ucw.cz>
Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2013-06-13 13:05:40 -04:00
Anderson Lizardo
300b962e52 Bluetooth: Fix crash in l2cap_build_cmd() with small MTU
If a too small MTU value is set with ioctl(HCISETACLMTU) or by a bogus
controller, memory corruption happens due to a memcpy() call with
negative length.

Fix this crash on either incoming or outgoing connections with a MTU
smaller than L2CAP_HDR_SIZE + L2CAP_CMD_HDR_SIZE:

[   46.885433] BUG: unable to handle kernel paging request at f56ad000
[   46.888037] IP: [<c03d94cd>] memcpy+0x1d/0x40
[   46.888037] *pdpt = 0000000000ac3001 *pde = 00000000373f8067 *pte = 80000000356ad060
[   46.888037] Oops: 0002 [#1] SMP DEBUG_PAGEALLOC
[   46.888037] Modules linked in: hci_vhci bluetooth virtio_balloon i2c_piix4 uhci_hcd usbcore usb_common
[   46.888037] CPU: 0 PID: 1044 Comm: kworker/u3:0 Not tainted 3.10.0-rc1+ #12
[   46.888037] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2007
[   46.888037] Workqueue: hci0 hci_rx_work [bluetooth]
[   46.888037] task: f59b15b0 ti: f55c4000 task.ti: f55c4000
[   46.888037] EIP: 0060:[<c03d94cd>] EFLAGS: 00010212 CPU: 0
[   46.888037] EIP is at memcpy+0x1d/0x40
[   46.888037] EAX: f56ac1c0 EBX: fffffff8 ECX: 3ffffc6e EDX: f55c5cf2
[   46.888037] ESI: f55c6b32 EDI: f56ad000 EBP: f55c5c68 ESP: f55c5c5c
[   46.888037]  DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068
[   46.888037] CR0: 8005003b CR2: f56ad000 CR3: 3557d000 CR4: 000006f0
[   46.888037] DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
[   46.888037] DR6: ffff0ff0 DR7: 00000400
[   46.888037] Stack:
[   46.888037]  fffffff8 00000010 00000003 f55c5cac f8c6a54c ffffffff f8c69eb2 00000000
[   46.888037]  f4783cdc f57f0070 f759c590 1001c580 00000003 0200000a 00000000 f5a88560
[   46.888037]  f5ba2600 f5a88560 00000041 00000000 f55c5d90 f8c6f4c7 00000008 f55c5cf2
[   46.888037] Call Trace:
[   46.888037]  [<f8c6a54c>] l2cap_send_cmd+0x1cc/0x230 [bluetooth]
[   46.888037]  [<f8c69eb2>] ? l2cap_global_chan_by_psm+0x152/0x1a0 [bluetooth]
[   46.888037]  [<f8c6f4c7>] l2cap_connect+0x3f7/0x540 [bluetooth]
[   46.888037]  [<c019b37b>] ? trace_hardirqs_off+0xb/0x10
[   46.888037]  [<c01a0ff8>] ? mark_held_locks+0x68/0x110
[   46.888037]  [<c064ad20>] ? mutex_lock_nested+0x280/0x360
[   46.888037]  [<c064b9d9>] ? __mutex_unlock_slowpath+0xa9/0x150
[   46.888037]  [<c01a118c>] ? trace_hardirqs_on_caller+0xec/0x1b0
[   46.888037]  [<c064ad08>] ? mutex_lock_nested+0x268/0x360
[   46.888037]  [<c01a125b>] ? trace_hardirqs_on+0xb/0x10
[   46.888037]  [<f8c72f8d>] l2cap_recv_frame+0xb2d/0x1d30 [bluetooth]
[   46.888037]  [<c01a0ff8>] ? mark_held_locks+0x68/0x110
[   46.888037]  [<c064b9d9>] ? __mutex_unlock_slowpath+0xa9/0x150
[   46.888037]  [<c01a118c>] ? trace_hardirqs_on_caller+0xec/0x1b0
[   46.888037]  [<f8c754f1>] l2cap_recv_acldata+0x2a1/0x320 [bluetooth]
[   46.888037]  [<f8c491d8>] hci_rx_work+0x518/0x810 [bluetooth]
[   46.888037]  [<f8c48df2>] ? hci_rx_work+0x132/0x810 [bluetooth]
[   46.888037]  [<c0158979>] process_one_work+0x1a9/0x600
[   46.888037]  [<c01588fb>] ? process_one_work+0x12b/0x600
[   46.888037]  [<c015922e>] ? worker_thread+0x19e/0x320
[   46.888037]  [<c015922e>] ? worker_thread+0x19e/0x320
[   46.888037]  [<c0159187>] worker_thread+0xf7/0x320
[   46.888037]  [<c0159090>] ? rescuer_thread+0x290/0x290
[   46.888037]  [<c01602f8>] kthread+0xa8/0xb0
[   46.888037]  [<c0656777>] ret_from_kernel_thread+0x1b/0x28
[   46.888037]  [<c0160250>] ? flush_kthread_worker+0x120/0x120
[   46.888037] Code: c3 90 8d 74 26 00 e8 63 fc ff ff eb e8 90 55 89 e5 83 ec 0c 89 5d f4 89 75 f8 89 7d fc 3e 8d 74 26 00 89 cb 89 c7 c1 e9 02 89 d6 <f3> a5 89 d9 83 e1 03 74 02 f3 a4 8b 5d f4 8b 75 f8 8b 7d fc 89
[   46.888037] EIP: [<c03d94cd>] memcpy+0x1d/0x40 SS:ESP 0068:f55c5c5c
[   46.888037] CR2: 00000000f56ad000
[   46.888037] ---[ end trace 0217c1f4d78714a9 ]---

Signed-off-by: Anderson Lizardo <anderson.lizardo@openbossa.org>
Cc: stable@vger.kernel.org
Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2013-06-13 13:05:39 -04:00
Eric Dumazet
d3b6f61418 ip_tunnel: remove __net_init/exit from exported functions
If CONFIG_NET_NS is not set then __net_init is the same as __init and
__net_exit is the same as __exit. These functions will be removed from
memory after the module loads or is removed. Functions that are exported
for use by other functions should never be labeled for removal.

Bug introduced by commit c544193214
("GRE: Refactor GRE tunneling code.")

Reported-by: Steinar H. Gunderson <sgunderson@bigfoot.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-13 03:00:59 -07:00
Ilan Peer
38745c7414 mac80211: Fix VHT bandwidth change event
Signed-off-by: Ilan Peer <ilan.peer@intel.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2013-06-13 11:58:47 +02:00
Alexander Bondar
817cee7675 mac80211: track AP's beacon rate and give it to the driver
Track the AP's beacon rate in the scan BSS data and in the
interface configuration to let the drivers know which rate
the AP is using. This information may be used by drivers,
in our case to let the firmware optimise beacon RX.

Signed-off-by: Alexander Bondar <alexander.bondar@intel.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2013-06-13 11:58:47 +02:00
Saurabh Mohan
baafc77b32 net/ipv4: ip_vti clear skb cb before tunneling.
If users apply shaper to vti tunnel then it will cause a kernel crash. The
problem seems to be due to the vti_tunnel_xmit function not clearing
skb->opt field before passing the packet to xfrm tunneling code.

Signed-off-by: Saurabh Mohan <saurabh@vyatta.com>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-13 02:47:46 -07:00
Yuchung Cheng
85f16525a2 tcp: properly send new data in fast recovery in first RTT
Linux sends new unset data during disorder and recovery state if all
(suspected) lost packets have been retransmitted ( RFC5681, section
3.2 step 1 & 2, RFC3517 section 4, NexSeg() Rule 2).  One requirement
is to keep the receive window about twice the estimated sender's
congestion window (tcp_rcv_space_adjust()), assuming the fast
retransmits repair the losses in the next round trip.

But currently it's not the case on the first round trip in either
normal or Fast Open connection, beucase the initial receive window
is identical to (expected) sender's initial congestion window. The
fix is to double it.

Signed-off-by: Yuchung Cheng <ycheng@google.com>
Acked-by: Neal Cardwell <ncardwell@google.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-13 02:46:29 -07:00
Guillaume Nault
a6f79d0f26 l2tp: Fix sendmsg() return value
PPPoL2TP sockets should comply with the standard send*() return values
(i.e. return number of bytes sent instead of 0 upon success).

Signed-off-by: Guillaume Nault <g.nault@alphalink.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-13 02:39:04 -07:00
Guillaume Nault
55b92b7a11 l2tp: Fix PPP header erasure and memory leak
Copy user data after PPP framing header. This prevents erasure of the
added PPP header and avoids leaking two bytes of uninitialised memory
at the end of skb's data buffer.

Signed-off-by: Guillaume Nault <g.nault@alphalink.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-13 02:39:04 -07:00
Joe Perches
fe2c6338fd net: Convert uses of typedef ctl_table to struct ctl_table
Reduce the uses of this unnecessary typedef.

Done via perl script:

$ git grep --name-only -w ctl_table net | \
  xargs perl -p -i -e '\
	sub trim { my ($local) = @_; $local =~ s/(^\s+|\s+$)//g; return $local; } \
        s/\b(?<!struct\s)ctl_table\b(\s*\*\s*|\s+\w+)/"struct ctl_table " . trim($1)/ge'

Reflow the modified lines that now exceed 80 columns.

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-13 02:36:09 -07:00
Flavio Leitner
194f4a6df2 net: make all team port device link events urgent
Since team functionality relies heavily on userspace daemon, we need to
deliver event to userspace via Netlink as quick as possible. So make all
team port device link events urgent.

Signed-off-by: Flavio Leitner <fbl@redhat.com>
Signed-off-by: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-13 02:31:41 -07:00
Daniel Borkmann
2dc85bf323 packet: packet_getname_spkt: make sure string is always 0-terminated
uaddr->sa_data is exactly of size 14, which is hard-coded here and
passed as a size argument to strncpy(). A device name can be of size
IFNAMSIZ (== 16), meaning we might leave the destination string
unterminated. Thus, use strlcpy() and also sizeof() while we're
at it. We need to memset the data area beforehand, since strlcpy
does not padd the remaining buffer with zeroes for user space, so
that we do not possibly leak anything.

Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-13 01:38:36 -07:00
Wu Fengguang
a06a2d378d net: ping_check_bind_addr() etc. can be static
net/ipv4/ping.c:286:5: sparse: symbol 'ping_check_bind_addr' was not declared. Should it be static?
net/ipv4/ping.c:355:6: sparse: symbol 'ping_set_saddr' was not declared. Should it be static?
net/ipv4/ping.c:370:6: sparse: symbol 'ping_clear_saddr' was not declared. Should it be static?

net/ipv6/ping.c:60:5: sparse: symbol 'dummy_ipv6_recv_error' was not declared. Should it be static?
net/ipv6/ping.c:64:5: sparse: symbol 'dummy_ip6_datagram_recv_ctl' was not declared. Should it be static?
net/ipv6/ping.c:69:5: sparse: symbol 'dummy_icmpv6_err_convert' was not declared. Should it be static?
net/ipv6/ping.c:73:6: sparse: symbol 'dummy_ipv6_icmp_error' was not declared. Should it be static?
net/ipv6/ping.c:75:5: sparse: symbol 'dummy_ipv6_chk_addr' was not declared. Should it be static?
net/ipv6/ping.c:201:5: sparse: symbol 'ping_v6_seq_show' was not declared. Should it be static?

Signed-off-by: Fengguang Wu <fengguang.wu@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-13 01:36:41 -07:00
Ben Greear
e562078a19 mac80211: Ensure tid_start_tx is protected by sta->lock
All accesses of the tid_start_tx lock should be protected
by sta->lock if there is any chance that another thread
could still be accessing the sta object.

Signed-off-by: Ben Greear <greearb@candelatech.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2013-06-13 10:27:07 +02:00
David S. Miller
e86c986137 Included change:
- fix "rtnl locked" concurrent executions by using rtnl_lock instead of
   rtnl_trylock. This fix enables batman-adv initialisation to do not fail just
   because somewhere else in the system another code path is holding the rtnl
   lock. It is easy to see the problem when batman-adv is trying to start
   together with other networking components.
 - fix the routing protocol forwarding policy by enhancing the duplicate control
   packet detection. When the right circumstances trigger the issue, some nodes in
   the network become totally unreachable, so breaking the mesh connectivity.
 - fix the Bridge Loop Avoidance component by not running the originator address
   change handling routine when the component is disabled. The routine was
   generating useless packets that were sent over the network.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.19 (GNU/Linux)
 
 iQIcBAABCAAGBQJRtkFtAAoJEADl0hg6qKeOBHUP/1Ni8juzvoQwR/gYjq0p2lHM
 LvkZ/KI74QP96rfDog+nhP9fcOgjSt2MF+sSa6is92RavrKFZQyO2J/GKkrHu2HA
 gaLKxN6S28Hi2qVbVXgSqT9RQ3XvpzIaojtNvb0tC1onzGzLtd6V5FURIq0FRHvN
 RvP4w+1HwH2CsQlgjQq1OPwUllVqTUzGYH0fl/U+0mw7h+q0ZWCA1IZln/t08xjl
 ViCCydbD1Th2tgK7uzFg8X3EJZN1CkrBWflb7X3YK5zeps1NC+l4OuUOK+K2L+fx
 vCMu603FXKi+SjM24d+eGJx6kQPCapYThIrp1qy43SLkNazRIAmgbZpndme0QP/8
 eSUozWAusWIESJI3Krneh3i70agMeg2MK4nAp51z54j52urDlOGURyNf7TkieaT4
 Vti5QG0poXncIb1XQ+yaKDCORwkn18QjmtfNmCCgT2YF91pOSYCrlgONi65K6DIs
 F4eDk7sTgHAIgYO/XEet/V5p06SO86ksF/C13Dqug64s3rkw9ejqgLZBEy3OH1AF
 IFgws3qE6GiSiXLMiiheplBYD51au+V1Jihqvw/lo2JzlOw4PRNRYsaQgVaUH/MJ
 jupEjA8V0swMtDIi6ixcPE/P60OJR41VuT8gVGWbrTKnHZ0yyIfgZwPcLZaQ2X0e
 EIlTJdtS7lVpleZ2C/H1
 =oGCV
 -----END PGP SIGNATURE-----

Merge tag 'batman-adv-fix-for-davem' of git://git.open-mesh.org/linux-merge

Included change:
- fix "rtnl locked" concurrent executions by using rtnl_lock instead of
  rtnl_trylock. This fix enables batman-adv initialisation to do not fail just
  because somewhere else in the system another code path is holding the rtnl
  lock. It is easy to see the problem when batman-adv is trying to start
  together with other networking components.
- fix the routing protocol forwarding policy by enhancing the duplicate control
  packet detection. When the right circumstances trigger the issue, some nodes in
  the network become totally unreachable, so breaking the mesh connectivity.
- fix the Bridge Loop Avoidance component by not running the originator address
  change handling routine when the component is disabled. The routine was
  generating useless packets that were sent over the network.

Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-13 01:26:54 -07:00
Simon Horman
2c928e0e8d sctp: Correct byte order of access to skb->{network, transport}_header
Corrects an byte order conflict introduced by
158874cac6
("sctp: Correct access to skb->{network, transport}_header").
The values in question are host byte order.

Signed-off-by: Simon Horman <horms@verge.net.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-13 01:21:50 -07:00
Gao feng
ca15febfe9 netlink: make compare exist all the time
Commit da12c90e09
"netlink: Add compare function for netlink_table"
only set compare at the time we create kernel netlink,
and reset compare to NULL at the time we finially
release netlink socket, but netlink_lookup wants
the compare exist always.

So we should set compare after we allocate nl_table,
and never reset it. make comapre exist all the time.

Reported-by: Fengguang Wu <fengguang.wu@intel.com>
Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-13 00:45:48 -07:00
Linus Torvalds
26e04462c8 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Pull networking update from David Miller:

 1) Fix dump iterator in nfnl_acct_dump() and ctnl_timeout_dump() to
    dump all objects properly, from Pablo Neira Ayuso.

 2) xt_TCPMSS must use the default MSS of 536 when no MSS TCP option is
    present.  Fix from Phil Oester.

 3) qdisc_get_rtab() looks for an existing matching rate table and uses
    that instead of creating a new one.  However, it's key matching is
    incomplete, it fails to check to make sure the ->data[] array is
    identical too.  Fix from Eric Dumazet.

 4) ip_vs_dest_entry isn't fully initialized before copying back to
    userspace, fix from Dan Carpenter.

 5) Fix ubuf reference counting regression in vhost_net, from Jason
    Wang.

 6) When sock_diag dumps a socket filter back to userspace, we have to
    translate it out of the kernel's internal representation first.
    From Nicolas Dichtel.

 7) davinci_mdio holds a spinlock while calling pm_runtime, which
    sleeps.  Fix from Sebastian Siewior.

 8) Timeout check in sh_eth_check_reset is off by one, from Sergei
    Shtylyov.

 9) If sctp socket init fails, we can NULL deref during cleanup.  Fix
    from Daniel Borkmann.

10) netlink_mmap() does not propagate errors properly, from Patrick
    McHardy.

11) Disable powersave and use minstrel by default in ath9k.  From Sujith
    Manoharan.

12) Fix a regression in that SOCK_ZEROCOPY is not set on tuntap sockets
    which prevents vhost from being able to use zerocopy.  From Jason
    Wang.

13) Fix race between port lookup and TX path in team driver, from Jiri
    Pirko.

14) Missing length checks in bluetooth L2CAP packet parsing, from Johan
    Hedberg.

15) rtlwifi fails to connect to networking using any encryption method
    other than WPA2.  Fix from Larry Finger.

16) Fix iwlegacy build due to incorrect CONFIG_* ifdeffing for power
    management stuff.  From Yijing Wang.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (35 commits)
  b43: stop format string leaking into error msgs
  ath9k: Use minstrel rate control by default
  Revert "ath9k_hw: Update rx gain initval to improve rx sensitivity"
  ath9k: Disable PowerSave by default
  net: wireless: iwlegacy: fix build error for il_pm_ops
  rtlwifi: Fix a false leak indication for PCI devices
  wl12xx/wl18xx: scan all 5ghz channels
  wl12xx: increase minimum singlerole firmware version required
  wl12xx: fix minimum required firmware version for wl127x multirole
  rtlwifi: rtl8192cu: Fix problem in connecting to WEP or WPA(1) networks
  mwifiex: debugfs: Fix out of bounds array access
  Bluetooth: Fix mgmt handling of power on failures
  Bluetooth: Fix missing length checks for L2CAP signalling PDUs
  Bluetooth: btmrvl: support Marvell Bluetooth device SD8897
  Bluetooth: Fix checks for LE support on LE-only controllers
  team: fix checks in team_get_first_port_txable_rcu()
  team: move add to port list before port enablement
  team: check return value of team_get_port_by_index_rcu() for NULL
  tuntap: set SOCK_ZEROCOPY flag during open
  netlink: fix error propagation in netlink_mmap()
  ...
2013-06-12 17:18:29 -07:00
Johannes Berg
661eb3811d mac80211: fix TX aggregation TID struct leak
Ben reports that kmemleak is saying TX aggregation TID
structs are leaked. Given his workload, I suspect that
they're leaked because stations are destroyed before
their aggregation sessions get a chance to start. Fix
this by simply freeing structs that are not used yet.

Reported-by: Ben Greear <greearb@candelatech.com>
Tested-by:  Ben Greear <greearb@candelatech.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2013-06-13 00:16:37 +02:00
Eric Dumazet
7c0cadc69c udp: fix two sparse errors
commit ba418fa357 ("soreuseport: UDP/IPv4 implementation")
added following sparse errors :

net/ipv4/udp.c:433:60: warning: cast from restricted __be16
net/ipv4/udp.c:433:60: warning: incorrect type in argument 1 (different base types)
net/ipv4/udp.c:433:60:    expected unsigned short [unsigned] [usertype] val
net/ipv4/udp.c:433:60:    got restricted __be16 [usertype] sport
net/ipv4/udp.c:433:60: warning: cast from restricted __be16
net/ipv4/udp.c:433:60: warning: cast from restricted __be16
net/ipv4/udp.c:514:60: warning: cast from restricted __be16
net/ipv4/udp.c:514:60: warning: incorrect type in argument 1 (different base types)
net/ipv4/udp.c:514:60:    expected unsigned short [unsigned] [usertype] val
net/ipv4/udp.c:514:60:    got restricted __be16 [usertype] sport
net/ipv4/udp.c:514:60: warning: cast from restricted __be16
net/ipv4/udp.c:514:60: warning: cast from restricted __be16

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-12 15:03:24 -07:00
Eric Dumazet
5b9b626377 gro: remove a sparse error
Fix following sparse error :

net/ipv4/af_inet.c:1410:59: warning: restricted __be16 degrades to
integer

added in commit db8caf3dbc
("gro: should aggregate frames without DF")

Reported-by: kbuild test robot <fengguang.wu@intel.com>
From: Eric Dumazet <edumazet@google.com>

Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-12 15:03:24 -07:00
David S. Miller
4a2e667ac1 Merge branch 'for-davem' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next
John W. Linville says:

====================
This pull request is intended for the 3.11 stream...

One big highlight is the cw1200 driver the ST-E CW1100 & CW1200
WLAN chipsets.  This one has been lingering for a while, lacking
some review comments.  Once started getting pulled into linux-next,
it got a bit more attention and a number of improvements were made
over the initial cut.  No doubt there will be more changes ahead,
but I think it is looking alright at this point.

Along with that, there is the usual flurry of updates to the mac80211
core and the iwlwifi, mwifiex, ath9k, rt2x00, wil6210, and other
drivers.  A few of the highlights are some rt2x00 refactoring/cleanup
by Gabor Juhos, some rt2800 hardware support enhancements by Stanislaw
Gruszka, some iwlwifi power management updates from Alexander Bondar,
some enhanced bcma SPROM support from Rafał Miłecki, and a variety
of other things here and there.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-12 14:23:41 -07:00
Eric Dumazet
c70eba7453 igmp: fix new sparse errors
Fix following sparse errors :

net/ipv4/igmp.c:1222:25: warning: cast from restricted __be32
net/ipv4/igmp.c🔢31: warning: incorrect type in assignment (different address spaces)
net/ipv4/igmp.c🔢31:    expected struct ip_mc_list [noderef] <asn:4>*next_hash
net/ipv4/igmp.c🔢31:    got struct ip_mc_list *<noident>
net/ipv4/igmp.c:1250:31: warning: incorrect type in assignment (different address spaces)
net/ipv4/igmp.c:1250:31:    expected struct ip_mc_list [noderef] <asn:4>*next_hash
net/ipv4/igmp.c:1250:31:    got struct ip_mc_list *<noident>
net/ipv4/igmp.c:2380:37: warning: cast from restricted __be32

These were added by commit e989707135
("igmp: hash a hash table to speedup ip_check_mc_rcu()")

Reported-by: kbuild test robot <fengguang.wu@intel.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-12 14:14:55 -07:00
John W. Linville
812fd64596 Merge branch 'for-john' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211-next
Conflicts:
	drivers/net/wireless/iwlwifi/mvm/mac80211.c
2013-06-12 15:39:05 -04:00
John W. Linville
861bca265e Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless
Conflicts:
	drivers/net/wireless/ath/ath9k/Kconfig
	net/mac80211/iface.c
2013-06-12 14:35:23 -04:00
John W. Linville
6bb7aabf73 Merge branch 'for-john' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211 2013-06-12 14:26:48 -04:00
Linus Torvalds
8d7a8fe2ce Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client
Pull ceph fixes from Sage Weil:
 "There is a pair of fixes for double-frees in the recent bundle for
  3.10, a couple of fixes for long-standing bugs (sleep while atomic and
  an endianness fix), and a locking fix that can be triggered when osds
  are going down"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client:
  rbd: fix cleanup in rbd_add()
  rbd: don't destroy ceph_opts in rbd_add()
  ceph: ceph_pagelist_append might sleep while atomic
  ceph: add cpu_to_le32() calls when encoding a reconnect capability
  libceph: must hold mutex for reset_changed_osds()
2013-06-12 08:28:19 -07:00
John W. Linville
42d887a680 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless into for-davem 2013-06-12 10:57:04 -04:00
Johan Hedberg
96570ffcca Bluetooth: Fix mgmt handling of power on failures
If hci_dev_open fails we need to ensure that the corresponding
mgmt_set_powered command gets an appropriate response. This patch fixes
the missing response by adding a new mgmt_set_powered_failed function
that's used to indicate a power on failure to mgmt. Since a situation
with the device being rfkilled may require special handling in user
space the patch uses a new dedicated mgmt status code for this.

Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Cc: stable@vger.kernel.org
Acked-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2013-06-12 10:20:55 -04:00
Johan Hedberg
cb3b3152b2 Bluetooth: Fix missing length checks for L2CAP signalling PDUs
There has been code in place to check that the L2CAP length header
matches the amount of data received, but many PDU handlers have not been
checking that the data received actually matches that expected by the
specific PDU. This patch adds passing the length header to the specific
handler functions and ensures that those functions fail cleanly in the
case of an incorrect amount of data.

Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Cc: stable@vger.kernel.org
Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2013-06-12 10:20:54 -04:00
Johan Hedberg
757aee0f71 Bluetooth: Fix checks for LE support on LE-only controllers
LE-only controllers do not support extended features so any kind of host
feature bit checks do not make sense for them. This patch fixes code
used for both single-mode (LE-only) and dual-mode (BR/EDR/LE) to use the
HCI_LE_ENABLED flag instead of the "Host LE supported" feature bit for
LE support tests.

Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Acked-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2013-06-12 10:20:54 -04:00
Phil Oester
b396966c46 netfilter: xt_TCPMSS: Fix missing fragmentation handling
Similar to commit bc6bcb59 ("netfilter: xt_TCPOPTSTRIP: fix
possible mangling beyond packet boundary"), add safe fragment
handling to xt_TCPMSS.

Signed-off-by: Phil Oester <kernel@linuxace.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2013-06-12 11:06:19 +02:00
Phil Oester
70d19f805f netfilter: xt_TCPMSS: Fix IPv6 default MSS too
As a followup to commit 409b545a ("netfilter: xt_TCPMSS: Fix violation
of RFC879 in absence of MSS option"), John Heffner points out that IPv6
has a higher MTU than IPv4, and thus a higher minimum MSS. Update TCPMSS
target to account for this, and update RFC comment.

While at it, point to more recent reference RFC1122 instead of RFC879.

Signed-off-by: Phil Oester <kernel@linuxace.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2013-06-12 11:04:41 +02:00
Daniel Borkmann
7a6e288d27 pktgen: ipv6: numa: consolidate skb allocation to pktgen_alloc_skb
We currently allow for numa-node aware skb allocation only within the
fill_packet_ipv4() path, but not in fill_packet_ipv6(). Consolidate that
code to a common allocation helper to enable numa-node aware skb
allocation for ipv6, and use it in both paths. This also makes both
functions a bit more readable.

Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-12 00:47:25 -07:00
Daniel Borkmann
da5bab079f net: udp4: move GSO functions to udp_offload
Similarly to TCP offloading and UDPv6 offloading, move all related
UDPv4 functions to udp_offload.c to make things more explicit. Also,
by this, we can make those functions static.

Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-12 00:47:25 -07:00
Shawn Bohrer
946d3bd723 igmp: remove unnecessary in_device member zeroing
ip_mc_init_dev() is passed a freshly kzalloc'd in_device so it is
unnecessary to explicitly zero out the members.

Signed-off-by: Shawn Bohrer <sbohrer@rgmadvisors.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-12 00:41:15 -07:00
Eric Dumazet
e989707135 igmp: hash a hash table to speedup ip_check_mc_rcu()
After IP route cache removal, multicast applications using
a lot of multicast addresses hit a O(N) behavior in ip_check_mc_rcu()

Add a per in_device hash table to get faster lookup.

This hash table is created only if the number of items in mc_list is
above 4.

Reported-by: Shawn Bohrer <sbohrer@rgmadvisors.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Tested-by: Shawn Bohrer <sbohrer@rgmadvisors.com>
Reviewed-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-12 00:25:23 -07:00
Eric Dumazet
64153ce0a7 net_sched: htb: do not setup default rate estimators
With a thousand htb classes, est_timer() spends ~5 million cpu cycles
and throws out cpu cache, because each htb class has a default
rate estimator (est 4sec 16sec).

Most users do not use default rate estimators, so switch htb
to not setup ones.

Add a module parameter (htb_rate_est) so that users relying
on this default rate estimator can revert the behavior.

echo 1 >/sys/module/sch_htb/parameters/htb_rate_est

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-12 00:14:21 -07:00
Simon Wunderlich
795d855d56 mac80211: Fix rate control mask matching call
The order of parameters was mixed up, introduced in commit
"mac80211: improve the rate control API"

Cc: Felix Fietkau <nbd@openwrt.org>
Signed-off-by: Simon Wunderlich <siwu@hrz.tu-chemnitz.de>
Signed-off-by: Mathias Kretschmer <mathias.kretschmer@fokus.fraunhofer.de>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2013-06-12 09:12:43 +02:00
Simon Wunderlich
a6b368f6ca mac80211: abort CAC in stop_ap()
When a CAC is running and stop_ap is called (e.g. when hostapd is killed
while performing CAC), the CAC must be aborted immediately.
Otherwise ieee80211_stop_ap() will try to stop it when it's too late -
wdev->channel is already NULL and the abort event can not be generated.

Signed-off-by: Simon Wunderlich <siwu@hrz.tu-chemnitz.de>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2013-06-12 09:12:43 +02:00
Johannes Berg
35d865afbb mac80211: work around broken APs not including HT info
There are some APs, notably 2G/3G/4G Wifi routers, specifically the
"Onda PN51T", "Vodafone PocketWiFi 2", "ZTE MF60" and a similar
T-Mobile branded device [1] that erroneously don't include all the
needed information in (re)association response frames. Work around
this by assuming the information is the same as it was in the
beacon or probe response and using the data from there instead.

This fixes https://bugzilla.kernel.org/show_bug.cgi?id=58881.

[1] https://bbs.archlinux.org/viewtopic.php?pid=1277305

Note that this requires marking the first ieee802_11_parse_elems()
argument const, otherwise we'd get a compiler warning.

Cc: stable@vger.kernel.org
Reported-and-tested-by: Michal Zajac <manwe@manwe.pl>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2013-06-12 09:11:54 +02:00
Eric Dumazet
130d3d68b5 net_sched: psched_ratecfg_precompute() improvements
Before allowing 64bits bytes rates, refactor
psched_ratecfg_precompute() to get better comments
and increased accuracy.

rate_bps field is renamed to rate_bytes_ps, as we only
have to worry about bytes per second.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Ben Greear <greearb@candelatech.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-11 22:39:47 -07:00
John W. Linville
3899ba90a4 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next into for-davem
Conflicts:
	drivers/net/wireless/ath/ath9k/debug.c
	net/mac80211/iface.c
2013-06-11 14:48:32 -04:00
Johannes Berg
940d0ac9db cfg80211: fix rtnl leak in wiphy dump error cases
In two wiphy dump error cases, most often when the dump allocation
must be increased, the RTNL is leaked. This quickly results in a
complete system lockup. Release the RTNL correctly.

Reported-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2013-06-11 16:52:39 +02:00
Antonio Quartulli
ea141b75ae nl80211: allow sending CMD_FRAME without specifying any frequency
Users may want to send a frame on the current channel
without specifying it.

This is particularly useful for the correct implementation
of the IBSS/RSN support in wpa_supplicant which requires to
receive and send AUTH frames.

Make mgmt_tx pass a NULL channel to the driver if none has
been specified by the user.

Signed-off-by: Antonio Quartulli <antonio@open-mesh.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2013-06-11 15:01:36 +02:00
Antonio Quartulli
f7aeb6fb1a mac80211: make mgmt_tx accept a NULL channel
cfg80211 passes a NULL channel to mgmt_tx if the frame has
to be sent on the one currently in use by the device.
Make the implementation of mgmt_tx correctly handle this
case. Fail if offchan is required.

Signed-off-by: Antonio Quartulli <antonio@open-mesh.com>
[fix RCU locking]
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2013-06-11 15:01:24 +02:00
Jouni Malinen
3d124ea27a cfg80211: fix VHT TDLS peer AID verification
I (Johannes) accidentally applied the first version of the patch
("Allow TDLS peer AID to be configured for VHT"). Now apply just
the changes between v1 and v2 to get the AID verification and
prefer the new attribute over the old one.

Signed-off-by: Jouni Malinen <jouni@qca.qualcomm.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2013-06-11 14:36:43 +02:00
Ashok Nagarajan
ffb3cf3000 {nl,mac,cfg}80211: Allow user to configure basic rates for mesh
Currently mesh uses mandatory rates as the default basic rates. Allow basic
rates to be configured during mesh join. Basic rates are applied only if
channel is also provided with mesh join command.

Signed-off-by: Ashok Nagarajan <ashok@cozybit.com>
[some whitespace fixes, refuse basic rates w/o channel]
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2013-06-11 14:24:36 +02:00
Colleen Twitty
66de671374 mac80211: expire mesh peers based on mesh configuration
The time it takes to see the peer link expire may differ
by a minute since sta_expire() is run once a minute as a
mesh housekeeping task.

Signed-off-by: Colleen Twitty <colleen@cozybit.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2013-06-11 14:16:29 +02:00
Colleen Twitty
8e7c053853 {nl,cfg}80211: make peer link expiration time configurable
If a STA has a peer that it hasn't seen any tx activity
from for a certain length of time, the peer link is
expired. This means the inactive STA is removed from the
list of peers and that STA is not considered a peer again
unless it re-peers.  Previously, this inactivity time was
always 30 minutes.  Now, add it to the mesh configuration
and allow it to be configured.  Retain 30 minutes as a
default value.

Signed-off-by: Colleen Twitty <colleen@cozybit.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2013-06-11 14:16:29 +02:00
Thomas Pedersen
ecccd072b0 mac80211: fix mesh deadlock
The patch "cfg80211/mac80211: use cfg80211 wdev mutex in
mac80211" introduced several deadlocks by converting the
ifmsh->mtx to wdev->mtx. Solve these by:

1. drop the cancel_work_sync() in ieee80211_stop_mesh().
   Instead make the mesh work conditional on whether the mesh
   is running or not.
2. lock the mesh work with sdata_lock() to protect beacon
   updates and prevent races with wdev->mesh_id_len or
   cfg80211.

Signed-off-by: Thomas Pedersen <thomas@cozybit.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2013-06-11 13:14:42 +02:00
Patrick McHardy
7cdbac71f9 netlink: fix error propagation in netlink_mmap()
Return the error if something went wrong instead of unconditionally
returning 0.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-11 02:52:47 -07:00
Eric Dumazet
45203a3b38 net_sched: add 64bit rate estimators
struct gnet_stats_rate_est contains u32 fields, so the bytes per second
field can wrap at 34360Mbit.

Add a new gnet_stats_rate_est64 structure to get 64bit bps/pps fields,
and switch the kernel to use this structure natively.

This structure is dumped to user space as a new attribute :

TCA_STATS_RATE_EST64

Old tc command will now display the capped bps (to 34360Mbit), instead
of wrapped values, and updated tc command will display correct
information.

Old tc command output, after patch :

eric:~# tc -s -d qd sh dev lo
qdisc pfifo 8001: root refcnt 2 limit 1000p
 Sent 80868245400 bytes 1978837 pkt (dropped 0, overlimits 0 requeues 0)
 rate 34360Mbit 189696pps backlog 0b 0p requeues 0

This patch carefully reorganizes "struct Qdisc" layout to get optimal
performance on SMP.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-11 02:51:03 -07:00
Daniel Borkmann
1abd165ed7 net: sctp: fix NULL pointer dereference in socket destruction
While stress testing sctp sockets, I hit the following panic:

BUG: unable to handle kernel NULL pointer dereference at 0000000000000020
IP: [<ffffffffa0490c4e>] sctp_endpoint_free+0xe/0x40 [sctp]
PGD 7cead067 PUD 7ce76067 PMD 0
Oops: 0000 [#1] SMP
Modules linked in: sctp(F) libcrc32c(F) [...]
CPU: 7 PID: 2950 Comm: acc Tainted: GF            3.10.0-rc2+ #1
Hardware name: Dell Inc. PowerEdge T410/0H19HD, BIOS 1.6.3 02/01/2011
task: ffff88007ce0e0c0 ti: ffff88007b568000 task.ti: ffff88007b568000
RIP: 0010:[<ffffffffa0490c4e>]  [<ffffffffa0490c4e>] sctp_endpoint_free+0xe/0x40 [sctp]
RSP: 0018:ffff88007b569e08  EFLAGS: 00010292
RAX: 0000000000000000 RBX: ffff88007db78a00 RCX: dead000000200200
RDX: ffffffffa049fdb0 RSI: ffff8800379baf38 RDI: 0000000000000000
RBP: ffff88007b569e18 R08: ffff88007c230da0 R09: 0000000000000001
R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
R13: ffff880077990d00 R14: 0000000000000084 R15: ffff88007db78a00
FS:  00007fc18ab61700(0000) GS:ffff88007fc60000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 0000000000000020 CR3: 000000007cf9d000 CR4: 00000000000007e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Stack:
 ffff88007b569e38 ffff88007db78a00 ffff88007b569e38 ffffffffa049fded
 ffffffff81abf0c0 ffff88007db78a00 ffff88007b569e58 ffffffff8145b60e
 0000000000000000 0000000000000000 ffff88007b569eb8 ffffffff814df36e
Call Trace:
 [<ffffffffa049fded>] sctp_destroy_sock+0x3d/0x80 [sctp]
 [<ffffffff8145b60e>] sk_common_release+0x1e/0xf0
 [<ffffffff814df36e>] inet_create+0x2ae/0x350
 [<ffffffff81455a6f>] __sock_create+0x11f/0x240
 [<ffffffff81455bf0>] sock_create+0x30/0x40
 [<ffffffff8145696c>] SyS_socket+0x4c/0xc0
 [<ffffffff815403be>] ? do_page_fault+0xe/0x10
 [<ffffffff8153cb32>] ? page_fault+0x22/0x30
 [<ffffffff81544e02>] system_call_fastpath+0x16/0x1b
Code: 0c c9 c3 66 2e 0f 1f 84 00 00 00 00 00 e8 fb fe ff ff c9 c3 66 0f
      1f 84 00 00 00 00 00 55 48 89 e5 53 48 83 ec 08 66 66 66 66 90 <48>
      8b 47 20 48 89 fb c6 47 1c 01 c6 40 12 07 e8 9e 68 01 00 48
RIP  [<ffffffffa0490c4e>] sctp_endpoint_free+0xe/0x40 [sctp]
 RSP <ffff88007b569e08>
CR2: 0000000000000020
---[ end trace e0d71ec1108c1dd9 ]---

I did not hit this with the lksctp-tools functional tests, but with a
small, multi-threaded test program, that heavily allocates, binds,
listens and waits in accept on sctp sockets, and then randomly kills
some of them (no need for an actual client in this case to hit this).
Then, again, allocating, binding, etc, and then killing child processes.

This panic then only occurs when ``echo 1 > /proc/sys/net/sctp/auth_enable''
is set. The cause for that is actually very simple: in sctp_endpoint_init()
we enter the path of sctp_auth_init_hmacs(). There, we try to allocate
our crypto transforms through crypto_alloc_hash(). In our scenario,
it then can happen that crypto_alloc_hash() fails with -EINTR from
crypto_larval_wait(), thus we bail out and release the socket via
sk_common_release(), sctp_destroy_sock() and hit the NULL pointer
dereference as soon as we try to access members in the endpoint during
sctp_endpoint_free(), since endpoint at that time is still NULL. Now,
if we have that case, we do not need to do any cleanup work and just
leave the destruction handler.

Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Acked-by: Vlad Yasevich <vyasevich@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-11 02:49:05 -07:00
Peter Pan(潘卫平)
b41abb42bf net: pass correct parameter to skb_headers_offset_update()
Since commit 1a37e412a022(net: Use 16bits for *_headers fields of struct
skbuff), skb->*_header are relative to skb->head,
so copy_skb_header() should not call skb_headers_offset_update() now,
and we should pass correct parameter to skb_headers_offset_update() in
pskb_expand_head() and skb_copy_expand().

Signed-off-by: Weiping Pan <panweiping3@gmail.com>
Reviewed-by: Simon Horman <horms@verge.net.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-11 02:48:06 -07:00
Gao feng
da12c90e09 netlink: Add compare function for netlink_table
As we know, netlink sockets are private resource of
net namespace, they can communicate with each other
only when they in the same net namespace. this works
well until we try to add namespace support for other
subsystems which use netlink.

Don't like ipv4 and route table.., it is not suited to
make these subsytems belong to net namespace, Such as
audit and crypto subsystems,they are more suitable to
user namespace.

So we must have the ability to make the netlink sockets
in same user namespace can communicate with each other.

This patch adds a new function pointer "compare" for
netlink_table, we can decide if the netlink sockets can
communicate with each other through this netlink_table
self-defined compare function.

The behavior isn't changed if we don't provide the compare
function for netlink_table.

Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>
Acked-by: Serge E. Hallyn <serge.hallyn@ubuntu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-11 02:39:42 -07:00
Vlad Yasevich
867a59436f bridge: Add a flag to control unicast packet flood.
Add a flag to control flood of unicast traffic.  By default, flood is
on and the bridge will flood unicast traffic if it doesn't know
the destination.  When the flag is turned off, unicast traffic
without an FDB will not be forwarded to the specified port.

Signed-off-by: Vlad Yasevich <vyasevic@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-11 02:04:32 -07:00
Vlad Yasevich
9ba18891f7 bridge: Add flag to control mac learning.
Allow user to control whether mac learning is enabled on the port.
By default, mac learning is enabled.  Disabling mac learning will
cause new dynamic FDB entries to not be created for a particular port.

Signed-off-by: Vlad Yasevich <vyasevic@redhat.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-11 02:04:32 -07:00
Nicolas Dichtel
ed13998c31 sock_diag: fix filter code sent to userspace
Filters need to be translated to real BPF code for userland, like SO_GETFILTER.

Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-10 22:23:32 -07:00
Cong Wang
30f3a40f9a net: remove last caller of skb_tail_offset() and itself
Similar to the following commits:

commit 00f97da17a (netpoll: fix position of network header)
commit 525cebedb3 (pktgen: Fix position of ip and udp header)

using skb_tail_offset() seems not correct since the offset
is based on head pointer.

With the last caller removed, skb_tail_offset() can be killed
finally.

Cc: Thomas Graf <tgraf@suug.ch>
Cc: Daniel Borkmann <dborkmann@redhat.com>
Cc: David S. Miller <davem@davemloft.net>
Signed-off-by: Cong Wang <amwang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-10 22:22:23 -07:00
Eliezer Tamir
d30e383bb8 tcp: add low latency socket poll support.
Adds low latency socket poll support for TCP.
In tcp_v[46]_rcv() add a call to sk_mark_ll() to copy the napi_id
from the skb to the sk.
In tcp_recvmsg(), when there is no data in the socket we busy-poll.
This is a good example of how to add busy-poll support to more protocols.

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: Eliezer Tamir <eliezer.tamir@linux.intel.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Tested-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-10 21:22:36 -07:00
Eliezer Tamir
a5b50476f7 udp: add low latency socket poll support
Add upport for busy-polling on UDP sockets.
In __udp[46]_lib_rcv add a call to sk_mark_ll() to copy the napi_id
from the skb into the sk.
This is done at the earliest possible moment, right after we identify
which socket this skb is for.
In __skb_recv_datagram When there is no data and the user
tries to read we busy poll.

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: Eliezer Tamir <eliezer.tamir@linux.intel.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Tested-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-10 21:22:36 -07:00
Eliezer Tamir
0602129286 net: add low latency socket poll
Adds an ndo_ll_poll method and the code that supports it.
This method can be used by low latency applications to busy-poll
Ethernet device queues directly from the socket code.
sysctl_net_ll_poll controls how many microseconds to poll.
Default is zero (disabled).
Individual protocol support will be added by subsequent patches.

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: Eliezer Tamir <eliezer.tamir@linux.intel.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Tested-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-10 21:22:35 -07:00
Eliezer Tamir
af12fa6e46 net: add napi_id and hash
Adds a napi_id and a hashing mechanism to lookup a napi by id.
This will be used by subsequent patches to implement low latency
Ethernet device polling.
Based on a code sample by Eric Dumazet.

Signed-off-by: Eliezer Tamir <eliezer.tamir@linux.intel.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Tested-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-10 21:22:35 -07:00
Linus Torvalds
1b79821fe7 zero copy error fix
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.12 (GNU/Linux)
 Comment: GPGTools - http://gpgtools.org
 
 iQIcBAABAgAGBQJRtjfAAAoJEDZk62b0Tg6xEHcP/jtZbupCJBVmjvI5zBDjn4NT
 RnOoaZLz4ks1OTj2WkhkdMe7/dNh+Dlq/w9F9nw4jOZhkSN90cgDZJqK/otHqh3R
 GpXEwNS5NLysTu9qlA2Kd49C77KynafEUtFR4k0zSB/q8Uh+g8D480bps1xBeIUx
 ATlBSuYPy6SZM1PWqgllnP7IzATQtaHJSQ8uXCkfSneG+dj+13Fq9LngH7DCpKA6
 T4bU8Mx8kXlR0RcVtjjqFAf7TmOe05XG/WhgsaX5gGcpxuZ5FajCaIZeuD1rQ63K
 jLKA56VxUGa5FU6ZBZH0I+52xhdceByJdFcTW8X8fhXU6m82NYhKnMv64QZ+UjT0
 F2vNlzW7SkYvnMkjvxor/JLyi0OKIrMfElPYcuMW8pU5AVV9Hzf0M1NHLCeM2CNn
 JoghVzOPcUSistYldAJkivCcGi9sEgNadTsBrU1TM8FwkaHboUXprNf8xn2SRm1V
 iD/25/voPqbsVPtDSO9YsVEi6NldLx07yQwJy34jJ1D5Qw4MPtfPlUTLT17ikCc4
 kjY1fiov+HOcEeCse1X3IxqEB3RK/KTR6NUqOk1yY2DlLxU1ZhJfrpelOTDV9iXc
 5Ra71ebsBYb2gQqk6+do/IUxKMiLBGrt5l2KLq2ZdhWYn8Fr8O7B5gaX9BH8aOwH
 L5ROAeZxkgSJBjuKMcsi
 =+vsA
 -----END PGP SIGNATURE-----

Merge tag '9p-3.10-bug-fix-1' of git://git.kernel.org/pub/scm/linux/kernel/git/ericvh/v9fs

Pull net/9p bug fix from Eric Van Hensbergen:
 "zero copy error fix"

* tag '9p-3.10-bug-fix-1' of git://git.kernel.org/pub/scm/linux/kernel/git/ericvh/v9fs:
  net/9p: Handle error in zero copy request correctly for 9p2000.u
2013-06-10 17:35:25 -07:00
Pablo Neira Ayuso
ed82c43732 netfilter: xt_TCPOPTSTRIP: don't use tcp_hdr()
In (bc6bcb5 netfilter: xt_TCPOPTSTRIP: fix possible mangling beyond
packet boundary), the use of tcp_hdr was introduced. However, we
cannot assume that skb->transport_header is set for non-local packets.

Cc: Florian Westphal <fw@strlen.de>
Reported-by: Phil Oester <kernel@linuxace.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2013-06-11 01:55:07 +02:00
David S. Miller
d88210910a Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf
Pablo Neira Ayuso says:

====================
The following patchset contains four fixes for Netfilter and one fix
for IPVS, they are:

* Fix data leak to user-space via getsockopt IP_VS_SO_GET_DESTS, from
  Dan Carpenter.

* Fix xt_TCPMSS if no TCP MSS is specified in syn packets, to avoid the
  violation of RFC879, from Phil Oester.

* Fix incomplete dump of objects via nfnetlink_acct and nfnetlink_cttimeout,
  from myself.

* Fix missing HW protocol in packets passed to user-space via NFQUEUE,
  from myself.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-10 13:30:33 -07:00
Dan Carpenter
a8241c6351 ipvs: info leak in __ip_vs_get_dest_entries()
The entry struct has a 2 byte hole after ->port and another 4 byte
hole after ->stats.outpkts.  You must have CAP_NET_ADMIN in your
namespace to hit this information leak.

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Simon Horman <horms@verge.net.au>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2013-06-10 14:53:00 +02:00
Simon Wunderlich
d5b4c93e67 batman-adv: Don't handle address updates when bla is disabled
The bridge loop avoidance has a hook to handle address updates of the
originator. These should not be handled when bridge loop avoidance is
disabled - it might send some bridge loop avoidance packets which should
not appear if bla is disabled.

Signed-off-by: Simon Wunderlich <simon@open-mesh.com>
Signed-off-by: Marek Lindner <lindner_marek@yahoo.de>
Signed-off-by: Antonio Quartulli <ordex@autistici.org>
2013-06-10 08:42:18 +02:00
Simon Wunderlich
7c24bbbeab batman-adv: forward late OGMs from best next hop
When a packet is received from another node first and later from the
best next hop, this packet is dropped. However the first OGM was sent
with the BATADV_NOT_BEST_NEXT_HOP flag and thus dropped by neighbors.
The late OGM from the best neighbor is then dropped because it is a
duplicate.

If this situation happens constantly, a node might end up not forwarding
the "valid" OGMs anymore, and nodes behind will starve from not getting
valid OGMs.

Fix this by refining the duplicate checking behaviour: The actions
should depend on whether it was a duplicate for a neighbor only or for
the originator. OGMs which are not duplicates for a specific neighbor
will now be considered in batadv_iv_ogm_forward(), but only actually
forwarded for the best next hop. Therefore, late OGMs from the best
next hop are forwarded now and not dropped as duplicates anymore.

Signed-off-by: Simon Wunderlich <simon@open-mesh.com>
Signed-off-by: Marek Lindner <lindner_marek@yahoo.de>
Signed-off-by: Antonio Quartulli <ordex@autistici.org>
2013-06-10 08:42:17 +02:00
Matthias Schiffer
ac16d1484e batman-adv: wait for rtnl in batadv_store_mesh_iface instead of failing if it is taken
The rtnl_lock in batadv_store_mesh_iface has been converted to a rtnl_trylock
some time ago to avoid a possible deadlock between rtnl and s_active on removal
of the sysfs nodes.

The behaviour introduced by that was quite confusing as it could lead to the
sysfs store to fail, making batman-adv setup scripts unreliable. As recently the
sysfs removal was postponed to a worker not running with the rtnl taken, the
deadlock can't occur any more and it is safe to change the trylock back to a
lock to make the sysfs store reliable again.

Signed-off-by: Matthias Schiffer <mschiffer@universe-factory.net>
Reviewed-by: Simon Wunderlich <siwu@hrz.tu-chemnitz.de>
Signed-off-by: Marek Lindner <lindner_marek@yahoo.de>
Signed-off-by: Antonio Quartulli <ordex@autistici.org>
2013-06-10 08:42:16 +02:00
Greg Kroah-Hartman
38a4671cad Merge 3.10-rc5 into char-misc-next 2013-06-08 22:34:53 -07:00
Pablo Neira Ayuso
c05cdb1b86 netlink: allow large data transfers from user-space
I can hit ENOBUFS in the sendmsg() path with a large batch that is
composed of many netlink messages. Here that limit is 8 MBytes of
skbuff data area as kmalloc does not manage to get more than that.

While discussing atomic rule-set for nftables with Patrick McHardy,
we decided to put all rule-set updates that need to be applied
atomically in one single batch to simplify the existing approach.
However, as explained above, the existing netlink code limits us
to a maximum of ~20000 rules that fit in one single batch without
hitting ENOBUFS. iptables does not have such limitation as it is
using vmalloc.

This patch adds netlink_alloc_large_skb() which is only used in
the netlink_sendmsg() path. It uses alloc_skb if the memory
requested is <= one memory page, that should be the common case
for most subsystems, else vmalloc for higher memory allocations.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-07 16:26:34 -07:00
Eric Dumazet
40edeff6e1 net_sched: qdisc_get_rtab() must check data[] array
qdisc_get_rtab() should check not only the keys in struct tc_ratespec,
but also the full data[] array.

"tc ... linklayer atm " only perturbs values in the 256 slots array.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-07 15:24:04 -07:00
Daniel Borkmann
28850dc7c7 net: tcp: move GRO/GSO functions to tcp_offload
Would be good to make things explicit and move those functions to
a new file called tcp_offload.c, thus make this similar to tcpv6_offload.c.
While moving all related functions into tcp_offload.c, we can also
make some of them static, since they are only used there. Also, add
an explicit registration function.

Suggested-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-07 14:39:05 -07:00
Daniel Borkmann
5ee9859157 net: minor: tcp: use tcp_skb_mss helper in tcp_tso_segment
We have the minimal inline helper tcp_skb_mss to access
skb_shinfo(skb)->gso_size, so also use it here to get mss.

Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-07 14:39:05 -07:00
Pablo Neira Ayuso
7b8dfe289f netfilter: nfnetlink_queue: fix missing HW protocol
Locally generated IPv4 and IPv6 traffic gets skb->protocol unset,
thus passing zero.

ip6tables -I OUTPUT -j NFQUEUE
libmnl/examples/netfilter# ./nf-queue 0 &
ping6 ::1
packet received (id=1 hw=0x0000 hook=3)
                         ^^^^^^

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2013-06-07 18:55:20 +02:00
David S. Miller
93a306aef5 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Merge 'net' into 'net-next' to get the MSG_CMSG_COMPAT
regression fix.

Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-06 23:39:26 -07:00
Trond Myklebust
9ec2ef53b9 SUNRPC: Remove redundant call to rpc_set_running() in __rpc_execute()
The RPC_TASK_RUNNING flag will always have been set in rpc_make_runnable()
once we get past the test for out_of_line_wait_on_bit() returning
ERESTARTSYS.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-06-06 16:24:40 -04:00
Trond Myklebust
0053a8e65c SUNRPC: Remove unused function rpc_queue_empty
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-06-06 16:24:39 -04:00
Trond Myklebust
a76580fbf0 SUNRPC: Fix a potential race in rpc_execute
If the rpc_task is asynchronous, it could theoretically finish executing
on the workqueue it was assigned by rpc_make_runnable() before we get
round to testing RPC_IS_ASYNC() in rpc_execute.

In practice, however, all the existing callers hold a reference to the
rpc_task, so this can't happen today...

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-06-06 16:24:38 -04:00
Andy Lutomirski
a7526eb5d0 net: Unbreak compat_sys_{send,recv}msg
I broke them in this commit:

    commit 1be374a051
    Author: Andy Lutomirski <luto@amacapital.net>
    Date:   Wed May 22 14:07:44 2013 -0700

        net: Block MSG_CMSG_COMPAT in send(m)msg and recv(m)msg

This patch adds __sys_sendmsg and __sys_sendmsg as common helpers that accept
MSG_CMSG_COMPAT and blocks MSG_CMSG_COMPAT at the syscall entrypoints.  It
also reverts some unnecessary checks in sys_socketcall.

Apparently I was suffering from underscore blindness the first time around.

Signed-off-by: Andy Lutomirski <luto@amacapital.net>
Tested-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-06 11:52:14 -07:00
David S. Miller
143554ace8 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-next
Conflicts:
	net/netfilter/nf_log.c

The conflict in nf_log.c is that in 'net' we added CONFIG_PROC_FS
protection around foo_proc_entry() calls to fix a build failure,
whereas in Pablo's tree a guard if() test around a call is
remove_proc_entry() was removed.  Trivially resolved.

Pablo Neira Ayuso says:

====================
The following patchset contains the first batch of
Netfilter/IPVS updates for your net-next tree, they are:

* Three patches with improvements and code refactorization
  for nfnetlink_queue, from Florian Westphal.

* FTP helper now parses replies without brackets, as RFC1123
  recommends, from Jeff Mahoney.

* Rise a warning to tell everyone about ULOG deprecation,
  NFLOG has been already in the kernel tree for long time
  and supersedes the old logging over netlink stub, from
  myself.

* Don't panic if we fail to load netfilter core framework,
  just bail out instead, from myself.

* Add cond_resched_rcu, used by IPVS to allow rescheduling
  while walking over big hashtables, from Simon Horman.

* Change type of IPVS sysctl_sync_qlen_max sysctl to avoid
  possible overflow, from Zhang Yanfei.

* Use strlcpy instead of strncpy to skip zeroing of already
  initialized area to write the extension names in ebtables,
  from Chen Gang.

* Use already existing per-cpu notrack object from xt_CT,
  from Eric Dumazet.

* Save explicit socket lookup in xt_socket now that we have
  early demux, also from Eric Dumazet.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-06 01:03:06 -07:00
Fan Du
4c4d41f200 xfrm: add LINUX_MIB_XFRMACQUIREERROR statistic counter
When host ping its peer, ICMP echo request packet triggers IPsec
policy, then host negotiates SA secret with its peer. After IKE
installed SA for OUT direction, but before SA for IN direction
installed, host get ICMP echo reply from its peer. At the time
being, the SA state for IN direction could be XFRM_STATE_ACQ,
then the received packet will be dropped after adding
LINUX_MIB_XFRMINSTATEINVALID statistic.

Adding a LINUX_MIB_XFRMACQUIREERROR statistic counter for such
scenario when SA in larval state is much clearer for user than
LINUX_MIB_XFRMINSTATEINVALID which indicates the SA is totally
bad.

Signed-off-by: Fan Du <fan.du@windriver.com>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2013-06-06 06:45:55 +02:00
David S. Miller
6bc19fb82d Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Merge 'net' bug fixes into 'net-next' as we have patches
that will build on top of them.

This merge commit includes a change from Emil Goode
(emilgoode@gmail.com) that fixes a warning that would
have been introduced by this merge.  Specifically it
fixes the pingv6_ops method ipv6_chk_addr() to add a
"const" to the "struct net_device *dev" argument and
likewise update the dummy_ipv6_chk_addr() declaration.

Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-05 16:37:30 -07:00
Andy Shevchenko
4cd5773a2a net: core: move mac_pton() to lib/net_utils.c
Since we have at least one user of this function outside of CONFIG_NET
scope, we have to provide this function independently. The proposed
solution is to move it under lib/net_utils.c with corresponding
configuration variable and select wherever it is needed.

Signed-off-by: Andy Shevchenko <andy.shevchenko@gmail.com>
Reported-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: David S. Miller <davem@davemloft.net>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-06-05 12:00:27 -07:00
Phil Oester
409b545ac1 netfilter: xt_TCPMSS: Fix violation of RFC879 in absence of MSS option
The clamp-mss-to-pmtu option of the xt_TCPMSS target can cause issues
connecting to websites if there was no MSS option present in the
original SYN packet from the client. In these cases, it may add a
MSS higher than the default specified in RFC879. Fix this by never
setting a value > 536 if no MSS option was specified by the client.

This closes netfilter's bugzilla #662.

Signed-off-by: Phil Oester <kernel@linuxace.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2013-06-05 13:59:22 +02:00
Florian Westphal
7f87712c01 netfilter: nfnetlink_queue: only add CAP_LEN attr when needed
CAP_LEN contains the size of the network packet we're queueing to
userspace, i.e. normally it is the same as the NFQA_PAYLOAD attribute len.

Include it only in the unlikely case when NFQA_PAYLOAD is truncated due
to copy_range limitations.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2013-06-05 12:40:54 +02:00
Florian Westphal
9cefbbc9c8 netfilter: nfnetlink_queue: cleanup copy_range usage
For every packet queued, we check if configured copy_range
is 0, and treat that as 'copy entire packet'.

We can move this check to the queue configuration, and can
set copy_range appropriately.

Also, convert repetitive '0xffff - NLA_HDRLEN' to a macro.

[ queue initialization still used 0xffff, although its harmless
  since the initial setting is overwritten on queue config ]

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2013-06-05 12:40:28 +02:00
Pablo Neira Ayuso
37bc4f8dfa netfilter: nfnetlink_cttimeout: fix incomplete dumping of objects
Fix broken incomplete object dumping if the list of objects does not
fit into one single netlink message.

Reported-by: Gabriel Lazar <Gabriel.Lazar@com.utcluj.ro>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2013-06-05 12:36:37 +02:00
Pablo Neira Ayuso
991a6b735f netfilter: nfnetlink_acct: fix incomplete dumping of objects
Fix broken incomplete object dumping if the list of objects does not
fit into one single netlink message.

Reported-by: Gabriel Lazar <Gabriel.Lazar@com.utcluj.ro>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2013-06-05 12:36:36 +02:00
Linus Torvalds
4d3797d7e1 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Pull networking fixes from David Miller:

 1) Fix timeouts with direct mode authentication in mac80211, from
    Stanislaw Gruszka.

 2) Aggregation sessions can deadlock in ath9k, from Felix Fietkau.

 3) Netfilter's xt_addrtype doesn't work with ipv6 due to route lookups
    creating undesirable cache entries, from Florian Westphal.

 4) Fix netfilter's ipt_ULOG from generating non-NULL terminated
    strings.

 5) Fix netdev transmit queue crashes in mac80211, from Johannes Berg.

 6) Fix copy and paste error in 802.11 stack that broke reporting of
    64-bit station tx statistics, from Felix Fietkau.

 7) When qlge_probe fails, it leaks the netdev.  Fix from Wei Yongjun.

 8) SKB control block (where we store the IP options information,
    amongst other things) must be cleared properly otherwise ICMP
    sending can crash for IP tunnels.  Fix from Eric Dumazet.

 9) Verification of Energy Efficient Ether support was coded wrongly,
    the test was inversed.  Fix from Giuseppe CAVALLARO.

10) TCP handles redirects improperly because the wrong flow key is used
    for the route lookup.  From Michal Kubecek.

11) Don't interpret MSG_CMSG_COMPAT from userspace, fix from Andy
    Lutomirski.

12) The new AF_VSOCK was missing from the lockdep string table, fix from
    Federico Vaga.

13) be2net doesn't handle checksumming of IP fragments properly, from
    Somnath Kotur.

14) Fix several bugs in the device address list code that lead to
    crashes and other misbehaviors.  From Jay Vosburgh.

15) Fix ipv6 segmentation handling of fragmented GRE tunnel traffic,
    from Pravin B Shalr.

16) Fix usage of stale policies in IPSEC layer, from Paul Moore.

17) Fix team driver dump of ports when there are a large number of them,
    from Jiri Pirko.

18) Fix softlockups in UDP ipv4 socket lookup causes by and error in the
    hlist_nulls_for_each_entry_rcu() macro.  From Eric Dumazet.

19) Fix several regressions added by the high rate accuracy changes to
    the htb packet scheduler.  From Eric Dumazet.

20) Fix DMA'ing onto the stack in esd_usb2 and peak_usb CAN drivers,
    from Olivier Sobrie and Marc Kleine-Budde.

21) Fix unremovable network devices due to missing route pointer
    installation in the per-device ipv6 address list entries.  From Gao
    feng.

22) Apply the tg3 5719 DMA workaround on 5720 chips as well, otherwise
    we get stalls.  From Nithin Sujir.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (68 commits)
  net_sched: htb: do not mix 1ns and 64ns time units
  net: fix sk_buff head without data area
  tg3: Add read dma workaround for 5720
  net: ethernet: xilinx_emaclite: set protocol selector bits when writing ANAR
  bnx2x: Fix bridged GSO for 57710/57711 chips
  net: fec: add fallback to random MAC address
  bnx2x: fix TCP offload for tunneling ipv4 over ipv6
  ipv6: assign rt6_info to inet6_ifaddr in init_loopback
  net/mlx4_core: Keep VF assigned MAC in the PF admin table
  net/mlx4_en: Handle unassigned VF MAC address correctly
  net/mlx4_core: Return -EPROBE_DEFER when a VF is probed before PF is sufficiently initialized
  net/mlx4_en: Fix adaptive moderation cq update
  net: can: peak_usb: Do not do dma on the stack
  net: can: esd_usb2: Do not do dma on the stack
  net: can: kvaser_usb: fix reception on "USBcan Pro" and "USBcan R" type hardware.
  net_sched: restore "overhead xxx" handling
  net: force a reload of first item in hlist_nulls_for_each_entry_rcu
  hyperv: Fix vlan_proto setting in netvsc_recv_callback()
  team: fix port list dump for big number of ports
  list: introduce list_first_entry_or_null
  ...
2013-06-05 19:19:04 +09:00
Johannes Berg
780b40df12 wireless: fix kernel-doc
Some kernel-doc fixes for forgotten fields and renamed things.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2013-06-05 09:33:32 +02:00
Alexander Bondar
989c6505cd mac80211: Use suitable semantics for beacon availability indication
Currently beacon availability upon association is marked by have_beacon
flag of assoc_data structure that becomes unavailable when association
completes. However beacon availability indication is required also after
association to inform a driver. Currently dtim_period parameter is used
for this purpose. Move have_beacon flag to another structure, persistant
throughout a interface's life cycle. Use suitable sematics for beacon
availability indication.

Signed-off-by: Alexander Bondar <alexander.bondar@intel.com>
[fix another instance of BSS_CHANGED_DTIM_PERIOD in docs]
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2013-06-05 09:12:20 +02:00
Alexander Bondar
482a9c74fa mac80211: fix powersave bug and clean up ieee80211_rx_bss_info
ieee80211_rx_bss_info() deals with dtim_period setting and PS update
when associated. Move all these to another locations cleaning this
function. Also, the current implementation is buggy because when it
calls ieee80211_recalc_ps() bss_conf->dtim_period is notset properly
yet and thus nothing will happen.

Signed-off-by: Alexander Bondar <alexander.bondar@intel.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2013-06-05 08:52:47 +02:00
Eric Dumazet
5343a7f8be net_sched: htb: do not mix 1ns and 64ns time units
commit 56b765b79 ("htb: improved accuracy at high rates") added another
regression for low rates, because it mixes 1ns and 64ns time units.

So the maximum delay (mbuffer) was not 60 second, but 937 ms.

Lets convert all time fields to 1ns as 64bit arches are becoming the
norm.

Reported-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Tested-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-04 17:44:07 -07:00
Amerigo Wang
00f97da17a netpoll: fix position of network header
Similar to the problem in pktgen, netpoll uses skb_tail_offset()
too, as the code is copied from pktgen.

Also use return values of skb_put() directly, this will simiplify
the code.

Reported-by: Thomas Graf <tgraf@suug.ch>
Cc: Thomas Graf <tgraf@suug.ch>
Cc: Daniel Borkmann <dborkmann@redhat.com>
Cc: David S. Miller <davem@davemloft.net>
Signed-off-by: Cong Wang <amwang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-04 17:37:04 -07:00
Thomas Graf
525cebedb3 pktgen: Fix position of ip and udp header
skb_set_network_header() expects an offset based on the data pointer
whereas skb_tail_offset() also includes the headroom. This resulted
in the ip header being written in a wrong location.

Use return values of skb_put() directly and rely on skb->len to
set mac, network, and transport header.

Cc: Simon Horman <horms@verge.net.au>
Cc: Daniel Borkmann <dborkmann@redhat.com>
Assisted-by: Daniel Borkmann <dborkmann@redhat.com>
Signed-off-by: Thomas Graf <tgraf@suug.ch>
Acked-by: Daniel Borkmann <dborkmann@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-04 17:35:45 -07:00
Pablo Neira
5e71d9d77c net: fix sk_buff head without data area
Eric Dumazet spotted that we have to check skb->head instead
of skb->data as skb->head points to the beginning of the
data area of the skbuff. Similarly, we have to initialize the
skb->head pointer, not skb->data in __alloc_skb_head.

After this fix, netlink crashes in the release path of the
sk_buff, so let's fix that as well.

This bug was introduced in (0ebd0ac net: add function to
allocate sk_buff head without data area).

Reported-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-04 17:26:49 -07:00
Yan Burman
600fed5e97 net/ethtool: Fix comment regarding location of dev_ethtool() call
Signed-off-by: Yan Burman <yanb@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-04 17:10:03 -07:00
Cong Wang
c26d6b46da ping: always initialize ->sin6_scope_id and ->sin6_flowinfo
If we don't need scope id, we should initialize it to zero.
Same for ->sin6_flowinfo.

Cc: Lorenzo Colitti <lorenzo@google.com>
Cc: David S. Miller <davem@davemloft.net>
Signed-off-by: Cong Wang <amwang@redhat.com>
Acked-by: Lorenzo Colitti <lorenzo@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-04 16:58:42 -07:00
Gao feng
534c877928 ipv6: assign rt6_info to inet6_ifaddr in init_loopback
Commit 25fb6ca4ed
"net IPv6 : Fix broken IPv6 routing table after loopback down-up"
forgot to assign rt6_info to the inet6_ifaddr.
When disable the net device, the rt6_info which allocated
in init_loopback will not be destroied in __ipv6_ifa_notify.

This will trigger the waring message below
[23527.916091] unregister_netdevice: waiting for tap0 to become free. Usage count = 1

Reported-by: Arkadiusz Miskiewicz <a.miskiewicz@gmail.com>
Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-04 16:57:41 -07:00
Baruch Siach
430f03cde2 net: mark netdev_create_hash __net_init
netdev_create_hash() is only called from netdev_init() which is marked
__net_init.

Signed-off-by: Baruch Siach <baruch@tkos.co.il>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-04 16:47:27 -07:00
Jean Sacren
4960c2c6fa Kconfig: remove dangling references to the deleted file
Commit 202dc3fc59 (Documentation: remove
obsolete networking/multicast.txt file) deleted the obsolete file. After
the file has been removed, clean up a couple of places where references
to the deleted file were made so that users wouldn't be confused when
they consult the Help menu.

Signed-off-by: Jean Sacren <sakiwit@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-04 15:17:39 -07:00
Jean Sacren
ebd4687af7 xfrm: simplify the exit path of xfrm_output_one()
Clean up unnecessary assignment and jump. While there, fix up the label
name.

Signed-off-by: Jean Sacren <sakiwit@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-04 15:17:38 -07:00
Johannes Berg
9b881963c1 cfg80211: make wiphy index start at 0 again
The change to use atomic_inc_return() for assigning the wiphy
index made the first wiphy index 1 instead of 0. This is fine,
but we all habitually type "phy0" when we're testing, so make
it go back to 0 instead of 1 by subtracting 1 from the index.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2013-06-04 22:28:23 +02:00
Johannes Berg
256c90dedf cfg80211: fix potential deadlock regression
My big locking cleanups caused a problem by registering the
rfkill instance with the RTNL held, while the callback also
acquires the RTNL. This potentially causes a deadlock since
the two locks used (rfkill mutex and RTNL) can be acquired
in two different orders. Fix this by (un)registering rfkill
without holding the RTNL. This needs to be done after the
device struct is registered, but that can also be done w/o
holding the RTNL.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2013-06-04 22:22:41 +02:00
Lorenzo Colitti
d862e54614 net: ipv6: Implement /proc/net/icmp6.
The format is based on /proc/net/icmp and /proc/net/{udp,raw}6.

Compiles and displays reasonable results with CONFIG_IPV6={n,m,y}
Couldn't figure out how to test without CONFIG_PROC_FS enabled.

Signed-off-by: Lorenzo Colitti <lorenzo@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-04 12:56:14 -07:00
Lorenzo Colitti
8cc785f6f4 net: ipv4: make the ping /proc code AF-independent
Introduce a ping_seq_afinfo structure (similar to its UDP
equivalent) and use it to make some of the ping /proc functions
address-family independent. Rename the remaining ping /proc
functions from ping_* to ping_v4_*.

Compiles and displays reasonable results with CONFIG_IPV6={n,m,y}

Signed-off-by: Lorenzo Colitti <lorenzo@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-04 12:56:14 -07:00
Lorenzo Colitti
17ef66afc0 net: ipv6: Unify {raw,udp}6_sock_seq_show.
udp6_sock_seq_show and raw6_sock_seq_show are identical, except
the UDP version displays ports and the raw version displays the
protocol. Refactor most of the code in these two functions into
a new common ip6_dgram_sock_seq_show function, in preparation
for using it to display ICMPv6 sockets as well.

Also reduce the indentation in parts of include/net/transp_v6.h
to improve readability.

Compiles and displays reasonable results with CONFIG_IPV6={n,m,y}

Signed-off-by: Lorenzo Colitti <lorenzo@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-04 12:56:14 -07:00
Johannes Berg
3430140ad9 regulatory: use proper enum return value
get_reg_request_treatment() returns 0 in one case but is
defined to return an enum, use the proper value REG_REQ_OK.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2013-06-04 14:35:07 +02:00
Johannes Berg
ceca7b7121 cfg80211: separate internal SME implementation
The current internal SME implementation in cfg80211 is
very mixed up with the MLME handling, which has been
causing issues for a long time. There are three things
that the implementation has to provide:
 * a basic SME implementation for nl80211's connect()
   call (for drivers implementing auth/assoc, which is
   really just mac80211) and wireless extensions
 * MLME events for the userspace SME
 * SME events (connected, disconnected etc.) for all
   different SME implementation possibilities (driver,
   cfg80211 and userspace)

To achieve these goals it isn't necessary to track the
software SME's connection status outside of it's state
(which is the part that caused many issues.) Instead,
track it only in the SME data (wdev->conn) and in the
general case only track whether the wdev is connected
or not (via wdev->current_bss.)

Also separate the internal implementation to not have
callbacks from the SME events, but rather call it from
the API functions that the driver (or rather mac80211)
calls. This separates the code better.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2013-06-04 13:03:11 +02:00
Johannes Berg
6ff57cf888 cfg80211/mac80211: clean up cfg80211 SME APIs
Do some cleanups in the cfg80211 SME APIs, which are
only used by mac80211.

Most of these functions get a frame passed, and there
isn't really any reason to export multiple functions
as cfg80211 can check the frame type instead, do that.

Additionally, the API functions have confusing names
like cfg80211_send_...() which was meant to indicate
that it sends an event to userspace, but gets a bit
confusing when there's both TX and RX and they're not
all clearly labeled.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2013-06-04 13:03:10 +02:00
Pontus Fuchs
ff40b425f0 mac80211: set IEEE80211_TX_CTL_REQ_TX_STATUS on nullframes
The connection monitor needs to know the tx status of
nullframes to work properly.

Signed-off-by: Pontus Fuchs <pontus.fuchs@gmail.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2013-06-04 12:52:39 +02:00
Johannes Berg
9c90a9f64c nl80211: remove bogus genlmsg_end() error checking
genlmsg_end() can't return an error since it returns the
skb length so remove checks treating the return value as
an error code.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2013-06-04 12:47:08 +02:00
Felix Fietkau
d6d23de278 mac80211: add a tx control flag to indicate PS-Poll/uAPSD response
Signed-off-by: Felix Fietkau <nbd@openwrt.org>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2013-06-04 12:41:36 +02:00
Johannes Berg
964dc9e2c3 cfg80211: take WoWLAN support information out of wiphy struct
There's no need to take up the space for devices that don't
support WoWLAN, and most drivers can even make the support
data static const (except where it's modified at runtime.)

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2013-06-03 18:43:34 +02:00
Jacob Minshall
e05ecccdf7 mac80211: set mesh formation field properly
Cap max peerings at 63 in accordance with IEEE-2012 8.4.2.100.7.
Triggers a beacon regeneration every time the number of peerings changes.
Previously this would only happen if the "accepting peerings" bit changed.

Signed-off-by: Jacob Minshall <jacob@cozybit.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2013-06-03 17:03:18 +02:00
Thomas Pedersen
866403a7bd mac80211: don't check local mesh TTL on TX
nl80211 has already verified the mesh TTL on setting the
mesh config, so no need to check it again in mac80211.

Signed-off-by: Thomas Pedersen <thomas@cozybit.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2013-06-03 16:53:51 +02:00
Johannes Berg
ed405be5cb mac80211: fix sdata locking around __ieee80211_request_smps
My cfg80211/mac80211 locking unification broke the sdata
locking in ieee80211_set_power_mgmt, it needs to acquire
the lock for __ieee80211_request_smps(). Add the locking.

Reported-by: Jakub Kicinski <kubakici@wp.pl>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2013-06-03 13:51:59 +02:00
Cong Wang
9a99d4a50c icmp: avoid allocating large struct on stack
struct icmp_bxm is a large struct, reduce stack usage
by allocating it on heap.

Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Joe Perches <joe@perches.com>
Cc: David S. Miller <davem@davemloft.net>
Signed-off-by: Cong Wang <amwang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-03 00:28:44 -07:00
Rami Rosen
08578d8d4e ] icmp: fix icmp_unreach() comment.
ICMP_PARAMETERPROB is handled by icmp_unreach(); This patch adds
ICMP_PARAMETERPROB to the list of ICMP message types handled by icmp_unreach().

Signed-off-by: Rami Rosen <ramirose@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-03 00:27:15 -07:00
Timo Teräs
5aad1de5ea ipv4: use separate genid for next hop exceptions
commit 13d82bf5 (ipv4: Fix flushing of cached routing informations)
added the support to flush learned pmtu information.

However, using rt_genid is quite heavy as it is bumped on route
add/change and multicast events amongst other places. These can
happen quite often, especially if using dynamic routing protocols.

While this is ok with routes (as they are just recreated locally),
the pmtu information is learned from remote systems and the icmp
notification can come with long delays. It is worthy to have separate
genid to avoid excessive pmtu resets.

Cc: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: Timo Teräs <timo.teras@iki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-03 00:07:43 -07:00
Timo Teräs
f016229e30 ipv4: rate limit updating of next hop exceptions with same pmtu
The tunnel devices call update_pmtu for each packet sent, this causes
contention on the fnhe_lock. Ignore the pmtu update if pmtu is not
actually changed, and there is still plenty of time before the entry
expires.

Signed-off-by: Timo Teräs <timo.teras@iki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-03 00:07:43 -07:00