Commit Graph

37373 Commits

Author SHA1 Message Date
WANG Cong
4cbcdf2b6c fou: always use be16 for port
udp_config.local_udp_port is be16. And iproute2 passes
network order for FOU_ATTR_PORT.

This doesn't fix any bug, just for consistency.

Cc: Tom Herbert <tom@herbertland.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-04-12 21:25:13 -04:00
WANG Cong
67270636a8 fou: exit early when parsing config fails
Not a big deal, just for corretness.

Cc: Tom Herbert <tom@herbertland.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-04-12 21:25:13 -04:00
WANG Cong
9272f04872 fou: avoid calling udp_del_offload() twice
This fixes the following harmless warning:

./ip/ip fou del port 7777
[  122.907516] udp_del_offload: didn't find offload for port 7777

Cc: Tom Herbert <tom@herbertland.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-04-12 21:25:13 -04:00
Eric Dumazet
52db70dca5 tcp: do not cache align timewait sockets
With recent adoption of skc_cookie in struct sock_common,
struct tcp_timewait_sock size increased from 192 to 200 bytes
on 64bit arches. SLAB rounds then to 256 bytes.

It is time to drop SLAB_HWCACHE_ALIGN constraint for twsk_slab.

This saves about 12 MB of memory on typical configuration reaching
262144 timewait sockets, and has no noticeable impact on performance.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-04-12 21:16:05 -04:00
David S. Miller
4e78eb0dbf There isn't much left, but we have
* new mac80211 internal software queue to allow drivers to have
    shorter hardware queues and pull on-demand
  * use rhashtable for mac80211 station table
  * minstrel rate control debug improvements and some refactoring
  * fix noisy message about TX power reduction
  * fix continuous message printing and activity if CRDA doesn't respond
  * fix VHT-related capabilities with "iw connect" or "iwconfig ..."
  * fix Kconfig for cfg80211 wireless extensions compatibility
 -----BEGIN PGP SIGNATURE-----
 
 iQIcBAABCAAGBQJVJ7CPAAoJEDBSmw7B7bqr8+IQAKCAbUyd6PFRT5tcz9kW5GCW
 /ibb+n1e14yWKgNEe1gddUGKG/L3HGCBXNkCYzR2M8mlL7dLPqspaBcGHS4dx8F4
 D0AuikqvtXIxfAXi0zmU2uo7rH7u2X34R2LtS8AlKByD+jmFvxMiPPvxNFgzJu/7
 63UQm73p2pnu/KdXLW1OQEcpZtZJ9+N/uBiq9zbVdX3A8T84ME0oMyy+EAQqCZdK
 CcsTXHCnAgmmXWJlu1JRdopr1bd38mSGB70eXduFtPqDdmtQRnoaCQ9e+tJDA4j4
 svEw0yDmsc4WG1EKLKKCRd3uFOZsng+lcXrHfpm5wlSPpCOItfQ9BzT3x1u6Y5JU
 Z1WMOMkkEce+95U7/RLoXwC/2RS3XelUXTde4cGIRMvO5drOrU58P0gdn3J+yKbv
 6v+2GGKy/39tdXUOxIl3EZT/huIl+h1UNO8C2hyaEwdXK+X1zl31/u6kk1Ns18Wr
 YPEJixxHx0zR8jaZgDC7OlWLuqn4Ay+Ls9yCyIesdHzKpizJKqn83PntYnpJmxoA
 9hlIyRDWnqH44KxzB85ni1C2Qudec3mcCWIWV7M+UoSC1Cgs/LxDzH7kRejR2ZIl
 vRhg5pqyr53L0h2lq5DO4Cj4UzbXb7YioKJRxjyKloNOlRrCZtK/VEsHbdsKEcIp
 d/wHj1AyFZeQfuhk8Qqr
 =mtuo
 -----END PGP SIGNATURE-----

Merge tag 'mac80211-next-for-davem-2015-04-10' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211-next

Johannes Berg says:

====================
There isn't much left, but we have
 * new mac80211 internal software queue to allow drivers to have
   shorter hardware queues and pull on-demand
 * use rhashtable for mac80211 station table
 * minstrel rate control debug improvements and some refactoring
 * fix noisy message about TX power reduction
 * fix continuous message printing and activity if CRDA doesn't respond
 * fix VHT-related capabilities with "iw connect" or "iwconfig ..."
 * fix Kconfig for cfg80211 wireless extensions compatibility
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2015-04-12 20:43:46 -04:00
Al Viro
5d5d568975 make new_sync_{read,write}() static
All places outside of core VFS that checked ->read and ->write for being NULL or
called the methods directly are gone now, so NULL {read,write} with non-NULL
{read,write}_iter will do the right thing in all cases.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-04-11 22:29:40 -04:00
Al Viro
dcdbd7b269 net/9p: remove (now-)unused helpers
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-04-11 22:28:29 -04:00
Al Viro
21c9f5ccb1 p9_client_attach(): set fid->uid correctly
it's almost always equal to current_fsuid(), but there's an exception -
if the first writeback fid is opened by non-root *and* that happens before
root has done any lookups in /, we end up doing attach for root.  The
current code leaves the resulting FID owned by root from the server POV
and by non-root from the client one.  Unfortunately, it means that e.g.
massive dcache eviction will leave that user buggered - they'll end
up redoing walks from / *and* picking that FID every time.  As soon as
they try to create something, the things will get nasty.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-04-11 22:28:28 -04:00
Al Viro
e1200fe68f 9p: switch p9_client_read() to passing struct iov_iter *
... and make it loop

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-04-11 22:28:27 -04:00
Al Viro
070b3656cf 9p: switch p9_client_write() to passing it struct iov_iter *
... and make it loop until it's done

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-04-11 22:28:25 -04:00
Al Viro
4f3b35c157 net/9p: switch the guts of p9_client_{read,write}() to iov_iter
... and have get_user_pages_fast() mapping fewer pages than requested
to generate a short read/write.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-04-11 22:28:25 -04:00
Al Viro
c0fec3a98b Merge branch 'iocb' into for-next 2015-04-11 22:24:41 -04:00
Al Viro
01e97e6517 new helper: msg_data_left()
convert open-coded instances

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-04-11 15:53:35 -04:00
Al Viro
a2dd3793a1 Merge remote-tracking branch 'dh/afs' into for-davem 2015-04-11 15:51:09 -04:00
Al Viro
d8725c86ae get rid of the size argument of sock_sendmsg()
it's equal to iov_iter_count(&msg->msg_iter) in all cases

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-04-11 15:27:37 -04:00
Vlad Zolotarov
01a3d79681 if_link: Add an additional parameter to ifla_vf_info for RSS querying
Add configuration setting for drivers to allow/block an RSS Redirection
Table and a Hash Key querying for discrete VFs.

On some devices VF share the mentioned above information with PF and
querying it may adduce a theoretical security risk. We want to let a
system administrator to decide if he/she wants to take this risk or not.

Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2015-04-10 21:57:22 -07:00
Thomas Graf
78ebb0d00b rtnetlink: Mark name argument of rtnl_create_link() const
Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-04-10 12:42:40 -07:00
David S. Miller
c3d0dac693 Merge branch 'for-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth-next
Johan Hedberg says:

====================
pull request: bluetooth-next 2015-04-09

We've had enough new patches during the past week (especially from
Marcel) that it'd be good to still get these queued for 4.1.

The majority of the changes are from Marcel with lots of cleanup &
refactoring patches for the HCI UART driver. Marcel also split out some
Broadcom & Intel vendor specific functionality into two new btintel &
btbcm modules.

In addition to the HCI driver changes there's the completion of our
local OOB data interface for pairing, added support for requesting
remote LE features when connecting, as well as a couple of minor fixes
for mac802154.

Please let me know if there are any issues pulling. Thanks.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2015-04-09 18:31:50 -04:00
Eric Dumazet
b52e69217b tcp: md5: fix a typo in tcp_v4_md5_lookup()
Lookup key for tcp_md5_do_lookup() has to be taken
from addr_sk, not sk (which can be the listener)

Fixes: fd3a154a00 ("tcp: md5: get rid of tcp_v[46]_reqsk_md5_lookup()")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-04-09 18:11:16 -04:00
Hubert Sokolowski
1e53d5bb88 net: Pass VLAN ID to rtnl_fdb_notify.
When an FDB entry is added or deleted the information about VLAN
is not passed to listening applications like 'bridge monitor fdb'.
With this patch VLAN ID is passed if it was set in the original
netlink message.

Also remove an unused bdev variable.

Signed-off-by: Hubert Sokolowski <hubert.sokolowski@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-04-09 17:30:58 -04:00
Eric Dumazet
b50edd7812 tcp: tcp_make_synack() should clear skb->tstamp
I noticed tcpdump was giving funky timestamps for locally
generated SYNACK messages on loopback interface.

11:42:46.938990 IP 127.0.0.1.48245 > 127.0.0.2.23850: S
945476042:945476042(0) win 43690 <mss 65495,nop,nop,sackOK,nop,wscale 7>

20:28:58.502209 IP 127.0.0.2.23850 > 127.0.0.1.48245: S
3160535375:3160535375(0) ack 945476043 win 43690 <mss
65495,nop,nop,sackOK,nop,wscale 7>

This is because we need to clear skb->tstamp before
entering lower stack, otherwise net_timestamp_check()
does not set skb->tstamp.

Fixes: 7faee5c0d5 ("tcp: remove TCP_SKB_CB(skb)->when")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-04-09 17:28:57 -04:00
Jesse Gross
b736a623bd udptunnels: Call handle_offloads after inserting vlan tag.
handle_offloads() calls skb_reset_inner_headers() to store
the layer pointers to the encapsulated packet. However, we
currently push the vlag tag (if there is one) onto the packet
afterwards. This changes the MAC header for the encapsulated
packet but it is not reflected in skb->inner_mac_header, which
breaks GSO and drivers which attempt to use this for encapsulation
offloads.

Fixes: 1eaa8178 ("vxlan: Add tx-vlan offload support.")
Signed-off-by: Jesse Gross <jesse@nicira.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-04-09 14:56:32 -04:00
David S. Miller
ca69d7102f Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-next
Pablo Neira Ayuso says:

====================
Netfilter updates for net-next

The following patchset contains Netfilter updates for your net-next tree.
They are:

* nf_tables set timeout infrastructure from Patrick Mchardy.

1) Add support for set timeout support.

2) Add support for set element timeouts using the new set extension
   infrastructure.

4) Add garbage collection helper functions to get rid of stale elements.
   Elements are accumulated in a batch that are asynchronously released
   via RCU when the batch is full.

5) Add garbage collection synchronization helpers. This introduces a new
   element busy bit to address concurrent access from the netlink API and the
   garbage collector.

5) Add timeout support for the nft_hash set implementation. The garbage
   collector peridically checks for stale elements from the workqueue.

* iptables/nftables cgroup fixes:

6) Ignore non full-socket objects from the input path, otherwise cgroup
   match may crash, from Daniel Borkmann.

7) Fix cgroup in nf_tables.

8) Save some cycles from xt_socket by skipping packet header parsing when
   skb->sk is already set because of early demux. Also from Daniel.

* br_netfilter updates from Florian Westphal.

9) Save frag_max_size and restore it from the forward path too.

10) Use a per-cpu area to restore the original source MAC address when traffic
    is DNAT'ed.

11) Add helper functions to access physical devices.

12) Use these new physdev helper function from xt_physdev.

13) Add another nf_bridge_info_get() helper function to fetch the br_netfilter
    state information.

14) Annotate original layer 2 protocol number in nf_bridge info, instead of
    using kludgy flags.

15) Also annotate the pkttype mangling when the packet travels back and forth
    from the IP to the bridge layer, instead of using a flag.

* More nf_tables set enhancement from Patrick:

16) Fix possible usage of set variant that doesn't support timeouts.

17) Avoid spurious "set is full" errors from Netlink API when there are pending
    stale elements scheduled to be released.

18) Restrict loop checks to set maps.

19) Add support for dynamic set updates from the packet path.

20) Add support to store optional user data (eg. comments) per set element.

BTW, I have also pulled net-next into nf-next to anticipate the conflict
resolution between your okfn() signature changes and Florian's br_netfilter
updates.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2015-04-09 14:46:04 -04:00
David S. Miller
9399bdcbb5 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec-next
Steffen Klassert says:

====================
pull request (net-next): ipsec-next 2015-04-09

1) Prohibit the use/abuse of the xfrm netlink interface on
   32/64 bit compatibility tasks. We need a full compat
   layer before we can allow this. From Fan Du.

Please pull or let me know if there are problems.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2015-04-09 14:41:47 -04:00
David S. Miller
634d8ee4de Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec
Steffen Klassert says:

====================
pull request (net): ipsec 2015-04-09

1) We dereferenced the xfrm outer_mode too early, larval
   SAs don't have it set. Move the dereference of the
   outer mode below the larval SA check to fix it.
   From Alexey Dobriyan.

2) Fix vti6 tunnel uninit on namespace crosssing.
   From Yao Xiwei.

Please pull or let me know if there are problems.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2015-04-09 14:39:37 -04:00
Marcel Holtmann
0fe29fd1cd Bluetooth: Read LE remote features during connection establishment
When establishing a Bluetooth LE connection, read the remote used
features mask to determine which features are supported. This was
not really needed with Bluetooth 4.0, but since Bluetooth 4.1 and
also 4.2 have introduced new optional features, this becomes more
important.

This works the same as with BR/EDR where the connection enters the
BT_CONFIG stage and hci_connect_cfm call is delayed until the remote
features have been retrieved. Only after successfully receiving the
remote features, the connection enters the BT_CONNECTED state.

Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
2015-04-09 08:36:54 +03:00
Al Viro
6aa248145a switch kernel_sendmsg() and kernel_recvmsg() to iov_iter_kvec()
For kernel_sendmsg() that eliminates the need to play with setfs();
for kernel_recvmsg() it does *not* - a couple of callers are using
it with non-NULL ->msg_control, which would be treated as userland
address on recvmsg side of things.

In all cases we are really setting a kvec-backed iov_iter, though.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-04-09 00:02:34 -04:00
Al Viro
da18428498 net: switch importing msghdr from userland to {compat_,}import_iovec()
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-04-09 00:02:26 -04:00
Al Viro
602bd0e90e net: switch sendto() and recvfrom() to import_single_range()
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-04-09 00:02:21 -04:00
Al Viro
237dae8890 Merge branch 'iocb' into for-davem
trivial conflict in net/socket.c and non-trivial one in crypto -
that one had evaded aio_complete() removal.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-04-09 00:01:38 -04:00
Eric Dumazet
dd929c1b3d tcp: do not rearm rsk_timer on FastOpen requests
FastOpen requests are not like other regular request sockets.

They do not yet use rsk_timer : tcp_fastopen_queue_check()
simply manually removes one expired request from fastopenq->rskq_rst
list.

Therefore, tcp_check_req() must not call mod_timer_pending(),
otherwise we crash because rsk_timer was not initialized.

Fixes: fa76ce7328 ("inet: get rid of central tcp/dccp listener timer")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Yuchung Cheng <ycheng@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-04-08 20:19:42 -04:00
David S. Miller
82740b9ac2 NFC: 4.1 pull request
This is the NFC pull request for 4.1.
 
 This is a shorter one than usual, as the Intel Field Peak NFC
 driver could not make it in time.
 
 We have:
 
 - A new driver for NXP NCI based chipsets, like e.g. the NPC100 or
   the PN7150. It currently only supports an i2c physical layer, but
   could easily be extended to work on top of e.g. SPI.
   This driver also includes support for user space triggered firmware
   updates.
 
 - A few minor st21nfc[ab] fixes, cleanups, and comments improvements.
 
 - A pn533 error return fix.
 
 - A few NFC related logs formatting cleanups.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJVJV81AAoJEIqAPN1PVmxKM3sP/R6fKXMxtVxXsiPzBMk+SpBI
 onNXbCx27rp1lHTFznE16JAaaSfcQEwSQmYx7xa6KXqRYVlDfRfC+5R+rPGrCjfH
 kV/eLBNnYpmPw0ViVT7dsWK0b4me0k5pr9ki9mze3YxvuMbA5vdv0tvuRFz5IRF/
 hl+WI5pntGuRtnIyBKIauAMylylUYVvCBGhmHnveiX0Dp8noJLBSV0wvdzujm51S
 +Uio/jHlUEV2lotrQBOfWNtEkwonXSwzZWSzimBCyEGLAwTx4lGXHmQftOtPd3zE
 sOT7Gw77ZCsulHoHiJyC0KpDSDS0NYVrtTI5BiuGgAivGi0YZw2XD3CYRIdy0m2Z
 lMoodYdiCgsxmrU6I6Rd/7DQBxD0Vhc+CGAyk41f7AwU4Fwq105kpSLupTtU+NzT
 ZpdvSXeirU8sxt+3WDOgv8esyYGZVVD8GuBbofMZQZ0vLq+FcDpYAW4w3LKvvi6X
 C+WN8f7SCI0kqpd4leyl6EG3SoQKFyPWobu0IlV520R9b76iBcyqooTIpvVa4Yk7
 az6fKhi9gK/T6FHW78y6fnkczd47JKrC924m+g6P/hhTD5zQ956ferp0uFTjkPtF
 8kNgRT7OIRi1JO783cnQx1uA61axC3GX1HFzsD9paVkTwRNtJ3eFsxtcYt7Nfv8D
 WGNWRQp5LmBsD/SqFBfk
 =MzOJ
 -----END PGP SIGNATURE-----

Merge tag 'nfc-next-4.1-1' of git://git.kernel.org/pub/scm/linux/kernel/git/sameo/nfc-next

Samuel Ortiz says:

====================
NFC: 4.1 pull request

This is the NFC pull request for 4.1.

This is a shorter one than usual, as the Intel Field Peak NFC
driver could not make it in time.

We have:

- A new driver for NXP NCI based chipsets, like e.g. the NPC100 or
  the PN7150. It currently only supports an i2c physical layer, but
  could easily be extended to work on top of e.g. SPI.
  This driver also includes support for user space triggered firmware
  updates.

- A few minor st21nfc[ab] fixes, cleanups, and comments improvements.

- A pn533 error return fix.

- A few NFC related logs formatting cleanups.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2015-04-08 17:12:50 -04:00
Andi Kleen
5eeb292215 fou: Don't use const __read_mostly
const __read_mostly is a senseless combination. If something
is already const it cannot be __read_mostly. Remove the bogus
__read_mostly in the fou driver.

This fixes section conflicts with LTO.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-04-08 15:29:08 -04:00
David Miller
c1f8667677 netfilter: Fix switch statement warnings with recent gcc.
More recent GCC warns about two kinds of switch statement uses:

1) Switching on an enumeration, but not having an explicit case
   statement for all members of the enumeration.  To show the
   compiler this is intentional, we simply add a default case
   with nothing more than a break statement.

2) Switching on a boolean value.  I think this warning is dumb
   but nevertheless you get it wholesale with -Wswitch.

This patch cures all such warnings in netfilter.

Signed-off-by: David S. Miller <davem@davemloft.net>
Acked-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-04-08 15:20:50 -04:00
Sowmini Varadhan
443be0e5af RDS: make sure not to loop forever inside rds_send_xmit
If a determined set of concurrent senders keep the send queue full,
we can loop forever inside rds_send_xmit.  This fix has two parts.

First we are dropping out of the while(1) loop after we've processed a
large batch of messages.

Second we add a generation number that gets bumped each time the
xmit bit lock is acquired.  If someone else has jumped in and
made progress in the queue, we skip our goto restart.

Original patch by Chris Mason.

Signed-off-by: Sowmini Varadhan <sowmini.varadhan@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-04-08 15:17:32 -04:00
Sowmini Varadhan
1789b2c077 RDS: only use passive connections when addresses match
Passive connections were added for the case where one loopback IB
connection between identical addresses needs another connection to store
the second QP.  Unfortunately, they were also created in the case where
the addesses differ and we already have both QPs.

This lead to a message reordering bug.

- two different IB interfaces and addresses on a machine: A B
- traffic is sent from A to B
- connection from A-B is created, connect request sent
- listening accepts connect request, B-A is created
- traffic flows, next_rx is incremented
- unacked messages exist on the retrans list
- connection A-B is shut down, new connect request sent
- listen sees existing loopback B-A, creates new passive B-A
- retrans messages are sent and delivered because of 0 next_rx

The problem is that the second connection request saw the previously
existing parent connection.  Instead of using it, and using the existing
next_rx_seq state for the traffic between those IPs, it mistakenly
thought that it had to create a passive connection.

We fix this by only using passive connections in the special case where
laddr and faddr match.  In this case we'll only ever have one parent
sending connection requests and one passive connection created as the
listening path sees the existing parent connection which initiated the
request.

Original patch by Zach Brown

Signed-off-by: Sowmini Varadhan <sowmini.varadhan@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-04-08 15:17:23 -04:00
Pablo Neira Ayuso
aadd51aa71 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next
Resolve conflicts between 5888b93 ("Merge branch 'nf-hook-compress'") and
Florian Westphal br_netfilter works.

Conflicts:
        net/bridge/br_netfilter.c

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-04-08 18:30:21 +02:00
Hannes Frederic Sowa
1b11287118 ipv6: call iptunnel_xmit with NULL sock pointer if no tunnel sock is available
Fixes: 79b16aadea ("udp_tunnel: Pass UDP socket down through udp_tunnel{, 6}_xmit_skb().")
Reported-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-04-08 12:09:43 -04:00
Hannes Frederic Sowa
926a882f69 ipv4: ip_tunnel: use net namespace from rtable not socket
The socket parameter might legally be NULL, thus sock_net is sometimes
causing a NULL pointer dereference. Using net_device pointer in dst_entry
is more reliable.

Fixes: b6a7719aed ("ipv4: hash net ptr into fragmentation bucket selection")
Reported-by: Rick Jones <rick.jones2@hp.com>
Cc: Rick Jones <rick.jones2@hp.com>
Cc: David S. Miller <davem@davemloft.net>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-04-08 12:09:42 -04:00
Patrick McHardy
68e942e88a netfilter: nf_tables: support optional userdata for set elements
Add an userdata set extension and allow the user to attach arbitrary
data to set elements. This is intended to hold TLV encoded data like
comments or DNS annotations that have no meaning to the kernel.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-04-08 16:58:27 +02:00
Patrick McHardy
22fe54d5fe netfilter: nf_tables: add support for dynamic set updates
Add a new "dynset" expression for dynamic set updates.

A new set op ->update() is added which, for non existant elements,
invokes an initialization callback and inserts the new element.
For both new or existing elements the extenstion pointer is returned
to the caller to optionally perform timer updates or other actions.

Element removal is not supported so far, however that seems to be a
rather exotic need and can be added later on.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-04-08 16:58:27 +02:00
Patrick McHardy
11113e190b netfilter: nf_tables: support different set binding types
Currently a set binding is assumed to be related to a lookup and, in
case of maps, a data load.

In order to use bindings for set updates, the loop detection checks
must be restricted to map operations only. Add a flags member to the
binding struct to hold the set "action" flags such as NFT_SET_MAP,
and perform loop detection based on these.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-04-08 16:58:27 +02:00
Patrick McHardy
3dd0673ac3 netfilter: nf_tables: prepare set element accounting for async updates
Use atomic operations for the element count to avoid races with async
updates.

To properly handle the transactional semantics during netlink updates,
deleted but not yet committed elements are accounted for seperately and
are treated as being already removed. This means for the duration of
a netlink transaction, the limit might be exceeded by the amount of
elements deleted. Set implementations must be prepared to handle this.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-04-08 16:58:27 +02:00
Patrick McHardy
4a8678efbe netfilter: nf_tables: fix set selection when timeouts are requested
The NFT_SET_TIMEOUT flag is ignore in nft_select_set_ops, which may
lead to selection of a set implementation that doesn't actually
support timeouts.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-04-08 16:58:26 +02:00
Florian Westphal
a1e67951e6 netfilter: bridge: make BRNF_PKT_TYPE flag a bool
nf_bridge_info->mask is used for several things, for example to
remember if skb->pkt_type was set to OTHER_HOST.

For a bridge, OTHER_HOST is expected case. For ip forward its a non-starter
though -- routing expects PACKET_HOST.

Bridge netfilter thus changes OTHER_HOST to PACKET_HOST before hook
invocation and then un-does it after hook traversal.

This information is irrelevant outside of br_netfilter.

After this change, ->mask now only contains flags that need to be
known outside of br_netfilter in fast-path.

Future patch changes mask into a 2bit state field in sk_buff, so that
we can remove skb->nf_bridge pointer for good and consider all remaining
places that access nf_bridge info content a not-so fastpath.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-04-08 16:49:12 +02:00
Florian Westphal
3eaf402502 netfilter: bridge: start splitting mask into public/private chunks
->mask is a bit info field that mixes various use cases.

In particular, we have flags that are mutually exlusive, and flags that
are only used within br_netfilter while others need to be exposed to
other parts of the kernel.

Remove BRNF_8021Q/PPPoE flags.  They're mutually exclusive and only
needed within br_netfilter context.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-04-08 16:49:11 +02:00
Florian Westphal
383307838d netfilter: bridge: add and use nf_bridge_info_get helper
Don't access skb->nf_bridge directly, this pointer will be removed soon.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-04-08 16:49:10 +02:00
Florian Westphal
a99074ae1f netfilter: physdev: use helpers
Avoid skb->nf_bridge accesses where possible.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-04-08 16:49:09 +02:00
Florian Westphal
c737b7c451 netfilter: bridge: add helpers for fetching physin/outdev
right now we store this in the nf_bridge_info struct, accessible
via skb->nf_bridge.  This patch prepares removal of this pointer from skb:

Instead of using skb->nf_bridge->x, we use helpers to obtain the in/out
device (or ifindexes).

Followup patches to netfilter will then allow nf_bridge_info to be
obtained by a call into the br_netfilter core, rather than keeping a
pointer to it in sk_buff.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-04-08 16:49:08 +02:00
Florian Westphal
e70deecbf8 netfilter: bridge: don't use nf_bridge_info data to store mac header
br_netfilter maintains an extra state, nf_bridge_info, which is attached
to skb via skb->nf_bridge pointer.

Amongst other things we use skb->nf_bridge->data to store the original
mac header for every processed skb.

This is required for ip refragmentation when using conntrack
on top of bridge, because ip_fragment doesn't copy it from original skb.

However there is no need anymore to do this unconditionally.

Move this to the one place where its needed -- when br_netfilter calls
ip_fragment().

Also switch to percpu storage for this so we can handle fragmenting
without accessing nf_bridge meta data.

Only user left is neigh resolution when DNAT is detected, to hold
the original source mac address (neigh resolution builds new mac header
using bridge mac), so rename ->data and reduce its size to whats needed.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-04-08 16:49:07 +02:00
Daniel Borkmann
d64d80a2cd netfilter: x_tables: don't extract flow keys on early demuxed sks in socket match
Currently in xt_socket, we take advantage of early demuxed sockets
since commit 00028aa370 ("netfilter: xt_socket: use IP early demux")
in order to avoid a second socket lookup in the fast path, but we
only make partial use of this:

We still unnecessarily parse headers, extract proto, {s,d}addr and
{s,d}ports from the skb data, accessing possible conntrack information,
etc even though we were not even calling into the socket lookup via
xt_socket_get_sock_{v4,v6}() due to skb->sk hit, meaning those cycles
can be spared.

After this patch, we only proceed the slower, manual lookup path
when we have a skb->sk miss, thus time to match verdict for early
demuxed sockets will improve further, which might be i.e. interesting
for use cases such as mentioned in 681f130f39 ("netfilter: xt_socket:
add XT_SOCKET_NOWILDCARD flag").

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-04-08 16:47:49 +02:00
Johannes Berg
6d00ec0514 cfg80211: don't allow disabling WEXT if it's required
The change to only export WEXT symbols when required could break
the build if CONFIG_CFG80211_WEXT was explicitly disabled while
a driver like orinoco selected it.

Fix this by hiding the symbol when it's required so it can't be
disabled in that case.

Fixes: 2afe38d15c ("cfg80211-wext: export symbols only when needed")
Reported-by: Randy Dunlap <rdunlap@infradead.org>
Reported-by: Jim Davis <jim.epost@gmail.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2015-04-08 09:19:29 +02:00
Sheng Yong
8bc0034cf6 net: remove extra newlines
Signed-off-by: Sheng Yong <shengyong1@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-04-07 22:24:37 -04:00
Daniel Lee
2646c831c0 tcp: RFC7413 option support for Fast Open client
Fast Open has been using an experimental option with a magic number
(RFC6994). This patch makes the client by default use the RFC7413
option (34) to get and send Fast Open cookies.  This patch makes
the client solicit cookies from a given server first with the
RFC7413 option. If that fails to elicit a cookie, then it tries
the RFC6994 experimental option. If that also fails, it uses the
RFC7413 option on all subsequent connect attempts.  If the server
returns a Fast Open cookie then the client caches the form of the
option that successfully elicited a cookie, and uses that form on
later connects when it presents that cookie.

The idea is to gradually obsolete the use of experimental options as
the servers and clients upgrade, while keeping the interoperability
meanwhile.

Signed-off-by: Daniel Lee <Longinus00@gmail.com>
Signed-off-by: Yuchung Cheng <ycheng@google.com>
Signed-off-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-04-07 18:36:39 -04:00
Daniel Lee
7f9b838b71 tcp: RFC7413 option support for Fast Open server
Fast Open has been using the experimental option with a magic number
(RFC6994) to request and grant Fast Open cookies. This patch enables
the server to support the official IANA option 34 in RFC7413 in
addition.

The change has passed all existing Fast Open tests with both
old and new options at Google.

Signed-off-by: Daniel Lee <Longinus00@gmail.com>
Signed-off-by: Yuchung Cheng <ycheng@google.com>
Signed-off-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-04-07 18:36:39 -04:00
Beshay, Joseph
0ad2a83659 netem: Fixes byte backlog accounting for the first of two chained netem instances
Fixes byte backlog accounting for the first of two chained netem instances.
Bytes backlog reported now corresponds to the number of queued packets.

When two netem instances are chained, for instance to apply rate and queue
limitation followed by packet delay, the number of backlogged bytes reported
by the first netem instance is wrong. It reports the sum of bytes in the queues
of the first and second netem. The first netem reports the correct number of
backlogged packets but not bytes. This is shown in the example below.

Consider a chain of two netem schedulers created using the following commands:

$ tc -s qdisc replace dev veth2 root handle 1:0 netem rate 10000kbit limit 100
$ tc -s qdisc add dev veth2 parent 1:0 handle 2: netem delay 50ms

Start an iperf session to send packets out on the specified interface and
monitor the backlog using tc:

$ tc -s qdisc show dev veth2

Output using unpatched netem:
	qdisc netem 1: root refcnt 2 limit 100 rate 10000Kbit
	 Sent 98422639 bytes 65434 pkt (dropped 123, overlimits 0 requeues 0)
	 backlog 172694b 73p requeues 0
	qdisc netem 2: parent 1: limit 1000 delay 50.0ms
	 Sent 98422639 bytes 65434 pkt (dropped 0, overlimits 0 requeues 0)
	 backlog 63588b 42p requeues 0

The interface used to produce this output has an MTU of 1500. The output for
backlogged bytes behind netem 1 is 172694b. This value is not correct. Consider
the total number of sent bytes and packets. By dividing the number of sent
bytes by the number of sent packets, we get an average packet size of ~=1504.
If we divide the number of backlogged bytes by packets, we get ~=2365. This is
due to the first netem incorrectly counting the 63588b which are in netem 2's
queue as being in its own queue. To verify this is the case, we subtract them
from the reported value and divide by the number of packets as follows:
	172694 - 63588 = 109106 bytes actualled backlogged in netem 1
	109106 / 73 packets ~= 1494 bytes (which matches our MTU)

The root cause is that the byte accounting is not done at the
same time with packet accounting. The solution is to update the backlog value
every time the packet queue is updated.

Signed-off-by: Joseph D Beshay <joseph.beshay@utdallas.edu>
Acked-by: Hagen Paul Pfeifer <hagen@jauu.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-04-07 18:34:24 -04:00
Johan Hedberg
40f66c05c3 Bluetooth: Add local SSP OOB data to OOB ext data mgmt command
The Read Local Out Of Band Extended Data mgmt command is specified to
return the SSP values when given a BR/EDR address type as input
parameter. The returned values may include either the 192-bit variants
of C and R, or their 256-bit variants, or both, depending on the status
of Secure Connections and Secure Connections Only modes. If SSP is not
enabled the command will only return the Class of Device value (like it
has done so far).

Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2015-04-07 23:31:20 +02:00
Nicolas Dichtel
a143c40c32 netns: allow to dump netns ids
Which this patch, it's possible to dump the list of ids allocated for peer
netns.

Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-04-07 17:29:41 -04:00
Nicolas Dichtel
9a9634545c netns: notify netns id events
With this patch, netns ids that are created and deleted are advertised into the
group RTNLGRP_NSID.

Because callers of rtnl_net_notifyid() already know the id of the peer, there is
no need to call __peernet2id() in rtnl_net_fill().

Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-04-07 17:29:41 -04:00
Nicolas Dichtel
b111e4e111 netns: minor cleanup in rtnl_net_getid()
No need to initialize err, it will be overridden by the value of nlmsg_parse().

Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-04-07 17:29:41 -04:00
David Miller
79b16aadea udp_tunnel: Pass UDP socket down through udp_tunnel{, 6}_xmit_skb().
That was we can make sure the output path of ipv4/ipv6 operate on
the UDP socket rather than whatever random thing happens to be in
skb->sk.

Based upon a patch by Jiri Pirko.

Signed-off-by: David S. Miller <davem@davemloft.net>
Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
2015-04-07 15:29:08 -04:00
David Miller
7026b1ddb6 netfilter: Pass socket pointer down through okfn().
On the output paths in particular, we have to sometimes deal with two
socket contexts.  First, and usually skb->sk, is the local socket that
generated the frame.

And second, is potentially the socket used to control a tunneling
socket, such as one the encapsulates using UDP.

We do not want to disassociate skb->sk when encapsulating in order
to fix this, because that would break socket memory accounting.

The most extreme case where this can cause huge problems is an
AF_PACKET socket transmitting over a vxlan device.  We hit code
paths doing checks that assume they are dealing with an ipv4
socket, but are actually operating upon the AF_PACKET one.

Signed-off-by: David S. Miller <davem@davemloft.net>
2015-04-07 15:25:55 -04:00
David Miller
1c984f8a5d netfilter: Add socket pointer to nf_hook_state.
It is currently always set to NULL, but nf_queue is adjusted to be
prepared for it being set to a real socket by taking and releasing a
reference to that socket when necessary.

Signed-off-by: David S. Miller <davem@davemloft.net>
2015-04-07 15:25:55 -04:00
Marcel Holtmann
2d7cc19eeb Bluetooth: Remove hci_recv_stream_fragment function
The hci_recv_stream_fragment function should have never been introduced
in the first place. The Bluetooth core does not need to know anything
about the HCI transport protocol.

With all transport protocol specific detailed moved back into the
drivers where they belong (mainly generic USB and UART drivers), this
function can now be removed.

This reduces the size of hci_dev structure and also removes an exported
symbol from the Bluetooth core module.

Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
2015-04-07 18:47:10 +02:00
Marcel Holtmann
5c7d2dd285 Bluetooth: Make data pointer of hci_recv_stream_fragment const
The data pointer provided to hci_recv_stream_fragment function should
have been marked const. The function has no business in modifying the
original data. So fix this now.

Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
2015-04-07 18:47:09 +02:00
Steven Rostedt (Red Hat)
1bc1e4d048 mac80211: Move message tracepoints to their own header
Every tracing file must have its own TRACE_SYSTEM defined.
The mac80211 tracepoint header broke this and add in the middle
of the file had:

 #undef TRACE_SYSTEM
 #define TRACE_SYSTEM mac80211_msg

Unfortunately, this broke new code in the ftrace infrastructure.
Moving the mac80211_msg into its own trace file with its own
TRACE_SYSTEM defined fixes the issue.

Link: http://lkml.kernel.org/r/1428389938.1841.1.camel@sipsolutions.net

Reviewed-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2015-04-07 12:32:09 -04:00
Ilya Dryomov
6d7fdb0ab3 Revert "libceph: use memalloc flags for net IO"
This reverts commit 89baaa570a.

Dirty page throttling should be sufficient for us in the general case
so there is no need to use __GFP_MEMALLOC - it would be needed only in
the swap-over-rbd case, which we currently don't support.  (It would
probably take approximately the commit that is being reverted to add
that support, but we would also need the "swap" option to distinguish
from the general case and make sure swap ceph_client-s aren't shared
with anything else.)  See ceph-devel threads [1] and [2] for the
details of why enabling pfmemalloc reserves for all cases is a bad
thing.

On top of potential system lockups related to drained emergency
reserves, this turned out to cause ceph lockups in case peers are on
the same host and communicating via loopback due to sk_filter()
dropping pfmemalloc skbs on the receiving side because the receiving
loopback socket is not tagged with SOCK_MEMALLOC.

[1] "SOCK_MEMALLOC vs loopback"
    http://www.spinics.net/lists/ceph-devel/msg22998.html
[2] "[PATCH] libceph: don't set memalloc flags in loopback case"
    http://www.spinics.net/lists/ceph-devel/msg23392.html

Conflicts:
	net/ceph/messenger.c [ context: tcp_nodelay option ]

Cc: Mike Christie <michaelc@cs.wisc.edu>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Sage Weil <sage@redhat.com>
Cc: stable@vger.kernel.org # 3.18+, needs backporting
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Acked-by: Mike Christie <michaelc@cs.wisc.edu>
Acked-by: Mel Gorman <mgorman@suse.de>
2015-04-07 19:08:35 +03:00
David S. Miller
7abccdba25 Merge branch 'for-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth-next
Johan Hedberg says:

====================
pull request: bluetooth-next 2015-04-04

Here's what's probably the last bluetooth-next pull request for 4.1:

 - Fixes for LE advertising data & advertising parameters
 - Fix for race condition with HCI_RESET flag
 - New BNEPGETSUPPFEAT ioctl, needed for certification
 - New HCI request callback type to get the resulting skb
 - Cleanups to use BIT() macro wherever possible
 - Consolidate Broadcom device entries in the btusb HCI driver
 - Check for valid flags in CMTP, HIDP & BNEP
 - Disallow local privacy & OOB data combo to prevent a potential race
 - Expose SMP & ECDH selftest results through debugfs
 - Expose current Device ID info through debugfs

Please let me know if there are any issues pulling. Thanks.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2015-04-07 11:47:52 -04:00
Johannes Berg
46b9d18014 cfg80211: send extended capabilities IE in connect
If the connect request from userspace didn't include an extended
capabilities IE, create one using the driver capabilities. This
fixes VHT associations, since those need to set the operating mode
notification capability.

Reviewed-by: Gregory Greenman <gregory.greenman@intel.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2015-04-07 13:56:45 +02:00
Johannes Berg
29464ccc78 cfg80211: move IE split utilities here from mac80211
As the next patch will require the IE splitting utility functions
in cfg80211, move them there from mac80211.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2015-04-07 13:56:41 +02:00
Yao Xiwei
092a29a40b vti6: fix uninit when using x-netns
When the kernel deleted a vti6 interface, this interface was not removed from
the tunnels list. Thus, when the ip6_vti module was removed, this old interface
was found and the kernel tried to delete it again. This was leading to a kernel
panic.

Fixes: 61220ab349 ("vti6: Enable namespace changing")
Signed-off-by: Yao Xiwei <xiwei.yao@6wind.com>
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2015-04-07 07:52:28 +02:00
Alexey Dobriyan
68c11e98ef xfrm: fix xfrm_input/xfrm_tunnel_check oops
https://bugzilla.kernel.org/show_bug.cgi?id=95211

Commit 70be6c91c8
("xfrm: Add xfrm_tunnel_skb_cb to the skb common buffer") added check
which dereferences ->outer_mode too early but larval SAs don't have
this pointer set (yet). So check for tunnel stuff later.

Mike Noordermeer reported this bug and patiently applied all the debugging.

Technically this is remote-oops-in-interrupt-context type of thing.

BUG: unable to handle kernel NULL pointer dereference at 0000000000000034
IP: [<ffffffff8150dca2>] xfrm_input+0x3c2/0x5a0
	...
[<ffffffff81500fc6>] ? xfrm4_esp_rcv+0x36/0x70
[<ffffffff814acc9a>] ? ip_local_deliver_finish+0x9a/0x200
[<ffffffff81471b83>] ? __netif_receive_skb_core+0x6f3/0x8f0
	...

RIP  [<ffffffff8150dca2>] xfrm_input+0x3c2/0x5a0
Kernel panic - not syncing: Fatal exception in interrupt

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2015-04-07 07:52:27 +02:00
David S. Miller
c85d6975ef Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Conflicts:
	drivers/net/ethernet/mellanox/mlx4/cmd.c
	net/core/fib_rules.c
	net/ipv4/fib_frontend.c

The fib_rules.c and fib_frontend.c conflicts were locking adjustments
in 'net' overlapping addition and removal of code in 'net-next'.

The mlx4 conflict was a bug fix in 'net' happening in the same
place a constant was being replaced with a more suitable macro.

Signed-off-by: David S. Miller <davem@davemloft.net>
2015-04-06 22:34:15 -04:00
Pavel Nakonechny
303038135a net: dsa: fix filling routing table from OF description
According to description in 'include/net/dsa.h', in cascade switches
configurations where there are more than one interconnected devices,
'rtable' array in 'dsa_chip_data' structure is used to indicate which
port on this switch should be used to send packets to that are destined
for corresponding switch.

However, dsa_of_setup_routing_table() fills 'rtable' with port numbers
of the _target_ switch, but not current one.

This commit removes redundant devicetree parsing and adds needed port
number as a function argument. So dsa_of_setup_routing_table() now just
looks for target switch number by parsing parent of 'link' device node.

To remove possible misunderstandings with the way of determining target
switch number, a corresponding comment was added to the source code and
to the DSA device tree bindings documentation file.

This was tested on a custom board with two Marvell 88E6095 switches with
following corresponding routing tables: { -1, 10 } and { 8, -1 }.

Signed-off-by: Pavel Nakonechny <pavel.nakonechny@skitlab.ru>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-04-06 17:31:37 -04:00
WANG Cong
67e04c29ec l2tp: unregister l2tp_net_ops on failure path
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-04-06 16:53:02 -04:00
Alexei Starovoitov
91bc4822c3 tc: bpf: add checksum helpers
Commit 608cd71a9c ("tc: bpf: generalize pedit action") has added the
possibility to mangle packet data to BPF programs in the tc pipeline.
This patch adds two helpers bpf_l3_csum_replace() and bpf_l4_csum_replace()
for fixing up the protocol checksums after the packet mangling.

It also adds 'flags' argument to bpf_skb_store_bytes() helper to avoid
unnecessary checksum recomputations when BPF programs adjusting l3/l4
checksums and documents all three helpers in uapi header.

Moreover, a sample program is added to show how BPF programs can make use
of the mangle and csum helpers.

Signed-off-by: Alexei Starovoitov <ast@plumgrid.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-04-06 16:42:35 -04:00
hannes@stressinduktion.org
f60e5990d9 ipv6: protect skb->sk accesses from recursive dereference inside the stack
We should not consult skb->sk for output decisions in xmit recursion
levels > 0 in the stack. Otherwise local socket settings could influence
the result of e.g. tunnel encapsulation process.

ipv6 does not conform with this in three places:

1) ip6_fragment: we do consult ipv6_npinfo for frag_size

2) sk_mc_loop in ipv6 uses skb->sk and checks if we should
   loop the packet back to the local socket

3) ip6_skb_dst_mtu could query the settings from the user socket and
   force a wrong MTU

Furthermore:
In sk_mc_loop we could potentially land in WARN_ON(1) if we use a
PF_PACKET socket ontop of an IPv6-backed vxlan device.

Reuse xmit_recursion as we are currently only interested in protecting
tunnel devices.

Cc: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-04-06 16:12:49 -04:00
David S. Miller
b85c3dc9bd netfilter: Pass nf_hook_state through arpt_do_table().
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-04-04 13:26:52 -04:00
David S. Miller
073bfd5686 netfilter: Pass nf_hook_state through nft_set_pktinfo*().
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-04-04 12:54:27 -04:00
David S. Miller
8f8a37152d netfilter: Pass nf_hook_state through ip6t_do_table().
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-04-04 12:52:06 -04:00
David S. Miller
8fe22382d1 netfilter: Pass nf_hook_state through nf_nat_ipv6_{in,out,fn,local_fn}().
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-04-04 12:48:08 -04:00
David S. Miller
1c491ba259 netfilter: Pass nf_hook_state through ipt_do_table().
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-04-04 12:47:04 -04:00
David S. Miller
d7cf4081ed netfilter: Pass nf_hook_state through nf_nat_ipv4_{in,out,fn,local_fn}().
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-04-04 12:45:19 -04:00
David S. Miller
238e54c9cb netfilter: Make nf_hookfn use nf_hook_state.
Pass the nf_hook_state all the way down into the hook
functions themselves.

Signed-off-by: David S. Miller <davem@davemloft.net>
2015-04-04 12:31:38 -04:00
David S. Miller
1d1de89b9a netfilter: Use nf_hook_state in nf_queue_entry.
That way we don't have to reinstantiate another nf_hook_state
on the stack of the nf_reinject() path.

Signed-off-by: David S. Miller <davem@davemloft.net>
2015-04-04 12:25:22 -04:00
David S. Miller
cfdfab3146 netfilter: Create and use nf_hook_state.
Instead of passing a large number of arguments down into the nf_hook()
entry points, create a structure which carries this state down through
the hook processing layers.

This makes is so that if we want to change the types or signatures of
any of these pieces of state, there are less places that need to be
changed.

Signed-off-by: David S. Miller <davem@davemloft.net>
2015-04-04 12:17:40 -04:00
Marcel Holtmann
38c8af6004 Bluetooth: Fix location of TX power field in LE advertising data
The TX power field in the LE advertising data should be placed last
since it needs to be possible to enable kernel controlled TX power,
but still allow for userspace provided flags field.

Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
2015-04-04 08:50:20 +03:00
Marcel Holtmann
fd6413d882 Bluetooth: hidp: Use BIT(x) instead of (1 << x)
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
2015-04-04 08:50:20 +03:00
Marcel Holtmann
b2ddeb1173 Bluetooth: cmtp: Use BIT(x) instead of (1 << x)
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
2015-04-04 08:50:20 +03:00
Grzegorz Kolodziejczyk
836a061b19 Bluetooth: bnep: Handle BNEP connection setup request
With this patch kernel will be able to handle setup request. This is
needed if we would like to handle control mesages with extension
headers. User space will be only resposible for reading setup data and
checking if scenario is conformance to specification (dst and src device
bnep role). In case of new user space, setup data must be leaved(peek
msg) on queue. New bnep session will be responsible for handling this
data.

Signed-off-by: Grzegorz Kolodziejczyk <grzegorz.kolodziejczyk@tieto.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2015-04-03 23:21:34 +02:00
Grzegorz Kolodziejczyk
bf8b9a9cb7 Bluetooth: bnep: Add support to extended headers of control frames
Handling extended headers of control frames is required BNEP
functionality. This patch refractor bnep rx frame handling function.
Extended header for control frames shouldn't be omitted as it was
previously done. Every control frame should be checked if it contains
extended header and then every extension should be parsed separately.

Signed-off-by: Grzegorz Kolodziejczyk <grzegorz.kolodziejczyk@tieto.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2015-04-03 23:21:34 +02:00
Grzegorz Kolodziejczyk
0477e2e868 Bluetooth: bnep: Add support for get bnep features via ioctl
This is needed if user space wants to know supported bnep features
by kernel, e.g. if kernel supports sending response to bnep setup
control message. By now there is no possibility to know supported
features by kernel in case of bnep. Ioctls allows only to add connection,
delete connection, get connection list, get connection info. Adding
connection if it's possible (establishing network device connection) is
equivalent to starting bnep session. Bnep session handles data queue of
transmit, receive messages over bnep channel. It means that if we add
connection the received/transmitted data will be parsed immediately. In
case of get bnep features we want to know before session start, if we
should leave setup data on socket queue and let kernel to handle with it,
or in case of no setup handling support, if we should pull this message
and handle setup response within user space.

Signed-off-by: Grzegorz Kolodziejczyk <grzegorz.kolodziejczyk@tieto.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2015-04-03 23:21:34 +02:00
Daniel Borkmann
bcad571824 ebpf: add skb->priority to offset map for usage in {cls, act}_bpf
This adds the ability to read out the skb->priority from an eBPF
program, so that it can be taken into account from a tc filter
or action for the use-case where the priority is not being used
to directly override the filter classification in a qdisc, but
to tag traffic otherwise for the classifier; the priority can be
assigned from various places incl. user space, in future we may
also mangle it from an eBPF program.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Cc: Alexei Starovoitov <ast@plumgrid.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-04-03 14:59:15 -04:00
Grzegorz Kolodziejczyk
e0fdbab169 Bluetooth: bnep: Return err value while sending cmd is not understood
Send command not understood response should be verified if it was
successfully sent, like all send responses.

Signed-off-by: Grzegorz Kolodziejczyk <grzegorz.kolodziejczyk@tieto.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2015-04-03 19:52:35 +02:00
Nicolas Dichtel
576b7cd2f6 netns: don't allocate an id for dead netns
First, let's explain the problem.
Suppose you have an ipip interface that stands in the netns foo and its link
part in the netns bar (so the netns bar has an nsid into the netns foo).
Now, you remove the netns bar:
 - the bar nsid into the netns foo is removed
 - the netns exit method of ipip is called, thus our ipip iface is removed:
   => a netlink message is built in the netns foo to advertise this deletion
   => this netlink message requests an nsid for bar, thus a new nsid is
      allocated for bar and never removed.

This patch adds a check in peernet2id() so that an id cannot be allocated for
a netns which is currently destroyed.

Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-04-03 12:36:31 -04:00
Nicolas Dichtel
6d458f5b4e Revert "netns: don't clear nsid too early on removal"
This reverts
commit 4217291e59 ("netns: don't clear nsid too early on removal").

This is not the right fix, it introduces races.

Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-04-03 12:36:31 -04:00
Ian Morris
00db41243e ipv4: coding style: comparison for inequality with NULL
The ipv4 code uses a mixture of coding styles. In some instances check
for non-NULL pointer is done as x != NULL and sometimes as x. x is
preferred according to checkpatch and this patch makes the code
consistent by adopting the latter form.

No changes detected by objdiff.

Signed-off-by: Ian Morris <ipm@chirality.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-04-03 12:11:15 -04:00
Ian Morris
51456b2914 ipv4: coding style: comparison for equality with NULL
The ipv4 code uses a mixture of coding styles. In some instances check
for NULL pointer is done as x == NULL and sometimes as !x. !x is
preferred according to checkpatch and this patch makes the code
consistent by adopting the latter form.

No changes detected by objdiff.

Signed-off-by: Ian Morris <ipm@chirality.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-04-03 12:11:15 -04:00
WANG Cong
7ba0c47c34 ip6mr: call del_timer_sync() in ip6mr_free_table()
We need to wait for the flying timers, since we
are going to free the mrtable right after it.

Cc: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-04-02 20:52:35 -04:00
WANG Cong
419df12fb5 net: move fib_rules_unregister() under rtnl lock
We have to hold rtnl lock for fib_rules_unregister()
otherwise the following race could happen:

fib_rules_unregister():	fib_nl_delrule():
...				...
...				ops = lookup_rules_ops();
list_del_rcu(&ops->list);
				list_for_each_entry(ops->rules) {
fib_rules_cleanup_ops(ops);	  ...
  list_del_rcu();		  list_del_rcu();
				}

Note, net->rules_mod_lock is actually not needed at all,
either upper layer netns code or rtnl lock guarantees
we are safe.

Cc: Alexander Duyck <alexander.h.duyck@redhat.com>
Cc: Thomas Graf <tgraf@suug.ch>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-04-02 20:52:34 -04:00
WANG Cong
ed785309c9 ipv4: take rtnl_lock and mark mrt table as freed on namespace cleanup
This is the IPv4 part for commit 905a6f96a1
(ipv6: take rtnl_lock and mark mrt6 table as freed on namespace cleanup).

Cc: Hannes Frederic Sowa <hannes@stressinduktion.org>
Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-04-02 20:52:34 -04:00
Neal Cardwell
666b805150 tcp: fix FRTO undo on cumulative ACK of SACKed range
On processing cumulative ACKs, the FRTO code was not checking the
SACKed bit, meaning that there could be a spurious FRTO undo on a
cumulative ACK of a previously SACKed skb.

The FRTO code should only consider a cumulative ACK to indicate that
an original/unretransmitted skb is newly ACKed if the skb was not yet
SACKed.

The effect of the spurious FRTO undo would typically be to make the
connection think that all previously-sent packets were in flight when
they really weren't, leading to a stall and an RTO.

Signed-off-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: Yuchung Cheng <ycheng@google.com>
Fixes: e33099f96d ("tcp: implement RFC5682 F-RTO")
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-04-02 16:35:58 -04:00
Jon Paul Maloy
ed193ece26 tipc: simplify link mtu negotiation
When a link is being established, the two endpoints advertise their
respective interface MTU in the transmitted RESET and ACTIVATE messages.
If there is any difference, the lower of the two MTUs will be selected
for use by both endpoints.

However, as a remnant of earlier attempts to introduce TIPC level
routing. there also exists an MTU discovery mechanism. If an intermediate
node has a lower MTU than the two endpoints, they will discover this
through a bisectional approach, and finally adopt this MTU for common use.

Since there is no TIPC level routing, and probably never will be,
this mechanism doesn't make any sense, and only serves to make the
link level protocol unecessarily complex.

In this commit, we eliminate the MTU discovery algorithm,and fall back
to the simple MTU advertising approach. This change is fully backwards
compatible.

Reviewed-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-04-02 16:27:12 -04:00
Jon Paul Maloy
dff29b1a88 tipc: eliminate delayed link deletion at link failover
When a bearer is disabled manually, all its links have to be reset
and deleted. However, if there is a remaining, parallel link ready
to take over a deleted link's traffic, we currently delay the delete
of the removed link until the failover procedure is finished. This
is because the remaining link needs to access state from the reset
link, such as the last received packet number, and any partially
reassembled buffer, in order to perform a successful failover.

In this commit, we do instead move the state data over to the new
link, so that it can fulfill the procedure autonomously, without
accessing any data on the old link. This means that we can now
proceed and delete all pertaining links immediately when a bearer
is disabled. This saves us from some unnecessary complexity in such
situations.

We also choose to change the confusing definitions CHANGEOVER_PROTOCOL,
ORIGINAL_MSG and DUPLICATE_MSG to the more descriptive TUNNEL_PROTOCOL,
FAILOVER_MSG and SYNCH_MSG respectively.

Reviewed-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-04-02 16:27:12 -04:00
Jon Paul Maloy
2da7142516 tipc: drop tunneled packet duplicates at reception
In commit 8b4ed8634f
("tipc: eliminate race condition at dual link establishment")
we introduced a parallel link synchronization mechanism that
guarentees sequential delivery even for users switching from
an old to a newly established link. The new mechanism makes it
unnecessary to deliver the tunneled duplicate packets back to
the old link, as we are currently doing. It is now sufficient
to use the last tunneled packet's inner sequence number as
synchronization point between the two parallel links, whereafter
it can be dropped.

In this commit, we drop the duplicate packets arriving on the new
link, after updating the synchronization point at each new arrival.

Although it would now have been sufficient for the other endpoint
to only tunnel the last packet in its send queue, and not the
entire queue, we must still do this to maintain compatibility
with older nodes.

This commit makes it possible to get rid if some complex
interaction between the two parallel links.

Reviewed-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-04-02 16:27:12 -04:00
David S. Miller
9f0d34bc34 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Conflicts:
	drivers/net/usb/asix_common.c
	drivers/net/usb/sr9800.c
	drivers/net/usb/usbnet.c
	include/linux/usb/usbnet.h
	net/ipv4/tcp_ipv4.c
	net/ipv6/tcp_ipv6.c

The TCP conflicts were overlapping changes.  In 'net' we added a
READ_ONCE() to the socket cached RX route read, whilst in 'net-next'
Eric Dumazet touched the surrounding code dealing with how mini
sockets are handled.

With USB, it's a case of the same bug fix first going into net-next
and then I cherry picked it back into net.

Signed-off-by: David S. Miller <davem@davemloft.net>
2015-04-02 16:16:53 -04:00
Marcel Holtmann
e213568ad6 Bluetooth: Disallow LE local out-of-band data when LE privacy is used
When the LE pivacy feature is used, then pairing has to happen based
on resolvable random addresses (RPA), but currently there is no clean
way to retrieve the correct RPA. So instead of returning an outdated
RPA, just disallow this command when LE privacy is in use.

Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
2015-04-02 22:18:58 +03:00
Linus Torvalds
8172ba51e2 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Pull networking fixes from David Miller:

 1) Fix use-after-free with mac80211 RX A-MPDU reorder timer, from
    Johannes Berg.

 2) iwlwifi leaks memory every module load/unload cycles, fix from Larry
    Finger.

 3) Need to use for_each_netdev_safe() in rtnl_group_changelink()
    otherwise we can crash, from WANG Cong.

 4) mlx4 driver does register_netdev() too early in the probe sequence,
    from Ido Shamay.

 5) Don't allow router discovery hop limit to decrease the interface's
    hop limit, from D.S. Ljungmark.

 6) tx_packets and tx_bytes improperly accounted for certain classes of
    USB network devices, fix from Ben Hutchings.

 7) ip{6}mr_rules_init() mistakenly use plain kfree to release the ipmr
    tables in the error path, they must instead use ip{6}mr_free_table().
    Fix from WANG Cong.

 8) cxgb4 doesn't properly quiesce all RX activity before unregistering
    the netdevice.  Fix from Hariprasad Shenai.

 9) Fix hash corruptions in ipvlan driver, from Jiri Benc.

10) nla_memcpy(), like a real memcpy, should fully initialize the
    destination buffer, even if the source attribute is smaller.  Fix
    from Jiri Benc.

11) Fix wrong error code returned from iucv_sock_sendmsg().  We should
    use whatever sock_alloc_send_skb() put into 'err'.  From Eugene
    Crosser.

12) Fix slab object leak on module unload in TIPC, from Ying Xue.

13) Need a READ_ONCE() when reading the cached RX socket route in
    tcp_v{4,6}_early_demux().  From Michal Kubecek.

14) Still too many problems with TPC support in the ath9k driver, so
    disable it for now.  From Felix Fietkau.

15) When in AP mode the rtlwifi driver can leak DMA mappings, fix from
    Larry Finger.

16) Missing kzalloc() failure check in gs_usb CAN driver, from Colin Ian
    King.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (52 commits)
  cxgb4: Fix to dump devlog, even if FW is crashed
  cxgb4: Firmware macro changes for fw verison 1.13.32.0
  bnx2x: Fix kdump when iommu=on
  bnx2x: Fix kdump on 4-port device
  mac80211: fix RX A-MPDU session reorder timer deletion
  MAINTAINERS: Update Intel Wired Ethernet Driver info
  tipc: fix a slab object leak
  net/usb/r8152: add device id for Lenovo TP USB 3.0 Ethernet
  af_iucv: fix AF_IUCV sendmsg() errno
  openvswitch: Return vport module ref before destruction
  netlink: pad nla_memcpy dest buffer with zeroes
  bonding: Bonding Overriding Configuration logic restored.
  ipvlan: fix check for IP addresses in control path
  ipvlan: do not use rcu operations for address list
  ipvlan: protect against concurrent link removal
  ipvlan: fix addr hash list corruption
  net: fec: setup right value for mdio hold time
  net: tcp6: fix double call of tcp_v6_fill_cb()
  cxgb4vf: Fix sparse warnings
  netns: don't clear nsid too early on removal
  ...
2015-04-02 11:09:41 -07:00
Nicolas Dichtel
e1622baf54 dev: set iflink to 0 for virtual interfaces
Virtual interfaces are supposed to set an iflink value != of their ifindex.
It was not the case for some of them, like vxlan, bond or bridge.
Let's set iflink to 0 when dev->rtnl_link_ops is set.

Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-04-02 14:05:01 -04:00
Nicolas Dichtel
7a66bbc96c net: remove iflink field from struct net_device
Now that all users of iflink have the ndo_get_iflink handler available, it's
possible to remove this field.

By default, dev_get_iflink() returns the ifindex of the interface.

Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-04-02 14:05:01 -04:00
Nicolas Dichtel
abd2be00d4 dsa: implement ndo_get_iflink
Don't use dev->iflink anymore.

CC: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-04-02 14:05:01 -04:00
Nicolas Dichtel
2dbf6b5058 vlan: implement ndo_get_iflink
Don't use dev->iflink anymore.

CC: Patrick McHardy <kaber@trash.net>
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-04-02 14:05:00 -04:00
Nicolas Dichtel
ee9b9596a8 ipmr,ip6mr: implement ndo_get_iflink
Don't use dev->iflink anymore.

Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-04-02 14:05:00 -04:00
Nicolas Dichtel
1e99584b91 ipip,gre,vti,sit: implement ndo_get_iflink
Don't use dev->iflink anymore.

CC: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-04-02 14:05:00 -04:00
Nicolas Dichtel
ecf2c06a88 ip6tnl,gre6,vti6: implement ndo_get_iflink
Don't use dev->iflink anymore.

CC: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-04-02 14:04:59 -04:00
Nicolas Dichtel
a54acb3a6f dev: introduce dev_get_iflink()
The goal of this patch is to prepare the removal of the iflink field. It
introduces a new ndo function, which will be implemented by virtual interfaces.

There is no functional change into this patch. All readers of iflink field
now call dev_get_iflink().

Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-04-02 14:04:59 -04:00
Johan Hedberg
1b9441f8ec Bluetooth: Convert local OOB data reading to use HCI request
Now that there's a HCI request API available where the callback receives
the resulting skb, we can convert the local OOB data reading to use this
new API. This patch does the necessary update in mgmt.c (which also
requires moving the callback higher up since it's now a static function)
and removes the custom calls from hci_event.c that are no-longer
necessary.

Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2015-04-02 16:09:29 +02:00
Johan Hedberg
757aa0b56d Bluetooth: Move hci_get_cmd_complete() to hci_event.c
To make the hci_req_run_skb() API consistent with hci_cmd_sync_ev()
the callback should receive the cmd_complete parameters in the 'normal'
case and the full HCI event if a special event was expected. This patch
moves the hci_get_cmd_complete() function from hci_core.c to hci_event.c
where it's used to strip the skb from the needed headers before passing
it on to the callback.

Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2015-04-02 16:09:28 +02:00
Johan Hedberg
abe66a4d03 Bluetooth: Remove unused hci_req_pending() function
The hci_req_pending() function has no users anymore, so simply remove
it.

Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2015-04-02 16:09:28 +02:00
Johan Hedberg
f7d9e97592 Bluetooth: Remove unneeded recv_event variable
Now that the synchronous HCI requests use the new API and a new private
variable the recv_evt member of hci_dev is no-longer needed. This patch
removes it.

Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2015-04-02 16:09:27 +02:00
Johan Hedberg
f60cb30579 Bluetooth: Convert hci_req_sync family of function to new request API
Now that there's an API in place that allows passing the resulting skb
to the request callback we can conveniently convert the hci_req_sync and
related functions to use it. Since we still need to get the skb from the
async callback into the sleeping _sync() function the patch adds another
req_skb variable to hci_dev where the sync request state is tracked.

Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2015-04-02 16:09:27 +02:00
Johan Hedberg
e621448749 Bluetooth: Add second hci_request callback option for full skb
This patch adds a second possible callback for HCI requests where the
callback will receive the full skb of the last successfully completed
HCI command. This API is useful for cases where we want to use a request
to read some data and the existing hci_event.c handlers do not store it
e.g. in the hci_dev struct.

The reason the patch is a bit bigger than just adding the new API is
because the hci_req_cmd_complete() functions required some refactoring
to enable it: now hci_req_cmd_complete() is simply used to request the
callback pointers if any, and the actual calling of them happens from a
single place at the end of hci_event_packet(). The reason for this is
that we need to pass the original skb (without any skb_pull, etc
modifications done to it) and it's simplest to keep track of it within
the hci_event_packet() function.

Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2015-04-02 16:09:27 +02:00
Johan Hedberg
444c6dd54d Bluetooth: Add clarifying comment to command status handling
When dealing with HCI command status events, the reasoning for trying to
mark a request as complete if no specific event is being waited for and
status was success is not self-evident. This patch adds a clarifying
comment above the if-statement.

Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2015-04-02 16:09:27 +02:00
Florian Westphal
0b67c43ce3 netfilter: bridge: really save frag_max_size between PRE and POST_ROUTING
We also need to save/store in forward, else br_parse_ip_options call
will zero frag_max_size as well.

Fixes: 93fdd47e5 ('bridge: Save frag_max_size between PRE_ROUTING and POST_ROUTING')
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-04-02 10:41:14 +02:00
Marcel Holtmann
64dd374eac Bluetooth: Export SMP selftest result in debugfs
When SMP selftest is enabled, then besides printing the result into the
kernel message buffer, also create a debugfs file that allows retrieving
the same information.

Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
2015-04-02 08:47:40 +03:00
Marcel Holtmann
6de50f9fdb Bluetooth: Export ECDH selftest result in debugfs
When ECDH selftest is enabled, then besides printing the result into the
kernel message buffer, also create a debugfs file that allows retrieving
the same information.

Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
2015-04-02 08:47:38 +03:00
Marcel Holtmann
0151e426b1 Bluetooth: Restrict BNEP flags to only valid ones
The BNEP flags should be clearly restricted to valid ones. So this puts
extra checks in place to ensure this.

Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
2015-04-02 08:44:02 +03:00
Marcel Holtmann
5f5da99f1d Bluetooth: Restrict HIDP flags to only valid ones
The HIDP flags should be clearly restricted to valid ones. So this puts
extra checks in place to ensure this.

Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
2015-04-02 08:43:11 +03:00
Marcel Holtmann
8bf17a3619 Bluetooth: Restrict CMTP flags to only valid ones
The CMTP flags should be clearly restricted to valid ones. So this puts
extra checks in place to ensure this.

Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
2015-04-02 08:42:21 +03:00
Marcel Holtmann
c3370de64d Bluetooth: Expose current Device ID information via debugfs
For debugging purposes it is good to be able to read the current
configured Device ID details.

Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
2015-04-02 08:40:35 +03:00
Simon Horman
05e8bb860b pkt_sched: fq: correct spelling of locally
Correct spelling of locally.

Also remove extra space before tab character in struct fq_flow.

Signed-off-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-04-01 22:52:29 -04:00
Felix Fietkau
ba8c3d6f16 mac80211: add an intermediate software queue implementation
This allows drivers to request per-vif and per-sta-tid queues from which
they can pull frames. This makes it easier to keep the hardware queues
short, and to improve fairness between clients and vifs.

The task of scheduling packet transmission is left up to the driver -
queueing is controlled by mac80211. Drivers can only dequeue packets by
calling ieee80211_tx_dequeue. This makes it possible to add active queue
management later without changing drivers using this code.

This can also be used as a starting point to implement A-MSDU
aggregation in a way that does not add artificially induced latency.

Signed-off-by: Felix Fietkau <nbd@openwrt.org>
[resolved minor context conflict, minor changes, endian annotations]
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2015-04-01 20:44:34 +02:00
John Linville
cef2fc1ce4 mac80211: reduce log spam from ieee80211_handle_pwr_constr
This changes a couple of messages from sdata_info to sdata_dbg.
This should reduce some log spam, as reported here:

	https://bugzilla.redhat.com/show_bug.cgi?id=1206468

Signed-off-by: John W. Linville <linville@tuxdriver.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2015-04-01 20:44:34 +02:00
Thomas Huehn
5f919abc76 mac80211: add standard deviation to Minstrel stats
This patch adds the statistical descriptor "standard deviation"
to better describe the current properties of Minstrel and
Minstrel-HTs success probability distribution. The standard
deviation (SD) is calculated as exponential weighted moving
standard deviation (EWMSD) and its current value is added as
new column in all rc_stats (in debugfs).

Signed-off-by: Thomas Huehn <thomas@net.t-labs.tu-berlin.de>
Acked-by: Felix Fietkau <nbd@openwrt.org>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2015-04-01 20:44:33 +02:00
Thomas Huehn
ade6d4a2ec mac80211: reduce calculation costs of EWMA
This patch reduces the calculation costs of the EWMA macro from
"2x multiplication and 1 addition" down to "1x multiplication and
2x additions". This slightly improves performance depending on the
CPU architecture.

Signed-off-by: Thomas Huehn <thomas@net.t-labs.tu-berlin.de>
Acked-by: Felix Fietkau <nbd@openwrt.org>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2015-04-01 20:44:33 +02:00
Thomas Huehn
50e55a8ea7 mac80211: add max lossless throughput per rate
This patch adds the new statistic "maximum possible lossless
throughput" to Minstrels and Minstrel-HTs rc_stats (in debugfs). This
enables comprehensive comparison between current per-rate throughput
and max. achievable per-rate throughput.

Signed-off-by: Thomas Huehn <thomas@net.t-labs.tu-berlin.de>
Acked-by: Felix Fietkau <nbd@openwrt.org>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2015-04-01 20:44:32 +02:00
Thomas Huehn
6a27b2c40b mac80211: restructure per-rate throughput calculation into function
This patch moves Minstrels and Minstrel-HTs per-rate throughput
calculation (EWMA(thr)) into a dedicated function to be called.
Therefore the variable "unsigned int cur_tp" within struct
"minstrel_rate_stats" becomes obsolete.  and is removed to free
up its space.

Signed-off-by: Thomas Huehn <thomas@net.t-labs.tu-berlin.de>
Acked-by: Felix Fietkau <nbd@openwrt.org>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2015-04-01 20:44:32 +02:00
Thomas Huehn
9134073bc6 mac80211: improve Minstrel variable & function naming
This patch ensures a consistent usage of variable names for type
"minstrel_rate_stats" to be used as "mrs" and from type minstrel_rate
as "mr" across both Minstrel & Minstrel-HT. In addition some
variable and function names got changed to more meaningful ones.

Signed-off-by: Thomas Huehn <thomas@net.t-labs.tu-berlin.de>
Acked-by: Felix Fietkau <nbd@openwrt.org>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2015-04-01 20:44:31 +02:00
Thomas Huehn
f62838bcc5 mac80211: unify Minstrel & Minstrel-HTs calculation of rate statistics
This patch unifies the calculation of Minstrels and Minstrel-HTs
per-rate statistic. The new common function minstrel_calc_rate_stats()
is called when a statistic update is performed.

Signed-off-by: Thomas Huehn <thomas@net.t-labs.tu-berlin.de>
Acked-by: Felix Fietkau <nbd@openwrt.org>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2015-04-01 20:44:31 +02:00
Thomas Huehn
2cae0b6a70 mac80211: add new Minstrel-HT statistic output via csv
This patch adds a new debugfs file "rc_stats_csv" to output
Minstrel-HTs statistics in a common csv format that is easy
to parse.

Signed-off-by: Thomas Huehn <thomas@net.t-labs.tu-berlin.de>
Signed-off-by: Stefan Venz <ikstream86@gmail.com>
Acked-by: Felix Fietkau <nbd@openwrt.org>
[remove printing current time of day]
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2015-04-01 20:44:26 +02:00
Thomas Huehn
6d48851779 mac80211: add new Minstrel statistic output via csv
This patch adds a new debugfs file "rc_stats_csv" to output Minstrels
statistics in a common csv format that is easy to parse.

Signed-off-by: Thomas Huehn <thomas@net.t-labs.tu-berlin.de>
Signed-off-by: Stefan Venz <ikstream86@gmail.com>
Acked-by: Felix Fietkau <nbd@openwrt.org>
[remove printing current time of day]
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2015-04-01 20:41:43 +02:00
David S. Miller
af3e09e666 This contains just a single fix for a crash I happened to randomly
run into today during testing. It's clearly been around for a while,
 but is pretty hard to trigger, even when I tried explicitly (and
 modified the code to make it more likely) it rarely did.
 -----BEGIN PGP SIGNATURE-----
 
 iQIcBAABCAAGBQJVG/WSAAoJEDBSmw7B7bqrzmgQALygsyo0GrmHHorg0wkO+PBK
 l6kVknRwlsMil+vmB6mPTnwgUaGnEWJeRXl4zPJeA4Z+Xr58K1lyTFcMC0nu2MVb
 MNcTJycRmF4Lqyycd52zFaA+1vMsgG7AZb6vXLYppchUFTmbLVNX/IWhJGNOAsKZ
 HjYdvLr2kbJunIRofc/0PCC/8J4qFQj2ZphF5WfMckhNrh8+SQjIqmscFdRIqcgj
 R+LtyxscyKWspQ97J6OdoTsKpxTKSZ8mi9IvohdJhkJVnzBEkJx+Krf6PNs+Xnh1
 Z4x1Dkn1RoM8+7cakq2tTwwxokEw7de/3v/s90W+yVuZWNKsaiKHcnU+KGHu6t0J
 2766PXv6hC3KSWN6XrfTxYP7CBsT46Vf5+7FSlDsck1tUbn3W55c2kPFMBKk79yi
 CHjz1O82wzx4bAVNdaKMpR8rz6bSyhZijmduMuYxxrvnKkl5BSHype9IAUo+etVz
 KxPnwGq9yJjf0RYdr9tttxiwXJaADD6/R/bO21SIi1JeKa5sCyoAA5nLDyJfXwyt
 KtlrzzM9NWqoQUi2SGmurHHEIBBmgg9RBBWvq+MNM0Ik7d9kIawCBlvPerVc4IH4
 bUvIXEnrQNOEwxmY/9nsUShQvkzuQbkwR3rpEYA3XemO0qW1t0Tkdp/3UY3TY6Sq
 EOTi9LkOquZnKd9GB7yl
 =Imm5
 -----END PGP SIGNATURE-----

Merge tag 'mac80211-for-davem-2015-04-01' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211

Johannes Berg says:

====================
This contains just a single fix for a crash I happened to randomly
run into today during testing. It's clearly been around for a while,
but is pretty hard to trigger, even when I tried explicitly (and
modified the code to make it more likely) it rarely did.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2015-04-01 14:19:22 -04:00
Linus Torvalds
1e848913f0 Merge branch 'for-4.0' of git://linux-nfs.org/~bfields/linux
Pull nfsd fixes from Bruce Fields:
 "Two main issues:

   - We found that turning on pNFS by default (when it's configured at
     build time) was too aggressive, so we want to switch the default
     before the 4.0 release.

   - Recent client changes to increase open parallelism uncovered a
     serious bug lurking in the server's open code.

  Also fix a krb5/selinux regression.

  The rest is mainly smaller pNFS fixes"

* 'for-4.0' of git://linux-nfs.org/~bfields/linux:
  sunrpc: make debugfs file creation failure non-fatal
  nfsd: require an explicit option to enable pNFS
  NFSD: Fix bad update of layout in nfsd4_return_file_layout
  NFSD: Take care the return value from nfsd4_encode_stateid
  NFSD: Printk blocklayout length and offset as format 0x%llx
  nfsd: return correct lockowner when there is a race on hash insert
  nfsd: return correct openowner when there is a race to put one in the hash
  NFSD: Put exports after nfsd4_layout_verify fail
  NFSD: Error out when register_shrinker() fail
  NFSD: Take care the return value from nfsd4_decode_stateid
  NFSD: Check layout type when returning client layouts
  NFSD: restore trace event lost in mismerge
2015-04-01 09:45:47 -07:00
David Howells
44ba06987c RxRPC: Handle VERSION Rx protocol packets
Handle VERSION Rx protocol packets.  We should respond to a VERSION packet
with a string indicating the Rx version.  This is a maximum of 64 characters
and is padded out to 65 chars with NUL bytes.

Note that other AFS clients use the version request as a NAT keepalive so we
need to handle it rather than returning an abort.

The standard formulation seems to be:

	<project> <version> built <yyyy>-<mm>-<dd>

for example:

	" OpenAFS 1.6.2 built  2013-05-07 "

(note the three extra spaces) as obtained with:

	rxdebug grand.mit.edu -version

from the openafs package.

Signed-off-by: David Howells <dhowells@redhat.com>
2015-04-01 16:31:26 +01:00
David Howells
382d7974de RxRPC: Use iov_iter_count() in rxrpc_send_data() instead of the len argument
Use iov_iter_count() in rxrpc_send_data() to get the remaining data length
instead of using the len argument as the len argument is now redundant.

Signed-off-by: David Howells <dhowells@redhat.com>
2015-04-01 15:49:26 +01:00
David Howells
aab94830a7 RxRPC: Don't call skb_add_data() if there's no data to copy
Don't call skb_add_data() in rxrpc_send_data() if there's no data to copy and
also skip the calculations associated with it in such a case.

Signed-off-by: David Howells <dhowells@redhat.com>
2015-04-01 15:48:00 +01:00
David Howells
3af6878eca RxRPC: Fix the conversion to iov_iter
This commit:

	commit af2b040e47
	Author: Al Viro <viro@zeniv.linux.org.uk>
	Date:   Thu Nov 27 21:44:24 2014 -0500
	Subject: rxrpc: switch rxrpc_send_data() to iov_iter primitives

incorrectly changes a do-while loop into a while loop in rxrpc_send_data().

Unfortunately, at least one pass through the loop is required - even if
there is no data - so that the packet the closes the send phase can be
sent if MSG_MORE is not set.

Signed-off-by: David Howells <dhowells@redhat.com>
2015-04-01 14:06:00 +01:00
Johannes Berg
788211d81b mac80211: fix RX A-MPDU session reorder timer deletion
There's an issue with the way the RX A-MPDU reorder timer is
deleted that can cause a kernel crash like this:

 * tid_rx is removed - call_rcu(ieee80211_free_tid_rx)
 * station is destroyed
 * reorder timer fires before ieee80211_free_tid_rx() runs,
   accessing the station, thus potentially crashing due to
   the use-after-free

The station deletion is protected by synchronize_net(), but
that isn't enough -- ieee80211_free_tid_rx() need not have
run when that returns (it deletes the timer.) We could use
rcu_barrier() instead of synchronize_net(), but that's much
more expensive.

Instead, to fix this, add a field tracking that the session
is being deleted. In this case, the only re-arming of the
timer happens with the reorder spinlock held, so make that
code not rearm it if the session is being deleted and also
delete the timer after setting that field. This ensures the
timer cannot fire after ___ieee80211_stop_rx_ba_session()
returns, which fixes the problem.

Cc: stable@vger.kernel.org
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2015-04-01 14:35:01 +02:00
Pablo Neira Ayuso
c5035c77f8 netfilter: nft_meta: fix cgroup matching
We have to stop iterating on the rule expressions if the cgroup
mismatches. Moreover, make sure a non-full socket from the input path
leads us to a crash.

Fixes: ce67417 ("netfilter: nft_meta: add cgroup support")
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-04-01 11:33:00 +02:00
Oliver Hartkopp
a5581ef4c2 can: introduce new raw socket option to join the given CAN filters
The CAN_RAW socket can set multiple CAN identifier specific filters that lead
to multiple filters in the af_can.c filter processing. These filters are
indenpendent from each other which leads to logical OR'ed filters when applied.

This socket option joines the given CAN filters in the way that only CAN frames
are passed to user space that matched *all* given CAN filters. The semantic for
the applied filters is therefore changed to a logical AND.

This is useful especially when the filterset is a combination of filters where
the CAN_INV_FILTER flag is set in order to notch single CAN IDs or CAN ID
ranges from the incoming traffic.

As the raw_rcv() function is executed from NET_RX softirq the introduced
variables are implemented as per-CPU variables to avoid extensive locking at
CAN frame reception time.

Signed-off-by: Oliver Hartkopp <socketcan@hartkopp.net>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2015-04-01 11:28:22 +02:00