Commit Graph

36409 Commits

Author SHA1 Message Date
Eliad Peller
954a86ef45 cfg80211: add operating classes 128-130
Operating classes 128-130 are defined in the 11ac
spec for the 5GHz band.

Update ieee80211_operating_class_to_band() to support them.

Signed-off-by: Eliad Peller <eliadx.peller@intel.com>
Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2015-03-03 15:56:07 +01:00
Alexander Bondar
2ecc3905e6 mac80211: Update beacon's timing and DTIM count on every beacon
Beacon's timestamp, device system time associated with this beacon and
DTIM count parameters are not updated in the associated vif context
if the latest beacon's content is identical to the previously received.
It make sense to update these changing parameters on every beacon so the
driver can get most updated values. This may be necessary, for example,
to avoid either beacons' drift effect or device time stamp overrun.
IMPORTANT: Three sync_* parameters - sync_ts, sync_device_ts and
sync_dtim_count would possibly be out of sync by the time the driver will
use them. The synchronized view is currently guaranteed only in certain
callbacks.

Signed-off-by: Alexander Bondar <alexander.bondar@intel.com>
Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2015-03-03 15:56:06 +01:00
Ahmad Kholaif
6c09e791b2 cfg80211: Allow NL80211_ATTR_IFINDEX to be added to vendor events
This modifies cfg80211_vendor_event_alloc() with an additional argument
struct wireless_dev *wdev. __cfg80211_alloc_event_skb() is modified to
take in *wdev argument, if wdev != NULL, both the NL80211_ATTR_IFINDEX
and wdev identifier are added to the vendor event.

These changes make it easier for drivers to add ifindex indication in
vendor events cleanly.

This also updates all existing users of cfg80211_vendor_event_alloc()
and __cfg80211_alloc_event_skb() in the kernel tree.

Signed-off-by: Ahmad Kholaif <akholaif@qca.qualcomm.com>
Signed-off-by: Jouni Malinen <jouni@qca.qualcomm.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2015-03-03 15:56:05 +01:00
Janusz.Dziedzic@tieto.com
ffc1199122 cfg80211: add VHT support for IBSS
Add NL80211_EXT_FEATURE_VHT_IBSS flag and VHT
support for IBSS.

Signed-off-by: Janusz Dziedzic <janusz.dziedzic@tieto.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2015-03-03 15:56:04 +01:00
Dedy Lansky
6eb1813764 cfg80211: add bss_type and privacy arguments in cfg80211_get_bss()
802.11ad adds new a network type (PBSS) and changes the capability
field interpretation for the DMG (60G) band.
The same 2 bits that were interpreted as "ESS" and "IBSS" before are
re-used as a 2-bit field with 3 valid values (and 1 reserved). Valid
values are: "IBSS", "PBSS" (new) and "AP".

In order to get the BSS struct for the new PBSS networks, change the
cfg80211_get_bss() function to take a new enum ieee80211_bss_type
argument with the valid network types, as "capa_mask" and "capa_val"
no longer work correctly (the search must be band-aware now.)

The remaining bits in "capa_mask" and "capa_val" are used only for
privacy matching so replace those two with a privacy enum as well.

Signed-off-by: Dedy Lansky <dlansky@codeaurora.org>
[rewrite commit log, tiny fixes]
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2015-03-03 15:56:01 +01:00
James Minor
76a70e9c4b cfg80211-wext: return -E2BIG when buffer can't hold full BSS entry
When using the wext compatibility code in cfg80211, part of the IEs
can be truncated if the passed user buffer is large enough for part
of the BSS but not large enough for all of the IEs.  This can cause
an EAP network to show up as a PSK network.

Always return -E2BIG in this case to avoid truncating data.

Since this changes the control flow, use an on-stack variable for
a small buffer instead of allocating it.

Signed-off-by: James Minor <james.minor@ni.com>
[rework patch to error out immediately, use _check wrappers]
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2015-02-28 21:31:12 +01:00
Johannes Berg
abfbc3af57 mac80211: remove TX latency measurement code
Revert commit ad38bfc916 ("mac80211: Tx frame latency statistics")
(along with some follow-up fixes).

This code turned out not to be as useful in the current form as we
thought, and we've internally hacked it up more, but that's not
very suitable for upstream (for now), and we might just do that
with tracing instead.

Therefore, for now at least, remove this code. We might also need
to use the skb->tstamp field for the TCP performance issue, which
is more important than the debugging.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2015-02-28 21:31:11 +01:00
Masashi Honma
31f909a2c0 nl/mac80211: allow zero plink timeout to disable STA expiration
Both wpa_supplicant and mac80211 have and inactivity timer. By default
wpa_supplicant will be timed out in 5 minutes and mac80211's it is 30
minutes. If wpa_supplicant uses a longer timer than mac80211, it will
get unexpected disconnection by mac80211.

Using 0xffffffff instead as the configured value could solve this w/o
changing the code, but due to integer overflow in the expression used
this doesn't work. The expression is:

(current jiffies) > (frame Rx jiffies + NL80211_MESHCONF_PLINK_TIMEOUT * 250)

On 32bit system, the right side would overflow and be a very small
value if NL80211_MESHCONF_PLINK_TIMEOUT is sufficiently large,
causing unexpectedly early disconnections.

Instead allow disabling the inactivity timer to avoid this situation,
by passing the (previously invalid and useless) value 0.

Signed-off-by: Masashi Honma <masashi.honma@gmail.com>
[reword/rewrap commit log]
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2015-02-28 21:31:10 +01:00
Johannes Berg
2afe38d15c cfg80211-wext: export symbols only when needed
When a fully converted cfg80211 driver needs cfg80211-wext for
userspace API purposes, the symbols need not be exported. When
other drivers (orinoco/hermes or ipw2200) are enabled, they do
need the symbols exported as they use them directly.

Make those drivers select a new CFG80211_WEXT_EXPORT Kconfig
symbol (instead of just CFG80211_WEXT) and export the functions
only if requested - this saves about 1/2k due to the size of
EXPORT_SYMBOL() itself.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2015-02-28 21:31:09 +01:00
Johannes Berg
7d9bb2f065 mac80211: iterate using station list in AP SMPS
When changing AP SMPS, we need to look up all the stations
for this interface, so there's no reason to iterate over
hash chains rather than doing the simpler iteration over
the station list.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2015-02-28 21:31:09 +01:00
Johannes Berg
9d6b106b54 mac80211: don't look up stations for multicast addresses
Since multicast addresses don't exist as stations, don't attempt
to look them up in the hashtable on TX.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2015-02-28 21:28:44 +01:00
Roopa Prabhu
b7853d73e3 bridge: add vlan info to bridge setlink and dellink notification messages
vlan add/deletes are not notified to userspace today. This patch adds
vlan info to bridge newlink/dellink notifications generated from the
bridge driver. Notifications use the RTEXT_FILTER_BRVLAN_COMPRESSED
flag to compress vlans into ranges whereever applicable.

The size calculations does not take ranges into account for
simplicity.  This has the potential for allocating a larger skb than
required.

There is an existing inconsistency with bridge NEWLINK and DELLINK
change notifications. Both generate NEWLINK notifications.  Since its
always a NEWLINK notification, this patch includes all vlans the port
belongs to in the notification. The NEWLINK and DELLINK request
messages however only include the vlans to be added and deleted.

Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-22 23:00:45 -05:00
Ameen Ali
e099b2d9df net: __aligned(size) is preferred over __attribute__((aligned(size)))
Signed-off-by: Ameen Ali <AmeenAli023@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-22 17:01:22 -05:00
Joe Perches
92b8391750 batman-adv: Fix use of seq_has_overflowed()
net-next commit 6d91147d18 ("batman-adv: Remove uses of return value
of seq_printf") incorrectly changed the overflow occurred return from
-1 to 1.  Change it back so that the test of batadv_write_buffer_text's
return value in batadv_gw_client_seq_print_text works properly.

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-22 17:00:08 -05:00
Bojan Prtvar
059a2440fd net: Remove state argument from skb_find_text()
Although it is clear that textsearch state is intentionally passed to
skb_find_text() as uninitialized argument, it was never used by the
callers. Therefore, we can simplify skb_find_text() by making it
local variable.

Signed-off-by: Bojan Prtvar <prtvar.b@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-22 15:59:54 -05:00
Dan Carpenter
d340c862e7 ethtool: use "ops" name consistenty in ethtool_set_rxfh()
"dev->ethtool_ops" and "ops" are the same, but we should use "ops"
everywhere to be consistent.

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-21 22:05:53 -05:00
Alex W Slater
29778bec12 ipv6: Replace "#include <asm/uaccess>" with "#include <linux/uaccess>"
Fix checkpatch.pl warning "Use #include <linux/uaccess.h> instead of <asm/uaccess.h>"

Signed-off-by: Alex W Slater <alex.slater.dev@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-21 22:05:53 -05:00
Eric Dumazet
959d10f6bb igmp: add __ip_mc_{join|leave}_group()
There is a need to perform igmp join/leave operations while RTNL is
held.

Make ip_mc_{join|leave}_group() wrappers around
__ip_mc_{join|leave}_group() to avoid the proliferation of work queues.

For example, vxlan_igmp_join() could possibly be removed.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-20 15:24:04 -05:00
Joe Perches
6d91147d18 batman-adv: Remove uses of return value of seq_printf
This function is soon going to return void so remove the
return value use.

Convert the return value to test seq_has_overflowed() instead.

Signed-off-by: Joe Perches <joe@perches.com>
Acked-by: Antonio Quartulli <antonio@meshcoding.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-20 15:18:26 -05:00
stephen hemminger
db2855ae24 tcp: silence registration message
This message isn't really needed it justs waits time/space.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-20 15:04:03 -05:00
Linus Torvalds
f5af19d10d Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Pull networking updates from David Miller:

 1) Missing netlink attribute validation in nft_lookup, from Patrick
    McHardy.

 2) Restrict ipv6 partial checksum handling to UDP, since that's the
    only case it works for.  From Vlad Yasevich.

 3) Clear out silly device table sentinal macros used by SSB and BCMA
    drivers.  From Joe Perches.

 4) Make sure the remote checksum code never creates a situation where
    the remote checksum is applied yet the tunneling metadata describing
    the remote checksum transformation is still present.  Otherwise an
    external entity might see this and apply the checksum again.  From
    Tom Herbert.

 5) Use msecs_to_jiffies() where applicable, from Nicholas Mc Guire.

 6) Don't explicitly initialize timer struct fields, use setup_timer()
    and mod_timer() instead.  From Vaishali Thakkar.

 7) Don't invoke tg3_halt() without the tp->lock held, from Jun'ichi
    Nomura.

 8) Missing __percpu annotation in ipvlan driver, from Eric Dumazet.

 9) Don't potentially perform skb_get() on shared skbs, also from Eric
    Dumazet.

10) Fix COW'ing of metrics for non-DST_HOST routes in ipv6, from Martin
    KaFai Lau.

11) Fix merge resolution error between the iov_iter changes in vhost and
    some bug fixes that occurred at the same time.  From Jason Wang.

12) If rtnl_configure_link() fails we have to perform a call to
    ->dellink() before unregistering the device.  From WANG Cong.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (39 commits)
  net: dsa: Set valid phy interface type
  rtnetlink: call ->dellink on failure when ->newlink exists
  com20020-pci: add support for eae single card
  vhost_net: fix wrong iter offset when setting number of buffers
  net: spelling fixes
  net/core: Fix warning while make xmldocs caused by dev.c
  net: phy: micrel: disable NAND-tree for KSZ8021, KSZ8031, KSZ8051, KSZ8081
  ipv6: fix ipv6_cow_metrics for non DST_HOST case
  openvswitch: Fix key serialization.
  r8152: restore hw settings
  hso: fix rx parsing logic when skb allocation fails
  tcp: make sure skb is not shared before using skb_get()
  bridge: netfilter: Move sysctl-specific error code inside #ifdef
  ipv6: fix possible deadlock in ip6_fl_purge / ip6_fl_gc
  ipvlan: add a missing __percpu pcpu_stats
  tg3: Hold tp->lock before calling tg3_halt() from tg3_init_one()
  bgmac: fix device initialization on Northstar SoCs (condition typo)
  qlcnic: Delete existing multicast MAC list before adding new
  net/mlx5_core: Fix configuration of log_uar_page_sz
  sunvnet: don't change gso data on clones
  ...
2015-02-17 17:41:19 -08:00
Guenter Roeck
19334920ea net: dsa: Set valid phy interface type
If the phy interface mode is not found in devicetree, or if devicetree
is not configured, of_get_phy_mode returns -ENODEV. The current code
sets the phy interface mode to the return value from of_get_phy_mode
without checking if it is valid.

This invalid phy interface mode is passed as parameter to of_phy_connect
or to phy_connect_direct. This sets the phy interface mode to the invalid
value, which in turn causes problems for any code using phydev->interface.

Fixes: b31f65fb43 ("net: dsa: slave: Fix autoneg for phys on switch MDIO bus")
Fixes: 0d8bcdd383 ("net: dsa: allow for more complex PHY setups")
Cc: Florian Fainelli <f.fainelli@gmail.com>
Cc: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Acked-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-17 10:37:39 -08:00
WANG Cong
7afb8886a0 rtnetlink: call ->dellink on failure when ->newlink exists
Ignacy reported that when eth0 is down and add a vlan device
on top of it like:

  ip link add link eth0 name eth0.1 up type vlan id 1

We will get a refcount leak:

  unregister_netdevice: waiting for eth0.1 to become free. Usage count = 2

The problem is when rtnl_configure_link() fails in rtnl_newlink(),
we simply call unregister_device(), but for stacked device like vlan,
we almost do nothing when we unregister the upper device, more work
is done when we unregister the lower device, so call its ->dellink().

Reported-by: Ignacy Gawedzki <ignacy.gawedzki@green-communications.fr>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-15 08:30:10 -08:00
Stephen Hemminger
ca9f1fd263 net: spelling fixes
Spelling errors caught by codespell.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-14 20:36:08 -08:00
Masanari Iida
4a26e453d9 net/core: Fix warning while make xmldocs caused by dev.c
This patch fix following warning wile make xmldocs.

  Warning(.//net/core/dev.c:5345): No description found
  for parameter 'bonding_info'
  Warning(.//net/core/dev.c:5345): Excess function parameter
  'netdev_bonding_info' description in 'netdev_bonding_info_change'

This warning starts to appear after following patch was added
into Linus's tree during merger period.

commit 61bd3857ff
net/core: Add event for a change in slave state

Signed-off-by: Masanari Iida <standby24x7@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-14 20:34:49 -08:00
Martin KaFai Lau
3b4711757d ipv6: fix ipv6_cow_metrics for non DST_HOST case
ipv6_cow_metrics() currently assumes only DST_HOST routes require
dynamic metrics allocation from inetpeer.  The assumption breaks
when ndisc discovered router with RTAX_MTU and RTAX_HOPLIMIT metric.
Refer to ndisc_router_discovery() in ndisc.c and note that dst_metric_set()
is called after the route is created.

This patch creates the metrics array (by calling dst_cow_metrics_generic) in
ipv6_cow_metrics().

Test:
radvd.conf:
interface qemubr0
{
	AdvLinkMTU 1300;
	AdvCurHopLimit 30;

	prefix fd00:face:face:face::/64
	{
		AdvOnLink on;
		AdvAutonomous on;
		AdvRouterAddr off;
	};
};

Before:
[root@qemu1 ~]# ip -6 r show | egrep -v unreachable
fd00:face:face:face::/64 dev eth0  proto kernel  metric 256  expires 27sec
fe80::/64 dev eth0  proto kernel  metric 256
default via fe80::74df:d0ff:fe23:8ef2 dev eth0  proto ra  metric 1024  expires 27sec

After:
[root@qemu1 ~]# ip -6 r show | egrep -v unreachable
fd00:face:face:face::/64 dev eth0  proto kernel  metric 256  expires 27sec mtu 1300
fe80::/64 dev eth0  proto kernel  metric 256  mtu 1300
default via fe80::74df:d0ff:fe23:8ef2 dev eth0  proto ra  metric 1024  expires 27sec mtu 1300 hoplimit 30

Fixes: 8e2ec63917 (ipv6: don't use inetpeer to store metrics for routes.)
Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-14 20:26:16 -08:00
Pravin B Shelar
26ad0b8358 openvswitch: Fix key serialization.
Fix typo where mask is used rather than key.

Fixes: 74ed7ab9264("openvswitch: Add support for unique flow IDs.")
Reported-by: Joe Stringer <joestringer@nicira.com>
Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Acked-by: Joe Stringer <joestringer@nicira.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-14 20:20:40 -08:00
Tejun Heo
f09068276c net: use %*pb[l] to print bitmaps including cpumasks and nodemasks
printk and friends can now format bitmaps using '%*pb[l]'.  cpumask
and nodemask also provide cpumask_pr_args() and nodemask_pr_args()
respectively which can be used to generate the two printf arguments
necessary to format the specified cpu/nodemask.

Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-02-13 21:21:38 -08:00
Eric Dumazet
ba34e6d9d3 tcp: make sure skb is not shared before using skb_get()
IPv6 can keep a copy of SYN message using skb_get() in
tcp_v6_conn_request() so that caller wont free the skb when calling
kfree_skb() later.

Therefore TCP fast open has to clone the skb it is queuing in
child->sk_receive_queue, as all skbs consumed from receive_queue are
freed using __kfree_skb() (ie assuming skb->users == 1)

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Yuchung Cheng <ycheng@google.com>
Fixes: 5b7ed0892f ("tcp: move fastopen functions to tcp_fastopen.c")
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-13 07:11:40 -08:00
Vladimir Davydov
f48b80a5e2 memcg: cleanup static keys decrement
Move memcg_socket_limit_enabled decrement to tcp_destroy_cgroup (called
from memcg_destroy_kmem -> mem_cgroup_sockets_destroy) and zap a bunch of
wrapper functions.

Although this patch moves static keys decrement from __mem_cgroup_free to
mem_cgroup_css_free, it does not introduce any functional changes, because
the keys are incremented on setting the limit (tcp or kmem), which can
only happen after successful mem_cgroup_css_online.

Signed-off-by: Vladimir Davydov <vdavydov@parallels.com>
Cc: Glauber Costa <glommer@parallels.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujtisu.com>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: Michal Hocko <mhocko@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-02-12 18:54:10 -08:00
Linus Torvalds
61845143fe Merge branch 'for-3.20' of git://linux-nfs.org/~bfields/linux
Pull nfsd updates from Bruce Fields:
 "The main change is the pNFS block server support from Christoph, which
  allows an NFS client connected to shared disk to do block IO to the
  shared disk in place of NFS reads and writes.  This also requires xfs
  patches, which should arrive soon through the xfs tree, barring
  unexpected problems.  Support for other filesystems is also possible
  if there's interest.

  Thanks also to Chuck Lever for continuing work to get NFS/RDMA into
  shape"

* 'for-3.20' of git://linux-nfs.org/~bfields/linux: (32 commits)
  nfsd: default NFSv4.2 to on
  nfsd: pNFS block layout driver
  exportfs: add methods for block layout exports
  nfsd: add trace events
  nfsd: update documentation for pNFS support
  nfsd: implement pNFS layout recalls
  nfsd: implement pNFS operations
  nfsd: make find_any_file available outside nfs4state.c
  nfsd: make find/get/put file available outside nfs4state.c
  nfsd: make lookup/alloc/unhash_stid available outside nfs4state.c
  nfsd: add fh_fsid_match helper
  nfsd: move nfsd_fh_match to nfsfh.h
  fs: add FL_LAYOUT lease type
  fs: track fl_owner for leases
  nfs: add LAYOUT_TYPE_MAX enum value
  nfsd: factor out a helper to decode nfstime4 values
  sunrpc/lockd: fix references to the BKL
  nfsd: fix year-2038 nfs4 state problem
  svcrdma: Handle additional inline content
  svcrdma: Move read list XDR round-up logic
  ...
2015-02-12 10:39:41 -08:00
Geert Uytterhoeven
9672723973 bridge: netfilter: Move sysctl-specific error code inside #ifdef
If CONFIG_SYSCTL=n:

    net/bridge/br_netfilter.c: In function ‘br_netfilter_init’:
    net/bridge/br_netfilter.c:996: warning: label ‘err1’ defined but not used

Move the label and the code after it inside the existing #ifdef to get
rid of the warning.

Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-12 08:44:46 -08:00
Jan Stancek
4762fb9804 ipv6: fix possible deadlock in ip6_fl_purge / ip6_fl_gc
Use spin_lock_bh in ip6_fl_purge() to prevent following potentially
deadlock scenario between ip6_fl_purge() and ip6_fl_gc() timer.

  =================================
  [ INFO: inconsistent lock state ]
  3.19.0 #1 Not tainted
  ---------------------------------
  inconsistent {SOFTIRQ-ON-W} -> {IN-SOFTIRQ-W} usage.
  swapper/5/0 [HC0[0]:SC1[1]:HE1:SE0] takes:
   (ip6_fl_lock){+.?...}, at: [<ffffffff8171155d>] ip6_fl_gc+0x2d/0x180
  {SOFTIRQ-ON-W} state was registered at:
    [<ffffffff810ee9a0>] __lock_acquire+0x4a0/0x10b0
    [<ffffffff810efd54>] lock_acquire+0xc4/0x2b0
    [<ffffffff81751d2d>] _raw_spin_lock+0x3d/0x80
    [<ffffffff81711798>] ip6_flowlabel_net_exit+0x28/0x110
    [<ffffffff815f9759>] ops_exit_list.isra.1+0x39/0x60
    [<ffffffff815fa320>] cleanup_net+0x100/0x1e0
    [<ffffffff810ad80a>] process_one_work+0x20a/0x830
    [<ffffffff810adf4b>] worker_thread+0x11b/0x460
    [<ffffffff810b42f4>] kthread+0x104/0x120
    [<ffffffff81752bfc>] ret_from_fork+0x7c/0xb0
  irq event stamp: 84640
  hardirqs last  enabled at (84640): [<ffffffff81752080>] _raw_spin_unlock_irq+0x30/0x50
  hardirqs last disabled at (84639): [<ffffffff81751eff>] _raw_spin_lock_irq+0x1f/0x80
  softirqs last  enabled at (84628): [<ffffffff81091ad1>] _local_bh_enable+0x21/0x50
  softirqs last disabled at (84629): [<ffffffff81093b7d>] irq_exit+0x12d/0x150

  other info that might help us debug this:
   Possible unsafe locking scenario:

         CPU0
         ----
    lock(ip6_fl_lock);
    <Interrupt>
      lock(ip6_fl_lock);

   *** DEADLOCK ***

Signed-off-by: Jan Stancek <jstancek@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-12 07:13:03 -08:00
Linus Torvalds
8cc748aa76 Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security
Pull security layer updates from James Morris:
 "Highlights:

   - Smack adds secmark support for Netfilter
   - /proc/keys is now mandatory if CONFIG_KEYS=y
   - TPM gets its own device class
   - Added TPM 2.0 support
   - Smack file hook rework (all Smack users should review this!)"

* 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security: (64 commits)
  cipso: don't use IPCB() to locate the CIPSO IP option
  SELinux: fix error code in policydb_init()
  selinux: add security in-core xattr support for pstore and debugfs
  selinux: quiet the filesystem labeling behavior message
  selinux: Remove unused function avc_sidcmp()
  ima: /proc/keys is now mandatory
  Smack: Repair netfilter dependency
  X.509: silence asn1 compiler debug output
  X.509: shut up about included cert for silent build
  KEYS: Make /proc/keys unconditional if CONFIG_KEYS=y
  MAINTAINERS: email update
  tpm/tpm_tis: Add missing ifdef CONFIG_ACPI for pnp_acpi_device
  smack: fix possible use after frees in task_security() callers
  smack: Add missing logging in bidirectional UDS connect check
  Smack: secmark support for netfilter
  Smack: Rework file hooks
  tpm: fix format string error in tpm-chip.c
  char/tpm/tpm_crb: fix build error
  smack: Fix a bidirectional UDS connect check typo
  smack: introduce a special case for tmpfs in smack_d_instantiate()
  ...
2015-02-11 20:25:11 -08:00
Linus Torvalds
59d53737a8 Merge branch 'akpm' (patches from Andrew)
Merge second set of updates from Andrew Morton:
 "More of MM"

* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (83 commits)
  mm/nommu.c: fix arithmetic overflow in __vm_enough_memory()
  mm/mmap.c: fix arithmetic overflow in __vm_enough_memory()
  vmstat: Reduce time interval to stat update on idle cpu
  mm/page_owner.c: remove unnecessary stack_trace field
  Documentation/filesystems/proc.txt: describe /proc/<pid>/map_files
  mm: incorporate read-only pages into transparent huge pages
  vmstat: do not use deferrable delayed work for vmstat_update
  mm: more aggressive page stealing for UNMOVABLE allocations
  mm: always steal split buddies in fallback allocations
  mm: when stealing freepages, also take pages created by splitting buddy page
  mincore: apply page table walker on do_mincore()
  mm: /proc/pid/clear_refs: avoid split_huge_page()
  mm: pagewalk: fix misbehavior of walk_page_range for vma(VM_PFNMAP)
  mempolicy: apply page table walker on queue_pages_range()
  arch/powerpc/mm/subpage-prot.c: use walk->vma and walk_page_vma()
  memcg: cleanup preparation for page table walk
  numa_maps: remove numa_maps->vma
  numa_maps: fix typo in gather_hugetbl_stats
  pagemap: use walk->vma instead of calling find_vma()
  clear_refs: remove clear_refs_private->vma and introduce clear_refs_test_walk()
  ...
2015-02-11 18:23:28 -08:00
Linus Torvalds
6f83e5bd3e Merge tag 'nfs-for-3.20-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs
Pull NFS client updates from Trond Myklebust:
 "Highlights incluse:

  Features:
   - Removing the forced serialisation of open()/close() calls in
     NFSv4.x (x>0) makes for a significant performance improvement in
     metadata intensive workloads.
   - Full support for the pNFS "flexible files" layout type
   - Further RPC/RDMA client improvements from Chuck

  Bugfixes:
   - Stable fix: NFSv4.1 backchannel calls blocking operations with !TASK_RUNNING
   - Stable fix: pnfs_generic_pg_init_read/write can be called with lseg == NULL
   - Stable fix: Fix an Oopsable condition when nsm_mon_unmon is called
     as part of the namespace cleanup,
   - Stable fix: Ensure we reference the inode for return-on-close in
     delegreturn
   - Use SO_REUSEPORT to ensure that NFSv3 TCP connections can rebind to
     the same source address/port combination during a disconnect/
     reconnect event.  This is a requirement imposed by most NFSv3
     server duplicate reply cache implementations.

  Optimisations:
   - Ask for no NFSv4.1 delegations on OPEN if using O_DIRECT

  Other:
   - Add Anna Schumaker as co-maintainer for the NFS client"

* tag 'nfs-for-3.20-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: (119 commits)
  SUNRPC: Cleanup to remove xs_tcp_close()
  pnfs: delete an unintended goto
  pnfs/flexfiles: Do not dprintk after the free
  SUNRPC: Fix stupid typo in xs_sock_set_reuseport
  SUNRPC: Define xs_tcp_fin_timeout only if CONFIG_SUNRPC_DEBUG
  SUNRPC: Handle connection reset more efficiently.
  SUNRPC: Remove the redundant XPRT_CONNECTION_CLOSE flag
  SUNRPC: Make xs_tcp_close() do a socket shutdown rather than a sock_release
  SUNRPC: Ensure xs_tcp_shutdown() requests a full close of the connection
  SUNRPC: Cleanup to remove remaining uses of XPRT_CONNECTION_ABORT
  SUNRPC: Remove TCP socket linger code
  SUNRPC: Remove TCP client connection reset hack
  SUNRPC: TCP/UDP always close the old socket before reconnecting
  SUNRPC: Add helpers to prevent socket create from racing
  SUNRPC: Ensure xs_reset_transport() resets the close connection flags
  SUNRPC: Do not clear the source port in xs_reset_transport
  SUNRPC: Handle EADDRINUSE on connect
  SUNRPC: Set SO_REUSEPORT socket option for TCP connections
  NFSv4.1: Fix pnfs_put_lseg races
  NFSv4.1: pnfs_send_layoutreturn should use GFP_NOFS
  ...
2015-02-11 17:14:54 -08:00
Andrea Arcangeli
7e33912849 mm: gup: use get_user_pages_unlocked
This allows those get_user_pages calls to pass FAULT_FLAG_ALLOW_RETRY to
the page fault in order to release the mmap_sem during the I/O.

Signed-off-by: Andrea Arcangeli <aarcange@redhat.com>
Reviewed-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Andres Lagar-Cavilla <andreslc@google.com>
Cc: Peter Feiner <pfeiner@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-02-11 17:06:05 -08:00
Johannes Weiner
650c5e5654 mm: page_counter: pull "-1" handling out of page_counter_memparse()
The unified hierarchy interface for memory cgroups will no longer use "-1"
to mean maximum possible resource value.  In preparation for this, make
the string an argument and let the caller supply it.

Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: Michal Hocko <mhocko@suse.cz>
Cc: Vladimir Davydov <vdavydov@parallels.com>
Cc: Greg Thelen <gthelen@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-02-11 17:06:02 -08:00
Tom Herbert
fe881ef11c gue: Use checksum partial with remote checksum offload
Change remote checksum handling to set checksum partial as default
behavior. Added an iflink parameter to configure not using
checksum partial (calling csum_partial to update checksum).

Signed-off-by: Tom Herbert <therbert@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-11 15:12:13 -08:00
Tom Herbert
15e2396d4e net: Infrastructure for CHECKSUM_PARTIAL with remote checsum offload
This patch adds infrastructure so that remote checksum offload can
set CHECKSUM_PARTIAL instead of calling csum_partial and writing
the modfied checksum field.

Add skb_remcsum_adjust_partial function to set an skb for using
CHECKSUM_PARTIAL with remote checksum offload.  Changed
skb_remcsum_process and skb_gro_remcsum_process to take a boolean
argument to indicate if checksum partial can be set or the
checksum needs to be modified using the normal algorithm.

Signed-off-by: Tom Herbert <therbert@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-11 15:12:12 -08:00
Tom Herbert
6db93ea13b udp: Set SKB_GSO_UDP_TUNNEL* in UDP GRO path
Properly set GSO types and skb->encapsulation in the UDP tunnel GRO
complete so that packets are properly represented for GSO. This sets
SKB_GSO_UDP_TUNNEL or SKB_GSO_UDP_TUNNEL_CSUM depending on whether
non-zero checksums were received, and sets SKB_GSO_TUNNEL_REMCSUM if
the remote checksum option was processed.

Signed-off-by: Tom Herbert <therbert@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-11 15:12:10 -08:00
Tom Herbert
26c4f7da3e net: Fix remcsum in GRO path to not change packet
Remote checksum offload processing is currently the same for both
the GRO and non-GRO path. When the remote checksum offload option
is encountered, the checksum field referred to is modified in
the packet. So in the GRO case, the packet is modified in the
GRO path and then the operation is skipped when the packet goes
through the normal path based on skb->remcsum_offload. There is
a problem in that the packet may be modified in the GRO path, but
then forwarded off host still containing the remote checksum option.
A remote host will again perform RCO but now the checksum verification
will fail since GRO RCO already modified the checksum.

To fix this, we ensure that GRO restores a packet to it's original
state before returning. In this model, when GRO processes a remote
checksum option it still changes the checksum per the algorithm
but on return from lower layer processing the checksum is restored
to its original value.

In this patch we add define gro_remcsum structure which is passed
to skb_gro_remcsum_process to save offset and delta for the checksum
being changed. After lower layer processing, skb_gro_remcsum_cleanup
is called to restore the checksum before returning from GRO.

Signed-off-by: Tom Herbert <therbert@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-11 15:12:09 -08:00
Geert Uytterhoeven
13101602c4 openvswitch: Add missing initialization in validate_and_copy_set_tun()
net/openvswitch/flow_netlink.c: In function ‘validate_and_copy_set_tun’:
net/openvswitch/flow_netlink.c:1749: warning: ‘err’ may be used uninitialized in this function

If ipv4_tun_from_nlattr() returns a different positive value than
OVS_TUNNEL_KEY_ATTR_GENEVE_OPTS, err will be uninitialized, and
validate_and_copy_set_tun() may return an undefined value instead of a
zero success indicator. Initialize err to zero to fix this.

Fixes: 1dd144cf5b ("openvswitch: Support VXLAN Group Policy extension")
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Acked-by: Thomas Graf <tgraf@suug.ch>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-11 14:40:15 -08:00
Pravin B Shelar
b35725a285 openvswitch: Reset key metadata for packet execution.
Userspace packet execute command pass down flow key for given
packet. But userspace can skip some parameter with zero value.
Therefore kernel needs to initialize key metadata to zero.

Fixes: 0714812134 ("openvswitch: Eliminate memset() from flow_extract.")
Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-11 14:40:15 -08:00
Sowmini Varadhan
80ad0d4a7a rds: rds_cong_queue_updates needs to defer the congestion update transmission
When the RDS transport is TCP, we cannot inline the call to rds_send_xmit
from rds_cong_queue_update because
(a) we are already holding the sock_lock in the recv path, and
    will deadlock when tcp_setsockopt/tcp_sendmsg try to get the sock
    lock
(b) cong_queue_update does an irqsave on the rds_cong_lock, and this
    will trigger warnings (for a good reason) from functions called
    out of sock_lock.

This patch reverts the change introduced by
2fa57129d ("RDS: Bypass workqueue when queueing cong updates").

The patch has been verified for both RDS/TCP as well as RDS/RDMA
to ensure that there are not regressions for either transport:
 - for verification of  RDS/TCP a client-server unit-test was used,
   with the server blocked in gdb and thus unable to drain its rcvbuf,
   eventually triggering a RDS congestion update.
 - for RDS/RDMA, the standard IB regression tests were used

Signed-off-by: Sowmini Varadhan <sowmini.varadhan@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-11 14:35:44 -08:00
Vlad Yasevich
bf250a1fa7 ipv6: Partial checksum only UDP packets
ip6_append_data is used by other protocols and some of them can't
be partially checksummed.  Only partially checksum UDP protocol.

Fixes: 32dce968dd (ipv6: Allow for partial checksums on non-ufo packets)
Reported-by: Sabrina Dubroca <sd@queasysnail.net>
Tested-by: Sabrina Dubroca <sd@queasysnail.net>
Signed-off-by: Vladislav Yasevich <vyasevic@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-11 14:32:08 -08:00
David S. Miller
4a3046d68a Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-next
Pablo Neira Ayuso says:

====================
Netfilter updates for net-next

The following patchset contains two small Netfilter updates for your
net-next tree, they are:

1) Add ebtables support to nft_compat, from Arturo Borrero.

2) Fix missing validation of the SET_ID attribute in the lookup
   expressions, from Patrick McHardy.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-11 14:27:24 -08:00
Paul Moore
04f81f0154 cipso: don't use IPCB() to locate the CIPSO IP option
Using the IPCB() macro to get the IPv4 options is convenient, but
unfortunately NetLabel often needs to examine the CIPSO option outside
of the scope of the IP layer in the stack.  While historically IPCB()
worked above the IP layer, due to the inclusion of the inet_skb_param
struct at the head of the {tcp,udp}_skb_cb structs, recent commit
971f10ec ("tcp: better TCP_SKB_CB layout to reduce cache line misses")
reordered the tcp_skb_cb struct and invalidated this IPCB() trick.

This patch fixes the problem by creating a new function,
cipso_v4_optptr(), which locates the CIPSO option inside the IP header
without calling IPCB().  Unfortunately, this isn't as fast as a simple
lookup so some additional tweaks were made to limit the use of this
new function.

Cc: <stable@vger.kernel.org> # 3.18
Reported-by: Casey Schaufler <casey@schaufler-ca.com>
Signed-off-by: Paul Moore <pmoore@redhat.com>
Tested-by: Casey Schaufler <casey@schaufler-ca.com>
2015-02-11 14:46:37 -05:00
Trond Myklebust
c627d31ba0 SUNRPC: Cleanup to remove xs_tcp_close()
xs_tcp_close() is now just a call to xs_tcp_shutdown(), so remove it,
and replace the entry in xs_tcp_ops.

Suggested-by: Anna Schumaker <anna.schumaker@netapp.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2015-02-10 11:06:04 -05:00
Fan Du
b0f9ca53cb ipv4: Namespecify TCP PMTU mechanism
Packetization Layer Path MTU Discovery works separately beside
Path MTU Discovery at IP level, different net namespace has
various requirements on which one to chose, e.g., a virutalized
container instance would require TCP PMTU to probe an usable
effective mtu for underlying tunnel, while the host would
employ classical ICMP based PMTU to function.

Hence making TCP PMTU mechanism per net namespace to decouple
two functionality. Furthermore the probe base MSS should also
be configured separately for each namespace.

Signed-off-by: Fan Du <fan.du@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-09 18:45:00 -08:00