Commit Graph

616772 Commits

Author SHA1 Message Date
Hariprasad Shenai
7829451c69 cxgb4: Add control net_device for configuring PCIe VF
Issue:
For instance, the current APIs assume a 1-to-1 mapping of Network Ports,
Physical Functions and the SR-IOV Virtual Functions of those Physical
Functions. This is not the case with our cards where any Virtual
Function can be hooked up to any Port -- or any number of Ports the
current Linux APIs also assume only 1 Network Interface/Port can be
accessed per Virtual Function.

Another issue is that these APIs assume that the Administrative Driver
is attached to the Physical Function Associated with a Virtual Function.
This is not the case with our card where all administration is performed
by a Driver which is not attached to any of the Physical Functions which
have SR-IOV PCI Capabilities.

Another consequence of these assumptions is the inability to utilize all
of the cards SR-IOV resources. For instance, our cards have SR-IOV
Capabilities on Physical Functions 0..3 and the administrative Driver
attaches to Physical Function 4. Each of the Physical Functions 0..3 can
support up to 16 Virtual Functions. With the current Linux APIs, a
2-Port card would only be able to use the Virtual Functions on Physical
Function 0..1 and not allow the Virtual Functions on Physical Functions
2..3 to be used since there are no Ports 2..3 on a 2-Port card.

Fix:
Since the control node is always the netdevice for all VF ACL commands.
Created a dummy netdevice for each Physical Function from 0 to 3 through
which one could control their VFs. The device won't be associated with
any port, since it doesn't need to transmit/receive. Its purely used
for VF management purpose only. The device will be registered only when
VF for a particular PF is configured using PCI sysfs interface and
unregistered while pci_disable_sriov() for the PF is called.

Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-08-14 21:16:57 -07:00
Florian Westphal
4cf0b354d9 rhashtable: avoid large lock-array allocations
Sander reports following splat after netfilter nat bysrc table got
converted to rhashtable:

swapper/0: page allocation failure: order:3, mode:0x2084020(GFP_ATOMIC|__GFP_COMP)
 CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.8.0-rc1 [..]
 [<ffffffff811633ed>] warn_alloc_failed+0xdd/0x140
 [<ffffffff811638b1>] __alloc_pages_nodemask+0x3e1/0xcf0
 [<ffffffff811a72ed>] alloc_pages_current+0x8d/0x110
 [<ffffffff8117cb7f>] kmalloc_order+0x1f/0x70
 [<ffffffff811aec19>] __kmalloc+0x129/0x140
 [<ffffffff8146d561>] bucket_table_alloc+0xc1/0x1d0
 [<ffffffff8146da1d>] rhashtable_insert_rehash+0x5d/0xe0
 [<ffffffff819fcfff>] nf_nat_setup_info+0x2ef/0x400

The failure happens when allocating the spinlock array.
Even with GFP_KERNEL its unlikely for such a large allocation
to succeed.

Thomas Graf pointed me at inet_ehash_locks_alloc(), so in addition
to adding NOWARN for atomic allocations this also makes the bucket-array
sizing more conservative.

In commit 095dc8e0c3 ("tcp: fix/cleanup inet_ehash_locks_alloc()"),
Eric Dumazet says: "Budget 2 cache lines per cpu worth of 'spinlocks'".
IOW, consider size needed by a single spinlock when determining
number of locks per cpu.  So with 64 byte per cacheline and 4 byte per
spinlock this gives 32 locks per cpu.

Resulting size of the lock-array (sizeof(spinlock) == 4):

cpus:    1   2   4   8   16   32   64
old:    1k  1k  4k  8k  16k  16k  16k
new:   128 256 512  1k   2k   4k   8k

8k allocation should have decent chance of success even
with GFP_ATOMIC, and should not fail with GFP_KERNEL.

With 72-byte spinlock (LOCKDEP):
cpus :   1   2
old:    9k 18k
new:   ~2k ~4k

Reported-by: Sander Eikelenboom <linux@eikelenboom.it>
Suggested-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-08-14 21:12:57 -07:00
David S. Miller
a878c02017 Merge branch 'proc-per-ns'
Dmitry Torokhov says:

====================
Make /proc per net namespace objects belong to container

Currently [almost] all /proc objects belong to the global root, even if
data belongs to a given namespace within a container and (at least for
sysctls) we work around permssions checks to allow container's root to
access the data.

This series changes ownership of net namespace /proc objects
(/proc/net/self/* and /proc/sys/net/*) to be container's root and not
global root when there exists mapping for container's root in user
namespace.

This helps when running Android CTS in a container, but I think it makes
sense regardless.

Changes from V1:

- added fix for crash when !CONFIG_NET_NS (new patch #1)
- addressed Eric'c comments for error handling style in patch #3 and
  added his Ack
- adjusted patch #2 to use the same style of erro handling
- sent out as series instead of separate patches
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2016-08-14 21:09:11 -07:00
Dmitry Torokhov
e79c6a4fc9 net: make net namespace sysctls belong to container's owner
If net namespace is attached to a user namespace let's make container's
root owner of sysctls affecting said network namespace instead of global
root.

This also allows us to clean up net_ctl_permissions() because we do not
need to fudge permissions anymore for the container's owner since it now
owns the objects in question.

Acked-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-08-14 21:08:58 -07:00
Dmitry Torokhov
c110486f6c proc: make proc entries inherit ownership from parent
There are certain parameters that belong to net namespace and that are
exported in /proc. They should be controllable by the container's owner,
but are currently owned by global root and thus not available.

Let's change proc code to inherit ownership of parent entry, and when
create per-ns "net" proc entry set it up as owned by container's owner.

Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-08-14 21:07:20 -07:00
Dmitry Torokhov
f8c46cb390 netns: do not call pernet ops for not yet set up init_net namespace
When CONFIG_NET_NS is disabled, registering pernet operations causes
init() to be called immediately with init_net as an argument. Unfortunately
this leads to some pernet ops, such as proc_net_ns_init() to be called too
early, when init_net namespace has not been fully initialized. This causes
issues when we want to change pernet ops to use more data from the net
namespace in question, for example reference user namespace that owns our
network namespace.

To fix this we could either play game of musical chairs and rearrange init
order, or we could do the same as when CONFIG_NET_NS is enabled, and
postpone calling pernet ops->init() until namespace is set up properly.

Note that we can not simply undo commit ed160e839d ("[NET]: Cleanup
pernet operation without CONFIG_NET_NS") and use the same implementations
for __register_pernet_operations() and __unregister_pernet_operations(),
because many pernet ops are marked as __net_initdata and will be discarded,
which wreaks havoc on our ops lists. Here we rely on the fact that we only
use lists until init_net is fully initialized, which happens much earlier
than discarding __net_initdata sections.

Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-08-14 21:07:20 -07:00
Linus Torvalds
694d0d0bb2 Linux 4.8-rc2 2016-08-14 19:11:36 -07:00
Michael S. Tsirkin
6be3ffaa0e tools/virtio: add dma stubs
Fixes build after recent IOMMU-related changes,
mustly by adding more stubs.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-08-15 05:05:51 +03:00
Michael S. Tsirkin
446374d7c7 vhost/test: fix after swiotlb changes
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-08-15 05:05:51 +03:00
Gerard Garcia
21bc54fc0c vhost/vsock: drop space available check for TX vq
Remove unnecessary use of enable/disable callback notifications
and the incorrect more space available check.

The virtio_transport_tx_work handles when the TX virtqueue
has more buffers available.

Signed-off-by: Gerard Garcia <ggarcia@deic.uab.cat>
Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-08-15 05:05:21 +03:00
Linus Torvalds
0043ee40f9 Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/rzhang/linux
Pull thermal updates from Zhang Rui:

 - Fix a race condition when updating cooling device, which may lead to
   a situation where a thermal governor never updates the cooling
   device.  From Michele Di Giorgio.

 - Fix a zero division error when disabling the forced idle injection
   from the intel powerclamp.  From Petr Mladek.

 - Add suspend/resume callback for intel_pch_thermal thermal driver.
   From Srinivas Pandruvada.

 - Another two fixes for clocking cooling driver and hwmon sysfs I/F.
   From Michele Di Giorgio and Kuninori Morimoto.

[ Hmm.  That suspend/resume callback for intel_pch_thermal doesn't look
  like a fix, but I'm letting it slide..  - Linus ]

* 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/rzhang/linux:
  thermal: clock_cooling: Fix missing mutex_init()
  thermal: hwmon: EXPORT_SYMBOL_GPL for thermal hwmon sysfs
  thermal: fix race condition when updating cooling device
  thermal/powerclamp: Prevent division by zero when counting interval
  thermal: intel_pch_thermal: Add suspend/resume callback
2016-08-14 19:01:31 -07:00
Michael S. Tsirkin
52012619e5 ringtest: test build fix
Recent changes to ptr_ring broke the ringtest
which lacks a likely() stub. Fix it up.

Fixes: 982fb490c2
	("ptr_ring: support zero length ring")
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-08-15 05:01:23 +03:00
Linus Torvalds
4ef870e373 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/gerg/m68knommu
Pull m68knommu fix from Greg Ungerer:
 "This contains only a single fix for a register corruption problem on
  certain types of m68k flat format binaries"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/gerg/m68knommu:
  m68knommu: fix user a5 register being overwritten
2016-08-14 18:54:37 -07:00
Linus Torvalds
118253a593 h8300 and unicore32 architecture fixes
Two patches to fix h8300 and unicore32 builds.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJXr0NpAAoJEMsfJm/On5mBgqoP/0Lw0h8Rywg/qQyC48i3moSQ
 RhHQc33dXELOZDcymahSrj69loUrsrFzZEWT8LJsHpUElYfDyiYc+FC3BEffySG3
 alst81N9D1hQb7uP6Ce8qw0V9wdnTlnbxU72DcAoPLTBTvj/uWE9IOrQlSwERdkp
 h6+K260PUiPj0+rjJrRAfHOwplHGYxaq1Ze8AYCKhgOThKMxeTYCiX4wUlb2pNrd
 0sr6SfCRREnSD+7jnaezD3PK1INYK/LAywyb4+1O2iaDuac3N6qN3c3uJYVpzSRi
 tAUawo2jlBxQYvwDOPwjNG3v7TKz8hXwjFN7X9Nyi9YZGSPjEO6g71FFi4uF1g6Z
 kyS7p+jUexjhPMmSkIVENbH3U72y6HhiPA+gygaVKIwnFWdow9gIiT9qjl51SgRS
 Fx8+wNNv8A2Jhhc/u57E8zr0IDyHEqcdlAOaD5bM2KE5oK33Ggg+BxeM5VdNb+T1
 dXjUuT+8Hpazo3VMCpI6l0v5tFf6IjUXAZoWlguoFBaA2W2aui1IB8QuDqvK7Afa
 TeydLucwu+Shj2Q98Fzu/e12m193A8F6KnnleNmkaH3M0GFl4gzhGAiyWGHxp3/x
 gcE725VsSlqawqmAVYDIpDIEV5hJVCVxRyIgh3wunwGxTQWS524O1HC38UwIJCfn
 gf1NHW3oCd6ZuX3qBVsS
 =PaA7
 -----END PGP SIGNATURE-----

Merge tag 'fixes-for-linus-4.8' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging

Pull h8300 and unicore32 architecture fixes from Guenter Roeck:
 "Two patches to fix h8300 and unicore32 builds.

  unicore32 builds have been broken since v4.6.  The fix has been
  available in -next since March of this year.

  h8300 builds have been broken since the last commit window.  The fix
  has been available in -next since June of this year"

* tag 'fixes-for-linus-4.8' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging:
  h8300: Add missing include file to asm/io.h
  unicore32: mm: Add missing parameter to arch_vma_access_permitted
2016-08-13 19:39:38 -07:00
Linus Torvalds
120c54751b arm64 fixes:
- Support for nr_cpus= command line argument (maxcpus was previously
   changed to allow secondary CPUs to be hot-plugged)
 
 - ARM PMU interrupt handling fix
 
 - Fix potential TLB conflict in the hibernate code
 
 - Improved handling of EL1 instruction aborts (better error reporting)
 
 - Removal of useless jprobes code for stack saving/restoring
 
 - defconfig updates
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJXr5vuAAoJEGvWsS0AyF7x2SMQAID5viyDMPAIA25v08SKWFSc
 aZnuLZfVFr60mw+NkFp2FycFJyrUiL8LByPjEpTKhsMx4BmkrJoOlUBs4lD9GZu+
 43N70tgSOFkKBwod0x1H7yyMURmVlMuy7/k3rJMRe71kFDL7qbH86bgxvPvx4nuX
 YdfhNdLIAQsyk3ngV4ym69tzfVWaFY2xLRoXh3rSKkDHPODsoEJk9+u72dnbrkoG
 gqr/ul0ChqOv7IPLRLCdyGLmVoyPqAv5P9VYDU/lOEXZ/qc+RgOxs7KHbS6nLQqQ
 +OkqEH0xFiJ80rtCuW3YBjUY6z8Gap3tHhZjI1waET/m7TyvqxesGCmp/40/EhB6
 XfqXNXhFM2Yjmdze5MfY4qwNpS0ivovstMTsFG+AtnDV1rODVEXgXK2mpO3u6l2r
 MJ6uYL15Q0KmXdtSd+VZyQGfiBKQ854eRBkA9ueQRpVQeU9Fwe1koQilk2RmVa1p
 ezHEZ+jPOUKNr+89ZJKm2xUou1t3KUDljLQt9rja6zbnsro/YUPloEk6CJLeeMRj
 EFovXhxsD0j8eOktzHVXYlT631Rzzqz9Cx16jpJd5NlNqH+xUntmXMHeAkihbfD1
 lCeihNY30gPkl4EGnu73wsNQqsZyOKNuwhQtqPpDsPtkUmW+uW3cb6XWKM/p+z8B
 raa2UN6bmIjBw3LiDN4V
 =XOhe
 -----END PGP SIGNATURE-----

Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux

Pull arm64 fixes from Catalin Marinas:

 - support for nr_cpus= command line argument (maxcpus was previously
   changed to allow secondary CPUs to be hot-plugged)

 - ARM PMU interrupt handling fix

 - fix potential TLB conflict in the hibernate code

 - improved handling of EL1 instruction aborts (better error reporting)

 - removal of useless jprobes code for stack saving/restoring

 - defconfig updates

* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
  arm64: defconfig: enable CONFIG_LOCALVERSION_AUTO
  arm64: defconfig: add options for virtualization and containers
  arm64: hibernate: handle allocation failures
  arm64: hibernate: avoid potential TLB conflict
  arm64: Handle el1 synchronous instruction aborts cleanly
  arm64: Remove stack duplicating code from jprobes
  drivers/perf: arm-pmu: Fix handling of SPI lacking "interrupt-affinity" property
  drivers/perf: arm-pmu: convert arm_pmu_mutex to spinlock
  arm64: Support hard limit of cpu count by nr_cpus
2016-08-13 19:29:46 -07:00
Colin Ian King
d16d9d2ad7 net: phy: initialize rc to zero to avoid returning garbage value
In the case where phydev->interrupts is not PHY_INTERRUPT_ENABLED
function vsc85xx_ack_interrupt is returning an uninitialized
garbage value.  Fix this by initializing rc to zero.

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-08-13 15:51:49 -07:00
Sabrina Dubroca
952fcfd08c net: remove type_check from dev_get_nest_level()
The idea for type_check in dev_get_nest_level() was to count the number
of nested devices of the same type (currently, only macvlan or vlan
devices).
This prevented the false positive lockdep warning on configurations such
as:

eth0 <--- macvlan0 <--- vlan0 <--- macvlan1

However, this doesn't prevent a warning on a configuration such as:

eth0 <--- macvlan0 <--- vlan0
eth1 <--- vlan1 <--- macvlan1

In this case, all the locks end up with a nesting subclass of 1, so
lockdep thinks that there is still a deadlock:

- in the first case we have (macvlan_netdev_addr_lock_key, 1) and then
  take (vlan_netdev_xmit_lock_key, 1)
- in the second case, we have (vlan_netdev_xmit_lock_key, 1) and then
  take (macvlan_netdev_addr_lock_key, 1)

By removing the linktype check in dev_get_nest_level() and always
incrementing the nesting depth, lockdep considers this configuration
valid.

Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-08-13 15:15:54 -07:00
Sabrina Dubroca
e200387245 macsec: fix lockdep splats when nesting devices
Currently, trying to setup a vlan over a macsec device, or other
combinations of devices, triggers a lockdep warning.

Use netdev_lockdep_set_classes and ndo_get_lock_subclass, similar to
what macvlan does.

Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-08-13 15:15:54 -07:00
LABBE Corentin
1e10f3fbfb net: bfin_mac: Fix a few spelling fixes
This patch respell some word badly spelled.
- Invidate instead of Invalidate
- proble instead of probe

Signed-off-by: LABBE Corentin <clabbe.montjoie@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-08-13 15:14:56 -07:00
Mike Manning
bc561632dd net: ipv6: Do not keep IPv6 addresses when IPv6 is disabled
If IPv6 is disabled when the option is set to keep IPv6
addresses on link down, userspace is unaware of this as
there is no such indication via netlink. The solution is to
remove the IPv6 addresses in this case, which results in
netlink messages indicating removal of addresses in the
usual manner. This fix also makes the behavior consistent
with the case of having IPv6 disabled first, which stops
IPv6 addresses from being added.

Fixes: f1705ec197 ("net: ipv6: Make address flushing on ifdown optional")
Signed-off-by: Mike Manning <mmanning@brocade.com>
Acked-by: David Ahern <dsa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-08-13 15:14:00 -07:00
David S. Miller
5f3b9d3654 Not much for -next so far, but here it goes:
* send more nl80211 events for interfaces
  * remove useless network/transport offset mangling code
  * validate beacon intervals identically for all interface types
  * use driver rate estimates for mesh
  * fix a compiler type/signedness warning
 -----BEGIN PGP SIGNATURE-----
 
 iQIcBAABCgAGBQJXrZSMAAoJEGt7eEactAAd3kAP/1zUHvAZ3qDaz9zz08cIL35R
 CXXDtaFip6k4q78ZU1OLKBbR6VXlxZ8mplQ15xtU5Cx7KIzJMG1pWMDpmuvOSttB
 4pyg+43CDYSWP9gdxS8jYZVx+KL8+SzrIs4ygvfZv8B5ATSLcTbOGUBlxSylY9FP
 CK35I9bL68girl6INb12NtkUwDiA6iC/n9i/62ao/2ywOXEljkwx+JNnk3aiT7Uj
 PBRPHY1lK9AynpNyJjpB9Mwip0HNnbS0Ay8WLpPsqQ8NlWH60tEcpEf4L0pBB8cU
 l5vs9y6TX/9V9OiVPXHgHjPJ2B3XzCY6DbqE1X5jlzFO1hc9o6y1vh8lqE4PXY/J
 brN6nwXObPnx3PDiiqxU33oiXvoQvo8JjBllA3isD2tbseSNZBLewbtdpLOdXtsS
 eaY8WyPaLvEfZjI8KJaQn2GvzPzXWLKHdxpYZtxavK0ss7OYioexS13qYxu+i2Tu
 8QHOqymI0rfEuyTqOKWxpbUmu4xRyH3Gn6WxOFa/tY4BSKNLfUH7EHvnbh+aX1Hp
 IGDTc3erD43agPcn2wYpQtFJ6k9oRIyIakKicP66l4kQrIItfUm6jPIHhYA2D9F4
 SrToRZRoVFFOYQEzPoQ0ZEXBYXXa1x8DxoghX50hzX89cqRlFT0jvZHJxlSCLg+h
 OzfMU4jtWCwRSK0N2EZ5
 =C5wa
 -----END PGP SIGNATURE-----

Merge tag 'mac80211-next-for-davem-2016-08-12' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211-next

Johannes Berg says:

====================
Not much for -next so far, but here it goes:
 * send more nl80211 events for interfaces
 * remove useless network/transport offset mangling code
 * validate beacon intervals identically for all interface types
 * use driver rate estimates for mesh
 * fix a compiler type/signedness warning
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2016-08-13 15:11:05 -07:00
Vegard Nossum
54236ab09e net/sctp: always initialise sctp_ht_iter::start_fail
sctp_transport_seq_start() does not currently clear iter->start_fail on
success, but relies on it being zero when it is allocated (by
seq_open_net()).

This can be a problem in the following sequence:

    open() // allocates iter (and implicitly sets iter->start_fail = 0)
    read()
     - iter->start() // fails and sets iter->start_fail = 1
     - iter->stop() // doesn't call sctp_transport_walk_stop() (correct)
    read() again
     - iter->start() // succeeds, but doesn't change iter->start_fail
     - iter->stop() // doesn't call sctp_transport_walk_stop() (wrong)

We should initialize sctp_ht_iter::start_fail to zero if ->start()
succeeds, otherwise it's possible that we leave an old value of 1 there,
which will cause ->stop() to not call sctp_transport_walk_stop(), which
causes all sorts of problems like not calling rcu_read_unlock() (and
preempt_enable()), eventually leading to more warnings like this:

    BUG: sleeping function called from invalid context at mm/slab.h:388
    in_atomic(): 0, irqs_disabled(): 0, pid: 16551, name: trinity-c2
    Preemption disabled at:[<ffffffff819bceb6>] rhashtable_walk_start+0x46/0x150

     [<ffffffff81149abb>] preempt_count_add+0x1fb/0x280
     [<ffffffff83295892>] _raw_spin_lock+0x12/0x40
     [<ffffffff819bceb6>] rhashtable_walk_start+0x46/0x150
     [<ffffffff82ec665f>] sctp_transport_walk_start+0x2f/0x60
     [<ffffffff82edda1d>] sctp_transport_seq_start+0x4d/0x150
     [<ffffffff81439e50>] traverse+0x170/0x850
     [<ffffffff8143aeec>] seq_read+0x7cc/0x1180
     [<ffffffff814f996c>] proc_reg_read+0xbc/0x180
     [<ffffffff813d0384>] do_loop_readv_writev+0x134/0x210
     [<ffffffff813d2a95>] do_readv_writev+0x565/0x660
     [<ffffffff813d6857>] vfs_readv+0x67/0xa0
     [<ffffffff813d6c16>] do_preadv+0x126/0x170
     [<ffffffff813d710c>] SyS_preadv+0xc/0x10
     [<ffffffff8100334c>] do_syscall_64+0x19c/0x410
     [<ffffffff83296225>] return_from_SYSCALL_64+0x0/0x6a
     [<ffffffffffffffff>] 0xffffffffffffffff

Notice that this is a subtly different stacktrace from the one in commit
5fc382d875 ("net/sctp: terminate rhashtable walk correctly").

Cc: Xin Long <lucien.xin@gmail.com>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Cc: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: Vegard Nossum <vegard.nossum@oracle.com>
Acked-By: Neil Horman <nhorman@tuxdriver.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-08-13 15:10:16 -07:00
Vegard Nossum
5ba092efc7 net/irda: handle iriap_register_lsap() allocation failure
If iriap_register_lsap() fails to allocate memory, self->lsap is
set to NULL. However, none of the callers handle the failure and
irlmp_connect_request() will happily dereference it:

    iriap_register_lsap: Unable to allocated LSAP!
    ================================================================================
    UBSAN: Undefined behaviour in net/irda/irlmp.c:378:2
    member access within null pointer of type 'struct lsap_cb'
    CPU: 1 PID: 15403 Comm: trinity-c0 Not tainted 4.8.0-rc1+ #81
    Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.9.3-0-ge2fc41e-prebuilt.qemu-project.org
    04/01/2014
     0000000000000000 ffff88010c7e78a8 ffffffff82344f40 0000000041b58ab3
     ffffffff84f98000 ffffffff82344e94 ffff88010c7e78d0 ffff88010c7e7880
     ffff88010630ad00 ffffffff84a5fae0 ffffffff84d3f5c0 000000000000017a
    Call Trace:
     [<ffffffff82344f40>] dump_stack+0xac/0xfc
     [<ffffffff8242f5a8>] ubsan_epilogue+0xd/0x8a
     [<ffffffff824302bf>] __ubsan_handle_type_mismatch+0x157/0x411
     [<ffffffff83b7bdbc>] irlmp_connect_request+0x7ac/0x970
     [<ffffffff83b77cc0>] iriap_connect_request+0xa0/0x160
     [<ffffffff83b77f48>] state_s_disconnect+0x88/0xd0
     [<ffffffff83b78904>] iriap_do_client_event+0x94/0x120
     [<ffffffff83b77710>] iriap_getvaluebyclass_request+0x3e0/0x6d0
     [<ffffffff83ba6ebb>] irda_find_lsap_sel+0x1eb/0x630
     [<ffffffff83ba90c8>] irda_connect+0x828/0x12d0
     [<ffffffff833c0dfb>] SYSC_connect+0x22b/0x340
     [<ffffffff833c7e09>] SyS_connect+0x9/0x10
     [<ffffffff81007bd3>] do_syscall_64+0x1b3/0x4b0
     [<ffffffff845f946a>] entry_SYSCALL64_slow_path+0x25/0x25
    ================================================================================

The bug seems to have been around since forever.

There's more problems with missing error checks in iriap_init() (and
indeed all of irda_init()), but that's a bigger problem that needs
very careful review and testing. This patch will fix the most serious
bug (as it's easily reached from unprivileged userspace).

I have tested my patch with a reproducer.

Signed-off-by: Vegard Nossum <vegard.nossum@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-08-13 15:09:07 -07:00
Johannes Berg
c15c0ab12f ipv6: suppress sparse warnings in IP6_ECN_set_ce()
Pass the correct type __wsum to csum_sub() and csum_add(). This doesn't
really change anything since __wsum really *is* __be32, but removes the
address space warnings from sparse.

Cc: Eric Dumazet <edumazet@google.com>
Fixes: 34ae6a1aa0 ("ipv6: update skb->csum when CE mark is propagated")
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-08-13 15:08:00 -07:00
Daniel Borkmann
0ed661d5a4 bpf: fix write helpers with regards to non-linear parts
Fix the bpf_try_make_writable() helper and all call sites we have in BPF,
it's currently defect with regards to skbs when the write_len spans into
non-linear parts, no matter if cloned or not.

There are multiple issues at once. First, using skb_store_bits() is not
correct since even if we have a cloned skb, page frags can still be shared.
To really make them private, we need to pull them in via __pskb_pull_tail()
first, which also gets us a private head via pskb_expand_head() implicitly.

This is for helpers like bpf_skb_store_bytes(), bpf_l3_csum_replace(),
bpf_l4_csum_replace(). Really, the only thing reasonable and working here
is to call skb_ensure_writable() before any write operation. Meaning, via
pskb_may_pull() it makes sure that parts we want to access are pulled in and
if not does so plus unclones the skb implicitly. If our write_len still fits
the headlen and we're cloned and our header of the clone is not writable,
then we need to make a private copy via pskb_expand_head(). skb_store_bits()
is a bit misleading and only safe to store into non-linear data in different
contexts such as 357b40a18b ("[IPV6]: IPV6_CHECKSUM socket option can
corrupt kernel memory").

For above BPF helper functions, it means after fixed bpf_try_make_writable(),
we've pulled in enough, so that we operate always based on skb->data. Thus,
the call to skb_header_pointer() and skb_store_bits() becomes superfluous.
In bpf_skb_store_bytes(), the len check is unnecessary too since it can
only pass in maximum of BPF stack size, so adding offset is guaranteed to
never overflow. Also bpf_l3/4_csum_replace() helpers must test for proper
offset alignment since they use __sum16 pointer for writing resulting csum.

The remaining helpers that change skb data not discussed here yet are
bpf_skb_vlan_push(), bpf_skb_vlan_pop() and bpf_skb_change_proto(). The
vlan helpers internally call either skb_ensure_writable() (pop case) and
skb_cow_head() (push case, for head expansion), respectively. Similarly,
bpf_skb_proto_xlat() takes care to not mangle page frags.

Fixes: 608cd71a9c ("tc: bpf: generalize pedit action")
Fixes: 91bc4822c3 ("tc: bpf: add checksum helpers")
Fixes: 3697649ff2 ("bpf: try harder on clones when writing into skb")
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-08-13 15:01:02 -07:00
sean.wang@mediatek.com
e8c2993a4c net: ethernet: mediatek: add the missing of_node_put() after node is used done
This patch adds the missing of_node_put() after finishing the usage
of of_parse_phandle() or of_node_get() used by fixed_phy.

Signed-off-by: Sean Wang <sean.wang@mediatek.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-08-13 14:58:38 -07:00
sean.wang@mediatek.com
d7005652cd net: ethernet: mediatek: fixed that initializing u64_stats_sync is missing
To fix runtime warning with lockdep is enabled due that u64_stats_sync
is not initialized well, so add it.

Signed-off-by: Sean Wang <sean.wang@mediatek.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-08-13 14:58:38 -07:00
Colin Ian King
b4c0e0c61f calipso: fix resource leak on calipso_genopt failure
Currently, if calipso_genopt fails then the error exit path
does not free the ipv6_opt_hdr new causing a memory leak. Fix
this by kfree'ing new on the error exit path.

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-08-13 14:56:17 -07:00
David S. Miller
70567833ad Merge branch 'net-urb-alloc-failure'
Wolfram Sang says:

====================
net: don't print error when allocating urb fails

This per-subsystem series is part of a tree wide cleanup. usb_alloc_urb() uses
kmalloc which already prints enough information on failure. So, let's simply
remove those "allocation failed" messages from drivers like we did already for
other -ENOMEM cases. gkh acked this approach when we talked about it at LCJ in
Tokyo a few weeks ago.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2016-08-13 14:53:41 -07:00
Wolfram Sang
eb36333896 net: wireless: realtek: rtlwifi: usb: don't print error when allocating urb fails
kmalloc will print enough information in case of failure.

Signed-off-by: Wolfram Sang <wsa-dev@sang-engineering.com>
Acked-by: Larry Finger <Larry.Finger@lwfinger.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-08-13 14:53:41 -07:00
Wolfram Sang
dbea99d6d9 net: wireless: marvell: mwifiex: usb: don't print error when allocating urb fails
kmalloc will print enough information in case of failure.

Signed-off-by: Wolfram Sang <wsa-dev@sang-engineering.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-08-13 14:53:41 -07:00
Wolfram Sang
da8794ce8e net: wireless: marvell: libertas_tf: if_usb: don't print error when allocating urb fails
kmalloc will print enough information in case of failure.

Signed-off-by: Wolfram Sang <wsa-dev@sang-engineering.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-08-13 14:53:41 -07:00
Wolfram Sang
3de6e8852f net: wireless: intersil: orinoco: orinoco_usb: don't print error when allocating urb fails
kmalloc will print enough information in case of failure.

Signed-off-by: Wolfram Sang <wsa-dev@sang-engineering.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-08-13 14:53:41 -07:00
Wolfram Sang
938f89e50a net: wireless: broadcom: brcm80211: brcmfmac: usb: don't print error when allocating urb fails
kmalloc will print enough information in case of failure.

Signed-off-by: Wolfram Sang <wsa-dev@sang-engineering.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-08-13 14:53:40 -07:00
Wolfram Sang
71c4c616ec net: wireless: ath: ar5523: ar5523: don't print error when allocating urb fails
kmalloc will print enough information in case of failure.

Signed-off-by: Wolfram Sang <wsa-dev@sang-engineering.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-08-13 14:53:40 -07:00
Wolfram Sang
aa38b885e5 net: wimax: i2400m: usb-notif: don't print error when allocating urb fails
kmalloc will print enough information in case of failure.

Signed-off-by: Wolfram Sang <wsa-dev@sang-engineering.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-08-13 14:53:40 -07:00
Wolfram Sang
ef6cd1301e net: usb: usbnet: don't print error when allocating urb fails
kmalloc will print enough information in case of failure.

Signed-off-by: Wolfram Sang <wsa-dev@sang-engineering.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-08-13 14:53:40 -07:00
Wolfram Sang
d7c4e84e34 net: usb: lan78xx: don't print error when allocating urb fails
kmalloc will print enough information in case of failure.

Signed-off-by: Wolfram Sang <wsa-dev@sang-engineering.com>
Acked-by: Woojung Huh <woojung.huh@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-08-13 14:53:40 -07:00
Wolfram Sang
12800ea95a net: usb: hso: don't print error when allocating urb fails
kmalloc will print enough information in case of failure.

Signed-off-by: Wolfram Sang <wsa-dev@sang-engineering.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-08-13 14:53:39 -07:00
Wolfram Sang
7fe7cfa43a net: can: usb: usb_8dev: don't print error when allocating urb fails
kmalloc will print enough information in case of failure.

Signed-off-by: Wolfram Sang <wsa-dev@sang-engineering.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-08-13 14:53:39 -07:00
Wolfram Sang
2479204b25 net: can: usb: peak_usb: pcan_usb_core: don't print error when allocating urb fails
kmalloc will print enough information in case of failure.

Signed-off-by: Wolfram Sang <wsa-dev@sang-engineering.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-08-13 14:53:39 -07:00
Wolfram Sang
bc1af47df2 net: can: usb: kvaser_usb: don't print error when allocating urb fails
kmalloc will print enough information in case of failure.

Signed-off-by: Wolfram Sang <wsa-dev@sang-engineering.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-08-13 14:53:39 -07:00
Wolfram Sang
2a4a1e4013 net: can: usb: gs_usb: don't print error when allocating urb fails
kmalloc will print enough information in case of failure.

Signed-off-by: Wolfram Sang <wsa-dev@sang-engineering.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-08-13 14:53:39 -07:00
Wolfram Sang
912f85e104 net: can: usb: esd_usb2: don't print error when allocating urb fails
kmalloc will print enough information in case of failure.

Signed-off-by: Wolfram Sang <wsa-dev@sang-engineering.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-08-13 14:53:39 -07:00
Wolfram Sang
87ced2f958 net: can: usb: ems_usb: don't print error when allocating urb fails
kmalloc will print enough information in case of failure.

Signed-off-by: Wolfram Sang <wsa-dev@sang-engineering.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-08-13 14:53:38 -07:00
Jiri Kosina
6176e89c57 net: fix up a few missing hashtable.h conflict resolutions
There are a couple of leftover symbol conflicts caused by hashtable.h
being included by netdevice.h; those were not caught as build failure
(they're "only" a warning, but in fact real bugs). Fix those up.

Fixes: e87a8f24c ("net: resolve symbol conflicts with generic hashtable.h")
Reported-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-08-13 14:51:02 -07:00
David S. Miller
a31eb63a0a Merge branch 'thunderx-next'
Sunil Goutham says:

====================
net: thunderx: Support for newer chips and miscellaneous patches

This patch series adds support for VNIC on 81xx and 83xx SOCs.
81xx/83xx is different from 88xx in terms of capabilities and new type
of interfaces supported (eg: QSGMII, RGMII) and have DLMs instead of
QLMs which allows single BGX to have interfaces of different LMAC types.

Also included some patches which are common for all 88xx/81xx/83xx
SOCs like using netdev's name while registering irqs, reset receive
queue stats and some changes to use standard API for split buffer Rx
packets, generating RSS key e.t.c

PS: Most of the patches were submitted earlier under different series but
for some reason were not picked up by patchwork. Since new patches have been
added in the meantime, resubmitting all as a new patchset.

Changes from v1:
- Incorporated Yuval Mintz's suggestion to use generic API to set minimum
  queue count i.e by using netif_get_num_default_rss_queues().
- Resolved a compilation issue reported by test robot while compiling
  patch 'Add support for 16 LMACs of 83xx'
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2016-08-13 12:00:49 -07:00
Sunil Goutham
93db2cf8ca net: thunderx: Don't set RX_PACKET_DIS while initializing
Setting BGXX_SPUX_MISC_CONTROL::RX_PACKET_DIS is not needed as
packet reception is anyway disabled by BGXX_CMRX_CONFIG::DATA_PKT_RX_EN.
Also setting RX_PACKET_DIS causes a bogus remote fault condition
which delays link detection.

Signed-off-by: Sunil Goutham <sgoutham@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-08-13 11:59:33 -07:00
Sunil Goutham
0052c92f8f net: thunderx: Use netdev_rss_key_fill() helper
Use standard API to generate a random RSS hash key
on every boot.

Signed-off-by: Sunil Goutham <sgoutham@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-08-13 11:59:33 -07:00
Zyta Szpak
e22e86ea98 net: thunderx: Configure tunnelling protocol parsing
This patch enables parsing of inner layers for tunnelled packets.

Signed-off-by: Zyta Szpak <zr@semihalf.com>
Signed-off-by: Sunil Goutham <sgoutham@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-08-13 11:59:33 -07:00