Commit Graph

495734 Commits

Author SHA1 Message Date
Palik, Imre
07ff890dae xen-netback: fixing the propagation of the transmit shaper timeout
Since e9ce7cb6b1 ("xen-netback: Factor queue-specific data into queue struct"),
the transimt shaper timeout is always set to 0.  The value the user sets via
xenbus is never propagated to the transmit shaper.

This patch fixes the issue.

Cc: Anthony Liguori <aliguori@amazon.com>
Signed-off-by: Imre Palik <imrep@amazon.de>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-01-06 14:17:37 -05:00
Shrikrishna Khare
53831aa125 Driver: Vmxnet3: Make Rx ring 2 size configurable
Rx ring 2 size can be configured by adjusting rx-jumbo parameter
of ethtool -G.

Signed-off-by: Ramya Bolla <bollar@vmware.com>
Signed-off-by: Shreyas Bhatewara <sbhatewara@vmware.com>
Signed-off-by: Shrikrishna Khare <skhare@vmware.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-01-06 14:15:16 -05:00
David S. Miller
15ecf7a063 Here's just a single fix - a revert of a patch that broke the
p54 and cw2100 drivers (arguably due to bad assumptions there.)
 Since this affects kernels since 3.17, I decided to revert for
 now and we'll revisit this optimisation properly for -next.
 -----BEGIN PGP SIGNATURE-----
 
 iQIcBAABCAAGBQJUq9EvAAoJEDBSmw7B7bqr7mMQAJRbdepVVqK5IFH1BH0NC7vi
 LkBE2lp8ZAz1Crg+OAQNUdlZUHtGyfYoXSfzezmrMG51i5xjHyOYQQikW7aJ2SQ0
 XsjJJ5TcqKe83NwoakXUrMpE7KmCt/LnbjKNXDsZIvLlUkqa7ksXaS7btK195aXy
 WlVrmUE+BqT9a16VjFLZ6wRjI43+3bGxhtFL+g1eXw6nZ4a2o4EbIXdc9SN+/bT4
 tAhWJfdAQqQc34jhesWGbMIvkXWhzy2R6Js+9gMIBNsmlAiYbFa4QZ/9tI3nBI/O
 yHSiDc7JnPNjkkC+3wTJxMl7mEd6fEKnAS1ryZ5L4XhPrQpV39iZuWSPvPGw6LLW
 kB6+wXkIyQdCSoyrQZxY75ibqOUKYYxhhkSYfMePXRKTYY6MlHYqiH8wPWFpPoqO
 iumLqx8/CtRW1q1t2EBAG6rZLRF8HqmfqtB+ptT0DWcAP8E81q8BImPoPFr+P9S2
 XfuuSw97xKCcilOcYJ0uYSBe4XNNhy1dtC/zJ8cA9nV4WNkofALga5Z/t8ARhDsM
 wvP1D2uIX3U9My17bXq+Xn/fSSS7yhpLZjEHj/JNRvpDCWGf/tQl6A3ydMy//Oqe
 lRSKfmiAGysqhXnmK12+YhfO+4ioTz8dA88tHs1AO8qasfQwx45eRsUPemWeExiL
 9Lntb0U6MhYvgiTdWqt6
 =CnbJ
 -----END PGP SIGNATURE-----

Merge tag 'mac80211-for-davem-2015-01-06' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211

Here's just a single fix - a revert of a patch that broke the
p54 and cw2100 drivers (arguably due to bad assumptions there.)
Since this affects kernels since 3.17, I decided to revert for
now and we'll revisit this optimisation properly for -next.

Signed-off-by: David S. Miller <davem@davemloft.net>
2015-01-06 13:29:27 -05:00
hayeswang
a5e31255e0 r8152: support ndo_features_check
Support ndo_features_check to avoid:
 - the transport offset is more than the hw limitation when using hw checksum.
 - the skb->len of a GSO packet is more than the limitation.

Signed-off-by: Hayes Wang <hayeswang@realtek.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-01-06 13:29:13 -05:00
Richard Cochran
7c8f1e7861 arm_arch_timer: include clocksource.h directly
This driver makes use of the clocksource code. Previously it had only
included the proper header indirectly, but that chain was inadvertently
broken by 74d23cc "time: move the timecounter/cyclecounter code into its
own file."

This patch fixes the issue by including clocksource.h directly.

Signed-off-by: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-01-06 13:18:50 -05:00
Hariprasad Shenai
678109ea09 cxgb4: Add PCI device ID for new T5 adapter
Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-01-06 13:15:15 -05:00
Oded Gabbay
76baee6c73 drm/amdkfd: rewrite kfd_ioctl() according to drm_ioctl()
This patch changes kfd_ioctl() to be very similar to drm_ioctl().

The patch defines an array of amdkfd_ioctls, which maps IOCTL definition to the
ioctl function.

The kfd_ioctl() uses that mapping to call the appropriate ioctl function,
through a function pointer.

This patch also declares a new typedef for the ioctl function pointer.

v2: Renamed KFD_COMMAND_(START|END) to AMDKFD_...

Signed-off-by: Oded Gabbay <oded.gabbay@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
2015-01-06 19:44:36 +02:00
Oded Gabbay
b81c55db10 drm/amdkfd: reformat IOCTL definitions to drm-style
This patch reformats the ioctl definitions in kfd_ioctl.h to be similar to the
drm ioctls definition style.

v2: Renamed KFD_COMMAND_(START|END) to AMDKFD_...

Signed-off-by: Oded Gabbay <oded.gabbay@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
2015-01-06 19:44:36 +02:00
Oded Gabbay
524a640444 drm/amdkfd: Do copy_to/from_user in general kfd_ioctl()
This patch moves the copy_to_user() and copy_from_user() calls from the
different ioctl functions in amdkfd to the general kfd_ioctl() function, as
this is a common code for all ioctls.

This was done according to example taken from drm_ioctl.c

Signed-off-by: Oded Gabbay <oded.gabbay@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
2015-01-06 19:44:26 +02:00
Amitkumar Karwar
1c6098eb2b bluetooth: btmrvl: increase the priority of firmware download message
When driver is loaded, it is important to know if FW was already
active or it is freshly downloaded. This patch increases the
priority of the message.

Signed-off-by: Amitkumar Karwar <akarwar@marvell.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2015-01-06 16:19:36 +01:00
Amitkumar Karwar
7b4b8740c6 Bluetooth: btmrvl: add surprise_removed flag
This flag will be set in unload path to make sure that we skip
sending further commands, ignore interrupts and stop main thread
when unload starts.

Signed-off-by: Amitkumar Karwar <akarwar@marvell.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2015-01-06 16:19:35 +01:00
Amitkumar Karwar
9b89fdfee4 Bluetooth: btmrvl: error path handling in setup handler
If module init command fails, FW might not be in good state.
We will return from setup handler and skip downloading further
commands.

Signed-off-by: Amitkumar Karwar <akarwar@marvell.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2015-01-06 16:19:35 +01:00
Amitkumar Karwar
8b324fa691 Bluetooth: btmrvl: fix race issue while stopping main thread
btmrvl_remove_card() calls kthread_stop() to stop the main thread,
but kthread_should_stop() is checked when all the activities are done
in the main thread before sleeping.
We will have kthread_should_stop() check as soon as main thread is
woken up. This fixes a crash issue caused due to an invalid memory
access while unnecessarily processing interrupts after card removal.

Signed-off-by: Amitkumar Karwar <akarwar@marvell.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2015-01-06 16:19:35 +01:00
Michael S. Tsirkin
a1eb03f546 virtio_pci: document why we defer kfree
The reason we defer kfree until release function is because it's a
general rule for kobjects: kfree of the reference counter itself is only
legal in the release function.

Previous patch didn't make this clear, document this in code.

Cc: stable@vger.kernel.org
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2015-01-06 16:35:36 +02:00
Sasha Levin
63bd62a08c virtio_pci: defer kfree until release callback
A struct device which has just been unregistered can live on past the
point at which a driver decides to drop it's initial reference to the
kobject gained on allocation.

This implies that when releasing a virtio device, we can't free a struct
virtio_device until the underlying struct device has been released,
which might not happen immediately on device_unregister().

Unfortunately, this is exactly what virtio pci does:
it has an empty release callback, and frees memory immediately
after unregistering the device.

This causes an easy to reproduce crash if CONFIG_DEBUG_KOBJECT_RELEASE
it enabled.

To fix, free the memory only once we know the device is gone in the release
callback.

Cc: stable@vger.kernel.org
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2015-01-06 16:35:36 +02:00
Michael S. Tsirkin
945399a8c7 virtio_pci: device-specific release callback
It turns out we need to add device-specific code
in release callback. Move it to virtio_pci_legacy.c.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2015-01-06 16:35:36 +02:00
Michael S. Tsirkin
80e9541f79 virtio: make del_vqs idempotent
Our code calls del_vqs multiple times, assuming
it's idempotent.

commit 3ec7a77bb3
    virtio_pci: free up vq->priv
broke this assumption, by adding kfree there,
so multiple calls cause double free.

Fix it up.

Fixes: 3ec7a77bb3
Reported-by: Sasha Levin <sasha.levin@oracle.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2015-01-06 16:35:35 +02:00
Arik Nemtsov
fa44b988d2 mac80211: skip disabled channels in VHT check
The patch "40a11ca mac80211: check if channels allow 80 MHz for VHT
probe requests" considered disabled channels as VHT enabled, and
mistakenly sent out probe-requests with the VHT IE.

Signed-off-by: Arik Nemtsov <arikx.nemtsov@intel.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2015-01-06 14:59:07 +01:00
Johannes Berg
71b836eca7 nl80211: define multicast group names in header
Put the group names into the userspace API header file so that
userspace clients can use symbolic names from there instead of
hardcoding the actual names. This doesn't really change much,
but seems somewhat cleaner.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2015-01-06 12:10:26 +01:00
Gautam Kumar Shukla
d75bb06b61 cfg80211: add extensible feature flag attribute
With the wiphy::features flag being used up this patch adds a
new field wiphy::ext_features. Considering extensibility this
new field is declared as a byte array. This extensible flag is
exposed to user-space by NL80211_ATTR_EXT_FEATURES.

Cc: Avinash Patil <patila@marvell.com>
Signed-off-by: Gautam (Gautam Kumar) Shukla <gautams@broadcom.com>
Signed-off-by: Arend van Spriel <arend@broadcom.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2015-01-06 12:10:24 +01:00
Linus Lüssing
9d31b3ce81 batman-adv: fix potential TT client + orig-node memory leak
This patch fixes a potential memory leak which can occur once an
originator times out. On timeout the according global translation table
entry might not get purged correctly. Furthermore, the non purged TT
entry will cause its orig-node to leak, too. Which additionally can lead
to the new multicast optimization feature not kicking in because of a
therefore bogus counter.

In detail: The batadv_tt_global_entry->orig_list holds the reference to
the orig-node. Usually this reference is released after
BATADV_PURGE_TIMEOUT through: _batadv_purge_orig()->
batadv_purge_orig_node()->batadv_update_route()->_batadv_update_route()->
batadv_tt_global_del_orig() which purges this global tt entry and
releases the reference to the orig-node.

However, if between two batadv_purge_orig_node() calls the orig-node
timeout grew to 2*BATADV_PURGE_TIMEOUT then this call path isn't
reached. Instead the according orig-node is removed from the
originator hash in _batadv_purge_orig(), the batadv_update_route()
part is skipped and won't be reached anymore.

Fixing the issue by moving batadv_tt_global_del_orig() out of the rcu
callback.

Signed-off-by: Linus Lüssing <linus.luessing@c0d3.blue>
Acked-by: Antonio Quartulli <antonio@meshcoding.com>
Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch>
Signed-off-by: Antonio Quartulli <antonio@meshcoding.com>
2015-01-06 11:07:01 +01:00
Linus Lüssing
a5164886b0 batman-adv: fix multicast counter when purging originators
When purging an orig_node we should only decrease counter tracking the
number of nodes without multicast optimizations support if it was
increased through this orig_node before.

A not yet quite initialized orig_node (meaning it did not have its turn
in the mcast-tvlv handler so far) which gets purged would not adhere to
this and will lead to a counter imbalance.

Fixing this by adding a check whether the orig_node is mcast-initalized
before decreasing the counter in the mcast-orig_node-purging routine.

Introduced by 60432d756c
("batman-adv: Announce new capability via multicast TVLV")

Reported-by: Tobias Hachmer <tobias@hachmer.de>
Signed-off-by: Linus Lüssing <linus.luessing@c0d3.blue>
Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch>
Signed-off-by: Antonio Quartulli <antonio@meshcoding.com>
2015-01-06 11:06:04 +01:00
Linus Lüssing
e8829f007e batman-adv: fix counter for multicast supporting nodes
A miscounting of nodes having multicast optimizations enabled can lead
to multicast packet loss in the following scenario:

If the first OGM a node receives from another one has no multicast
optimizations support (no multicast tvlv) then we are missing to
increase the counter. This potentially leads to the wrong assumption
that we could safely use multicast optimizations.

Fixings this by increasing the counter if the initial OGM has the
multicast TVLV unset, too.

Introduced by 60432d756c
("batman-adv: Announce new capability via multicast TVLV")

Reported-by: Tobias Hachmer <tobias@hachmer.de>
Signed-off-by: Linus Lüssing <linus.luessing@c0d3.blue>
Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch>
Signed-off-by: Antonio Quartulli <antonio@meshcoding.com>
2015-01-06 11:05:42 +01:00
Martin Hundebøll
f44d54077a batman-adv: fix lock class for decoding hash in network-coding.c
batadv_has_set_lock_class() is called with the wrong hash table as first
argument (probably due to a copy-paste error), which leads to false
positives when running with lockdep.

Introduced-by: 612d2b4fe0
("batman-adv: network coding - save overheard and tx packets for decoding")

Signed-off-by: Martin Hundebøll <martin@hundeboll.net>
Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch>
Signed-off-by: Antonio Quartulli <antonio@meshcoding.com>
2015-01-06 11:05:12 +01:00
Linus Lüssing
2c667a339c batman-adv: fix delayed foreign originator recognition
Currently it can happen that the reception of an OGM from a new
originator is not being accepted. More precisely it can happen that
an originator struct gets allocated and initialized
(batadv_orig_node_new()), even the TQ gets calculated and set correctly
(batadv_iv_ogm_calc_tq()) but still the periodic orig_node purging
thread will decide to delete it if it has a chance to jump between
these two function calls.

This is because batadv_orig_node_new() initializes the last_seen value
to zero and its caller (batadv_iv_ogm_orig_get()) makes it visible to
other threads by adding it to the hash table already.
batadv_iv_ogm_calc_tq() will set the last_seen variable to the correct,
current time a few lines later but if the purging thread jumps in between
that it will think that the orig_node timed out and will wrongly
schedule it for deletion already.

If the purging interval is the same as the originator interval (which is
the default: 1 second), then this game can continue for several rounds
until the random OGM jitter added enough difference between these
two (in tests, two to about four rounds seemed common).

Fixing this by initializing the last_seen variable of an orig_node
to the current time before adding it to the hash table.

Signed-off-by: Linus Lüssing <linus.luessing@c0d3.blue>
Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch>
Signed-off-by: Antonio Quartulli <antonio@meshcoding.com>
2015-01-06 11:05:09 +01:00
Simon Wunderlich
329887ad13 batman-adv: fix and simplify condition when bonding should be used
The current condition actually does NOT consider bonding when the
interface the packet came in from is the soft interface, which is the
opposite of what it should do (and the comment describes). Fix that and
slightly simplify the condition.

Reported-by: Ray Gibson <booray@gmail.com>
Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch>
Signed-off-by: Antonio Quartulli <antonio@meshcoding.com>
2015-01-06 11:05:07 +01:00
David S. Miller
a918eb9fac Merge branch 'rt_cong_ctrl'
Daniel Borkmann says:

====================
net: allow setting congctl via routing table

This is the second part of our work and allows for setting the congestion
control algorithm via routing table. For details, please see individual
patches.

Since patch 1 is a bug fix, we suggest applying patch 1 to net, and then
merging net into net-next, for example, and following up with the remaining
feature patches wrt dependencies.

Joint work with Florian Westphal, suggested by Hannes Frederic Sowa.

Patch for iproute2 is available under [1], but will be reposted with along
with the man-page update when this set hits net-next.

  [1] http://patchwork.ozlabs.org/patch/418149/

Thanks!

v2 -> v3:
 - Added module auto-loading as suggested by David Miller, thanks!
  - Added patch 2 for handling possible sleeps in fib6
  - While working on this, we discovered a bug, hence fix in patch 1
  - Added auto-loading to patch 4
 - Rebased, retested, rest the same.
v1 -> v2:
 - Very sorry, I noticed I had decnet disabled during testing.
   Added missing header include in decnet, rest as is.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2015-01-05 22:55:28 -05:00
Daniel Borkmann
81164413ad net: tcp: add per route congestion control
This work adds the possibility to define a per route/destination
congestion control algorithm. Generally, this opens up the possibility
for a machine with different links to enforce specific congestion
control algorithms with optimal strategies for each of them based
on their network characteristics, even transparently for a single
application listening on all links.

For our specific use case, this additionally facilitates deployment
of DCTCP, for example, applications can easily serve internal
traffic/dsts in DCTCP and external one with CUBIC. Other scenarios
would also allow for utilizing e.g. long living, low priority
background flows for certain destinations/routes while still being
able for normal traffic to utilize the default congestion control
algorithm. We also thought about a per netns setting (where different
defaults are possible), but given its actually a link specific
property, we argue that a per route/destination setting is the most
natural and flexible.

The administrator can utilize this through ip-route(8) by appending
"congctl [lock] <name>", where <name> denotes the name of a
congestion control algorithm and the optional lock parameter allows
to enforce the given algorithm so that applications in user space
would not be allowed to overwrite that algorithm for that destination.

The dst metric lookups are being done when a dst entry is already
available in order to avoid a costly lookup and still before the
algorithms are being initialized, thus overhead is very low when the
feature is not being used. While the client side would need to drop
the current reference on the module, on server side this can actually
even be avoided as we just got a flat-copied socket clone.

Joint work with Florian Westphal.

Suggested-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-01-05 22:55:24 -05:00
Daniel Borkmann
ea69763999 net: tcp: add RTAX_CC_ALGO fib handling
This patch adds the minimum necessary for the RTAX_CC_ALGO congestion
control metric to be set up and dumped back to user space.

While the internal representation of RTAX_CC_ALGO is handled as a u32
key, we avoided to expose this implementation detail to user space, thus
instead, we chose the netlink attribute that is being exchanged between
user space to be the actual congestion control algorithm name, similarly
as in the setsockopt(2) API in order to allow for maximum flexibility,
even for 3rd party modules.

It is a bit unfortunate that RTAX_QUICKACK used up a whole RTAX slot as
it should have been stored in RTAX_FEATURES instead, we first thought
about reusing it for the congestion control key, but it brings more
complications and/or confusion than worth it.

Joint work with Florian Westphal.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-01-05 22:55:24 -05:00
Daniel Borkmann
c5c6a8ab45 net: tcp: add key management to congestion control
This patch adds necessary infrastructure to the congestion control
framework for later per route congestion control support.

For a per route congestion control possibility, our aim is to store
a unique u32 key identifier into dst metrics, which can then be
mapped into a tcp_congestion_ops struct. We argue that having a
RTAX key entry is the most simple, generic and easy way to manage,
and also keeps the memory footprint of dst entries lower on 64 bit
than with storing a pointer directly, for example. Having a unique
key id also allows for decoupling actual TCP congestion control
module management from the FIB layer, i.e. we don't have to care
about expensive module refcounting inside the FIB at this point.

We first thought of using an IDR store for the realization, which
takes over dynamic assignment of unused key space and also performs
the key to pointer mapping in RCU. While doing so, we stumbled upon
the issue that due to the nature of dynamic key distribution, it
just so happens, arguably in very rare occasions, that excessive
module loads and unloads can lead to a possible reuse of previously
used key space. Thus, previously stale keys in the dst metric are
now being reassigned to a different congestion control algorithm,
which might lead to unexpected behaviour. One way to resolve this
would have been to walk FIBs on the actually rare occasion of a
module unload and reset the metric keys for each FIB in each netns,
but that's just very costly.

Therefore, we argue a better solution is to reuse the unique
congestion control algorithm name member and map that into u32 key
space through jhash. For that, we split the flags attribute (as it
currently uses 2 bits only anyway) into two u32 attributes, flags
and key, so that we can keep the cacheline boundary of 2 cachelines
on x86_64 and cache the precalculated key at registration time for
the fast path. On average we might expect 2 - 4 modules being loaded
worst case perhaps 15, so a key collision possibility is extremely
low, and guaranteed collision-free on LE/BE for all in-tree modules.
Overall this results in much simpler code, and all without the
overhead of an IDR. Due to the deterministic nature, modules can
now be unloaded, the congestion control algorithm for a specific
but unloaded key will fall back to the default one, and on module
reload time it will switch back to the expected algorithm
transparently.

Joint work with Florian Westphal.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-01-05 22:55:24 -05:00
Daniel Borkmann
29ba4fffd3 net: tcp: refactor reinitialization of congestion control
We can just move this to an extra function and make the code
a bit more readable, no functional change.

Joint work with Florian Westphal.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-01-05 22:55:24 -05:00
Florian Westphal
e715b6d3a5 net: fib6: convert cfg metric to u32 outside of table write lock
Do the nla validation earlier, outside the write lock.

This is needed by followup patch which needs to be able to call
request_module (which can sleep) if needed.

Joint work with Daniel Borkmann.

Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-01-05 22:55:24 -05:00
Daniel Borkmann
0409c9a5a7 net: fib6: fib6_commit_metrics: fix potential NULL pointer dereference
When IPv6 host routes with metrics attached are being added, we fetch
the metrics store from the dst via COW through dst_metrics_write_ptr(),
added through commit e5fd387ad5.

One remaining problem here is that we actually call into inet_getpeer()
and may end up allocating/creating a new peer from the kmemcache, which
may fail.

Example trace from perf probe (inet_getpeer:41) where create is 1:

ip 6877 [002] 4221.391591: probe:inet_getpeer: (ffffffff8165e293)
  85e294 inet_getpeer.part.7 (<- kmem_cache_alloc())
  85e578 inet_getpeer
  8eb333 ipv6_cow_metrics
  8f10ff fib6_commit_metrics

Therefore, a check for NULL on the return of dst_metrics_write_ptr()
is necessary here.

Joint work with Florian Westphal.

Fixes: e5fd387ad5 ("ipv6: do not overwrite inetpeer metrics prematurely")
Cc: Michal Kubeček <mkubecek@suse.cz>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-01-05 22:55:24 -05:00
Hubert Sokolowski
6cb69742da net: Do not call ndo_dflt_fdb_dump if ndo_fdb_dump is defined
Add checking whether the call to ndo_dflt_fdb_dump is needed.
It is not expected to call ndo_dflt_fdb_dump unconditionally
by some drivers (i.e. qlcnic or macvlan) that defines
own ndo_fdb_dump. Other drivers define own ndo_fdb_dump
and don't want ndo_dflt_fdb_dump to be called at all.
At the same time it is desirable to call the default dump
function on a bridge device.
Fix attributes that are passed to dev->netdev_ops->ndo_fdb_dump.
Add extra checking in br_fdb_dump to avoid duplicate entries
as now filter_dev can be NULL.

Following tests for filtering have been performed before
the change and after the patch was applied to make sure
they are the same and it doesn't break the filtering algorithm.

[root@localhost ~]# cd /root/iproute2-3.18.0/bridge
[root@localhost bridge]# modprobe dummy
[root@localhost bridge]# ./bridge fdb add f1:f2:f3:f4:f5:f6 dev dummy0
[root@localhost bridge]# brctl addbr br0
[root@localhost bridge]# brctl addif  br0 dummy0
[root@localhost bridge]# ip link set dev br0 address 02:00:00:12:01:04
[root@localhost bridge]# # show all
[root@localhost bridge]# ./bridge fdb show
33:33:00:00:00:01 dev p2p1 self permanent
01:00:5e:00:00:01 dev p2p1 self permanent
33:33:ff:ac:ce:32 dev p2p1 self permanent
33:33:00:00:02:02 dev p2p1 self permanent
01:00:5e:00:00:fb dev p2p1 self permanent
33:33:00:00:00:01 dev p7p1 self permanent
01:00:5e:00:00:01 dev p7p1 self permanent
33:33:ff:79:50:53 dev p7p1 self permanent
33:33:00:00:02:02 dev p7p1 self permanent
01:00:5e:00:00:fb dev p7p1 self permanent
f2:46:50:85:6d:d9 dev dummy0 master br0 permanent
f2:46:50:85:6d:d9 dev dummy0 vlan 1 master br0 permanent
33:33:00:00:00:01 dev dummy0 self permanent
f1:f2:f3:f4:f5:f6 dev dummy0 self permanent
33:33:00:00:00:01 dev br0 self permanent
02:00:00:12:01:04 dev br0 vlan 1 master br0 permanent
02:00:00:12:01:04 dev br0 master br0 permanent
[root@localhost bridge]# # filter by bridge
[root@localhost bridge]# ./bridge fdb show br br0
f2:46:50:85:6d:d9 dev dummy0 master br0 permanent
f2:46:50:85:6d:d9 dev dummy0 vlan 1 master br0 permanent
33:33:00:00:00:01 dev dummy0 self permanent
f1:f2:f3:f4:f5:f6 dev dummy0 self permanent
33:33:00:00:00:01 dev br0 self permanent
02:00:00:12:01:04 dev br0 vlan 1 master br0 permanent
02:00:00:12:01:04 dev br0 master br0 permanent
[root@localhost bridge]# # filter by port
[root@localhost bridge]# ./bridge fdb show brport dummy0
f2:46:50:85:6d:d9 master br0 permanent
f2:46:50:85:6d:d9 vlan 1 master br0 permanent
33:33:00:00:00:01 self permanent
f1:f2:f3:f4:f5:f6 self permanent
[root@localhost bridge]# # filter by port + bridge
[root@localhost bridge]# ./bridge fdb show br br0 brport dummy0
f2:46:50:85:6d:d9 master br0 permanent
f2:46:50:85:6d:d9 vlan 1 master br0 permanent
33:33:00:00:00:01 self permanent
f1:f2:f3:f4:f5:f6 self permanent
[root@localhost bridge]#

Signed-off-by: Hubert Sokolowski <hubert.sokolowski@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-01-05 22:52:06 -05:00
David S. Miller
d4253c6258 Merge branch 'ip_cmsg_csum'
Tom Herbert says:

====================
ip: Support checksum returned in csmg

This patch set allows the packet checksum for a datagram socket
to be returned in csum data in recvmsg. This allows userspace
to implement its own checksum over the data, for instance if an
IP tunnel was be implemented in user space, the inner checksum
could be validated.

Changes in this patch set:
  - Move checksum conversion to inet_sock from udp_sock. This
    generalizes checksum conversion for use with other protocols.
  - Move IP cmsg constants to a header file and make processing
    of the flags more efficient in ip_cmsg_recv
  - Return checksum value in cmsg. This is specifically the unfolded
    32 bit checksum of the full packet starting from the first byte
    returned in recvmsg

Tested: Wrote a little server to get checksums in cmsg for UDP and
        verfied correct checksum is returned.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2015-01-05 22:44:56 -05:00
Tom Herbert
ad6f939ab1 ip: Add offset parameter to ip_cmsg_recv
Add ip_cmsg_recv_offset function which takes an offset argument
that indicates the starting offset in skb where data is being received
from. This will be useful in the case of UDP and provided checksum
to user space.

ip_cmsg_recv is an inline call to ip_cmsg_recv_offset with offset of
zero.

Signed-off-by: Tom Herbert <therbert@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-01-05 22:44:46 -05:00
Tom Herbert
5961de9f19 ip: Add offset parameter to ip_cmsg_recv
Add ip_cmsg_recv_offset function which takes an offset argument
that indicates the starting offset in skb where data is being received
from. This will be useful in the case of UDP and provided checksum
to user space.

ip_cmsg_recv is an inline call to ip_cmsg_recv_offset with offset of
zero.

Signed-off-by: Tom Herbert <therbert@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-01-05 22:44:46 -05:00
Tom Herbert
c44d13d6f3 ip: IP cmsg cleanup
Move the IP_CMSG_* constants from ip_sockglue.c to inet_sock.h so that
they can be referenced in other source files.

Restructure ip_cmsg_recv to not go through flags using shift, check
for flags by 'and'. This eliminates both the shift and a conditional
per flag check.

Signed-off-by: Tom Herbert <therbert@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-01-05 22:44:46 -05:00
Tom Herbert
224d019c4f ip: Move checksum convert defines to inet
Move convert_csum from udp_sock to inet_sock. This allows the
possibility that we can use convert checksum for different types
of sockets and also allows convert checksum to be enabled from
inet layer (what we'll want to do when enabling IP_CHECKSUM cmsg).

Signed-off-by: Tom Herbert <therbert@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-01-05 22:44:46 -05:00
Thomas Graf
149118d893 netlink: Warn on unordered or illegal nla_nest_cancel() or nlmsg_cancel()
Calling nla_nest_cancel() in a different order as the nesting was
built up can lead to negative offsets being calculated which
results in skb_trim() being called with an underflowed unsigned
int. Warn if mark < skb->data as it's definitely a bug.

Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-01-05 22:38:22 -05:00
Linus Torvalds
b1940cd21c Linux 3.19-rc3 2015-01-05 17:05:20 -08:00
Zhang Rui
ad0f409051 int340x_thermal/processor_thermal_device: return failure when
there is no ACPI device object

processor_thermal_device driver needs ACPI support to work. Thus, the driver
probing should fail when there is no ACPI device object asscociated.

This fixes a NULL pointer dereference when the driver is loaded
with INT340X feature disabled in BIOS.

Reported-by: Chen Yu <yu.c.chen@intel.com>
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Tested-by: Chen Yu <yu.c.chen@intel.com>
2015-01-06 08:18:21 +08:00
Zhang Rui
014d9d5d0c ACPI/int340x_thermal: enumerate INT3401 for Intel SoC DTS thermal driver
Intel SoC DTS thermal driver on Baytrail platform uses IRQ 86 for
critical overheating notification.
But this IRQ 86 is described in the _CRS control method of INT3401 device,
thus we should enumerate INT3401 to set the IRQ descriptor when
Intel SoC DTS thermal driver is built.

Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Tested-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
2015-01-06 08:17:14 +08:00
Zhang Rui
48628e4015 ACPI/int340x_thermal: enumerate INT340X devices even if they're not in _ART/_TRT
For some INT340X thermal devices, even if they are not referred in
_TRT/_ART table, they still can be used by userspace for thermal control.
Thus change the code to enumerated all the INT340X devices,
no matter if they're referred in _TRT/_ART or not.

Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Tested-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
2015-01-06 08:17:06 +08:00
Linus Torvalds
79b8cb9737 powerpc fixes for 3.19
Wire up sys_execveat(). Tested on 32 & 64 bit.
 
 Fix for kdump on LE systems with cpus hot unplugged.
 
 Revert Anton's fix for "kernel BUG at kernel/smpboot.c:134!", this broke other
 platforms, we'll do a proper fix for 3.20.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJUqillAAoJEFHr6jzI4aWA+aYP/0zMNyRbQehtpF9OLKw2TGJW
 LeSdQMN+g6JWKzG2hsNIiKga5kquuCi0zgtu3dW/rxlzbXY0gxwAY7HFkjzgE7Em
 qlt/iairlg8ifE436DRCkH2SXTCDN2AnlRLO35QZ2zKoo8osJuA6YjDiQRN3Y//p
 ELvWraYoVdhbfGPg+gYJn94U/OAmFu0E0VrwEmpWie1YrfTNbbfTkS/BieqhqJTu
 UEX53tW6FXbmit3ReQGGsPzcoVWbXfIg09gRy14Bn8dtA1EipHKr1m5o1JCjGu9J
 moJVB8cMkj5o4OcQrPtJZI76FkzaVueNUVTUdFtIt1q6NCyvX8tZe16OKPYY0TEN
 j8snpJfsk+oZij5caYSKqTeMAPrh6JbIjbkl8cIwc7ClRkRf/l5sW0zaFPS+GDWl
 c9OFu1CdKO59g+apQ16BmMRw8b/pV22WkxSpqm9GD3I7Y7ABoUN+1l07d5H+aGRq
 khD7M2rUWh4CwSRvg+ealpyAP++4Sj6mn/z4pe/gem4Wn/ynxGi6CSsYYijB6PWP
 bm47t+QqyomRiH65Yo41vKGNYu/JfenvAWyJp8AnIajWqxwCRLVNSgySjtylUESs
 f02qdvLsBTd3/CorZxigWgThM1wWeKyCDO6NBbjVUC/L9O8Ck9U0WbJUVQFDgG0b
 +52Wv5FRb/xBF9XdpWsw
 =1yze
 -----END PGP SIGNATURE-----

Merge tag 'powerpc-3.19-3' of git://git.kernel.org/pub/scm/linux/kernel/git/mpe/linux

Pull powerpc fixes from Michael Ellerman:

 - Wire up sys_execveat(). Tested on 32 & 64 bit.

 - Fix for kdump on LE systems with cpus hot unplugged.

 - Revert Anton's fix for "kernel BUG at kernel/smpboot.c:134!", this
   broke other platforms, we'll do a proper fix for 3.20.

* tag 'powerpc-3.19-3' of git://git.kernel.org/pub/scm/linux/kernel/git/mpe/linux:
  Revert "powerpc: Secondary CPUs must set cpu_callin_map after setting active and online"
  powerpc/kdump: Ignore failure in enabling big endian exception during crash
  powerpc: Wire up sys_execveat() syscall
2015-01-05 14:49:02 -08:00
Hanjun Guo
d02dc27db0 ACPI / processor: Rename acpi_(un)map_lsapic() to acpi_(un)map_cpu()
acpi_map_lsapic() will allocate a logical CPU number and map it to
physical CPU id (such as APIC id) for the hot-added CPU, it will also
do some mapping for NUMA node id and etc, acpi_unmap_lsapic() will
do the reverse.

We can see that the name of the function is a little bit confusing and
arch (IA64) dependent so rename them as acpi_(un)map_cpu() to make arch
agnostic and explicit.

Signed-off-by: Hanjun Guo <hanjun.guo@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2015-01-05 23:34:26 +01:00
Hanjun Guo
af8f3f514d ACPI / processor: Convert apic_id to phys_id to make it arch agnostic
apic_id in MADT table is the CPU hardware id which identify
it self in the system for x86 and ia64, OSPM will use it for
SMP init to map APIC ID to logical cpu number in the early
boot, when the DSDT/SSDT (ACPI namespace) is scanned later, the
ACPI processor driver is probed and the driver will use acpi_id
in DSDT to get the apic_id, then map to the logical cpu number
which is needed by the processor driver.

Before ACPI 5.0, only x86 and ia64 were supported in ACPI spec,
so apic_id is used both in arch code and ACPI core which is
pretty fine. Since ACPI 5.0, ARM is supported by ACPI and
APIC is not available on ARM, this will confuse people when
apic_id is both used by x86 and ARM in one function.

So convert apic_id to phys_id (which is the original meaning)
in ACPI processor dirver to make it arch agnostic, but leave the
arch dependent code unchanged, no functional change.

Signed-off-by: Hanjun Guo <hanjun.guo@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2015-01-05 23:32:42 +01:00
Linus Torvalds
f40bde8588 Add execveat syscall
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJUqwAbAAoJEKurIx+X31iBQCkP/3W3+bo2uabm8sjz6ivBi3N8
 FW+xiBeyKoCsJIbsdr+txpWoSeIJSQ2Wgd3QwB5L4ztrsykZMSXDg5NzMnnsWh/A
 30EfjraillriN9PHt+jcBBEr43VIMoxHA4E8HOuG08tiEaEQMXyGE0PirbNIB2lu
 Dh5dq5u+fPA9KiD6lHLLfYSAq2PAOl425tsRRRWqaivWjyu0N5pvZJ7zsNN1bFv1
 zVlA++IK7XyiM8QuGIMsQ3hLbVs3tdWVeqlS3CcqHRNi2rHQaRI5oX4Rdfj573Tn
 M7LfYINdhEzzggOpAK9o3olzF9+Z7E9Dj+hSO70GercgD0m3uaRPIgTig9SGP8nq
 IctyFjhC0MfTPMql83fhmcqEyXJelzKXpELT1FG3MpPUlSMXKyOHWxwfqyfUIj1z
 5JV8hurPME0p5FlfLWVaA0o3tncx26SYZ3y7Tdd5fHoOkaZj+Wqcdgs48h5o+1UL
 VjtCh9JUTx5jwn1+hviPxi4l0gfLwcBLVMqV1F3UCBLMeFD9lRuUs9HiX6tQKXAv
 E6RR5CdGZPgzqgcmoNQU73JjleZWlMqNDgn/8QpZ1NWEOq8wkPJqKkbPL1wfO7yC
 Fso4tsRwJn7ykpz4xbEoXYaoTycuIPgqoddxv/OqWw1bTAtlk/mo1sxalrg0XLrv
 mdp5VTjy4jf+/m6HcOid
 =0lKw
 -----END PGP SIGNATURE-----

Merge tag 'please-pull-syscall' of git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux

Pull ia64 fixlet from Tony Luck:
 "Add execveat syscall"

* tag 'please-pull-syscall' of git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux:
  [IA64] Enable execveat syscall for ia64
2015-01-05 14:31:20 -08:00
Rafael J. Wysocki
1b1f3e1699 ACPI / PM: Fix PM initialization for devices that are not present
If an ACPI device object whose _STA returns 0 (not present and not
functional) has _PR0 or _PS0, its power_manageable flag will be set
and acpi_bus_init_power() will return 0 for it.  Consequently, if
such a device object is passed to the ACPI device PM functions, they
will attempt to carry out the requested operation on the device,
although they should not do that for devices that are not present.

To fix that problem make acpi_bus_init_power() return an error code
for devices that are not present which will cause power_manageable to
be cleared for them as appropriate in acpi_bus_get_power_flags().
However, the lists of power resources should not be freed for the
device in that case, so modify acpi_bus_get_power_flags() to keep
those lists even if acpi_bus_init_power() returns an error.
Accordingly, when deciding whether or not the lists of power
resources need to be freed, acpi_free_power_resources_lists()
should check the power.flags.power_resources flag instead of
flags.power_manageable, so make that change too.

Furthermore, if acpi_bus_attach() sees that flags.initialized is
unset for the given device, it should reset the power management
settings of the device and re-initialize them from scratch instead
of relying on the previous settings (the device may have appeared
after being not present previously, for example), so make it use
the 'valid' flag of the D0 power state as the initial value of
flags.power_manageable for it and call acpi_bus_init_power() to
discover its current power state.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Cc: 3.10+ <stable@vger.kernel.org> # 3.10+
2015-01-05 22:49:52 +01:00
David S. Miller
a515abd777 Merge branch 'cxgb4-next'
Hariprasad Shenai says:

====================
RDMA/cxgb4/cxgb4vf/csiostor: Cleanup register defines

This series continues to cleanup all the macros/register defines related to
SGE, PCIE, MC, MA, TCAM, MAC, etc that are defined in t4_regs.h and the
affected files.

Will post another 1 or 2 series so that we can cover all the macros so that
they all follow the same style to be consistent.

The patches series is created against 'net-next' tree.
And includes patches on cxgb4, cxgb4vf, iw_cxgb4 and csiostor driver.

We have included all the maintainers of respective drivers. Kindly review the
change and let us know in case of any review comments.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2015-01-05 16:34:53 -05:00