Commit Graph

636319 Commits

Author SHA1 Message Date
Andrew Lunn
7952347391 net: dsa: mv88e6xxx: Add mv88e6390 stats snapshot operation
The MV88E6390 has a control register for what the histogram statistics
actually contain. This means the stat_snapshot method should not set
this information. So implement the 6390 stats_snapshot function without
these bits.

Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-22 09:55:30 -05:00
Andrew Lunn
4b325d8c84 net: dsa: mv88e6xxx: Add comment about family a device belongs to
Knowing the family of device belongs to helps with picking the ops
implementation which is appropriate to the device. So add a comment to
each structure of ops.

Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-22 09:55:30 -05:00
Andrew Lunn
a605a0fe71 net: dsa: mv88e6xxx: Abstract stats_snapshot into ops structure
Taking a stats snapshot differs between same families. Abstract this
into an ops member. At the same time, move the code into global1.[ch],
since the registers are in the global1 range.

Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-22 09:55:30 -05:00
Andrew Lunn
1a3b39ecfe net: dsa: mv88e6xxx: Add the mv88e6390 family
With the devices added to the tables, the probe will recognize the
switch. This however is not sufficient to make it work properly, other
changes are needed because of incompatibilities.

Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-22 09:55:30 -05:00
Andrew Lunn
096eea0ff8 net: dsa: mv88e6xxx: Fix unused variable warning by using variable
_mv88e6xxx_stats_wait() did not check the return value from
 mv88e6xxx_g1_read(), so the compiler complained about set but unused
 err.

Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-22 09:55:30 -05:00
Andrew Lunn
b4308f046a net: dsa: mv88e6xxx: Take switch out of reset before probe
The switch needs to be taken out of reset before we can read its ID
register on the MDIO bus.

Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-22 09:55:30 -05:00
Russell King
7a79279e71 drm/arm: hdlcd: fix plane base address update
While testing HDMI with Xorg on the Juno board, I find that when Xorg
starts up or shuts down, the display is shifted significantly to the
right and wrapped in the active region.  (No sync bars are visible.)
The timings are correct, it behaves as if the start address has been
shifted many pixels _into_ the framebuffer.

This occurs whenever the display mode size is changed - using xrandr
in Xorg shows that changing the resolution triggers the problem
almost every time, but changing the refresh rate does not.

Using devmem2 to disable and re-enable the HDLCD resolves the issue,
and repeated disable/enable cycles do not make the issue re-appear.
Further debugging shows that we try to update the controller
configuration while enabled.

Alwys ensure that the HDLCD is disabled prior to updating the
controller timings, and use drm_crtc_vblank_off()/drm_crtc_vblank_on()
so that DRM knows whether it can expect vblank interrupts.

Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: Liviu Dudau <Liviu.Dudau@arm.com>
2016-11-22 14:09:06 +00:00
Peter Zijlstra
033ac60c7f perf/x86/intel/uncore: Allow only a single PMU/box within an events group
Group validation expects all events to be of the same PMU; however
is_uncore_pmu() is too wide, it matches _all_ uncore events, even
across PMUs.

This triggers failure when we group different events from different
uncore PMUs, like:

  perf stat -vv -e '{uncore_cbox_0/config=0x0334/,uncore_qpi_0/event=1/}' -a sleep 1

Fix is_uncore_pmu() by only matching events to the box at hand.

Note that generic code; ran after this step; will disallow this
mixture of PMU events.

Reported-by: Jiri Olsa <jolsa@redhat.com>
Tested-by: Jiri Olsa <jolsa@redhat.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vince@deater.net>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Link: http://lkml.kernel.org/r/20161118125354.GQ3117@twins.programming.kicks-ass.net
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-11-22 12:36:59 +01:00
Peter Zijlstra
b8000586c9 perf/x86/intel: Cure bogus unwind from PEBS entries
Vince Weaver reported that perf_fuzzer + KASAN detects that PEBS event
unwinds sometimes do 'weird' things. In particular, we seemed to be
ending up unwinding from random places on the NMI stack.

While it was somewhat expected that the event record BP,SP would not
match the interrupt BP,SP in that the interrupt is strictly later than
the record event, it was overlooked that it could be on an already
overwritten stack.

Therefore, don't copy the recorded BP,SP over the interrupted BP,SP
when we need stack unwinds.

Note that its still possible the unwind doesn't full match the actual
event, as its entirely possible to have done an (I)RET between record
and interrupt, but on average it should still point in the general
direction of where the event came from. Also, it's the best we can do,
considering.

The particular scenario that triggered the bogus NMI stack unwind was
a PEBS event with very short period, upon enabling the event at the
tail of the PMI handler (FREEZE_ON_PMI is not used), it instantly
triggers a record (while still on the NMI stack) which in turn
triggers the next PMI. This then causes back-to-back NMIs and we'll
try and unwind the stack-frame from the last NMI, which obviously is
now overwritten by our own.

Analyzed-by: Josh Poimboeuf <jpoimboe@redhat.com>
Reported-by: Vince Weaver <vincent.weaver@maine.edu>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@gmail.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: davej@codemonkey.org.uk <davej@codemonkey.org.uk>
Cc: dvyukov@google.com <dvyukov@google.com>
Cc: stable@vger.kernel.org
Fixes: ca037701a0 ("perf, x86: Add PEBS infrastructure")
Link: http://lkml.kernel.org/r/20161117171731.GV3157@twins.programming.kicks-ass.net
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-11-22 12:36:58 +01:00
Johannes Weiner
ae31fe51a3 perf/x86: Restore TASK_SIZE check on frame pointer
The following commit:

  75925e1ad7 ("perf/x86: Optimize stack walk user accesses")

... switched from copy_from_user_nmi() to __copy_from_user_nmi() with a manual
access_ok() check.

Unfortunately, copy_from_user_nmi() does an explicit check against TASK_SIZE,
whereas the access_ok() uses whatever the current address limit of the task is.

We are getting NMIs when __probe_kernel_read() has switched to KERNEL_DS, and
then see vmalloc faults when we access what looks like pointers into vmalloc
space:

  [] WARNING: CPU: 3 PID: 3685731 at arch/x86/mm/fault.c:435 vmalloc_fault+0x289/0x290
  [] CPU: 3 PID: 3685731 Comm: sh Tainted: G        W       4.6.0-5_fbk1_223_gdbf0f40 #1
  [] Call Trace:
  []  <NMI>  [<ffffffff814717d1>] dump_stack+0x4d/0x6c
  []  [<ffffffff81076e43>] __warn+0xd3/0xf0
  []  [<ffffffff81076f2d>] warn_slowpath_null+0x1d/0x20
  []  [<ffffffff8104a899>] vmalloc_fault+0x289/0x290
  []  [<ffffffff8104b5a0>] __do_page_fault+0x330/0x490
  []  [<ffffffff8104b70c>] do_page_fault+0xc/0x10
  []  [<ffffffff81794e82>] page_fault+0x22/0x30
  []  [<ffffffff81006280>] ? perf_callchain_user+0x100/0x2a0
  []  [<ffffffff8115124f>] get_perf_callchain+0x17f/0x190
  []  [<ffffffff811512c7>] perf_callchain+0x67/0x80
  []  [<ffffffff8114e750>] perf_prepare_sample+0x2a0/0x370
  []  [<ffffffff8114e840>] perf_event_output+0x20/0x60
  []  [<ffffffff8114aee7>] ? perf_event_update_userpage+0xc7/0x130
  []  [<ffffffff8114ea01>] __perf_event_overflow+0x181/0x1d0
  []  [<ffffffff8114f484>] perf_event_overflow+0x14/0x20
  []  [<ffffffff8100a6e3>] intel_pmu_handle_irq+0x1d3/0x490
  []  [<ffffffff8147daf7>] ? copy_user_enhanced_fast_string+0x7/0x10
  []  [<ffffffff81197191>] ? vunmap_page_range+0x1a1/0x2f0
  []  [<ffffffff811972f1>] ? unmap_kernel_range_noflush+0x11/0x20
  []  [<ffffffff814f2056>] ? ghes_copy_tofrom_phys+0x116/0x1f0
  []  [<ffffffff81040d1d>] ? x2apic_send_IPI_self+0x1d/0x20
  []  [<ffffffff8100411d>] perf_event_nmi_handler+0x2d/0x50
  []  [<ffffffff8101ea31>] nmi_handle+0x61/0x110
  []  [<ffffffff8101ef94>] default_do_nmi+0x44/0x110
  []  [<ffffffff8101f13b>] do_nmi+0xdb/0x150
  []  [<ffffffff81795187>] end_repeat_nmi+0x1a/0x1e
  []  [<ffffffff8147daf7>] ? copy_user_enhanced_fast_string+0x7/0x10
  []  [<ffffffff8147daf7>] ? copy_user_enhanced_fast_string+0x7/0x10
  []  [<ffffffff8147daf7>] ? copy_user_enhanced_fast_string+0x7/0x10
  []  <<EOE>>  <IRQ>  [<ffffffff8115d05e>] ? __probe_kernel_read+0x3e/0xa0

Fix this by moving the valid_user_frame() check to before the uaccess
that loads the return address and the pointer to the next frame.

Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Cc: linux-kernel@vger.kernel.org
Fixes: 75925e1ad7 ("perf/x86: Optimize stack walk user accesses")
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-11-22 12:36:58 +01:00
Oleg Nesterov
8e5bfa8c1f sched/autogroup: Do not use autogroup->tg in zombie threads
Exactly because for_each_thread() in autogroup_move_group() can't see it
and update its ->sched_task_group before _put() and possibly free().

So the exiting task needs another sched_move_task() before exit_notify()
and we need to re-introduce the PF_EXITING (or similar) check removed by
the previous change for another reason.

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: hartsjc@redhat.com
Cc: vbendel@redhat.com
Cc: vlovejoy@redhat.com
Link: http://lkml.kernel.org/r/20161114184612.GA15968@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-11-22 12:33:43 +01:00
Oleg Nesterov
18f649ef34 sched/autogroup: Fix autogroup_move_group() to never skip sched_move_task()
The PF_EXITING check in task_wants_autogroup() is no longer needed. Remove
it, but see the next patch.

However the comment is correct in that autogroup_move_group() must always
change task_group() for every thread so the sysctl_ check is very wrong;
we can race with cgroups and even sys_setsid() is not safe because a task
running with task_group() == ag->tg must participate in refcounting:

	int main(void)
	{
		int sctl = open("/proc/sys/kernel/sched_autogroup_enabled", O_WRONLY);

		assert(sctl > 0);
		if (fork()) {
			wait(NULL); // destroy the child's ag/tg
			pause();
		}

		assert(pwrite(sctl, "1\n", 2, 0) == 2);
		assert(setsid() > 0);
		if (fork())
			pause();

		kill(getppid(), SIGKILL);
		sleep(1);

		// The child has gone, the grandchild runs with kref == 1
		assert(pwrite(sctl, "0\n", 2, 0) == 2);
		assert(setsid() > 0);

		// runs with the freed ag/tg
		for (;;)
			sleep(1);

		return 0;
	}

crashes the kernel. It doesn't really need sleep(1), it doesn't matter if
autogroup_move_group() actually frees the task_group or this happens later.

Reported-by: Vern Lovejoy <vlovejoy@redhat.com>
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: hartsjc@redhat.com
Cc: vbendel@redhat.com
Link: http://lkml.kernel.org/r/20161114184609.GA15965@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-11-22 12:33:42 +01:00
Nicholas Piggin
9e5f688422 powerpc: Fix missing CRCs, add more asm-prototypes.h declarations
After patch 4efca4ed0 ("kbuild: modversions for EXPORT_SYMBOL() for asm"),
asm exports can get modversions CRCs generated if they have C definitions
in asm-prototypes.h. This patch adds missing definitions for 32 and 64 bit
allmodconfig builds.

Fixes: 9445aa1a30 ("ppc: move exports to definitions")
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-11-22 22:25:41 +11:00
Herbert Xu
c8467f7a36 crypto: scatterwalk - Remove unnecessary aliasing check in map_and_copy
The aliasing check in map_and_copy is no longer necessary because
the IPsec ESP code no longer provides an IV that points into the
actual request data.  As this check is now triggering BUG checks
due to the vmalloced stack code, I'm removing it.

Reported-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2016-11-22 15:02:25 +08:00
Herbert Xu
8acf7a1063 crypto: algif_hash - Fix result clobbering in recvmsg
Recently an init call was added to hash_recvmsg so as to reset
the hash state in case a sendmsg call was never made.

Unfortunately this ended up clobbering the result if the previous
sendmsg was done with a MSG_MORE flag.  This patch fixes it by
excluding that case when we make the init call.

Fixes: a8348bca29 ("algif_hash - Fix NULL hash crash with shash")
Reported-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2016-11-22 15:02:24 +08:00
Benjamin Herrenschmidt
7a43906f5c powerpc: Set missing wakeup bit in LPCR on POWER9
There is a new bit, LPCR_PECE_HVEE (Hypervisor Virtualization Exit
Enable), which controls wakeup from STOP states on Hypervisor
Virtualization Interrupts (which happen to also be all external
interrupts in host or bare metal mode).

It needs to be set or we will miss wakeups.

Fixes: 9baaef0a22 ("powerpc/irq: Add support for HV virtualization interrupts")
Cc: stable@vger.kernel.org # v4.8+
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
[mpe: Rename it to HVEE to match the name in the ISA]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-11-22 14:53:27 +11:00
Linus Torvalds
3b404a5198 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security
Pull apparmor bugfix from James Morris:
 "This has a fix for a policy replacement bug that is fairly serious for
  apache mod_apparmor users, as it results in the wrong policy being
  applied on an network facing service"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security:
  apparmor: fix change_hat not finding hat after policy replacement
2016-11-21 15:27:41 -08:00
Linus Torvalds
8d1a2408ef Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc
Pull sparc fixes from David Miller:

 1) With modern networking cards we can run out of 32-bit DMA space, so
    support 64-bit DMA addressing when possible on sparc64. From Dave
    Tushar.

 2) Some signal frame validation checks are inverted on sparc32, fix
    from Andreas Larsson.

 3) Lockdep tables can get too large in some circumstances on sparc64,
    add a way to adjust the size a bit. From Babu Moger.

 4) Fix NUMA node probing on some sun4v systems, from Thomas Tai.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc:
  sparc: drop duplicate header scatterlist.h
  lockdep: Limit static allocations if PROVE_LOCKING_SMALL is defined
  config: Adding the new config parameter CONFIG_PROVE_LOCKING_SMALL for sparc
  sunbmac: Fix compiler warning
  sunqe: Fix compiler warnings
  sparc64: Enable 64-bit DMA
  sparc64: Enable sun4v dma ops to use IOMMU v2 APIs
  sparc64: Bind PCIe devices to use IOMMU v2 service
  sparc64: Initialize iommu_map_table and iommu_pool
  sparc64: Add ATU (new IOMMU) support
  sparc64: Add FORCE_MAX_ZONEORDER and default to 13
  sparc64: fix compile warning section mismatch in find_node()
  sparc32: Fix inverted invalid_frame_pointer checks on sigreturns
  sparc64: Fix find_node warning if numa node cannot be found
2016-11-21 13:56:17 -08:00
Mika Westerberg
effb46b40f watchdog: wdat_wdt: Select WATCHDOG_CORE
The WDAT watchdog driver uses functionality provided by the watchdog timer
core but it did not select it explicitly. This results following linker
error when only WDAT_WDT is enabled in Kconfig:

  drivers/built-in.o: In function `wdat_wdt_probe':
  drivers/watchdog/wdat_wdt.c:444: undefined reference to `devm_watchdog_register_device'

Fix this by explicitly selecting WATCHDOG_CORE when WDAT watchdog driver is
enabled.

Fixes: 058dfc7670 (ACPI / watchdog: Add support for WDAT hardware watchdog)
Reported-by: Vegard Nossum <vegard.nossum@gmail.com>
Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-11-21 22:49:17 +01:00
Linus Torvalds
27e7ab99db Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Pull networking fixes from David Miller:

 1) Clear congestion control state when changing algorithms on an
    existing socket, from Florian Westphal.

 2) Fix register bit values in altr_tse_pcs portion of stmmac driver,
    from Jia Jie Ho.

 3) Fix PTP handling in stammc driver for GMAC4, from Giuseppe
    CAVALLARO.

 4) Fix udplite multicast delivery handling, it ignores the udp_table
    parameter passed into the lookups, from Pablo Neira Ayuso.

 5) Synchronize the space estimated by rtnl_vfinfo_size and the space
    actually used by rtnl_fill_vfinfo. From Sabrina Dubroca.

 6) Fix memory leak in fib_info when splitting nodes, from Alexander
    Duyck.

 7) If a driver does a napi_hash_del() explicitily and not via
    netif_napi_del(), it must perform RCU synchronization as needed. Fix
    this in virtio-net and bnxt drivers, from Eric Dumazet.

 8) Likewise, it is not necessary to invoke napi_hash_del() is we are
    also doing neif_napi_del() in the same code path. Remove such calls
    from be2net and cxgb4 drivers, also from Eric Dumazet.

 9) Don't allocate an ID in peernet2id_alloc() if the netns is dead,
    from WANG Cong.

10) Fix OF node and device struct leaks in of_mdio, from Johan Hovold.

11) We cannot cache routes in ip6_tunnel when using inherited traffic
    classes, from Paolo Abeni.

12) Fix several crashes and leaks in cpsw driver, from Johan Hovold.

13) Splice operations cannot use freezable blocking calls in AF_UNIX,
    from WANG Cong.

14) Link dump filtering by master device and kind support added an error
    in loop index updates during the dump if we actually do filter, fix
    from Zhang Shengju.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (59 commits)
  tcp: zero ca_priv area when switching cc algorithms
  net: l2tp: Treat NET_XMIT_CN as success in l2tp_eth_dev_xmit
  ethernet: stmmac: make DWMAC_STM32 depend on it's associated SoC
  tipc: eliminate obsolete socket locking policy description
  rtnl: fix the loop index update error in rtnl_dump_ifinfo()
  l2tp: fix racy SOCK_ZAPPED flag check in l2tp_ip{,6}_bind()
  net: macb: add check for dma mapping error in start_xmit()
  rtnetlink: fix FDB size computation
  netns: fix get_net_ns_by_fd(int pid) typo
  af_unix: conditionally use freezable blocking calls in read
  net: ethernet: ti: cpsw: fix fixed-link phy probe deferral
  net: ethernet: ti: cpsw: add missing sanity check
  net: ethernet: ti: cpsw: fix secondary-emac probe error path
  net: ethernet: ti: cpsw: fix of_node and phydev leaks
  net: ethernet: ti: cpsw: fix deferred probe
  net: ethernet: ti: cpsw: fix mdio device reference leak
  net: ethernet: ti: cpsw: fix bad register access in probe error path
  net: sky2: Fix shutdown crash
  cfg80211: limit scan results cache size
  net sched filters: pass netlink message flags in event notification
  ...
2016-11-21 13:26:28 -08:00
Bhumika Goyal
e796f49d82 net: ieee802154: constify ieee802154_ops structures
Declare the structure ieee802154_ops as const as it is only passed as an
argument to the function  ieee802154_alloc_hw. This argument is of type
const struct ieee802154_ops *, so ieee80254_ops structures having this
property can be declared as const.
Done using Coccinelle:

@r1 disable optional_qualifier @
identifier i;
position p;
@@
static struct ieee802154_ops i@p = {...};

@ok1@
identifier r1.i;
position p;
expression e1;
@@
ieee802154_alloc_hw(e1,&i@p)

@bad@
position p!={r1.p,ok1.p};
identifier r1.i;
@@
i@p

@depends on !bad disable optional_qualifier@
identifier r1.i;
@@
static
+const
struct ieee802154_ops  i={...};

@depends on !bad disable optional_qualifier@
identifier r1.i;
@@
+const
struct ieee802154_ops  i;

The before and after size details of the affected files are:

   text	   data	    bss	    dec	    hex	filename
   8669	   1176	     16	   9861	   2685	drivers/net/ieee802154/adf7242.o
   8805	   1048	     16	   9869	   268d	drivers/net/ieee802154/adf7242.o

   text	   data	    bss	    dec	    hex	filename
   7211	   2296	     32	   9539	   2543	drivers/net/ieee802154/atusb.o
   7339	   2160	     32	   9531	   253b	drivers/net/ieee802154/atusb.o

Signed-off-by: Bhumika Goyal <bhumirks@gmail.com>
Acked-by: Stefan Schmidt <stefan@osg.samsung.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-21 15:46:12 -05:00
David S. Miller
acbec8567c Merge branch 'geneve-lwt-efficiency'
Pravin B Shelar says:

====================
geneve: Use LWT more effectively.

Following patch series make use of geneve LWT code path for
geneve netdev type of device.
This allows us to simplify geneve module without changing any
functionality.

v2-v3:
Rebase against latest net-next.

v1-v2:
Fix warning reported by kbuild test robot.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-21 14:05:50 -05:00
pravin shelar
2e0b26e103 geneve: Optimize geneve device lookup.
Rather than comparing 64-bit tunnel-id, compare tunnel vni
which is 24-bit id. This also save conversion from vni
to tunnel id on each tunnel packet receive.

Signed-off-by: Pravin B Shelar <pshelar@ovn.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-21 14:05:50 -05:00
pravin shelar
bcceeec3cc geneve: Remove redundant socket checks.
Geneve already has check for device socket in route
lookup function. So no need to check it in xmit
function.

Signed-off-by: Pravin B Shelar <pshelar@ovn.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-21 14:05:49 -05:00
pravin shelar
c3ef5aa5e5 geneve: Merge ipv4 and ipv6 geneve_build_skb()
There are minimal difference in building Geneve header
between ipv4 and ipv6 geneve tunnels. Following patch
refactors code to unify it.

Signed-off-by: Pravin B Shelar <pshelar@ovn.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-21 14:05:49 -05:00
pravin shelar
9b4437a5b8 geneve: Unify LWT and netdev handling.
Current geneve implementation has two separate cases to handle.
1. netdev xmit
2. LWT xmit.

In case of netdev, geneve configuration is stored in various
struct geneve_dev members. For example geneve_addr, ttl, tos,
label, flags, dst_cache, etc. For LWT ip_tunnel_info is passed
to the device in ip_tunnel_info.

Following patch uses ip_tunnel_info struct to store almost all
of configuration of a geneve netdevice. This allows us to unify
most of geneve driver code around ip_tunnel_info struct.
This dramatically simplify geneve code, since it does not
need to handle two different configuration cases. Removes
duplicate code, single code path can handle either type
of geneve devices.

Signed-off-by: Pravin B Shelar <pshelar@ovn.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-21 14:05:49 -05:00
David S. Miller
9e36ced633 Merge branch 'tcp-cong-undo_cwnd-mandatory'
Florian Westphal says:

====================
tcp: make undo_cwnd mandatory for congestion modules

highspeed, illinois, scalable, veno and yeah congestion control algorithms
don't provide a 'cwnd_undo' function.  This makes the stack default to a
'reno undo' which doubles cwnd.  However, the ssthresh implementation of
these algorithms do not halve the slowstart threshold. This causes similar
issue as the one fixed for dctcp in ce6dd23329 ("dctcp: avoid bogus
doubling of cwnd after loss").

In light of this it seems better to remove the fallback and make undo_cwnd
mandatory.

First patch fixes those spots where reno undo seems incorrect by providing
.cwnd_undo functions, second patch removes the fallback.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-21 13:20:17 -05:00
Florian Westphal
e97991832a tcp: make undo_cwnd mandatory for congestion modules
The undo_cwnd fallback in the stack doubles cwnd based on ssthresh,
which un-does reno halving behaviour.

It seems more appropriate to let congctl algorithms pair .ssthresh
and .undo_cwnd properly. Add a 'tcp_reno_undo_cwnd' function and wire it
up for all congestion algorithms that used to rely on the fallback.

Cc: Eric Dumazet <edumazet@google.com>
Cc: Yuchung Cheng <ycheng@google.com>
Cc: Neal Cardwell <ncardwell@google.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-21 13:20:17 -05:00
Florian Westphal
85f7e7508a tcp: add cwnd_undo functions to various tcp cc algorithms
congestion control algorithms that do not halve cwnd in their .ssthresh
should provide a .cwnd_undo rather than rely on current fallback which
assumes reno halving (and thus doubles the cwnd).

All of these do 'something else' in their .ssthresh implementation, thus
store the cwnd on loss and provide .undo_cwnd to restore it again.

A followup patch will remove the fallback and all algorithms will
need to provide a .cwnd_undo function.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-21 13:20:17 -05:00
David S. Miller
2fcb58ab30 Merge branch 'bridge-igmpv3-mldv2-support'
Nikolay Aleksandrov says:

====================
bridge: add support for IGMPv3 and MLDv2 querier

This patch-set adds support for IGMPv3 and MLDv2 querier in the bridge.
Two new options which can be toggled via netlink and sysfs are added that
control the version per-bridge:
 multicast_igmp_version - default 2, can be set to 3
 multicast_mld_version - default 1, can be set to 2 (this option is
                         disabled if CONFIG_IPV6=n)

Note that the names do not include "querier", I think that these options
can be re-used later as more IGMPv3 support is added to the bridge so we
can avoid adding more options to switch between v2 and v3 behaviour.

The set uses the already existing br_ip{4,6}_multicast_alloc_query
functions and adds the appropriate header based on the chosen version.

For the initial support I have removed the compatibility implementation
(RFC3376 sec 7.3.1, 7.3.2; RFC3810 sec 8.3.1, 8.3.2), because there are
some details that we need to sort out.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-21 13:16:59 -05:00
Nikolay Aleksandrov
aa2ae3e71c bridge: mcast: add MLDv2 querier support
This patch adds basic support for MLDv2 queries, the default is MLDv1
as before. A new multicast option - multicast_mld_version, adds the
ability to change it between 1 and 2 via netlink and sysfs.
The MLD option is disabled if CONFIG_IPV6 is disabled.

Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-21 13:16:58 -05:00
Nikolay Aleksandrov
5e9235853d bridge: mcast: add IGMPv3 query support
This patch adds basic support for IGMPv3 queries, the default is IGMPv2
as before. A new multicast option - multicast_igmp_version, adds the
ability to change it between 2 and 3 via netlink and sysfs. The option
struct member is in a 4 byte hole in net_bridge.

There also a few minor style adjustments in br_multicast_new_group and
br_multicast_add_group.

Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-21 13:16:58 -05:00
Florian Westphal
7082c5c3f2 tcp: zero ca_priv area when switching cc algorithms
We need to zero out the private data area when application switches
connection to different algorithm (TCP_CONGESTION setsockopt).

When congestion ops get assigned at connect time everything is already
zeroed because sk_alloc uses GFP_ZERO flag.  But in the setsockopt case
this contains whatever previous cc placed there.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-21 13:13:56 -05:00
Gao Feng
7c6ae610a1 net: l2tp: Treat NET_XMIT_CN as success in l2tp_eth_dev_xmit
The tc could return NET_XMIT_CN as one congestion notification, but
it does not mean the packe is lost. Other modules like ipvlan,
macvlan, and others treat NET_XMIT_CN as success too.
So l2tp_eth_dev_xmit should add the NET_XMIT_CN check.

Signed-off-by: Gao Feng <gfree.wind@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-21 13:10:29 -05:00
Gao Feng
fc51f2b7e5 driver: macvlan: Remove duplicated IFF_UP condition check in macvlan_forward_source
The function macvlan_forward_source_one has already checked the flag
IFF_UP, so needn't check it outside in macvlan_forward_source too.

Signed-off-by: Gao Feng <gfree.wind@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-21 11:58:49 -05:00
Benjamin Coddington
d75a6a0e39 NFSv4.1: Keep a reference on lock states while checking
While walking the list of lock_states, keep a reference on each
nfs4_lock_state to be checked, otherwise the lock state could be removed
while the check performs TEST_STATEID and possible FREE_STATEID.

Signed-off-by: Benjamin Coddington <bcodding@redhat.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2016-11-21 11:58:39 -05:00
Peter Robinson
6bc5445c01 ethernet: stmmac: make DWMAC_STM32 depend on it's associated SoC
There's not much point, except compile test, enabling the stmmac
platform drivers unless the STM32 SoC is enabled. It's not
useful without it.

Signed-off-by: Peter Robinson <pbrobinson@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-21 11:34:24 -05:00
Eric Dumazet
dad42c3038 mlx4: avoid unnecessary dirtying of critical fields
While stressing a 40Gbit mlx4 NIC with busy polling, I found false
sharing in mlx4 driver that can be easily avoided.

This patch brings an additional 7 % performance improvement in UDP_RR
workload.

1) If we received no frame during one mlx4_en_process_rx_cq()
   invocation, no need to call mlx4_cq_set_ci() and/or dirty ring->cons

2) Do not refill rx buffers if we have plenty of them.
   This avoids false sharing and allows some bulk/batch optimizations.
   Page allocator and its locks will thank us.

Finally, mlx4_en_poll_rx_cq() should not return 0 if it determined
cpu handling NIC IRQ should be changed. We should return budget-1
instead, to not fool net_rx_action() and its netdev_budget.

v2: keep AVG_PERF_COUNTER(... polled) even if polled is 0

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Tariq Toukan <tariqt@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-21 11:33:31 -05:00
Eric Dumazet
b668534c1d bnx2: use READ_ONCE() instead of barrier()
barrier() is a big hammer compared to READ_ONCE(),
and requires comments explaining what is protected.

READ_ONCE() is more precise and compiler should generate
better overall code.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-21 11:32:34 -05:00
Eric Dumazet
d21dbdfe0a udp: avoid one cache line miss in recvmsg()
UDP_SKB_CB(skb)->partial_cov is located at offset 66 in skb,
requesting a cold cache line being read in cpu cache.

We can avoid this cache line miss for UDP sockets,
as partial_cov has a meaning only for UDPLite.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-21 11:26:59 -05:00
David S. Miller
ee9d5461c0 Merge branch 'mlx5-bpf-refcnt-fixes'
Daniel Borkmann says:

====================
Couple of BPF refcount fixes for mlx5

Various mlx5 bugs on eBPF refcount handling found during review.
Last patch in series adds a __must_check to BPF helpers to make
sure we won't run into it again w/o compiler complaining first.

v2 -> v3:

 - Just reworked patch 2/4 so we don't need bpf_prog_sub().
 - Rebased, rest as is.

v1 -> v2:

 - After discussion with Alexei, we agreed upon rebasing the
   patches against net-next.
 - Since net-next, I've also added the __must_check to enforce
   future users to check for errors.
 - Fixed up commit message #2.
 - Simplify assignment from patch #1 based on Saeed's feedback
   on previous set.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-21 11:25:59 -05:00
Daniel Borkmann
6d67942dd0 bpf: add __must_check attributes to refcount manipulating helpers
Helpers like bpf_prog_add(), bpf_prog_inc(), bpf_map_inc() can fail
with an error, so make sure the caller properly checks their return
value and not just ignores it, which could worst-case lead to use
after free.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-21 11:25:58 -05:00
Daniel Borkmann
a055c19be9 bpf, mlx5: drop priv->xdp_prog reference on netdev cleanup
mlx5e_xdp_set() is currently the only place where we drop reference on the
prog sitting in priv->xdp_prog when it's exchanged by a new one. We also
need to make sure that we eventually release that reference, for example,
in case the netdev is dismantled, otherwise we leak the program.

Fixes: 86994156c7 ("net/mlx5e: XDP fast RX drop bpf programs support")
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-21 11:25:58 -05:00
Daniel Borkmann
c54c062904 bpf, mlx5: fix various refcount issues in mlx5e_xdp_set
There are multiple issues in mlx5e_xdp_set():

1) The batched bpf_prog_add() is currently not checked for errors. When
   doing so, it should be done at an earlier point in time to makes sure
   that we cannot fail anymore at the time we want to set the program for
   each channel. The batched refs short-cut can only be performed when we
   don't need to perform a reset for changing the rq type and the device
   was in opened state. In case the device was not in opened state, then
   the next mlx5e_open_locked() will aquire the refs from the control prog
   via mlx5e_create_rq(), same when we need to perform a reset.

2) When swapping the priv->xdp_prog, then no extra reference count must be
   taken since we got that from call path via dev_change_xdp_fd() already.
   Otherwise, we'd never be able to release the program. Also, bpf_prog_add()
   without checking the return code could fail.

Fixes: 86994156c7 ("net/mlx5e: XDP fast RX drop bpf programs support")
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-21 11:25:58 -05:00
Daniel Borkmann
97bc402db7 bpf, mlx5: fix mlx5e_create_rq taking reference on prog
In mlx5e_create_rq(), when creating a new queue, we call bpf_prog_add() but
without checking the return value. bpf_prog_add() can fail since 92117d8443
("bpf: fix refcnt overflow"), so we really must check it. Take the reference
right when we assign it to the rq from priv->xdp_prog, and just drop the
reference on error path. Destruction in mlx5e_destroy_rq() looks good, though.

Fixes: 86994156c7 ("net/mlx5e: XDP fast RX drop bpf programs support")
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-21 11:25:57 -05:00
Rafael J. Wysocki
9713adc2a1 Revert "ACPI: Execute _PTS before system reboot"
Revert commit 2c85025c75 (ACPI: Execute _PTS before system reboot)
as it is reported to cause poweroff and reboot to hang on Dell
Latitude E7250.

Link: https://bugzilla.kernel.org/show_bug.cgi?id=187061
Reported-by:  Gianpaolo <gianpaoloc@gmail.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-11-21 14:25:49 +01:00
Jacob Pan
ec638db8cb thermal/powerclamp: add back module device table
Commit 3105f234e0 replaced module
cpu id table with a cpu feature check, which is logically correct.
But we need the module device table to allow module auto loading.

Cc: stable@vger.kernel.org # 4.8
Fixes:3105f234 thermal/powerclamp: correct cpu support check
Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com>
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
2016-11-21 20:54:40 +08:00
Alexander Shishkin
e96271f3ed perf/core: Fix address filter parser
The token table passed into match_token() must be null-terminated, which
it currently is not in the perf's address filter string parser, as caught
by Vince's perf_fuzzer and KASAN.

It doesn't blow up otherwise because of the alignment padding of the table
to the next element in the .rodata, which is luck.

Fixing by adding a null-terminator to the token table.

Reported-by: Vince Weaver <vincent.weaver@maine.edu>
Tested-by: Vince Weaver <vincent.weaver@maine.edu>
Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: dvyukov@google.com
Cc: stable@vger.kernel.org # v4.7+
Fixes: 375637bc52 ("perf/core: Introduce address range filtering")
Link: http://lkml.kernel.org/r/877f81f264.fsf@ashishki-desk.ger.corp.intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-11-21 11:28:36 +01:00
Jaehoon Chung
647f80a1f2 mmc: dw_mmc: fix the error handling for dma operation
When dma->start is failed,then it has to fall back to PIO mode
for current transfer.

But Host controller was already set to bits relevant to DMA operation.
If needs to use the PIO mode, Host controller has to stop the DMA
operation. (It's more stable than now.)

When it occurred error, it's not running any request.

Fixes: 3fc7eaef44 ("mmc: dw_mmc: Add external dma interface support")
Reported-by: Marek Szyprowski <m.szyprowski@samsung.com>
Signed-off-by: Jaehoon Chung <jh80.chung@samsung.com>
Reviewed-by: Shawn Lin <shawn.lin@rock-chips.com>
Cc: <stable@vger.kernel.org> # v4.3+
Signed-off-by: Jaehoon Chung <jh80.chung@samsung.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2016-11-21 11:08:28 +01:00
Andy Shevchenko
e5dce28688 x86/platform/intel-mid: Rename platform_wdt to platform_mrfld_wdt
Rename the watchdog platform library file to explicitly show that is used only
on Intel Merrifield platforms.

Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20161118172723.179761-1-andriy.shevchenko@linux.intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-11-21 11:07:11 +01:00