Commit Graph

413305 Commits

Author SHA1 Message Date
Michael S. Tsirkin
f121159d72 virtio_net: make all RX paths handle erors consistently
receive mergeable now handles errors internally.
Do same for big and small packet paths, otherwise
the logic is too hard to follow.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-12-01 20:27:16 -05:00
Michael S. Tsirkin
8fc3b9e9a2 virtio_net: fix error handling for mergeable buffers
Eric Dumazet noticed that if we encounter an error
when processing a mergeable buffer, we don't
dequeue all of the buffers from this packet,
the result is almost sure to be loss of networking.

Jason Wang noticed that we also leak a page and that we don't decrement
the rq buf count, so we won't repost buffers (a resource leak).

Fix both issues.

Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Michael Dalton <mwdalton@google.com>
Reported-by: Eric Dumazet <edumazet@google.com>
Reported-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-12-01 20:27:16 -05:00
Linus Torvalds
e84a2a49b9 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rw/uml
Pull UML fixes from Richard Weinberger:
 "Fixes two regressions which got introduced this merge window"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rw/uml:
  um: Build always with -mcmodel=large on 64bit
  um: Rename print_stack_trace to do_stack_trace
2013-12-01 15:33:53 -08:00
Linus Torvalds
1d07489aac Merge branch 'fixes' of git://ftp.arm.linux.org.uk/~rmk/linux-arm
Pull ARM fixes from Russell King:
 "Some ARM fixes, the biggest of which is the fix for the signal return
  codes; this came up due to an interaction between the V7M nommu
  changes and the BE8 changes.  Dave Martin spotted that the kexec
  trampoline wasn't being correctly copied (in a way which allows
  Thumb-2 to work).

  I've also fixed a number of breakages on footbridge platforms as I've
  upgraded one of my machines to v3.12...  one which had a 1200 day
  uptime"

* 'fixes' of git://ftp.arm.linux.org.uk/~rmk/linux-arm:
  ARM: 7907/1: lib: delay-loop: Add align directive to fix BogoMIPS calculation
  ARM: 7897/1: kexec: Use the right ISA for relocate_new_kernel
  ARM: 7895/1: signal: fix armv7-m build issue in sigreturn_codes.S
  ARM: footbridge: fix EBSA285 LEDs
  ARM: footbridge: fix VGA initialisation
  ARM: fix booting low-vectors machines
  ARM: dma-mapping: check DMA mask against available memory
2013-12-01 15:32:19 -08:00
Richard Weinberger
fff6540cbc um: Build always with -mcmodel=large on 64bit
On UML SUBARCH can be x86, x86_64 and i386 and if it is x86
we use uname -m to select a defconfig.
Therefore we can no longer use -mcmodel=large only if SUBARCH
is x86_64.

Reported-and-tested-by: Boaz Harrosh <bharrosh@panasas.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
2013-12-01 13:06:51 +01:00
Richard Weinberger
8ed12fcc19 um: Rename print_stack_trace to do_stack_trace
We cannot use print_stack_trace because the name conflicts
with linux/stacktrace.h.

Reported-by: Randy Dunlap <rdunlap@infradead.org>
Acked-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Richard Weinberger <richard@nod.at>
2013-12-01 13:06:51 +01:00
Stephen M. Cameron
88bf6d62db [SCSI] hpsa: return 0 from driver probe function on success, not 1
A return value of 1 is interpreted as an error.  See pci_driver.
in local_pci_probe().  If you're wondering how this ever could
have worked, it's because it used to be the case that only return
values less than zero were interpreted as failure.  But even in
the current kernel if the driver registers its various entry
points with the kernel, and then returns a value which is
interpreted as failure, those registrations aren't undone, so
the driver still mostly works.  However, the driver's remove
function wouldn't be called on rmmod, and pci power management
functions wouldn't work.  In the case of Smart Array, since it
has a battery backed cache (or else no cache) even if the driver
is not shut down properly as long as there is no outstanding
i/o, nothing too bad happens, which is why it took so long to
notice.

Requesting backport to stable because the change to pci-driver.c
which requires driver probe functions to return 0 occurred between
2.6.35 and 2.6.36 (the pci power management breakage) and again
between 3.7 and 3.8 (pci_dev->driver getting set to NULL in
local_pci_probe() preventing driver remove function from being
called on rmmod.)

Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Cc: stable@vger.kernel.org
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2013-11-30 19:28:00 -05:00
Fabio Estevam
11d4bb1bd0 ARM: 7907/1: lib: delay-loop: Add align directive to fix BogoMIPS calculation
Currently mx53 (CortexA8) running at 1GHz reports:
Calibrating delay loop... 663.55 BogoMIPS (lpj=3317760)

Tom Evans verified that alignments of 0x0 and 0x8 run the two instructions of __loop_delay in one clock cycle (1 clock/loop), while alignments of 0x4 and 0xc take 3 clocks to run the loop twice. (1.5 clock/loop)

The original object code looks like this:

00000010 <__loop_const_udelay>:
  10:	e3e01000 	mvn	r1, #0
  14:	e51f201c 	ldr	r2, [pc, #-28]	; 0 <__loop_udelay-0x8>
  18:	e5922000 	ldr	r2, [r2]
  1c:	e0800921 	add	r0, r0, r1, lsr #18
  20:	e1a00720 	lsr	r0, r0, #14
  24:	e0822b21 	add	r2, r2, r1, lsr #22
  28:	e1a02522 	lsr	r2, r2, #10
  2c:	e0000092 	mul	r0, r2, r0
  30:	e0800d21 	add	r0, r0, r1, lsr #26
  34:	e1b00320 	lsrs	r0, r0, #6
  38:	01a0f00e 	moveq	pc, lr

0000003c <__loop_delay>:
  3c:	e2500001 	subs	r0, r0, #1
  40:	8afffffe 	bhi	3c <__loop_delay>
  44:	e1a0f00e 	mov	pc, lr

After adding the 'align 3' directive to __loop_delay (align to 8 bytes):

00000010 <__loop_const_udelay>:
  10:	e3e01000 	mvn	r1, #0
  14:	e51f201c 	ldr	r2, [pc, #-28]	; 0 <__loop_udelay-0x8>
  18:	e5922000 	ldr	r2, [r2]
  1c:	e0800921 	add	r0, r0, r1, lsr #18
  20:	e1a00720 	lsr	r0, r0, #14
  24:	e0822b21 	add	r2, r2, r1, lsr #22
  28:	e1a02522 	lsr	r2, r2, #10
  2c:	e0000092 	mul	r0, r2, r0
  30:	e0800d21 	add	r0, r0, r1, lsr #26
  34:	e1b00320 	lsrs	r0, r0, #6
  38:	01a0f00e 	moveq	pc, lr
  3c:	e320f000 	nop	{0}

00000040 <__loop_delay>:
  40:	e2500001 	subs	r0, r0, #1
  44:	8afffffe 	bhi	40 <__loop_delay>
  48:	e1a0f00e 	mov	pc, lr
  4c:	e320f000 	nop	{0}

, which now reports:
Calibrating delay loop... 996.14 BogoMIPS (lpj=4980736)

Some more test results:

On mx31 (ARM1136) running at 532 MHz, before the patch:
Calibrating delay loop... 351.43 BogoMIPS (lpj=1757184)

On mx31 (ARM1136) running at 532 MHz after the patch:
Calibrating delay loop... 528.79 BogoMIPS (lpj=2643968)

Also tested on mx6 (CortexA9) and on mx27 (ARM926), which shows the same
BogoMIPS value before and after this patch.

Reported-by: Tom Evans <tom_usenet@optusnet.com.au>
Suggested-by: Tom Evans <tom_usenet@optusnet.com.au>
Signed-off-by: Fabio Estevam <fabio.estevam@freescale.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2013-11-30 22:21:03 +00:00
Dave Martin
e2ccba4908 ARM: 7897/1: kexec: Use the right ISA for relocate_new_kernel
Copying a function with memcpy() and then trying to execute the
result isn't trivially portable to Thumb.

This patch modifies the kexec soft restart code to copy its
assembler trampoline relocate_new_kernel() using fncpy() instead,
so that relocate_new_kernel can be in the same ISA as the rest of
the kernel without problems.

Signed-off-by: Dave Martin <Dave.Martin@arm.com>
Acked-by: Will Deacon <will.deacon@arm.com>
Reported-by: Taras Kondratiuk <taras.kondratiuk@linaro.org>
Tested-by: Taras Kondratiuk <taras.kondratiuk@linaro.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2013-11-30 22:21:02 +00:00
Victor Kamensky
50913336ef ARM: 7895/1: signal: fix armv7-m build issue in sigreturn_codes.S
After "ARM: signal: sigreturn_codes should be endian neutral to
work in BE8" commit, thumb only platforms, like armv7m, fails to
compile sigreturn_codes.S. The reason is that for such arch
values '.arm' directive and arm opcodes are not allowed.

Fix conditionally enables arm opcodes only if no CONFIG_CPU_THUMBONLY
defined and it uses .org instructions to keep sigreturn_codes
layout.

Suggested-by: Dave Martin <Dave.Martin@arm.com>
Signed-off-by: Victor Kamensky <victor.kamensky@linaro.org>
Tested-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Reviewed-by: Dave Martin <Dave.Martin@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2013-11-30 22:21:00 +00:00
Russell King
67130c5464 ARM: footbridge: fix EBSA285 LEDs
- The LEDs register is write-only: it can't be read-modify-written.
- The LEDs are write-1-for-off not 0.
- The check for the platform was inverted.

Fixes: cf6856d693 ("ARM: mach-footbridge: retire custom LED code")
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Cc: stable@vger.kernel.org
2013-11-30 22:20:59 +00:00
Stephen M. Cameron
2e311fbabd [SCSI] hpsa: do not discard scsi status on aborted commands
We inadvertantly discarded the scsi status for aborted commands.
For some commands (e.g. reads from tape drives) these can't be retried,
and if we discarded the scsi status, the scsi mid layer couldn't notice
anything was wrong and the error was not reported.

Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Cc: stable@vger.kernel.org
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2013-11-30 17:04:49 -05:00
Helge Deller
b8d8fccd68 parisc: remove CONFIG_MLONGCALLS=y from defconfigs
Signed-off-by: Helge Deller <deller@gmx.de>
2013-11-30 22:22:08 +01:00
Thomas Huth
99e872ae1e virtio_net: Fixed a trivial typo (fitler --> filter)
"MAC filter" sounds more reasonable than "MAC fitler".

Signed-off-by: Thomas Huth <thuth@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-11-30 16:14:24 -05:00
Helge Deller
161bd3bf60 parisc: fix kernel memory layout in vmlinux.ld.S
When building a 64bit kernel sometimes functions in the .init section were not
able to reach the standard kernel function. Main reason for this problem is,
that the linkage tables (.plt, .opd, .dlt) tend to become pretty huge and thus
the distance gets too big for short calls.

One option to avoid this is to use the -mlong-calls compiler option, but this
increases the binary size and introduces a performance penalty.

Instead, with this patch we just lay out the binary differently.  Init code is
stored first, followed by text, R/O and finally R/W data. This means, that init
and text code is now much closer to each other, which is sufficient to reach
each other by short calls.

Signed-off-by: Helge Deller <deller@gmx.de>
2013-11-30 22:09:21 +01:00
Helge Deller
c790b41bac parisc: use kernel_text_address() in unwind functions
Signed-off-by: Helge Deller <deller@gmx.de>
2013-11-30 22:08:54 +01:00
Chen Gang
6c3215cd5d parisc: remove empty SERIAL_PORT_DFNS in serial.h
If architectures don't support SERIAL_PORT_DFNS, they need not define it
to "nothing", the related drivers need do it by themselves (e.g.  8250
serial driver).

Signed-off-by: Chen Gang <gang.chen@asianux.com>
Signed-off-by: Helge Deller <deller@gmx.de>
2013-11-30 21:02:18 +01:00
Helge Deller
8f96bdfd6f parisc: add some more machine names to hardware database
Sadly the correct names for machines which end with a question-mark aren't
known, so let's give it a best-guessed-name.

Signed-off-by: Helge Deller <deller@gmx.de>
2013-11-30 21:00:14 +01:00
Helge Deller
0576da2c08 parisc: fix mmap(MAP_FIXED|MAP_SHARED) to already mmapped address
locale-gen on Debian showed a strange problem on parisc:
mmap2(NULL, 536870912, PROT_NONE, MAP_SHARED, 3, 0) = 0x42a54000
mmap2(0x42a54000, 103860, PROT_READ|PROT_WRITE, MAP_SHARED|MAP_FIXED, 3, 0) = -1 EINVAL (Invalid argument)

Basically it was just trying to re-mmap() a file at the same address
which it was given by a previous mmap() call. But this remapping failed
with EINVAL.

The problem is, that when MAP_FIXED and MAP_SHARED flags were used, we didn't
included the mapping-based offset when we verified the alignment of the given
fixed address against the offset which we calculated it in the previous call.

Signed-off-by: Helge Deller <deller@gmx.de>
Cc: <stable@vger.kernel.org> # 3.10+
2013-11-30 20:57:50 +01:00
stephen hemminger
eff7979f00 netem: fix gemodel loss generator
Patch from developers of the alternative loss models, downloaded from:
   http://netgroup.uniroma2.it/twiki/bin/view.cgi/Main/NetemCLG

 "in case 2, of the switch we change the direction of the inequality to
  net_random()>clg->a3, because clg->a3 is h in the GE model and when h
  is 0 all packets will be lost."

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-11-30 12:49:29 -05:00
stephen hemminger
ab6c27be81 netem: fix loss 4 state model
Patch from developers of the alternative loss models, downloaded from:
   http://netgroup.uniroma2.it/twiki/bin/view.cgi/Main/NetemCLG

 "In the case 1 of the switch statement in the if conditions we
   need to add clg->a4 to clg->a1, according to the model."

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-11-30 12:49:28 -05:00
stephen hemminger
7c2781fa92 netem: missing break in ge loss generator
There is a missing break statement in the Gilbert Elliot loss model
generator which makes state machine behave incorrectly.

Reported-by: Martin Burri <martin.burri@ch.abb.com
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-11-30 12:49:28 -05:00
Arvid Brodin
98bf836222 net/hsr: Support iproute print_opt ('ip -details ...')
This implements the rtnl_link_ops fill_info routine for HSR.

Signed-off-by: Arvid Brodin <arvid.brodin@alten.se>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-11-30 12:48:14 -05:00
Arvid Brodin
213e3bc723 net/hsr: Very small fix of comment style.
Signed-off-by: Arvid Brodin <arvid.brodin@alten.se>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-11-30 12:48:13 -05:00
Arvid Brodin
19990e29fe MAINTAINERS: Added net/hsr/ maintainer
Signed-off-by: Arvid Brodin <arvid.brodin@alten.se>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-11-30 12:48:13 -05:00
Hannes Frederic Sowa
7f88c6b23a ipv6: fix possible seqlock deadlock in ip6_finish_output2
IPv6 stats are 64 bits and thus are protected with a seqlock. By not
disabling bottom-half we could deadlock here if we don't disable bh and
a softirq reentrantly updates the same mib.

Cc: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-11-30 12:48:13 -05:00
David S. Miller
696701b89d Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/net
Jeff Kirsher says:

====================
Intel Wired LAN Driver Updates

This series contains updates to igb, e1000 and ixgbe.

Akeem provides a igb fix where WOL was being reported as supported on
some ethernet devices which did not have that capability.

Yanjun provides a fix for e1000 which is similar to a previous fix
for e1000e commit bb9e44d0d0 ("e1000e: prevent oops when adapter is
being closed and reset simultaneously"), where the same issue was
observed on the older e1000 cards.

Vladimir Davydov provides 2 e1000 fixes.  The first fixes a lockdep
warning e1000_down() tries to synchronously cancel e1000 auxiliary
works (reset_task, watchdog_task, phy_info_task and fifo_stall_task)
which take adapter->mutex in their handlers.  The second patch is to
fix a possible race condition where reset_task() would be running
after adapter down.

John provides 2 fixes for ixgbe.  First turns ixgbe_fwd_ring_down
to static and the second disables NETIF_F_HW_L2FW_DOFFLOAD by default
because it allows upper layer net devices to use queues in the hardware
to directly submit and receive skbs.

Mark Rustad provides a single patch for ixgbe to make
ixgbe_identify_qsfp_module_generic static to resolve compile
warnings.

v2: Drop igb patch "igb: Update queue reinit function to call dev_close
    when init of queues fails" from Carolyn, so that the solution can
    be re-worked based on feedback from David Miller.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2013-11-30 12:42:20 -05:00
Russell King
43659222e7 ARM: footbridge: fix VGA initialisation
It's no good setting vga_base after the VGA console has been
initialised, because if we do that we get this:

Unable to handle kernel paging request at virtual address 000b8000
pgd = c0004000
[000b8000] *pgd=07ffc831, *pte=00000000, *ppte=00000000
0Internal error: Oops: 5017 [#1] ARM
Modules linked in:
CPU: 0 PID: 0 Comm: swapper Not tainted 3.12.0+ #49
task: c03e2974 ti: c03d8000 task.ti: c03d8000
PC is at vgacon_startup+0x258/0x39c
LR is at request_resource+0x10/0x1c
pc : [<c01725d0>]    lr : [<c0022b50>]    psr: 60000053
sp : c03d9f68  ip : 000b8000  fp : c03d9f8c
r10: 000055aa  r9 : 4401a103  r8 : ffffaa55
r7 : c03e357c  r6 : c051b460  r5 : 000000ff  r4 : 000c0000
r3 : 000b8000  r2 : c03e0514  r1 : 00000000  r0 : c0304971
Flags: nZCv  IRQs on  FIQs off  Mode SVC_32  ISA ARM  Segment kernel

which is an access to the 0xb8000 without the PCI offset required to
make it work.

Fixes: cc22b4c185 ("ARM: set vga memory base at run-time")
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Cc: <stable@vger.kernel.org>
2013-11-30 14:45:32 +00:00
Russell King
d8aa712c30 ARM: fix booting low-vectors machines
Commit f6f91b0d9f (ARM: allow kuser helpers to be removed from the
vector page) required two pages for the vectors code.  Although the
code setting up the initial page tables was updated, the code which
allocates page tables for new processes wasn't, neither was the code
which tears down the mappings.  Fix this.

Fixes: f6f91b0d9f ("ARM: allow kuser helpers to be removed from the vector page")
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Cc: <stable@vger.kernel.org>
2013-11-30 14:45:31 +00:00
Russell King
11a5aa3256 ARM: dma-mapping: check DMA mask against available memory
Some buses have negative offsets, which causes the DMA mask checks to
falsely fail.  Fix this by using the actual amount of memory fitted in
the system.

Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2013-11-30 14:45:29 +00:00
Mark Rustad
8821754704 ixgbe: Make ixgbe_identify_qsfp_module_generic static
Correct a namespace complaint by making the function static
and moving the prototype into the .c file.

Signed-off-by: Mark Rustad <mark.d.rustad@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2013-11-30 00:21:47 -08:00
John Fastabend
8bf1264d2f ixgbe: turn NETIF_F_HW_L2FW_DOFFLOAD off by default
NETIF_F_HW_L2FW_DOFFLOAD allows upper layer net devices such
as macvlan to use queues in the hardware to directly submit and
receive skbs.

This creates a subtle change in the datapath though. One change
being the skb may no longer use the root devices qdisc.

Because users may not expect this we can't enable the feature
by default unless the hardware can offload all the software
functionality above it. So for now disable it by default and
let users opt in.

Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2013-11-30 00:15:16 -08:00
John Fastabend
ae72c8d068 ixgbe: ixgbe_fwd_ring_down needs to be static
When compiling with -Wstrict-prototypes gcc catches a static
I missed.

./ixgbe_main.c:4254: warning: no previous prototype for 'ixgbe_fwd_ring_down'

Reported-by: Phillip Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2013-11-30 00:08:45 -08:00
Vladimir Davydov
74a1b1ea8a e1000: fix possible reset_task running after adapter down
On e1000_down(), we should ensure every asynchronous work is canceled
before proceeding. Since the watchdog_task can schedule other works
apart from itself, it should be stopped first, but currently it is
stopped after the reset_task. This can result in the following race
leading to the reset_task running after the module unload:

e1000_down_and_stop():			e1000_watchdog():
----------------------			-----------------

cancel_work_sync(reset_task)
					schedule_work(reset_task)
cancel_delayed_work_sync(watchdog_task)

The patch moves cancel_delayed_work_sync(watchdog_task) at the beginning
of e1000_down_and_stop() thus ensuring the race is impossible.

Cc: Tushar Dave <tushar.n.dave@intel.com>
Cc: Patrick McHardy <kaber@trash.net>
Signed-off-by: Vladimir Davydov <vdavydov@parallels.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2013-11-30 00:02:12 -08:00
Vladimir Davydov
b2f963bfae e1000: fix lockdep warning in e1000_reset_task
The patch fixes the following lockdep warning, which is 100%
reproducible on network restart:

======================================================
[ INFO: possible circular locking dependency detected ]
3.12.0+ #47 Tainted: GF
-------------------------------------------------------
kworker/1:1/27 is trying to acquire lock:
 ((&(&adapter->watchdog_task)->work)){+.+...}, at: [<ffffffff8108a5b0>] flush_work+0x0/0x70

but task is already holding lock:
 (&adapter->mutex){+.+...}, at: [<ffffffffa0177c0a>] e1000_reset_task+0x4a/0xa0 [e1000]

which lock already depends on the new lock.

the existing dependency chain (in reverse order) is:

-> #1 (&adapter->mutex){+.+...}:
       [<ffffffff810bdb5d>] lock_acquire+0x9d/0x120
       [<ffffffff816b8cbc>] mutex_lock_nested+0x4c/0x390
       [<ffffffffa017233d>] e1000_watchdog+0x7d/0x5b0 [e1000]
       [<ffffffff8108b972>] process_one_work+0x1d2/0x510
       [<ffffffff8108ca80>] worker_thread+0x120/0x3a0
       [<ffffffff81092c1e>] kthread+0xee/0x110
       [<ffffffff816c3d7c>] ret_from_fork+0x7c/0xb0

-> #0 ((&(&adapter->watchdog_task)->work)){+.+...}:
       [<ffffffff810bd9c0>] __lock_acquire+0x1710/0x1810
       [<ffffffff810bdb5d>] lock_acquire+0x9d/0x120
       [<ffffffff8108a5eb>] flush_work+0x3b/0x70
       [<ffffffff8108b5d8>] __cancel_work_timer+0x98/0x140
       [<ffffffff8108b693>] cancel_delayed_work_sync+0x13/0x20
       [<ffffffffa0170cec>] e1000_down_and_stop+0x3c/0x60 [e1000]
       [<ffffffffa01775b1>] e1000_down+0x131/0x220 [e1000]
       [<ffffffffa0177c12>] e1000_reset_task+0x52/0xa0 [e1000]
       [<ffffffff8108b972>] process_one_work+0x1d2/0x510
       [<ffffffff8108ca80>] worker_thread+0x120/0x3a0
       [<ffffffff81092c1e>] kthread+0xee/0x110
       [<ffffffff816c3d7c>] ret_from_fork+0x7c/0xb0

other info that might help us debug this:

 Possible unsafe locking scenario:

       CPU0                    CPU1
       ----                    ----
  lock(&adapter->mutex);
                               lock((&(&adapter->watchdog_task)->work));
                               lock(&adapter->mutex);
  lock((&(&adapter->watchdog_task)->work));

 *** DEADLOCK ***

3 locks held by kworker/1:1/27:
 #0:  (events){.+.+.+}, at: [<ffffffff8108b906>] process_one_work+0x166/0x510
 #1:  ((&adapter->reset_task)){+.+...}, at: [<ffffffff8108b906>] process_one_work+0x166/0x510
 #2:  (&adapter->mutex){+.+...}, at: [<ffffffffa0177c0a>] e1000_reset_task+0x4a/0xa0 [e1000]

stack backtrace:
CPU: 1 PID: 27 Comm: kworker/1:1 Tainted: GF            3.12.0+ #47
Hardware name: System manufacturer System Product Name/P5B-VM SE, BIOS 0501    05/31/2007
Workqueue: events e1000_reset_task [e1000]
 ffffffff820f6000 ffff88007b9dba98 ffffffff816b54a2 0000000000000002
 ffffffff820f5e50 ffff88007b9dbae8 ffffffff810ba936 ffff88007b9dbac8
 ffff88007b9dbb48 ffff88007b9d8f00 ffff88007b9d8780 ffff88007b9d8f00
Call Trace:
 [<ffffffff816b54a2>] dump_stack+0x49/0x5f
 [<ffffffff810ba936>] print_circular_bug+0x216/0x310
 [<ffffffff810bd9c0>] __lock_acquire+0x1710/0x1810
 [<ffffffff8108a5b0>] ? __flush_work+0x250/0x250
 [<ffffffff810bdb5d>] lock_acquire+0x9d/0x120
 [<ffffffff8108a5b0>] ? __flush_work+0x250/0x250
 [<ffffffff8108a5eb>] flush_work+0x3b/0x70
 [<ffffffff8108a5b0>] ? __flush_work+0x250/0x250
 [<ffffffff8108b5d8>] __cancel_work_timer+0x98/0x140
 [<ffffffff8108b693>] cancel_delayed_work_sync+0x13/0x20
 [<ffffffffa0170cec>] e1000_down_and_stop+0x3c/0x60 [e1000]
 [<ffffffffa01775b1>] e1000_down+0x131/0x220 [e1000]
 [<ffffffffa0177c12>] e1000_reset_task+0x52/0xa0 [e1000]
 [<ffffffff8108b972>] process_one_work+0x1d2/0x510
 [<ffffffff8108b906>] ? process_one_work+0x166/0x510
 [<ffffffff8108ca80>] worker_thread+0x120/0x3a0
 [<ffffffff8108c960>] ? manage_workers+0x2c0/0x2c0
 [<ffffffff81092c1e>] kthread+0xee/0x110
 [<ffffffff81092b30>] ? __init_kthread_worker+0x70/0x70
 [<ffffffff816c3d7c>] ret_from_fork+0x7c/0xb0
 [<ffffffff81092b30>] ? __init_kthread_worker+0x70/0x70

== The issue background ==

The problem occurs, because e1000_down(), which is called under
adapter->mutex by e1000_reset_task(), tries to synchronously cancel
e1000 auxiliary works (reset_task, watchdog_task, phy_info_task,
fifo_stall_task), which take adapter->mutex in their handlers. So the
question is what does adapter->mutex protect there?

The adapter->mutex was introduced by commit 0ef4ee ("e1000: convert to
private mutex from rtnl") as a replacement for rtnl_lock() taken in the
asynchronous handlers. It targeted on fixing a similar lockdep warning
issued when e1000_down() was called under rtnl_lock(), and it fixed it,
but unfortunately it introduced the lockdep warning described above.
Anyway, that said the source of this bug is that the asynchronous works
were made to take rtnl_lock() some time ago, so let's look deeper and
find why it was added there.

The rtnl_lock() was added to asynchronous handlers by commit 338c15
("e1000: fix occasional panic on unload") in order to prevent
asynchronous handlers from execution after the module is unloaded
(e1000_down() is called) as it follows from the comment to the commit:

> Net drivers in general have an issue where timers fired
> by mod_timer or work threads with schedule_work are running
> outside of the rtnl_lock.
>
> With no other lock protection these routines are vulnerable
> to races with driver unload or reset paths.
>
> The longer term solution to this might be a redesign with
> safer locks being taken in the driver to guarantee no
> reentrance, but for now a safe and effective fix is
> to take the rtnl_lock in these routines.

I'm not sure if this locking scheme fixed the problem or just made it
unlikely, although I incline to the latter. Anyway, this was long time
ago when e1000 auxiliary works were implemented as timers scheduling
real work handlers in their routines. The e1000_down() function only
canceled the timers, but left the real handlers running if they were
running, which could result in work execution after module unload.
Today, the e1000 driver uses sane delayed works instead of the pair
timer+work to implement its delayed asynchronous handlers, and the
e1000_down() synchronously cancels all the works so that the problem
that commit 338c15 tried to cope with disappeared, and we don't need any
locks in the handlers any more. Moreover, any locking there can
potentially result in a deadlock.

So, this patch reverts commits 0ef4ee and 338c15.

Fixes: 0ef4eedc2e ("e1000: convert to private mutex from rtnl")
Fixes: 338c15e470 ("e1000: fix occasional panic on unload")
Cc: Tushar Dave <tushar.n.dave@intel.com>
Cc: Patrick McHardy <kaber@trash.net>
Signed-off-by: Vladimir Davydov <vdavydov@parallels.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2013-11-29 23:55:40 -08:00
yzhu1
6a7d64e3e0 e1000: prevent oops when adapter is being closed and reset simultaneously
This change is based on a similar change made to e1000e support in
commit bb9e44d0d0 ("e1000e: prevent oops when adapter is being closed
and reset simultaneously").  The same issue has also been observed
on the older e1000 cards.

Here, we have increased the RESET_COUNT value to 50 because there are too
many accesses to e1000 nic on stress tests to e1000 nic, it is not enough
to set RESET_COUT 25. Experimentation has shown that it is enough to set
RESET_COUNT 50.

Signed-off-by: yzhu1 <yanjun.zhu@windriver.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2013-11-29 23:49:05 -08:00
Akeem G Abodunrin
42ce4126d8 igb: Fixed Wake On LAN support
This patch fixes Wake on LAN being reported as supported on some Ethernet
ports, in contrary to Hardware capability.

Signed-off-by: Akeem G Abodunrin <akeem.g.abodunrin@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2013-11-29 23:42:35 -08:00
Tejun Heo
bfc5c17337 sysfs, kernfs: remove cross inclusions of internal headers
fs/kernfs/kernfs-internal.h needed to include fs/sysfs/sysfs.h because
part of kernfs core implementation was living in sysfs.

fs/sysfs/sysfs.h needed to include fs/kernfs/kernfs-internal.h because
include/linux/kernfs.h didn't expose enough interface.

The separation is complete and neither is true anymore.  Remove the
cross inclusion and make sysfs a proper user of kernfs.

Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-11-29 18:54:50 -08:00
Tejun Heo
ac9bba0310 sysfs, kernfs: implement kernfs_ns_enabled()
fs/sysfs/symlink.c::sysfs_delete_link() tests @sd->s_flags for
SYSFS_FLAG_NS.  Let's add kernfs_ns_enabled() so that sysfs doesn't
have to test sysfs_dirent flag directly.  This makes things tidier for
kernfs proper too.

This is purely cosmetic.

v2: To avoid possible NULL deref, use noop dummy implementation which
    always returns false when !CONFIG_SYSFS.

Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-11-29 18:41:28 -08:00
Tejun Heo
cf9e5a73aa sysfs, kernfs: make sysfs_dirent definition public
sysfs_dirent includes some information which should be available to
kernfs users - the type, flags, name and parent pointer.  This patch
moves sysfs_dirent definition from kernfs/kernfs-internal.h to
include/linux/kernfs.h so that kernfs users can access them.

The type part of flags is exported as enum kernfs_node_type, the flags
kernfs_node_flag, sysfs_type() and kernfs_enable_ns() are moved to
include/linux/kernfs.h and the former is updated to return the enum
type.  sysfs_dirent->s_parent and ->s_name are marked explicitly as
public.

This patch doesn't introduce any functional changes.

v2: Flags exported too and kernfs_enable_ns() definition moved.

v3: While moving kernfs_enable_ns() to include/linux/kernfs.h, v1 and
    v2 put the definition outside CONFIG_SYSFS replacing the dummy
    implementation with the actual implementation too.  Unfortunately,
    this can lead to oops when !CONFIG_SYSFS because
    kernfs_enable_ns() may be called on a NULL @sd and now tries to
    dereference @sd instead of not doing anything.  This issue was
    reported by Yuanhan Liu.

Signed-off-by: Tejun Heo <tj@kernel.org>
Reported-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-11-29 18:21:01 -08:00
Tejun Heo
fa736a951e sysfs, kernfs: move mount core code to fs/kernfs/mount.c
Move core mount code to fs/kernfs/mount.c.  The respective
declarations in fs/sysfs/sysfs.h are moved to
fs/kernfs/kernfs-internal.h.

This is pure relocation.

Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-11-29 18:16:08 -08:00
Tejun Heo
4b93dc9b1c sysfs, kernfs: prepare mount path for kernfs
We're in the process of separating out core sysfs functionality into
kernfs which will deal with sysfs_dirents directly.  This patch
rearranges mount path so that the kernfs and sysfs parts are separate.

* As sysfs_super_info won't be visible outside kernfs proper,
  kernfs_super_ns() is added to allow kernfs users to access a
  super_block's namespace tag.

* Generic mount operation is separated out into kernfs_mount_ns().
  sysfs_mount() now just performs sysfs-specific permission check,
  acquires namespace tag, and invokes kernfs_mount_ns().

* Generic superblock release is separated out into kernfs_kill_sb()
  which can be used directly as file_system_type->kill_sb().  As sysfs
  needs to put the namespace tag, sysfs_kill_sb() wraps
  kernfs_kill_sb() with ns tag put.

* sysfs_dir_cachep init and sysfs_inode_init() are separated out into
  kernfs_init().  kernfs_init() uses only small amount of memory and
  trying to handle and propagate kernfs_init() failure doesn't make
  much sense.  Use SLAB_PANIC for sysfs_dir_cachep and make
  sysfs_inode_init() panic on failure.

  After this change, kernfs_init() should be called before
  sysfs_init(), fs/namespace.c::mnt_init() modified accordingly.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: linux-fsdevel@vger.kernel.org
Cc: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-11-29 18:16:08 -08:00
Tejun Heo
df394fb56c sysfs, kernfs: make super_blocks bind to different kernfs_roots
kernfs is being updated to allow multiple sysfs_dirent hierarchies so
that it can also be used by other users.  Currently, sysfs
super_blocks are always attached to one kernfs_root - sysfs_root - and
distinguished only by their namespace tags.

This patch adds sysfs_super_info->root and update
sysfs_fill/test_super() so that super_blocks are identified by the
combination of both the associated kernfs_root and namespace tag.
This allows mounting different kernfs hierarchies.

Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-11-29 18:10:48 -08:00
Tejun Heo
bc755553df sysfs, kernfs: make inode number ida per kernfs_root
kernfs is being updated to allow multiple sysfs_dirent hierarchies so
that it can also be used by other users.  Currently, inode number is
allocated using a global ida, sysfs_ino_ida; however, inos for
different hierarchies should be handled separately.

This patch makes ino allocation per kernfs_root.  sysfs_ino_ida is
replaced by kernfs_root->ino_ida and sysfs_new_dirent() is updated to
take @root and allocate ino from it.  ida_simple_get/remove() are used
instead of sysfs_ino_lock and sysfs_alloc/free_ino().

Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-11-29 18:10:48 -08:00
Tejun Heo
ba7443bc65 sysfs, kernfs: implement kernfs_create/destroy_root()
There currently is single kernfs hierarchy in the whole system which
is used for sysfs.  kernfs needs to support multiple hierarchies to
allow other users.  This patch introduces struct kernfs_root which
serves as the root of each kernfs hierarchy and implements
kernfs_create/destroy_root().

* Each kernfs_root is associated with a root sd (sysfs_dentry).  The
  root is freed when the root sd is released and kernfs_destory_root()
  simply invokes kernfs_remove() on the root sd.  sysfs_remove_one()
  is updated to handle release of the root sd.  Note that ps_iattr
  update in sysfs_remove_one() is trivially updated for readability.

* Root sd's are now dynamically allocated using sysfs_new_dirent().
  Update sysfs_alloc_ino() so that it gives out ino from 1 so that the
  root sd still gets ino 1.

* While kernfs currently only points to the root sd, it'll soon grow
  fields which are specific to each hierarchy.  As determining a given
  sd's root will be necessary, sd->s_dir.root is added.  This backlink
  fits better as a separate field in sd; however, sd->s_dir is inside
  union with space to spare, so use it to save space and provide
  kernfs_root() accessor to determine the root sd.

* As hierarchies may be destroyed now, each mount needs to hold onto
  the hierarchy it's attached to.  Update sysfs_fill_super() and
  sysfs_kill_sb() so that they get and put the kernfs_root
  respectively.

* sysfs_root is replaced with kernfs_root which is dynamically created
  by invoking kernfs_create_root() from sysfs_init().

This patch doesn't introduce any visible behavior changes.

v2: kernfs_create_root() forgot to set @sd->priv.  Fixed.

Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-11-29 18:10:48 -08:00
Roberto Sassu
af91706d5d ima: store address of template_fmt_copy in a pointer before calling strsep
This patch stores the address of the 'template_fmt_copy' variable in a new
variable, called 'template_fmt_ptr', so that the latter is passed as an
argument of strsep() instead of the former. This modification is needed
in order to correctly free the memory area referenced by
'template_fmt_copy' (strsep() modifies the pointer of the passed string).

Signed-off-by: Roberto Sassu <roberto.sassu@polito.it>
Reported-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Mimi Zohar <zohar@us.ibm.com>
Signed-off-by: James Morris <james.l.morris@oracle.com>
2013-11-30 13:09:53 +11:00
Tejun Heo
061447a496 sysfs, kernfs: introduce sysfs_root_sd
Currently, it's assumed that there's a single kernfs hierarchy in the
system anchored at sysfs_root which is defined as a global struct.  To
allow other users of kernfs, this will be made dynamic.  Introduce a
new global variable sysfs_root_sd which points to &sysfs_root and
convert all &sysfs_root users.

This patch doesn't introduce any behavior difference.

Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-11-29 18:09:27 -08:00
Tejun Heo
9e30cc9595 sysfs, kernfs: no need to kern_mount() sysfs from sysfs_init()
It has been very long since sysfs depended on vfs to keep track of
internal states and whether sysfs is mounted or not doesn't make any
difference to sysfs's internal operation.

In addition to init and filesystem type registration, sysfs_init()
invokes kern_mount() to create in-kernel mount of sysfs.  This
internal mounting doesn't server any purpose anymore.  Remove it.

Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-11-29 18:09:27 -08:00
Tejun Heo
51a35e9fd0 sysfs, kernfs: make sysfs_super_info->ns const
Add const qualifier to sysfs_super_info->ns so that it's consistent
with other namespace tag usages in sysfs.  Because kobject doesn't use
const qualifier for namespace tags, this ends up requiring an explicit
cast to drop const qualifier in free_sysfs_super_info().

Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-11-29 18:09:27 -08:00
Tejun Heo
ccc532dc12 sysfs, kernfs: drop unused params from sysfs_fill_super()
sysfs_fill_super() takes three params - @sb, @data and @silent - but
uses only @sb.  Drop the latter two.

Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-11-29 18:08:39 -08:00