The software stats are updated from BH, this change ensures that 32b
UP hosts use appropriate protection.
Tested:
- HW/SW stats consistent with pktgen (in particular tx=rx)
- HW/SW stats consistent when tx/rx offloads disabled
- no problem with+without lockdep (SMP 16-way)
Signed-off-by: David Decotigny <david.decotigny@google.com>
Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This is very similar to bnx2x conversion, but bnx2 only requires 16bytes
alignement at start of the received frame to store its l2_fhdr, so goal
was not to reduce skb truesize (in fact it should not change after this
patch)
Using build_skb() reduces cache line misses in the driver, since we
use cache hot skb instead of cold ones. Number of in-flight sk_buff
structures is lower, they are more likely recycled in SLUB caches
while still hot.
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
CC: Michael Chan <mchan@broadcom.com>
CC: Eilon Greenstein <eilong@broadcom.com>
Reviewed-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Some of the sun4v code patching occurs in inline functions visible
to, and usable by, modules.
Therefore we have to patch them up during module load.
Signed-off-by: David S. Miller <davem@davemloft.net>
If IRQ was never initialized, then calling napi_disable() would hang.
Add more bookkeeping to track whether IRQ was ever initialized.
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The hardware has a restriction that the minimum ring size possible
is 128. The number of elements used is controlled by tx_pending and
the overall number of elements in the ring tx_ring_size, therefore it
is okay to limit the number of elements in use to a small value (63)
but still provide a bigger ring.
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
To handle the large physical addresses, just make a simple wrapper
around remap_pfn_range() like MIPS does.
Signed-off-by: David S. Miller <davem@davemloft.net>
When changing mode via bonding's sysfs, the slaves are not initialized
correctly. Forbid to change modes with slaves present to ensure that every
slave is initialized correctly via bond_enslave().
Signed-off-by: Veaceslav Falico <vfalico@redhat.com>
Signed-off-by: Andy Gospodarek <andy@greyhouse.net>
Acked-by: Nicolas de Pesloüan <nicolas.2p.debian@free.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
Most machines dont use RPS/RFS, and pay a fair amount of instructions in
netif_receive_skb() / netif_rx() / get_rps_cpu() just to discover
RPS/RFS is not setup.
Add a jump_label named rps_needed.
If no device rps_map or global rps_sock_flow_table is setup,
netif_receive_skb() / netif_rx() do a single instruction instead of many
ones, including conditional jumps.
jmp +0 (if CONFIG_JUMP_LABEL=y)
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
CC: Tom Herbert <therbert@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The re-binding (unbind..bind) of an EMAC device fails because the
static variable "busy_phy_map" is not updated when the device is
removed.
Signed-off-by: Wolfgang Grandegger <wg@denx.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
We pull one byte (the MAC header) from the first fragment before the
fragment is actually appended. So the socket buffer length is 1, not 0.
Signed-off-by: Rémi Denis-Courmont <remi.denis-courmont@nokia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix build regression introduced by commit 056879d2f2
(ARM: mach-shmobile: sh7372 A3SP no_suspend_console fix) by moving
the intialization of the A3SP domain to a separate function and
providing an empty definition of it for CONFIG_PM unset.
Reported-and-tested-by: Guennadi Liakhovetski <g.liakhovetski@gmx.de>
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Commit 4ca46ff3e0 (PM / Sleep: Mark
devices involved in wakeup signaling during suspend) introduced
the power.wakeup_path field in struct dev_pm_info to mark devices
whose children are enabled to wake up the system from sleep states,
so that power domains containing the parents that provide their
children with wakeup power and/or relay their wakeup signals are not
turned off. Unfortunately, that introduced a PM regression on SH7372
whose power consumption in the system "memory sleep" state increased
as a result of it, because it prevented the power domain containing
the I2C controller from being turned off when some children of that
controller were enabled to wake up the system, although the
controller was not necessary for them to signal wakeup.
To fix this issue use the observation that devices whose
power.ignore_children flag is set for runtime PM should be treated
analogously during system suspend. Namely, they shouldn't be
included in wakeup paths going through their children. Since the
SH7372 I2C controller's power.ignore_children flag is set, doing so
will restore the previous behavior of that SOC.
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Acked-by: Greg Kroah-Hartman <gregkh@suse.de>
These two syscalls were introduced during the last merge window.
Add the entries into the ARM call tables for them.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
This reverts commit a15bd354f0.
It exceeded the padding on the SREGS struct, rendering the ABI
backwards-incompatible.
Conflicts:
arch/powerpc/kvm/powerpc.c
include/linux/kvm.h
Signed-off-by: Avi Kivity <avi@redhat.com>
Support guest/host-only profiling by switch perf msrs on
a guest entry if needed.
Signed-off-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Some cpus have special support for switching PERF_GLOBAL_CTRL msr.
Add logic to detect if such support exists and works properly and extend
msr switching code to use it if available. Also extend number of generic
msr switching entries to 8.
Signed-off-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
KVM on s390 always had a sync mmu. Any mapping change in userspace
mapping was always reflected immediately in the guest mapping.
- In older code the guest mapping was just an offset
- In newer code the last level page table is shared
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Carsten Otte <cotte@de.ibm.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
There is a potential host deadlock in the tprot intercept handling.
We must not hold the mmap semaphore while resolving the guest
address. If userspace is remapping, then the memory detection in
the guest is broken anyway so we can safely separate the
address translation from walking the vmas.
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Carsten Otte <cotte@de.ibm.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
SIGP sense running may cause an intercept on higher level
virtualization, so handle it by checking the CPUSTAT_RUNNING flag.
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Carsten Otte <cotte@de.ibm.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
CPUSTAT_RUNNING was implemented signifying that a vcpu is not stopped.
This is not, however, what the architecture says: RUNNING should be
set when the host is acting on the behalf of the guest operating
system.
CPUSTAT_RUNNING has been changed to be set in kvm_arch_vcpu_load()
and to be unset in kvm_arch_vcpu_put().
For signifying stopped state of a vcpu, a host-controlled bit has
been used and is set/unset basically on the reverse as the old
CPUSTAT_RUNNING bit (including pushing it down into stop handling
proper in handle_stop()).
Cc: stable@kernel.org
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Carsten Otte <cotte@de.ibm.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
If there is an architecture-specific random number generator we use it
to acquire randomness one "long" at a time. We should put these random
words into consecutive words in the result buffer - not just overwrite
the first word again and again.
Signed-off-by: Tony Luck <tony.luck@intel.com>
Acked-by: H. Peter Anvin <hpa@zytor.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Fix build warnings:
drivers/platform/x86/dell-laptop.c:592:13: warning: function declaration isn't a prototype
drivers/platform/x86/dell-laptop.c:599:13: warning: function declaration isn't a prototype
Signed-off-by: Randy Dunlap <rdunlap@xenotime.net>
Cc: Matthew Garrett <mjg@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Fix x86 allyesconfig builds. Builds fail due to a non-static variable
named 'debug' in drivers/staging/media/as102:
arch/x86/built-in.o:arch/x86/kernel/entry_32.S:1296: first defined here
ld: Warning: size of symbol `debug' changed from 90 in arch/x86/built-in.o to 4 in drivers/built-in.o
Thou shalt have no non-static identifiers that are named 'debug'.
Signed-off-by: Randy Dunlap <rdunlap@xenotime.net>
Cc: Pierrick Hascoet <pierrick.hascoet@abilis.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Rename dependency of EXYNOS4_TMU in Kconfig to the existing one.
Signed-off-by: Donggeun Kim <dg77.kim@samsung.com>
Signed-off-by: MyungJoo Ham <myungjoo.ham@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
Signed-off-by: Guenter Roeck <guenter.roeck@ericsson.com>
This patch fixes a crash when non existing IPv6 route is tried to be changed.
When new destination node was inserted in middle of FIB6 tree, no relevant
sanity checks were performed. Later route insertion might have been prevented
due to invalid request, causing node with no rt info being left in tree.
When this node was accessed, a crash occurred.
Patch adds missing checks in fib6_add_1()
Signed-off-by: Matti Vaittinen <Mazziesaccount@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch fixes the pm functions to avoid the system
sleeps while a spinlock is taken.
Signed-off-by: Francesco Virlinzi <francesco.virlinzi@st.com>
Signed-off-by: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch removes un-needed spin_lock in stmmac_ioctl while reading and
writing mdio registers. While holding spin_lock the code must be
atomic, which is not true in this case as both mdiobus_read and writes
have mutex locks.
Without this patch reading mdio registers via mii-tool results in below
BUG:
mii-tool -vvv eth0"
Using SIOCGMIIPHY=0x8947
BUG: sleeping function called from invalid context at kernel/mutex.c:287
in_atomic(): 1, irqs_disabled(): 0, pid: 614, name: mii-tool
2 locks held by mii-tool/614:
#0: (rtnl_mutex){......}, at: [<c01fd80c>] dev_ioctl+0x550/0x674
#1: (&priv->lock){......}, at: [<c01b34ec>] stmmac_ioctl+0x4c/0x78
[<c002ea14>] (unwind_backtrace+0x0/0xcc) from [<c0272c38>]
(mutex_lock_nested+0x24/0x35c)
[<c0272c38>] (mutex_lock_nested+0x24/0x35c) from [<c01b237c>]
(mdiobus_read+0x44/0x70)
[<c01b237c>] (mdiobus_read+0x44/0x70) from [<c01b0c64>]
(phy_mii_ioctl+0x4c/0x138)
[<c01b0c64>] (phy_mii_ioctl+0x4c/0x138) from [<c01b34fc>]
(stmmac_ioctl+0x5c/0x78)
[<c01b34fc>] (stmmac_ioctl+0x5c/0x78) from [<c01fcec8>]
(dev_ifsioc+0x2a4/0x2c8)
[<c01fcec8>] (dev_ifsioc+0x2a4/0x2c8) from [<c01fd81c>]
(dev_ioctl+0x560/0x674)
[<c01fd81c>] (dev_ioctl+0x560/0x674) from [<c00c36e0>]
(vfs_ioctl+0x2c/0x8c)
[<c00c36e0>] (vfs_ioctl+0x2c/0x8c) from [<c00c4130>]
(do_vfs_ioctl+0x530/0x578)
[<c00c4130>] (do_vfs_ioctl+0x530/0x578) from [<c00c41ac>]
(sys_ioctl+0x34/0x54)
[<c00c41ac>] (sys_ioctl+0x34/0x54) from [<c0028aa0>]
(ret_fast_syscall+0x0/0x2c)
Signed-off-by: Srinivas Kandagatla <srinivas.kandagatla@st.com>
Signed-off-by: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
New GMAC devices (newer than the databook 3.50a) have the
HW capability register that provides which features are actually
supported by the hardware.
On old devices many information have to be passed through the
platform, for example: enhanced descriptor structure,
TX COE etc. These are mandatory to properly configure the driver.
This remains still valid because the driver has to support old
Synopsys devices but now it's also able to override them using the
values from the HW capability register if supported.
Signed-off-by: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch fixes the way to stop the 1000Base advertising
capabilties for non GMII interfaces.
Signed-off-by: Srinivas Kandagatla <srinivas.kandagatla@st.com>
Acked-by: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch uses an mdelay to manage the timeout on
sw reset to be independant of cpu_clk.
Signed-off-by: Francesco Virlinzi <francesco.virlinzi@st.com>
Reviewed-by: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
On PPC64, put_sigset_t converts a sigset_t to a compat_sigset_t
before copying it to userspace. There is a typo in the case that
we have 4 words to copy, meaning that we corrupt the compat_sigset_t.
It appears that _NSIG_WORDS can't be greater than 2 at the moment
so this code is probably always optimised away anyway.
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
The Documentation/memory-barriers.txt document requires that atomic
operations that return a value act as a memory barrier both before
and after the actual atomic operation.
Our current implementation doesn't guarantee this. More specifically,
while a load following the isync can not be issued before stwcx. has
completed, that completion doesn't architecturally means that the
result of stwcx. is visible to other processors (or any previous stores
for that matter) (typically, the other processors L1 caches can still
hold the old value).
This has caused an actual crash in RCU torture testing on Power 7
This fixes it by changing those atomic ops to use new macros instead
of RELEASE/ACQUIRE barriers, called ATOMIC_ENTRY and ATMOIC_EXIT barriers,
which are then defined respectively to lwsync and sync.
I haven't had a chance to measure the performance impact (or rather
what I measured with kernel compiles is in the noise, I yet have to
find a more precise benchmark)
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Recent binutils refuses to assemble AltiVec opcodes when in e500/SPE
mode, as some of those opcodes alias the "SPE" instructions. This
triggers an ancient binutils version check even when building a kernel
with CONFIG_ALTIVEC disabled.
In theory, the check could be conditionalized on CONFIG_ALTIVEC, but in
practice it has long outlived its utility. It is virtually impossible
to find binutils older than 2.12.1 (released 2002) in the wild anymore.
Even ancient RedHat Enterprise Linux 4 has binutils-2.14.
To fix the kernel build when done natively on e500 systems with this new
binutils, the test is simply removed.
Signed-off-by: Kyle Moffett <Kyle.D.Moffett@boeing.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
With the introduction of CONFIG_PPC_ADV_DEBUG_REGS user space debug is
broken on Book-E 64-bit parts that support delayed debug events. When
switch_booke_debug_regs() sets DBCR0 we'll start getting debug events as
MSR_DE is also set and we aren't able to handle debug events from kernel
space.
We can remove the hack that always enables MSR_DE and loads up DBCR0 and
just utilize switch_booke_debug_regs() to get user space debug working
again.
We still need to handle critical/debug exception stacks & proper
save/restore of state for those exception levles to support debug events
from kernel space like we have on 32-bit.
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
All of DebugException is already protected by CONFIG_PPC_ADV_DEBUG_REGS
there is no need to have another such ifdef inside the function.
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
We had an existing ifdef for 4xx & BOOKE processors that got changed to
CONFIG_PPC_ADV_DEBUG_REGS. The define has nothing to do with
CONFIG_PPC_ADV_DEBUG_REGS. The define really should be:
#if defined(CONFIG_4xx) || defined(CONFIG_BOOKE)
and not
#ifdef CONFIG_PPC_ADV_DEBUG_REGS
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Mark stats timer as deferrable: punctuality in waking the stats timer
callback doesn't matter much, as it is responsible only to avoid
integer wraparound.
We need at least 1 other timer to fire within 17s (fully loaded 1Gbps)
to avoid wrap-arounds. Desired period is still 10s.
Signed-off-by: David Decotigny <david.decotigny@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This adds code to update the stats counter for dropped RX frames.
Signed-off-by: David Decotigny <david.decotigny@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This commit implements the ndo_get_stats64() API for forcedeth. Since
hardware stats are being updated from different contexts (process and
timer), this commit adds synchronization. For software stats, it
relies on the u64_stats_sync.h API.
Tested:
- 16-way SMP x86_64 ->
RX bytes:7244556582 (7.2 GB) TX bytes:181904254 (181.9 MB)
- pktgen + loopback: identical rx_bytes/tx_bytes and rx_packets/tx_packets
Signed-off-by: David Decotigny <david.decotigny@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This adds a new module parameter "debug_tx_timeout" to silence most
debug messages in case of TX timeout. These messages don't provide a
signal/noise ratio high enough for production systems and, with ~30kB
logged each time, they tend to add to a cascade effect if the system
is already under stress (memory pressure, disk, etc.).
By default, the parameter is clear, meaning that only a single warning
will be reported.
Signed-off-by: David Decotigny <david.decotigny@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>