This will help figuring out GPU when looking at bugs log.
Signed-off-by: Jerome Glisse <jglisse@redhat.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Transmitted skb might be attached to a socket and a destructor, for
memory accounting purposes.
Traditionally, this destructor is called at tx completion time, when skb
is freed.
When tx completion is performed by another cpu than the sender, this
forces some cache lines to change ownership. XPS was an attempt to give
tx completion to initial cpu.
David idea is to call destructor right before giving skb to device (call
to ndo_start_xmit()). Because device queues are usually small, orphaning
skb before tx completion is not a big deal. Some drivers already do
this, we could do it in upper level.
There is one known exception to this early orphaning, called tx
timestamping. It needs to keep a reference to socket until device can
give a hardware or software timestamp.
This patch adds a skb_orphan_try() helper, to centralize all exceptions
to early orphaning in one spot, and use it in dev_hard_start_xmit().
"tbench 16" results on a Nehalem machine (2 X5570 @ 2.93GHz)
before: Throughput 4428.9 MB/sec 16 procs
after: Throughput 4448.14 MB/sec 16 procs
UDP should get even better results, its destructor being more complex,
since SOCK_USE_WRITE_QUEUE is not set (four atomic ops instead of one)
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
- There is no point to enforce a time limit in process_backlog(), since
other napi instances dont follow same rule. We can exit after only one
packet processed...
The normal quota of 64 packets per napi instance should be the norm, and
net_rx_action() already has its own time limit.
Note : /proc/net/core/dev_weight can be used to tune this 64 default
value.
- Use DEFINE_PER_CPU_ALIGNED for softnet_data definition.
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Commit 5a0e3ad causes slab.h to be included twice in many of the
Gigaset driver's source files, first via the common include file
gigaset.h and then a second time directly. Drop the spares, and
use the opportunity to clean up a few more similar cases.
Impact: cleanup, no functional change
Signed-off-by: Tilman Schmidt <tilman@imap.cc>
CC: Tejun Heo <tj@kernel.org>
Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/anholt/drm-intel:
drm/i915: Ignore LVDS EDID when it is unavailabe or invalid
drm/i915: Add no_lvds entry for the Clientron U800
drm/i915: Rename many remaining uses of "output" to encoder or connector.
drm/i915: Rename intel_output to intel_encoder.
agp/intel: intel_845_driver is an agp driver!
drm/i915: introduce to_intel_bo helper
drm/i915: Disable FBC on 915GM and 945GM.
Use local_irq_restore in this error-handling case just like in the one just
below.
A simplified version of the semantic patch that finds this problem is as
follows: (http://coccinelle.lip6.fr/)
// <smpl>
@r exists@
expression E1;
identifier f;
@@
f (...) { <+...
* local_irq_save (E1,...);
... when != E1
* return ...;
...+> }
// </smpl>
Signed-off-by: Julia Lawall <julia@diku.dk>
Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
Many PCMCIA GPRS modems like: Onda Edge N100E, Novaway PC98 (OEM SPC98Z),
Rovermate Edgus Adaptmate-039 and others have same construction and
identification:
lspcmcia -vvv
Product Name: Generic Modem: MD55x 1.00 Serial number: xxxxx-xxx
Identification: manf_id: 0x015d card_id: 0x4c45
function: 2 (serial)
prod_id(1): "Generic" (0xc49e4731)
prod_id(2): "Modem: MD55x" (0x8913b110)
prod_id(3): "1.00" (0x83dbf271)
prod_id(4): "Serial number: xxxxx-xxx" (0x73ee9514)
Serial connection to GSM module based on Elan VPU16551 PCMCIA UART with
datasheet recommeded 14.7456MHz crystal oscillator.
By default serial_cs set UART clock == 1843200 Hz
For correct work need set clock 14745600 Hz.
This quirk already present in driver, only need add device in quirk list.
Signed-off-by: Timur Maximov <xcom.org@gmail.com>
Acked-by: Alan Cox <alan@linux.intel.com>
Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
pccard_validate_cis() nowadays destroys the CIS cache. Therefore,
calling it after card setup should be avoided. We can't control
the deprecated PCMCIA ioctl (which is only used on ARM nowadays),
but we can avoid -- and report -- any other calls.
Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
This patch implements receive flow steering (RFS). RFS steers
received packets for layer 3 and 4 processing to the CPU where
the application for the corresponding flow is running. RFS is an
extension of Receive Packet Steering (RPS).
The basic idea of RFS is that when an application calls recvmsg
(or sendmsg) the application's running CPU is stored in a hash
table that is indexed by the connection's rxhash which is stored in
the socket structure. The rxhash is passed in skb's received on
the connection from netif_receive_skb. For each received packet,
the associated rxhash is used to look up the CPU in the hash table,
if a valid CPU is set then the packet is steered to that CPU using
the RPS mechanisms.
The convolution of the simple approach is that it would potentially
allow OOO packets. If threads are thrashing around CPUs or multiple
threads are trying to read from the same sockets, a quickly changing
CPU value in the hash table could cause rampant OOO packets--
we consider this a non-starter.
To avoid OOO packets, this solution implements two types of hash
tables: rps_sock_flow_table and rps_dev_flow_table.
rps_sock_table is a global hash table. Each entry is just a CPU
number and it is populated in recvmsg and sendmsg as described above.
This table contains the "desired" CPUs for flows.
rps_dev_flow_table is specific to each device queue. Each entry
contains a CPU and a tail queue counter. The CPU is the "current"
CPU for a matching flow. The tail queue counter holds the value
of a tail queue counter for the associated CPU's backlog queue at
the time of last enqueue for a flow matching the entry.
Each backlog queue has a queue head counter which is incremented
on dequeue, and so a queue tail counter is computed as queue head
count + queue length. When a packet is enqueued on a backlog queue,
the current value of the queue tail counter is saved in the hash
entry of the rps_dev_flow_table.
And now the trick: when selecting the CPU for RPS (get_rps_cpu)
the rps_sock_flow table and the rps_dev_flow table for the RX queue
are consulted. When the desired CPU for the flow (found in the
rps_sock_flow table) does not match the current CPU (found in the
rps_dev_flow table), the current CPU is changed to the desired CPU
if one of the following is true:
- The current CPU is unset (equal to RPS_NO_CPU)
- Current CPU is offline
- The current CPU's queue head counter >= queue tail counter in the
rps_dev_flow table. This checks if the queue tail has advanced
beyond the last packet that was enqueued using this table entry.
This guarantees that all packets queued using this entry have been
dequeued, thus preserving in order delivery.
Making each queue have its own rps_dev_flow table has two advantages:
1) the tail queue counters will be written on each receive, so
keeping the table local to interrupting CPU s good for locality. 2)
this allows lockless access to the table-- the CPU number and queue
tail counter need to be accessed together under mutual exclusion
from netif_receive_skb, we assume that this is only called from
device napi_poll which is non-reentrant.
This patch implements RFS for TCP and connected UDP sockets.
It should be usable for other flow oriented protocols.
There are two configuration parameters for RFS. The
"rps_flow_entries" kernel init parameter sets the number of
entries in the rps_sock_flow_table, the per rxqueue sysfs entry
"rps_flow_cnt" contains the number of entries in the rps_dev_flow
table for the rxqueue. Both are rounded to power of two.
The obvious benefit of RFS (over just RPS) is that it achieves
CPU locality between the receive processing for a flow and the
applications processing; this can result in increased performance
(higher pps, lower latency).
The benefits of RFS are dependent on cache hierarchy, application
load, and other factors. On simple benchmarks, we don't necessarily
see improvement and sometimes see degradation. However, for more
complex benchmarks and for applications where cache pressure is
much higher this technique seems to perform very well.
Below are some benchmark results which show the potential benfit of
this patch. The netperf test has 500 instances of netperf TCP_RR
test with 1 byte req. and resp. The RPC test is an request/response
test similar in structure to netperf RR test ith 100 threads on
each host, but does more work in userspace that netperf.
e1000e on 8 core Intel
No RFS or RPS 104K tps at 30% CPU
No RFS (best RPS config): 290K tps at 63% CPU
RFS 303K tps at 61% CPU
RPC test tps CPU% 50/90/99% usec latency Latency StdDev
No RFS/RPS 103K 48% 757/900/3185 4472.35
RPS only: 174K 73% 415/993/2468 491.66
RFS 223K 73% 379/651/1382 315.61
Signed-off-by: Tom Herbert <therbert@google.com>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The af_packet protocol is used by Perl to do ioctls as reported by
Stephane Riviere:
"Net::RawIP relies on SIOCGIFADDR et SIOCGIFHWADDR to get the IP and MAC
addresses of the network interface."
But in a new network namespace these ioctl fail because it is disabled for
a namespace different from the init_net_ns.
These two lines should not be there as af_inet and af_packet are
namespace aware since a long time now. I suppose we forget to remove these
lines because we sent the af_packet first, before af_inet was supported.
Signed-off-by: Daniel Lezcano <daniel.lezcano@free.fr>
Reported-by: Stephane Riviere <stephane.riviere@regis-dgac.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
tx_queue is used as a temporary queue when not allowed to queue skb
directly to the hw device driver (which may sleep). Most paths flush
it before returning, but ppp_start() currently cannot. Make sure we
don't leave skbs pointing to a non-existent device.
Thanks to Michael Barkowski for reporting this problem.
Signed-off-by: Krzysztof Hałasa <khc@pm.waw.pl>
Signed-off-by: David S. Miller <davem@davemloft.net>
For 6000 series, the 2.4G HT40 band regulatory settings address in EEPROM
was off by 2.
Before the fix, you'll see this in dmesg:
[79535.788877] ieee80211 phy8: U iwl_mod_ht40_chan_info HT40 Ch. 7 [2.4GHz]
WIDE (0x61 0dBm): Ad-Hoc not supported
[79535.788880] ieee80211 phy8: U iwl_mod_ht40_chan_info HT40 Ch. 11 [2.4GHz]
WIDE (0x61 0dBm): Ad-Hoc not supported
And after the fix:
[91132.688706] ieee80211 phy14: U iwl_mod_ht40_chan_info HT40 Ch. 7 [2.4GHz]
IBSS ACTIVE WIDE (0x6f 0dBm): Ad-Hoc supported
[91132.688709] ieee80211 phy14: U iwl_mod_ht40_chan_info HT40 Ch. 11 [2.4GHz]
IBSS ACTIVE WIDE (0x6f 0dBm): Ad-Hoc supported
Signed-off-by: Shanyu Zhao <shanyu.zhao@intel.com>
Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
When an internal scan is started, nothing protects the
is_internal_short_scan variable which can cause crashes,
cf. https://bugzilla.kernel.org/show_bug.cgi?id=15667.
Fix this by making the short scan request use the mutex
for locking, which requires making the request go to a
work struct so that it can sleep.
Reported-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
access_bit_width field is u8 in ACPICA, thus 256 value written to it
becomes 0, causing divide by zero later.
Proper fix would be to remove access_bit_width at all, just because
we already have access_byte_width, which is access_bit_width / 8.
Limit access width to 64 bit for now.
https://bugzilla.kernel.org/show_bug.cgi?id=15749
fixes regression caused by the fix for:
https://bugzilla.kernel.org/show_bug.cgi?id=14667
Signed-off-by: Alexey Starikovskiy <astarikovskiy@suse.de>
Signed-off-by: Len Brown <len.brown@intel.com>
Any inode reclaim flush that returns EAGAIN will result in the inode
reclaim being attempted again later. There is no need to issue a
warning into the logs about this situation.
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Alex Elder <aelder@sgi.com>
Signed-off-by: Alex Elder <aelder@sgi.com>
Updates to the VFS layer removed an extra ->sync_fs call into the
filesystem during the sync process (from the quota code).
Unfortunately the sync code was unknowingly relying on this call to
make sure metadata buffers were flushed via a xfs_buftarg_flush()
call to move the tail of the log forward in memory before the final
transactions of the sync process were issued.
As a result, the old code would write a very recent log tail value
to the log by the end of the sync process, and so a subsequent crash
would leave nothing for log recovery to do. Hence in qa test 182,
log recovery only replayed a small handle for inode fsync
transactions in this case.
However, with the removal of the extra ->sync_fs call, the log tail
was now not moved forward with the inode fsync transactions near the
end of the sync procese the first (and only) buftarg flush occurred
after these transactions went to disk. The result is that log
recovery now sees a large number of transactions for metadata that
is already on disk.
This usually isn't a problem, but when the transactions include
inode chunk allocation, the inode create transactions and all
subsequent changes are replayed as we cannt rely on what is on disk
is valid. As a result, if the inode was written and contains
unlogged changes, the unlogged changes are lost, thereby violating
sync semantics.
The fix is to always issue a transaction after the buftarg flush
occurs is the log iѕ not idle or covered. This results in a dummy
transaction being written that contains the up-to-date log tail
value, which will be very recent. Indeed, it will be at least as
recent as the old code would have left on disk, so log recovery
will behave exactly as it used to in this situation.
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Alex Elder <aelder@sgi.com>
* git://git.kernel.org/pub/scm/linux/kernel/git/wim/linux-2.6-watchdog:
[WATCHDOG] max63xx driver depends on ioremap()
[WATCHDOG] max63xx: be careful when disabling the watchdog
[WATCHDOG] fixed book E watchdog period register mask.
[WATCHDOG] omap4: Fix WDT Kconfig
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound-2.6:
ASoC: imx-ssi: do not call hrtimer_disable in trigger function
ALSA: hda - Add position_fix quirk for Biostar mobo
ALSA: hda - add a quirk for Clevo M570U laptop
ASoC: imx-ssi: increase minimum periods to 4
ALSA: hda - Avoid invalid "Independent HP" control for VIA codecs
ALSA: hda - Fix control element allocations in VIA codec parser
ALSA: aaci - Fix alignment faults on ARM Cortex introduced by commit 29a4f2d3
ALSA: hda - Add fix-up for Sony VAIO with ALC269
ALSA: hda - Enhance fix-up table for Realtek codecs
ALSA: usb - Fix Oops after usb-midi disconnection
ALSA: hda - Fix initial capture source connections of ALC880/260
ALSA: hda - Fix setup for ALC269vb amic and dmic models
ALSA: hda - Fix auto-parser of ALC269vb for HP pin NID 0x21
ASoC: imx-ssi: Use a hrtimer in FIQ mode
ASoC: imx-pcm-dma-mx2: restart DMA after an error
ASoC: imx-ssi: honor IMX_SSI_DMA flag
ASoC: wm2000: remove unused #include <linux/version.h>
ALSA: hda: Add support for Medion WIM2160
Correct fix for the "ioremap() causes build failure on S390" should have been
a dependancy on HAS_IOMEM. So we add this dependancy also (and leave the driver
in the ARM section for now).
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
When shutting down the watchdog timer, special care must be taken
not to overwrite other bits in the register, as it may be shared
with other peripherals.
For example, on the Arcom Vulcan, the register is shared between
the watchdog and the PCI reset line...
Signed-off-by: Marc Zyngier <maz@misterjones.org>
Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
A previous fix changed the WDTP function to use the period directly,
rather than subtracting from 63. However the mask generation was
not changed, so the mask was coming out as 0. This patch fixes it.
Signed-off-by: Luuk Paulussen <luuk.paulussen@alliedtelesis.co.nz>
Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
This patch allows Watchdog timer to be selected for OMAP4 by fixing
Kconfig entry
Signed-off-by: Santosh Shilimkar <santosh.shilimkar@ti.com>
Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
As Herbert Xu said: we should be able to simply replace ipfragok
with skb->local_df. commit f88037(sctp: Drop ipfargok in sctp_xmit function)
has droped ipfragok and set local_df value properly.
The patch kills the ipfragok parameter of .queue_xmit().
Signed-off-by: Shan Wei <shanwei@cn.fujitsu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
commit f88037(sctp: Drop ipfargok in sctp_xmit function)
has droped ipfragok and set local_df value properly.
So the change of commit 77e2f1(ipv6: Fix ip6_xmit to
send fragments if ipfragok is true) is not needed.
So the patch remove them.
Signed-off-by: Shan Wei <shanwei@cn.fujitsu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Paris got following trace with a linux-next kernel
[ 14.203970] BUG: using smp_processor_id() in preemptible [00000000]
code: avahi-daemon/2093
[ 14.204025] caller is netif_rx+0xfa/0x110
[ 14.204035] Call Trace:
[ 14.204064] [<ffffffff81278fe5>] debug_smp_processor_id+0x105/0x110
[ 14.204070] [<ffffffff8142163a>] netif_rx+0xfa/0x110
[ 14.204090] [<ffffffff8145b631>] ip_dev_loopback_xmit+0x71/0xa0
[ 14.204095] [<ffffffff8145b892>] ip_mc_output+0x192/0x2c0
[ 14.204099] [<ffffffff8145d610>] ip_local_out+0x20/0x30
[ 14.204105] [<ffffffff8145d8ad>] ip_push_pending_frames+0x28d/0x3d0
[ 14.204119] [<ffffffff8147f1cc>] udp_push_pending_frames+0x14c/0x400
[ 14.204125] [<ffffffff814803fc>] udp_sendmsg+0x39c/0x790
[ 14.204137] [<ffffffff814891d5>] inet_sendmsg+0x45/0x80
[ 14.204149] [<ffffffff8140af91>] sock_sendmsg+0xf1/0x110
[ 14.204189] [<ffffffff8140dc6c>] sys_sendmsg+0x20c/0x380
[ 14.204233] [<ffffffff8100ad82>] system_call_fastpath+0x16/0x1b
While current linux-2.6 kernel doesnt emit this warning, bug is latent
and might cause unexpected failures.
ip_dev_loopback_xmit() runs in process context, preemption enabled, so
must call netif_rx_ni() instead of netif_rx(), to make sure that we
process pending software interrupt.
Same change for ip6_dev_loopback_xmit()
Reported-by: Eric Paris <eparis@redhat.com>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input:
Input: wacom - switch mode upon system resume
Revert "Input: wacom - merge out and in prox events"
Input: matrix_keypad - allow platform to disable key autorepeat
Input: ALPS - add signature for HP Pavilion dm3 laptops
Input: i8042 - spelling fix
Input: sparse-keymap - implement safer freeing of the keymap
Input: update the status of the Multitouch X driver project
Input: clarify the no-finger event in multitouch protocol
Input: bcm5974 - retract efi-broken suspend_resume
Input: sparse-keymap - free the right keymap on error
When Wacom devices wake up from a sleep, the switch mode command
(wacom_query_tablet_data) is needed before wacom_open is called.
wacom_query_tablet_data should not be executed inside wacom_open
since wacom_open is called more than once during probe.
wacom_retrieve_hid_descriptor is removed from wacom_resume due
to the fact that the required descriptors are stored properly
upon system resume.
Reported-and-tested-by: Anton Anikin <Anton@Anikin.name>
Signed-off-by: Ping Cheng <pingc@wacom.com>
Cc: stable@kernel.org
Signed-off-by: Dmitry Torokhov <dtor@mail.ru>
Size needs to be calculated after manipulating with the start value.
Reported-by: Komuro <komurojun-mbn@nifty.com>
Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
Doing so causes a deadlock, so just signal the timer to stop
using an atomic variable.
Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>
Acked-by: Liam Girdwood <lrg@slimlogic.co.uk>
Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
Among else, this allows projects like libdc1394 to carry copies of the
ABI related header files without them or distributors having to worry
about effects on the project's overall license terms. Switch to MIT
license as suggested by Kristian. Also update the year in the
copyright statement according to source history.
Cc: Jay Fenlason <fenlason@redhat.com>
Acked-by: Clemens Ladisch <clemens@ladisch.de>
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
Signed-off-by: Kristian Høgsberg <krh@bitplanet.net>
Fix an oversight in ipmr_destroy_unres() - the net pointer is
unconditionally initialized to NULL, resulting in a NULL pointer
dereference later on.
Fix by adding a net pointer to struct mr_table and using it in
ipmr_destroy_unres().
Signed-off-by: Patrick McHardy <kaber@trash.net>
The patch to convert struct mfc_cache to list_heads (ipv4: ipmr: convert
struct mfc_cache to struct list_head) introduced a bug when adding new
cache entries that don't match any unresolved entries.
The unres queue is searched for a matching entry, which is then resolved.
When no matching entry is present, the iterator points to the head of the
list, but is treated as a matching entry. Use a seperate variable to
indicate that a matching entry was found.
Signed-off-by: Patrick McHardy <kaber@trash.net>
When dev_pick_tx() caches tx queue_index on a socket, we must check
socket dst_entry matches skb one, or risk a crash later, as reported by
Denys Fedorysychenko, if old packets are in flight during a route
change, involving devices with different number of queues.
Bug introduced by commit a4ee3ce3
(net: Use sk_tx_queue_mapping for connected sockets)
Reported-by: Denys Fedorysychenko <nuclearcat@nuclearcat.com>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>