Commit Graph

364994 Commits

Author SHA1 Message Date
Tomas Winkler
5ceb46e25d mei: revamp mei_amthif_irq_read_message
Rename the function to mei_amthif_irq_read_msg
and change parameters order

Signed-off-by: Tomas Winkler <tomas.winkler@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-04-19 11:22:55 -07:00
Eric Dumazet
12fb3dd9dc tcp: call tcp_replace_ts_recent() from tcp_ack()
commit bd090dfc63 (tcp: tcp_replace_ts_recent() should not be called
from tcp_validate_incoming()) introduced a TS ecr bug in slow path
processing.

1 A > B P. 1:10001(10000) ack 1 <nop,nop,TS val 1001 ecr 200>
2 B < A . 1:1(0) ack 1 win 257 <sack 9001:10001,TS val 300 ecr 1001>
3 A > B . 1:1001(1000) ack 1 win 227 <nop,nop,TS val 1002 ecr 200>
4 A > B . 1001:2001(1000) ack 1 win 227 <nop,nop,TS val 1002 ecr 200>

(ecr 200 should be ecr 300 in packets 3 & 4)

Problem is tcp_ack() can trigger send of new packets (retransmits),
reflecting the prior TSval, instead of the TSval contained in the
currently processed incoming packet.

Fix this by calling tcp_replace_ts_recent() from tcp_ack() after the
checks, but before the actions.

Reported-by: Yuchung Cheng <ycheng@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Neal Cardwell <ncardwell@google.com>
Acked-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-04-19 14:21:53 -04:00
H Hartley Sweeten
3d1fe3f785 staging: comedi: drivers: free_irq() in comedi_legacy_detach()
All the legacy comedi drivers now call comedi_legacy_detach()
either directly or as part of their (*detach). Move the free_irq()
into comedi_legacy_detach() so that the drivers don't have to
deal with it.

For drivers that then only call comedi_legacy_detach() in their
private (*detach), remove the private function and use the helper
directly for the (*detach).

The amplc_pc236 and ni_labpc drivers are hybrid legacy/PCI drivers.
In the detach of a PCI board free_irq() still needs to be handled
by the driver.

The pcl724 and pcl726 drivers currently have the free_irq() #ifdef'ed
out. The comedi_legacy_detach() function sanity checks that the irq
has been requested before freeing it so they are safe to convert.

For aesthetic reasons, move the #ifdef unused chunk in the pcl816
driver up to the previous #ifdef unused block.

The pcmio and pcmuio drivers request multiple irqs and handle the
freeing of them. Remove the 'dev->irq = irq[0]' in those drivers
so that comedi_legacy_detach() does not attempt to free the irq.

Signed-off-by: H Hartley Sweeten <hsweeten@visionengravers.com>
Reviewed-by: Ian Abbott <abbotti@mev.co.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-04-19 11:19:54 -07:00
H Hartley Sweeten
a32c6d0084 staging: comedi: drivers: use comedi_legacy_detach()
Use comedi_legacy_detach() to release the I/O region requested
by these drivers.

Signed-off-by: H Hartley Sweeten <hsweeten@visionengravers.com>
Reviewed-by: Ian Abbott <abbotti@mev.co.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-04-19 11:19:54 -07:00
H Hartley Sweeten
3fd19714e3 staging: comedi: skel: use comedi_legacy_detach()
Update the skeleton driver to use the new comedi_legacy_detach()
helper in the (*detach) to release the I/O region. Also, update
the comment about it.

Signed-off-by: H Hartley Sweeten <hsweeten@visionengravers.com>
Reviewed-by: Ian Abbott <abbotti@mev.co.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-04-19 11:19:54 -07:00
H Hartley Sweeten
60cb3b02bc staging: comedi: amplc_dio200: use comedi_legacy_detach()
The I/O region used by this driver is always requested using
comedi_request_region(). The devpriv->io union is only used by
the common code shared by the legacy and PCI drivers.

Use the new comedi_legacy_detach() helper in the (*detach) to
release the I/O region requested by this driver. That function
will handle the proper sanity checking before releasing the
resource.

Signed-off-by: H Hartley Sweeten <hsweeten@visionengravers.com>
Reviewed-by: Ian Abbott <abbotti@mev.co.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-04-19 11:19:54 -07:00
H Hartley Sweeten
21208519d4 staging: comedi: drivers: use comedi_legacy_detach() in simple drivers
Use the new comedi_legacy_detach() helper in the (*detach) to release
the I/O region requested by these drivers.

Since the (*detach) for these drivers only releases the region, remove
the private (*detach) functions and use comedi_legacy_detach() directly
for the (*detach).

Signed-off-by: H Hartley Sweeten <hsweeten@visionengravers.com>
Reviewed-by: Ian Abbott <abbotti@mev.co.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-04-19 11:19:53 -07:00
H Hartley Sweeten
e608796ab6 staging: comedi: das1800: use comedi_legacy_detach()
Use the new comedi_legacy_detach() helper in the (*detach) to release
the first I/O region requested by this driver.

An additional I/O region is requested for some of the boards this driver
supports. The iobase for that region is stored in the private data so
that the (*detach) knows it needs to be released. Remove the extra
cleanup in the (*attach) that releases the first region.

For aesthetics, move the release of the additional region in the
(*detach) so it follows the (*attach) order.

Signed-off-by: H Hartley Sweeten <hsweeten@visionengravers.com>
Reviewed-by: Ian Abbott <abbotti@mev.co.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-04-19 11:19:53 -07:00
H Hartley Sweeten
4b3fb0ff74 staging: comedi: das16m1: check for subdev_8255_init() failure
Make sure to check if subdev_8255_init() fails and propogate the
error code.

Signed-off-by: H Hartley Sweeten <hsweeten@visionengravers.com>
Reviewed-by: Ian Abbott <abbotti@mev.co.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-04-19 11:19:53 -07:00
H Hartley Sweeten
a0b4bccc40 staging: comedi: das16m1: use comedi_legacy_detach()
Use the new comedi_legacy_detach() helper in the (*detach) to release
the first I/O region requested by this driver.

An additional I/O region is requested by this driver for the 8255
device. Save the iobase for that region in the private data so that
the (*detach) knows it needs to be released.

Signed-off-by: H Hartley Sweeten <hsweeten@visionengravers.com>
Reviewed-by: Ian Abbott <abbotti@mev.co.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-04-19 11:19:53 -07:00
H Hartley Sweeten
adfaa207ca staging: comedi: das16: use comedi_legacy_detach()
Use the new comedi_legacy_detach() helper in the (*detach) to release
the first I/O region requested by this driver.

An additional I/O region is requested for some of the boards this driver
supports. Save the iobase for that region in the private data so that
the (*detach) knows it needs to be released.

Signed-off-by: H Hartley Sweeten <hsweeten@visionengravers.com>
Reviewed-by: Ian Abbott <abbotti@mev.co.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-04-19 11:19:53 -07:00
H Hartley Sweeten
a3fd5e517a staging: comedi: pcl812: use comedi_legacy_detach()
This driver does not follow the standard (*attach) (*detach) flow of
the other drivers in comedi. Comedi drivers do not 'cleanup' any of
the allocations made during the (*attach) if failures are encountered.
If the (*attach) fails, the comedi core will call the (*detach) to
handle any clenaup.

In this driver, the function free_resources() handles all the cleanup.
Remove the calls to this function during the (*attach). Since the
(*detach) is then the only caller, remove the function and just put
all the cleanup in the (*detach) function.

Use the new comedi_legacy_detach() helper in the (*detach) to release
the I/O region.

Signed-off-by: H Hartley Sweeten <hsweeten@visionengravers.com>
Reviewed-by: Ian Abbott <abbotti@mev.co.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-04-19 11:19:53 -07:00
H Hartley Sweeten
316f97f169 staging: comedi: drivers: introduce comedi_legacy_detach()
This function is intended to be used by the comedi legacy (ISA) drivers
either directly as the (*detach) function or as a helper in the drivers
private (*detach) function.

Modify the comedi_request_region() helper so that it stores the 'len' of
the region as well as the 'start' after the region has been successfuly
allocated by request_region() in __comedi_request_region(). This region
will then be automatically released detach of the driver by the
comedi_legacy_detach() helper.

Signed-off-by: H Hartley Sweeten <hsweeten@visionengravers.com>
Reviewed-by: Ian Abbott <abbotti@mev.co.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-04-19 11:19:52 -07:00
Andy Shevchenko
3a8d2ccdcf staging: rts5129: re-use kbasename()
The custom filename function mostly repeats the kernel's kbasename. This patch
simplifies it. The updated filename() will not check for the '\' in the
filenames. It seems redundant in Linux. The __FILE__ macro always defined if we
compile an existing file. Thus, NULL check is not needed there as well.

Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-04-19 11:17:15 -07:00
Daniel Borkmann
f33cb17d1d staging: net: remove pc300 driver
To quote the TODO from staging/net/:

  PC300:

  The driver is very broken and cannot work with the current TTY
  layer. It is inevitable to convert it to the new TTY API. If no
  one steps in to adopt the driver, it will be removed in the 3.7
  release.

Nothing has changed since more than _one_ year on this driver, thus
just remove it since we already moved past 3.7. If somebody steps
up and does a whole rework, he/she, of course, is free to resubmit
it. Since this is the only one in the net directory, we can remove
it as well.

Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Cc: Jiri Slaby <jslaby@suse.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-04-19 11:15:20 -07:00
Tomas Winkler
9b0d5efc42 mei: revamp hbm state machine
1. Rename init_clients_state to hbm_state and use
MEI_HBM_ prefix for HBM states

2. Remove recvd_msg and use hbm state for synchronizing
hbm protocol has successful start.
We can wake up the hbm event from start response handler
and remove the hack from the interrupt thread

3. mei_hbm_start_wait function encapsulate start completion
waiting

Signed-off-by: Tomas Winkler <tomas.winkler@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-04-19 10:58:21 -07:00
Johan Hovold
f1175daa53 USB: ti_usb_3410_5052: kill custom closing_wait
Kill custom closing_wait implementation and let the tty-layer handle it
instead.

Note that the port drain-delay is set to three characters to keep the
20ms delay after wait_until_sent at low baudrates (1200 baud) during close.

Signed-off-by: Johan Hovold <jhovold@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-04-19 10:24:14 -07:00
Johan Hovold
c041902433 USB: ti_usb_3410_5052: remove redundant drain from break_ctl
Remove redundant drain, which has already been handled by the tty-layer,
from break_ctl.

Signed-off-by: Johan Hovold <jhovold@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-04-19 10:24:14 -07:00
Johan Hovold
2c992cd737 USB: ti_usb_3410_5052: query hardware-buffer status in chars_in_buffer
Query hardware-buffer status in chars_in_buffer should the write fifo be
empty.

This is needed to make the tty layer wait for hardware buffers to drain
on close.

Signed-off-by: Johan Hovold <jhovold@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-04-19 10:24:13 -07:00
Johan Hovold
b5784f7d85 USB: ti_usb_3410_5052: remove lsr from port data
The line status register is only polled so let's not keep a possibly
outdated value in the port data.

Signed-off-by: Johan Hovold <jhovold@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-04-19 10:24:13 -07:00
Johan Hovold
113ec31e16 USB: ti_usb_3410_5052: move write-fifo flushing to close
Move write-fifo flushing from ti_drain to close where it belongs.

Signed-off-by: Johan Hovold <jhovold@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-04-19 10:24:13 -07:00
Johan Hovold
a8ec374f96 USB: io_ti: remove redundant wait_until_sent
Remove redundant wait_until_sent, which has already been handled by the
tty-layer, from break_ctl.

Signed-off-by: Johan Hovold <jhovold@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-04-19 10:24:13 -07:00
Johan Hovold
b6fd35ee57 USB: io_ti: fix TIOCGSERIAL
Fix regression introduced by commit f40d78155 ("USB: io_ti: kill custom
closing_wait implementation") which made TIOCGSERIAL return the wrong
value for closing_wait.

Signed-off-by: Johan Hovold <jhovold@gmail.com>
Cc: stable <stable@vger.kernel.org> # 3.9
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-04-19 10:24:13 -07:00
Ming Lei
ac9e59cad7 USB: usbtmc: remove unnecessary memory allocation
Inside usbtmc_ioctl_clear_out_halt()/usbtmc_ioctl_clear_in_halt(),
usb_clear_halt() needn't any buffer to pass in, so remove the
unnecessary memory allocation.

Signed-off-by: Ming Lei <ming.lei@canonical.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-04-19 10:20:41 -07:00
Wei Yongjun
c33c888b58 usbatm: fix potential NULL pointer dereference
The dereference to 'instance' in the debug code should be moved
below the NULL test.

Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn>
Signed-off-by: Duncan Sands <baldrick@free.fr>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-04-19 10:20:41 -07:00
Linus Torvalds
f86b11fbc7 mtdchar: remove no-longer-used vma helpers
With the conversion to vm_iomap_memory(), these vma helpers are no
longer used.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-04-19 10:05:39 -07:00
Linus Torvalds
0fe09a45c4 vm: convert snd_pcm_lib_mmap_iomem() to vm_iomap_memory() helper
This is my example conversion of a few existing mmap users.  The pcm
mmap case is one of the more straightforward ones.

Acked-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-04-19 10:01:04 -07:00
Linus Torvalds
fc9bbca8f6 vm: convert fb_mmap to vm_iomap_memory() helper
This is my example conversion of a few existing mmap users.  The
fb_mmap() case is a good example because it is a bit more complicated
than some: fb_mmap() mmaps one of two different memory areas depending
on the page offset of the mmap (but happily there is never any mixing of
the two, so the helper function still works).

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-04-19 09:57:35 -07:00
Linus Torvalds
8558e4a26b vm: convert mtdchar mmap to vm_iomap_memory() helper
This is my example conversion of a few existing mmap users.  The mtdchar
case is actually disabled right now (and stays disabled), but I did it
because it showed up on my "git grep", and I was familiar with the code
due to fixing an overflow problem in the code in commit 9c603e53d3
("mtdchar: fix offset overflow detection").

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-04-19 09:53:07 -07:00
Linus Torvalds
2323036dfe vm: convert HPET mmap to vm_iomap_memory() helper
This is my example conversion of a few existing mmap users.  The HPET
case is simple, widely available, and easy to test (Clemens Ladisch sent
a trivial test-program for it).

Test-program-by: Clemens Ladisch <clemens@ladisch.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-04-19 09:46:39 -07:00
Linus Torvalds
0f177f8739 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input
Pull input fixes from Dmitry Torokhov:
 "Two more small fixups to the wacom driver"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input:
  Input: wacom - fix "can not retrieve extra class descriptor" for DTH2242
  Input: wacom - DTH2242 Grip Pen id was off by one bit
2013-04-19 09:15:13 -07:00
Linus Torvalds
53d945e1a2 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse
Pull fuse build fix from Miklos Szeredi:
 "This fixes android builds.  The patch appears large, but is just
  search & replace."

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse:
  fuse: fix type definitions in uapi header
2013-04-19 09:12:55 -07:00
Ping Cheng
5846115b30 Input: wacom - fix "can not retrieve extra class descriptor" for DTH2242
Same as Cintiq 24HDT, DTH2242 has two interfaces sharing one configuration.
This patch ignores the second interface.

Signed-off-by: Ping Cheng <pingc@wacom.com>
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
2013-04-19 08:52:41 -07:00
Ping Cheng
1582eea208 Input: wacom - DTH2242 Grip Pen id was off by one bit
Signed-off-by: Ping Cheng <pingc@wacom.com>
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
2013-04-19 08:52:28 -07:00
Ben Guthro
18c0025b69 xen: resolve section mismatch warnings in xen-acpi-processor
The following resolves a section mismatch warning below in xen-acpi-processor introduced by
3fac10145b [13/13] xen: Re-upload processor PM data to hypervisor after S3 resume (v2)

Warning:
WARNING: drivers/xen/built-in.o(.text+0x2056a): Section mismatch in reference from the function xen_upload_processor_pm_data() to the function .init.text:read_acpi_id()
   The function xen_upload_processor_pm_data() references
   the function __init read_acpi_id().
   This is often because xen_upload_processor_pm_data lacks a __init
   annotation or the annotation of read_acpi_id is wrong.

Reported-by: kbuild test robot <fengguang.wu@intel.com>

Signed-off-by: Ben Guthro <benjamin.guthro@citrix.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2013-04-19 10:44:23 -04:00
Laurent Meunier
f07512e615 pinctrl/pinconfig: add debug interface
This update adds a debugfs interface to modify a pin configuration
for a given state in the pinctrl map. This allows to modify the
configuration for a non-active state, typically sleep state.
This configuration is not applied right away, but only when the state
will be entered.

This solution is mandated for us by HW validation: in order
to test and verify several pin configurations during sleep without
recompiling the software.

Change log in this patch set;
Take into account latest feedback from Stephen Warren:
- stale comments update
- improved code efficiency and readibility
- limit size of global variable pinconf_dbg_conf
- remove req_type as it can easily be added later when
add/delete requests support is implemented

Signed-off-by: Laurent Meunier <laurent.meunier@st.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
2013-04-19 15:45:05 +02:00
Waiman Long
cc189d2513 mutex: Back out architecture specific check for negative mutex count
Linus suggested that probably all the supported architectures can
allow a negative mutex count without incorrect behavior, so we can
then back out the architecture specific change and allow the
mutex count to go to any negative number. That should further
reduce contention for non-x86 architecture.

Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Waiman Long <Waiman.Long@hp.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Chandramouleeswaran Aswin <aswin@hp.com>
Cc: Davidlohr Bueso <davidlohr.bueso@hp.com>
Cc: Norton Scott J <scott.norton@hp.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: David Howells <dhowells@redhat.com>
Cc: Dave Jones <davej@redhat.com>
Cc: Clark Williams <williams@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1366226594-5506-5-git-send-email-Waiman.Long@hp.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-04-19 09:33:36 +02:00
Waiman Long
2bd2c92cf0 mutex: Queue mutex spinners with MCS lock to reduce cacheline contention
The current mutex spinning code (with MUTEX_SPIN_ON_OWNER option
turned on) allow multiple tasks to spin on a single mutex
concurrently. A potential problem with the current approach is
that when the mutex becomes available, all the spinning tasks
will try to acquire the mutex more or less simultaneously. As a
result, there will be a lot of cacheline bouncing especially on
systems with a large number of CPUs.

This patch tries to reduce this kind of contention by putting
the mutex spinners into a queue so that only the first one in
the queue will try to acquire the mutex. This will reduce
contention and allow all the tasks to move forward faster.

The queuing of mutex spinners is done using an MCS lock based
implementation which will further reduce contention on the mutex
cacheline than a similar ticket spinlock based implementation.
This patch will add a new field into the mutex data structure
for holding the MCS lock. This expands the mutex size by 8 bytes
for 64-bit system and 4 bytes for 32-bit system. This overhead
will be avoid if the MUTEX_SPIN_ON_OWNER option is turned off.

The following table shows the jobs per minute (JPM) scalability
data on an 8-node 80-core Westmere box with a 3.7.10 kernel. The
numactl command is used to restrict the running of the fserver
workloads to 1/2/4/8 nodes with hyperthreading off.

+-----------------+-----------+-----------+-------------+----------+
|  Configuration  | Mean JPM  | Mean JPM  |  Mean JPM   | % Change |
|                 | w/o patch | patch 1   | patches 1&2 |  1->1&2  |
+-----------------+------------------------------------------------+
|                 |              User Range 1100 - 2000            |
+-----------------+------------------------------------------------+
| 8 nodes, HT off |  227972   |  227237   |   305043    |  +34.2%  |
| 4 nodes, HT off |  393503   |  381558   |   394650    |   +3.4%  |
| 2 nodes, HT off |  334957   |  325240   |   338853    |   +4.2%  |
| 1 node , HT off |  198141   |  197972   |   198075    |   +0.1%  |
+-----------------+------------------------------------------------+
|                 |              User Range 200 - 1000             |
+-----------------+------------------------------------------------+
| 8 nodes, HT off |  282325   |  312870   |   332185    |   +6.2%  |
| 4 nodes, HT off |  390698   |  378279   |   393419    |   +4.0%  |
| 2 nodes, HT off |  336986   |  326543   |   340260    |   +4.2%  |
| 1 node , HT off |  197588   |  197622   |   197582    |    0.0%  |
+-----------------+-----------+-----------+-------------+----------+

At low user range 10-100, the JPM differences were within +/-1%.
So they are not that interesting.

The fserver workload uses mutex spinning extensively. With just
the mutex change in the first patch, there is no noticeable
change in performance.  Rather, there is a slight drop in
performance. This mutex spinning patch more than recovers the
lost performance and show a significant increase of +30% at high
user load with the full 8 nodes. Similar improvements were also
seen in a 3.8 kernel.

The table below shows the %time spent by different kernel
functions as reported by perf when running the fserver workload
at 1500 users with all 8 nodes.

+-----------------------+-----------+---------+-------------+
|        Function       |  % time   | % time  |   % time    |
|                       | w/o patch | patch 1 | patches 1&2 |
+-----------------------+-----------+---------+-------------+
| __read_lock_failed    |  34.96%   | 34.91%  |   29.14%    |
| __write_lock_failed   |  10.14%   | 10.68%  |    7.51%    |
| mutex_spin_on_owner   |   3.62%   |  3.42%  |    2.33%    |
| mspin_lock            |    N/A    |   N/A   |    9.90%    |
| __mutex_lock_slowpath |   1.46%   |  0.81%  |    0.14%    |
| _raw_spin_lock        |   2.25%   |  2.50%  |    1.10%    |
+-----------------------+-----------+---------+-------------+

The fserver workload for an 8-node system is dominated by the
contention in the read/write lock. Mutex contention also plays a
role. With the first patch only, mutex contention is down (as
shown by the __mutex_lock_slowpath figure) which help a little
bit. We saw only a few percents improvement with that.

By applying patch 2 as well, the single mutex_spin_on_owner
figure is now split out into an additional mspin_lock figure.
The time increases from 3.42% to 11.23%. It shows a great
reduction in contention among the spinners leading to a 30%
improvement. The time ratio 9.9/2.33=4.3 indicates that there
are on average 4+ spinners waiting in the spin_lock loop for
each spinner in the mutex_spin_on_owner loop. Contention in
other locking functions also go down by quite a lot.

The table below shows the performance change of both patches 1 &
2 over patch 1 alone in other AIM7 workloads (at 8 nodes,
hyperthreading off).

+--------------+---------------+----------------+-----------------+
|   Workload   | mean % change | mean % change  | mean % change   |
|              | 10-100 users  | 200-1000 users | 1100-2000 users |
+--------------+---------------+----------------+-----------------+
| alltests     |      0.0%     |     -0.8%      |     +0.6%       |
| five_sec     |     -0.3%     |     +0.8%      |     +0.8%       |
| high_systime |     +0.4%     |     +2.4%      |     +2.1%       |
| new_fserver  |     +0.1%     |    +14.1%      |    +34.2%       |
| shared       |     -0.5%     |     -0.3%      |     -0.4%       |
| short        |     -1.7%     |     -9.8%      |     -8.3%       |
+--------------+---------------+----------------+-----------------+

The short workload is the only one that shows a decline in
performance probably due to the spinner locking and queuing
overhead.

Signed-off-by: Waiman Long <Waiman.Long@hp.com>
Reviewed-by: Davidlohr Bueso <davidlohr.bueso@hp.com>
Acked-by: Rik van Riel <riel@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Chandramouleeswaran Aswin <aswin@hp.com>
Cc: Norton Scott J <scott.norton@hp.com>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: David Howells <dhowells@redhat.com>
Cc: Dave Jones <davej@redhat.com>
Cc: Clark Williams <williams@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1366226594-5506-4-git-send-email-Waiman.Long@hp.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-04-19 09:33:36 +02:00
Waiman Long
0dc8c730c9 mutex: Make more scalable by doing less atomic operations
In the __mutex_lock_common() function, an initial entry into
the lock slow path will cause two atomic_xchg instructions to be
issued. Together with the atomic decrement in the fast path, a
total of three atomic read-modify-write instructions will be
issued in rapid succession. This can cause a lot of cache
bouncing when many tasks are trying to acquire the mutex at the
same time.

This patch will reduce the number of atomic_xchg instructions
used by checking the counter value first before issuing the
instruction. The atomic_read() function is just a simple memory
read. The atomic_xchg() function, on the other hand, can be up
to 2 order of magnitude or even more in cost when compared with
atomic_read(). By using atomic_read() to check the value first
before calling atomic_xchg(), we can avoid a lot of unnecessary
cache coherency traffic. The only downside with this change is
that a task on the slow path will have a tiny bit less chance of
getting the mutex when competing with another task in the fast
path.

The same is true for the atomic_cmpxchg() function in the
mutex-spin-on-owner loop. So an atomic_read() is also performed
before calling atomic_cmpxchg().

The mutex locking and unlocking code for the x86 architecture
can allow any negative number to be used in the mutex count to
indicate that some tasks are waiting for the mutex. I am not so
sure if that is the case for the other architectures. So the
default is to avoid atomic_xchg() if the count has already been
set to -1. For x86, the check is modified to include all
negative numbers to cover a larger case.

The following table shows the jobs per minutes (JPM) scalability
data on an 8-node 80-core Westmere box with a 3.7.10 kernel. The
numactl command is used to restrict the running of the
high_systime workloads to 1/2/4/8 nodes with hyperthreading on
and off.

+-----------------+-----------+------------+----------+
|  Configuration  | Mean JPM  |  Mean JPM  | % Change |
|		  | w/o patch | with patch |	      |
+-----------------+-----------------------------------+
|		  |      User Range 1100 - 2000	      |
+-----------------+-----------------------------------+
| 8 nodes, HT on  |    36980   |   148590  | +301.8%  |
| 8 nodes, HT off |    42799   |   145011  | +238.8%  |
| 4 nodes, HT on  |    61318   |   118445  |  +51.1%  |
| 4 nodes, HT off |   158481   |   158592  |   +0.1%  |
| 2 nodes, HT on  |   180602   |   173967  |   -3.7%  |
| 2 nodes, HT off |   198409   |   198073  |   -0.2%  |
| 1 node , HT on  |   149042   |   147671  |   -0.9%  |
| 1 node , HT off |   126036   |   126533  |   +0.4%  |
+-----------------+-----------------------------------+
|		  |       User Range 200 - 1000	      |
+-----------------+-----------------------------------+
| 8 nodes, HT on  |   41525    |   122349  | +194.6%  |
| 8 nodes, HT off |   49866    |   124032  | +148.7%  |
| 4 nodes, HT on  |   66409    |   106984  |  +61.1%  |
| 4 nodes, HT off |  119880    |   130508  |   +8.9%  |
| 2 nodes, HT on  |  138003    |   133948  |   -2.9%  |
| 2 nodes, HT off |  132792    |   131997  |   -0.6%  |
| 1 node , HT on  |  116593    |   115859  |   -0.6%  |
| 1 node , HT off |  104499    |   104597  |   +0.1%  |
+-----------------+------------+-----------+----------+

At low user range 10-100, the JPM differences were within +/-1%.
So they are not that interesting.

AIM7 benchmark run has a pretty large run-to-run variance due to
random nature of the subtests executed. So a difference of less
than +-5% may not be really significant.

This patch improves high_systime workload performance at 4 nodes
and up by maintaining transaction rates without significant
drop-off at high node count.  The patch has practically no
impact on 1 and 2 nodes system.

The table below shows the percentage time (as reported by perf
record -a -s -g) spent on the __mutex_lock_slowpath() function
by the high_systime workload at 1500 users for 2/4/8-node
configurations with hyperthreading off.

+---------------+-----------------+------------------+---------+
| Configuration | %Time w/o patch | %Time with patch | %Change |
+---------------+-----------------+------------------+---------+
|    8 nodes    |      65.34%     |      0.69%       |  -99%   |
|    4 nodes    |       8.70%	  |      1.02%	     |  -88%   |
|    2 nodes    |       0.41%     |      0.32%       |  -22%   |
+---------------+-----------------+------------------+---------+

It is obvious that the dramatic performance improvement at 8
nodes was due to the drastic cut in the time spent within the
__mutex_lock_slowpath() function.

The table below show the improvements in other AIM7 workloads
(at 8 nodes, hyperthreading off).

+--------------+---------------+----------------+-----------------+
|   Workload   | mean % change | mean % change  | mean % change   |
|              | 10-100 users  | 200-1000 users | 1100-2000 users |
+--------------+---------------+----------------+-----------------+
| alltests     |     +0.6%     |   +104.2%      |   +185.9%       |
| five_sec     |     +1.9%     |     +0.9%      |     +0.9%       |
| fserver      |     +1.4%     |     -7.7%      |     +5.1%       |
| new_fserver  |     -0.5%     |     +3.2%      |     +3.1%       |
| shared       |    +13.1%     |   +146.1%      |   +181.5%       |
| short        |     +7.4%     |     +5.0%      |     +4.2%       |
+--------------+---------------+----------------+-----------------+

Signed-off-by: Waiman Long <Waiman.Long@hp.com>
Reviewed-by: Davidlohr Bueso <davidlohr.bueso@hp.com>
Reviewed-by: Rik van Riel <riel@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Chandramouleeswaran Aswin <aswin@hp.com>
Cc: Norton: Scott J <scott.norton@hp.com>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: David Howells <dhowells@redhat.com>
Cc: Dave Jones <davej@redhat.com>
Cc: Clark Williams <williams@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1366226594-5506-3-git-send-email-Waiman.Long@hp.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-04-19 09:33:35 +02:00
Waiman Long
41fcb9f230 mutex: Move mutex spinning code from sched/core.c back to mutex.c
As mentioned by Ingo, the SCHED_FEAT_OWNER_SPIN scheduler
feature bit was really just an early hack to make with/without
mutex-spinning testable. So it is no longer necessary.

This patch removes the SCHED_FEAT_OWNER_SPIN feature bit and
move the mutex spinning code from kernel/sched/core.c back to
kernel/mutex.c which is where they should belong.

Signed-off-by: Waiman Long <Waiman.Long@hp.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Chandramouleeswaran Aswin <aswin@hp.com>
Cc: Davidlohr Bueso <davidlohr.bueso@hp.com>
Cc: Norton Scott J <scott.norton@hp.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: David Howells <dhowells@redhat.com>
Cc: Dave Jones <davej@redhat.com>
Cc: Clark Williams <williams@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1366226594-5506-2-git-send-email-Waiman.Long@hp.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-04-19 09:33:34 +02:00
Joe Perches
1cb6e73c55 usb: storage: Fix link error
Fix allmodconfig link error introduced by commit 75b9130e8a
("usb: storage: Add usb_stor_dbg, reduce object size")

Export the symbol usb_stor_dbg.
Add export.h

Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-04-18 19:02:59 -07:00
Linus Torvalds
6835039d7e Merge branch 'userns-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/luto/linux
Pull user-namespace fixes from Andy Lutomirski.

* 'userns-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/luto/linux:
  userns: Changing any namespace id mappings should require privileges
  userns: Check uid_map's opener's fsuid, not the current fsuid
  userns: Don't let unprivileged users trick privileged users into setting the id_map
2013-04-18 18:09:12 -07:00
Linus Torvalds
a86d52667d Merge branch 'for_linus' of git://cavan.codon.org.uk/platform-drivers-x86
Pull x86 platform driver revert from Matthew Garrett:
 "It turns out that one of the hp-wmi patches this cycle breaks some
  other HP laptops.  I think we have a good idea how to work on it for
  3.10, but it's safer to just revert it for now."

* 'for_linus' of git://cavan.codon.org.uk/platform-drivers-x86:
  Revert "hp-wmi: Add support for SMBus hotkeys"
2013-04-18 15:14:34 -07:00
Florian Westphal
f83a7ea207 netfilter: xt_rpfilter: skip locally generated broadcast/multicast, too
Alex Efros reported rpfilter module doesn't match following packets:
IN=br.qemu SRC=192.168.2.1 DST=192.168.2.255 [ .. ]
(netfilter bugzilla #814).

Problem is that network stack arranges for the locally generated broadcasts
to appear on the interface they were sent out, so the IFF_LOOPBACK check
doesn't trigger.

As -m rpfilter is restricted to PREROUTING, we can check for existing
rtable instead, it catches locally-generated broad/multicast case, too.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2013-04-19 00:11:59 +02:00
Matthew Garrett
c857b7f45b Revert "hp-wmi: Add support for SMBus hotkeys"
This reverts commit fabf85e3ca which breaks
hotkey support on some other HP laptops. We'll try doing this differently
in 3.10.

Signed-off-by: Matthew Garrett <matthew.garrett@nebula.com>
2013-04-18 14:53:10 -07:00
Jozsef Kadlecsik
5add189a12 netfilter: ipset: bitmap:ip,mac: fix listing with timeout
The type when timeout support was enabled, could not list all elements,
just the first ones which could fit into one netlink message: it just
did not continue listing after the first message.

Reported-by: Yoann JUET <yoann.juet@univ-nantes.fr>
Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Tested-by: Yoann JUET <yoann.juet@univ-nantes.fr>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2013-04-18 23:40:41 +02:00
Eric Dumazet
4394542ca4 bonding: fix l23 and l34 load balancing in forwarding path
Since commit 6b923cb718 (bonding: support for IPv6 transmit hashing)
bonding doesn't properly hash traffic in forwarding setups.

Vitaly V. Bursov diagnosed that skb_network_header_len() returned 0 in
this case.

More generally, the transport header might not be in the skb head.

Use pskb_may_pull() & skb_header_pointer() to get it right, and use
proto_ports_offset() in bond_xmit_hash_policy_l34() to get support for
more protocols than TCP and UDP.

Reported-by: Vitaly V. Bursov <vitalyb@telenet.dn.ua>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Jay Vosburgh <fubar@us.ibm.com>
Cc: Andy Gospodarek <andy@greyhouse.net>
Cc: John Eaglesham <linux@8192.net>
Tested-by: Vitaly V. Bursov <vitalyb@telenet.dn.ua>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-04-18 15:07:29 -04:00
Ariel Elior
0c14e5ced2 bnx2x: Fix status blocks configuration
This fixes 2 issues regarding bnx2x's status blocks:

   1. ethtool -c caused corruption of status blocks in FW RAM.

   2. when using multi-CoS, the configuration of the timeout values of
      status blocks is incorrect, harming the coalescing of interrupts
      for such CoSs.

Signed-off-by: Ariel Elior <ariele@broadcom.com>
Signed-off-by: Yuval Mintz <yuvalmin@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-04-18 15:03:26 -04:00
Dmitry Kravkov
d46f7c4df3 bnx2x: Prevent UNDI FW illegal host access
When loading after UNDI (e.g., Boot from SAN) the UNDI does not
gracefully yield its resources; The bnx2x driver handles that release
itself.

During the manipulation required to release those resources, it's possible
for the UNDI to try and write to memory regions which are no longer accessible,
causing the PCI bus to prevent further writes from the chip.

This would in turn cause DMAE timeouts later on in the driver, as the driver
will be unable to use the chip's DMA engines.

This patch prevents the chip from actually writing through the PCI bus
in said scenario, thus allowing the release without the unfortunate by-product.

Signed-off-by: Dmitry Kravkov <dmitry@broadcom.com>
Signed-off-by: Yuval Mintz <yuvalmin@broadcom.com>
Signed-off-by: Ariel Elior <ariele@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-04-18 15:03:25 -04:00
John W. Linville
5a22483e5a Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless into for-davem 2013-04-18 15:01:30 -04:00