Commit Graph

401813 Commits

Author SHA1 Message Date
Joe Perches
29fc2bc753 printk: pr_debug_ratelimited: check state first to reduce "callbacks suppressed" messages
pr_debug_ratelimited should be coded similarly to dev_dbg_ratelimited
to reduce the "callbacks suppressed" messages.

Add #include <linux/dynamic_debug.h> to printk.h. Unfortunately, this
new #include must be after the prototype/declaration of function printk.

It may be better to split out these _ratelimited declarations into
a separate file one day.

Any use of these pr_<foo>_ratelimited functions must also have another
specific #include <ratelimited.h>.  Most users have this done indirectly
via #include <linux/kernel.h>

printk.h may not #include <linux/ratelimit.h> as it causes circular
dependencies and compilation failures.

Signed-off-by: Joe Perches <joe@perches.com>
Tested-by: Krzysztof Mazur <krzysiek@podlesie.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-10-29 16:50:52 -07:00
Huang Rui
9d3bd76846 usb: usbtest: support bos descriptor test for usb 3.0
In Test 9 of usbtest module, it is used for performing chapter 9 tests N
times. This patch adds to support getting bos descriptor test scenario for
USB 3.0.

Signed-off-by: Huang Rui <ray.huang@amd.com>
Acked-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-10-29 16:48:53 -07:00
Anton Tikhomirov
488c9cd34b USB: phy: samsung: Support multiple PHYs of same type
This patch removes limitation when only one PHY of specific type
could be used.

Signed-off-by: Anton Tikhomirov <av.tikhomirov@samsung.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-10-29 16:46:48 -07:00
Thomas Pugliese
f74b75e7f9 usb: wusbcore: change WA_SEGS_MAX to a legal value
change WA_SEGS_MAX to a number that is legal according to the WUSB
spec.

Signed-off-by: Thomas Pugliese <thomas.pugliese@gmail.com>
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-10-29 16:44:49 -07:00
Thomas Pugliese
f07ddb9ef5 usb: wusbcore: add a quirk for Alereon HWA device isoc behavior
Add a quirk for Alereon HWA devices to concatenate the frames of isoc
transfer requests.

Signed-off-by: Thomas Pugliese <thomas.pugliese@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-10-29 16:44:49 -07:00
Thomas Pugliese
2101242cef usb: wusbcore: combine multiple isoc frames in a single transfer request.
Combine multiple isoc frames in a single transfer request.  This allows
the HWA to take advantage of bursting to deliver data to endpoints
whose logical service interval is less than the minimum wireless USB
service interval of 4ms.  Wireless audio quality is much improved after
this update.

Signed-off-by: Thomas Pugliese <thomas.pugliese@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-10-29 16:44:49 -07:00
Thomas Pugliese
7b6bc07ab5 usb: wusbcore: set the RPIPE wMaxPacketSize value correctly
For isochronous endpoints, set the RPIPE wMaxPacketSize value using
wOverTheAirPacketSize from the endpoint companion descriptor instead of
wMaxPacketSize from the normal endpoint descriptor.

Signed-off-by: Thomas Pugliese <thomas.pugliese@gmail.com>
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-10-29 16:44:49 -07:00
Peter Chen
df101c531e usb: chipidea: host: more enhancement when ci->hcd is NULL
Like http://marc.info/?l=linux-usb&m=138200449428874&w=2 said:
two more things are needed to be done:

- If host_start fails, the host_stop should not be called, so we
add check that ci->hcd is not NULL.
- if the host_start fails at the beginning, we need to consider
regulator mismatch issue.

Signed-off-by: Peter Chen <peter.chen@freescale.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-10-29 16:43:37 -07:00
H Hartley Sweeten
e55f7cd246 usb: ohci: remove ep93xx bus glue platform driver
Convert ep93xx to use the OHCI platform driver and remove the
ohci-ep93xx bus glue driver.

Enable CONFIG_OHCI_HCD_PLATFORM in the ep93xx_defconfig so that USB
is still enabled by default on the EP93xx platform.

Signed-off-by: H Hartley Sweeten <hsweeten@visionengravers.com>
Acked-by: Alan Stern <stern@rowland.harvard.edu>
Cc: Ryan Mallon <rmallon@gmail.com>
Cc: Lennert Buytenhek <kernel@wantstofly.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Olof Johansson <olof@lixom.net>
Cc: Russell King <linux@arm.linux.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-10-29 16:43:37 -07:00
Huang Rui
f55055b464 usb: usbtest: fix checkpatch warning as sizeof code style
Script checkpatch.pl always complains incorrect code style like below:

WARNING: sizeof *udev->bos->desc should be sizeof(*udev->bos->desc)

This patch fixes the warning for usbtest module.

Signed-off-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-10-29 16:43:36 -07:00
Greg Kroah-Hartman
2fc74ac227 UWB: clean up attribute use by using ATTRIBUTE_GROUPS()
The ATTRIBUTE_GROUPS() macro can be used in the uwb code to reduce the
number of lines of code.

Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-10-29 16:43:36 -07:00
Cong Ding
75f0aef622 uio: fix memory leak
we have to call kobject_put() to clean up the kobject after function
kobject_init(), kobject_add(), or kobject_uevent() is called.

Signed-off-by: Cong Ding <dinggnu@gmail.com>
Acked-by: Sebastian Andrzej Siewior <sebastian@breakpoint.cc>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-10-29 16:36:54 -07:00
Greg Kroah-Hartman
4192c74940 mdio_bus: convert bus code to use dev_groups
The dev_attrs field of struct bus_type is going away soon, dev_groups
should be used instead.  This converts the MDIO bus code to use the
correct field.

Cc: David S. Miller <davem@davemloft.net>
Cc: Mark Brown <broonie@opensource.wolfsonmicro.com>
Cc: Nick Bowler <nbowler@elliptictech.com>
Cc: <netdev@vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-10-29 16:25:51 -07:00
Vladimir Zapolskiy
7c65e29250 misc/at24: avoid infinite loop on write()
This change fixes a problem of infinite zero byte write() without
an error status, if there is an attempt to write a file bigger than
EEPROM size over sysfs interface.

Signed-off-by: Vladimir Zapolskiy <vladimir_zapolskiy@mentor.com>
Cc: Wolfram Sang <w.sang@pengutronix.de>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-10-29 16:24:33 -07:00
Vladimir Zapolskiy
95f774c526 misc/93xx46: avoid infinite loop on write()
This change fixes a problem of infinite zero byte write() without
an error status, if there is an attempt to write a file bigger than
EEPROM size over sysfs interface.

Signed-off-by: Vladimir Zapolskiy <vladimir_zapolskiy@mentor.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-10-29 16:24:33 -07:00
Johan Hovold
5c6d6fd156 misc: atmel_pwm: add deferred-probing support
Two drivers (atmel-pwm-bl and leds-atmel-pwm) currently depend on the
atmel_pwm driver to have bound to any pwm-device before their devices
are probed.

Support deferred probing of such devices by making sure to return
-EPROBE_DEFER from pwm_channel_alloc when no pwm-device has yet been
bound.

Signed-off-by: Johan Hovold <jhovold@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-10-29 16:22:57 -07:00
Tomas Winkler
50f67a0671 mei: wd: host_init propagate error codes from called functions
Propagate error codes from called functions, they are correct.

Signed-off-by: Tomas Winkler <tomas.winkler@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-10-29 16:20:43 -07:00
Alexander Usyskin
c4e87b5259 mei: replace stray pr_debug with dev_dbg
Driver better use dev_dbg, not pr_debug.

Signed-off-by: Alexander Usyskin <alexander.usyskin@intel.com>
Signed-off-by: Tomas Winkler <tomas.winkler@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-10-29 16:20:43 -07:00
Tomas Winkler
f9350129a0 mei: bus: propagate error code returned by mei_me_cl_by_id
no need to change error code value returned by
mei_me_cl_by_id, just propagate it on

Signed-off-by: Tomas Winkler <tomas.winkler@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-10-29 16:20:43 -07:00
Tomas Winkler
df667a1a2c mei: mei_cl_link remove duplicated check for open_handle_count
Signed-off-by: Tomas Winkler <tomas.winkler@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-10-29 16:20:43 -07:00
Alexander Usyskin
f931f4f3f0 mei: print correct device state during unexpected reset
Move the unexpected state print to the beginning of mei_reset,
thus printing right state.

Signed-off-by: Alexander Usyskin <alexander.usyskin@intel.com>
Signed-off-by: Tomas Winkler <tomas.winkler@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-10-29 16:20:35 -07:00
Tomas Winkler
4bff7208f3 mei: nfc: fix memory leak in error path
The flow may reach the err label without freeing cl and cl_info

cl and cl_info weren't assigned to ndev->cl and cl_info
so they weren't freed in mei_nfc_free called on error path

Cc: <stable@vger.kernel.org> # 3.10+
Cc: Samuel Ortiz <sameo@linux.intel.com>
Signed-off-by: Tomas Winkler <tomas.winkler@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-10-29 16:20:28 -07:00
Kees Cook
9ae113ce5f lkdtm: add tests for additional page permissions
Testing execution and access of userspace from the kernel is needed for
validating things like Intel's SMEP and SMAP protections. Additionally,
add an explicit test for validating that RO page permissions have been
set for the RO data area.

Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-10-29 16:13:39 -07:00
Kees Cook
7d196ac303 lkdtm: adjust recursion size to avoid warnings
When CONFIG_FRAME_WARN is set low (e.g. some ARM builds), the hard-coded
stack buffer size used for kernel stack over run testing triggers build
warnings. Instead, avoid the warning by recalcuating the buffer size and
recursion count needed to trigger the test. Also uses the recursion counter
indirectly to avoid changing the parameter during the test.

Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-10-29 16:13:39 -07:00
Bjorn Helgaas
3eae136717 device: Make dev_WARN/dev_WARN_ONCE print device as well as driver name
dev_WARN() and dev_WARN_ONCE() are annoying because (1) they include
only the driver name, not the device name, and (2) they print a spurious
newline in the middle.  This results in messages like this that are less
useful than they should be:

  [   40.094995] Device pcieport
  disabling already-disabled device

This patch makes them work more like dev_printk().

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Felipe Balbi <balbi@ti.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-10-29 16:06:07 -07:00
Tejun Heo
d1c1459e45 sysfs: separate out dup filename warning into a separate function
Separate out sysfs_warn_dup() out of sysfs_add_one().  This will help
separating out the core sysfs functionalities into kernfs so that it
can be used by non-sysfs users too.

This doesn't make any functional changes.

Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-10-29 15:12:07 -07:00
Tejun Heo
7eed6ecb07 sysfs: move sysfs_hash_and_remove() to fs/sysfs/dir.c
Most removal related logic is implemented in fs/sysfs/dir.c.  Move
sysfs_hash_and_remove() to fs/sysfs/dir.c so that __sysfs_remove()
doesn't have to be public.

This is pure relocation.

Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-10-29 15:12:07 -07:00
Tejun Heo
baa97cb507 sysfs: remove unused sysfs_get_dentry() prototype
sysfs_get_dentry() has been gone for years now.  Remove the left-over
prototype.

Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-10-29 15:12:07 -07:00
Tejun Heo
672f76a81a sysfs: honor bin_attr.attr.ignore_lockdep
ignore_lockdep is currently honored only for regular files.  There's
no reason to ignore it for bin files.  Update sysfs_ignore_lockdep()
so that bin_attr.attr.ignore_lockdep works too.

While this doesn't have any in-kernel user, this unifies the behaviors
between regular and bin files and will help later changes.

Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-10-29 15:12:07 -07:00
Tejun Heo
56b3f3b884 sysfs: merge sysfs_elem_bin_attr into sysfs_elem_attr
3124eb1679 ("sysfs: merge regular and bin file handling") folded bin
file handling into regular file handling.  Among other things, bin
file now shares the same open path including sysfs_open_dirent
association using sysfs_dirent->s_attr.open.  This is buggy because
->s_bin_attr lives in the same union and doesn't have the field.  This
bug doesn't trigger because sysfs_elem_bin_attr doesn't have an active
field at the conflicting position.  It does have a field "buffers" but
it isn't used anymore.

This patch collapses sysfs_elem_bin_attr into sysfs_elem_attr so that
the bin_attr is accessed through ->s_attr.bin_attr which lives with
->s_attr.attr in an anonymous union.  The code paths already assume
bin_attr contains attr as the first element, so this doesn't add any
more assumptions while making it explicit that the two types are
handled together.

Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-10-29 15:12:06 -07:00
Vlad Yasevich
06499098a0 bridge: pass correct vlan id to multicast code
Currently multicast code attempts to extrace the vlan id from
the skb even when vlan filtering is disabled.  This can lead
to mdb entries being created with the wrong vlan id.
Pass the already extracted vlan id to the multicast
filtering code to make the correct id is used in
creation as well as lookup.

Signed-off-by: Vlad Yasevich <vyasevic@redhat.com>
Acked-by: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-10-29 17:40:08 -04:00
David S. Miller
911aeb1084 Merge branch 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jesse/openvswitch
Jesse Gross says:

====================
One patch for net/3.12 fixing an issue where devices could be in an
invalid state they are removed while still attached to OVS.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2013-10-29 17:36:39 -04:00
Michael Drüing
706e282b69 net: x25: Fix dead URLs in Kconfig
Update the URLs in the Kconfig file to the new pages at sangoma.com and cisco.com

Signed-off-by: Michael Drüing <michael@drueing.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-10-29 17:35:17 -04:00
David S. Miller
68783ec73c Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf
Pablo Neira Ayuso says:

====================
This pull request contains the following netfilter fix:

* fix --queue-bypass in xt_NFQUEUE revision 3. While adding the
  revision 3 of this target, the bypass flags were not correctly
  handled anymore, thus, breaking packet bypassing if no application
  is listening from userspace, patch from Holger Eitzenberger,
  reported by Florian Westphal.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2013-10-29 16:53:44 -04:00
Deng-Cheng Zhu
7f081f1755 MIPS: Perf: Fix 74K cache map
According to Software User's Manual, the event of last-level-cache
read/write misses is mapped to even counters. Odd counters of that
event number count miss cycles.

Signed-off-by: Deng-Cheng Zhu <dengcheng.zhu@imgtec.com>
Signed-off-by: Markos Chandras <markos.chandras@imgtec.com>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/6036/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2013-10-29 21:18:23 +01:00
Linus Torvalds
7314e613d5 Fix a few incorrectly checked [io_]remap_pfn_range() calls
Nico Golde reports a few straggling uses of [io_]remap_pfn_range() that
really should use the vm_iomap_memory() helper.  This trivially converts
two of them to the helper, and comments about why the third one really
needs to continue to use remap_pfn_range(), and adds the missing size
check.

Reported-by: Nico Golde <nico@ngolde.de>
Cc: stable@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org.
2013-10-29 10:21:34 -07:00
Linus Torvalds
f9ec2e6f79 Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf tooling fixes from Ingo Molnar:
 "This contains five tooling fixes:

   - fix a remaining mmap2 assumption which resulted in perf top output
     breakage
   - fix mmap ring-buffer processing bug that corrupts data
   - fix for a severe python scripting memory leak
   - fix broken (and user-visible) -g option handling
   - fix stdio output

  The diffstat size is larger than what we'd like to see this late :-/"

* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  perf tools: Fixup mmap event consumption
  perf top: Split -G and --call-graph
  perf record: Split -g and --call-graph
  perf hists: Add color overhead for stdio output buffer
  perf tools: Fix up /proc/PID/maps parsing
  perf script python: Fix mem leak due to missing Py_DECREFs on dict entries
2013-10-29 08:36:50 -07:00
Linus Torvalds
2a999aa0a1 Kconfig: make KOBJECT_RELEASE debugging require timer debugging
Without the timer debugging, the delayed kobject release will just
result in undebuggable oopses if it triggers any latent bugs.  That
doesn't actually help debugging at all.

So make DEBUG_KOBJECT_RELEASE depend on DEBUG_OBJECTS_TIMERS to avoid
having people enable one without the other.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-10-29 08:33:36 -07:00
Daniel Vetter
1fbc0d789d drm/i915: Fix the PPT fdi lane bifurcate state handling on ivb
Originally I've thought that this is leftover hw state dirt from the
BIOS. But after way too much helpless flailing around on my part I've
noticed that the actual bug is when we change the state of an already
active pipe.

For example when we change the fdi lines from 2 to 3 without switching
off outputs in-between we'll never see the crucial on->off transition
in the ->modeset_global_resources hook the current logic relies on.

Patch version 2 got this right by instead also checking whether the
pipe is indeed active. But that in turn broke things when pipes have
been turned off through dpms since the bifurcate enabling is done in
the ->crtc_mode_set callback.

To address this issues discussed with Ville in the patch review move
the setting of the bifurcate bit into the ->crtc_enable hook. That way
we won't wreak havoc with this state when userspace puts all other
outputs into dpms off state. This also moves us forward with our
overall goal to unify the modeset and dpms on paths (which we need to
have to allow runtime pm in the dpms off state).

Unfortunately this requires us to move the bifurcate helpers around a
bit.

Also update the commit message, I've misanalyzed the bug rather badly.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=70507
Tested-by: Jan-Michael Brummer <jan.brummer@tabos.org>
Cc: stable@vger.kernel.org
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-10-29 13:52:56 +01:00
Holger Eitzenberger
d954777324 netfilter: xt_NFQUEUE: fix --queue-bypass regression
V3 of the NFQUEUE target ignores the --queue-bypass flag,
causing packets to be dropped when the userspace listener
isn't running.

Regression is in since 8746ddcf12 ("netfilter: xt_NFQUEUE:
introduce CPU fanout").

Reported-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Holger Eitzenberger <holger@eitzenberger.org>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2013-10-29 13:05:54 +01:00
Peter Zijlstra
e8a923cc1f perf/x86: Fix NMI measurements
OK, so what I'm actually seeing on my WSM is that sched/clock.c is
'broken' for the purpose we're using it for.

What triggered it is that my WSM-EP is broken :-(

  [    0.001000] tsc: Fast TSC calibration using PIT
  [    0.002000] tsc: Detected 2533.715 MHz processor
  [    0.500180] TSC synchronization [CPU#0 -> CPU#6]:
  [    0.505197] Measured 3 cycles TSC warp between CPUs, turning off TSC clock.
  [    0.004000] tsc: Marking TSC unstable due to check_tsc_sync_source failed

For some reason it consistently detects TSC skew, even though NHM+
should have a single clock domain for 'reasonable' systems.

This marks sched_clock_stable=0, which means that we do fancy stuff to
try and get a 'sane' clock. Part of this fancy stuff relies on the tick,
clearly that's gone when NOHZ=y. So for idle cpus time gets stuck, until
it either wakes up or gets kicked by another cpu.

While this is perfectly fine for the scheduler -- it only cares about
actually running stuff, and when we're running stuff we're obviously not
idle. This does somewhat break down for perf which can trigger events
just fine on an otherwise idle cpu.

So I've got NMIs get get 'measured' as taking ~1ms, which actually
don't last nearly that long:

          <idle>-0     [013] d.h.   886.311970: rcu_nmi_enter <-do_nmi
  ...
          <idle>-0     [013] d.h.   886.311997: perf_sample_event_took: HERE!!! : 1040990

So ftrace (which uses sched_clock(), not the fancy bits) only sees
~27us, but we measure ~1ms !!

Now since all this measurement stuff lives in x86 code, we can actually
fix it.

Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Cc: mingo@kernel.org
Cc: dave.hansen@linux.intel.com
Cc: eranian@google.com
Cc: Don Zickus <dzickus@redhat.com>
Cc: jmario@redhat.com
Cc: acme@infradead.org
Link: http://lkml.kernel.org/r/20131017133350.GG3364@laptop.programming.kicks-ass.net
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-10-29 12:01:20 +01:00
Peter Zijlstra
bf378d341e perf: Fix perf ring buffer memory ordering
The PPC64 people noticed a missing memory barrier and crufty old
comments in the perf ring buffer code. So update all the comments and
add the missing barrier.

When the architecture implements local_t using atomic_long_t there
will be double barriers issued; but short of introducing more
conditional barrier primitives this is the best we can do.

Reported-by: Victor Kaplansky <victork@il.ibm.com>
Tested-by: Victor Kaplansky <victork@il.ibm.com>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
Cc: michael@ellerman.id.au
Cc: Paul McKenney <paulmck@linux.vnet.ibm.com>
Cc: Michael Neuling <mikey@neuling.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: anton@samba.org
Cc: benh@kernel.crashing.org
Link: http://lkml.kernel.org/r/20131025173749.GG19466@laptop.lan
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-10-29 12:01:19 +01:00
Mel Gorman
0255d49184 mm: Account for a THP NUMA hinting update as one PTE update
A THP PMD update is accounted for as 512 pages updated in vmstat.  This is
large difference when estimating the cost of automatic NUMA balancing and
can be misleading when comparing results that had collapsed versus split
THP. This patch addresses the accounting issue.

Signed-off-by: Mel Gorman <mgorman@suse.de>
Reviewed-by: Rik van Riel <riel@redhat.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Cc: <stable@kernel.org>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1381141781-10992-10-git-send-email-mgorman@suse.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-10-29 11:38:17 +01:00
Mel Gorman
3f926ab945 mm: Close races between THP migration and PMD numa clearing
THP migration uses the page lock to guard against parallel allocations
but there are cases like this still open

  Task A					Task B
  ---------------------				---------------------
  do_huge_pmd_numa_page				do_huge_pmd_numa_page
  lock_page
  mpol_misplaced == -1
  unlock_page
  goto clear_pmdnuma
						lock_page
						mpol_misplaced == 2
						migrate_misplaced_transhuge
  pmd = pmd_mknonnuma
  set_pmd_at

During hours of testing, one crashed with weird errors and while I have
no direct evidence, I suspect something like the race above happened.
This patch extends the page lock to being held until the pmd_numa is
cleared to prevent migration starting in parallel while the pmd_numa is
being cleared. It also flushes the old pmd entry and orders pagetable
insertion before rmap insertion.

Signed-off-by: Mel Gorman <mgorman@suse.de>
Reviewed-by: Rik van Riel <riel@redhat.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Cc: <stable@kernel.org>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1381141781-10992-9-git-send-email-mgorman@suse.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-10-29 11:38:05 +01:00
Mel Gorman
c61109e34f mm: numa: Sanitize task_numa_fault() callsites
There are three callers of task_numa_fault():

 - do_huge_pmd_numa_page():
     Accounts against the current node, not the node where the
     page resides, unless we migrated, in which case it accounts
     against the node we migrated to.

 - do_numa_page():
     Accounts against the current node, not the node where the
     page resides, unless we migrated, in which case it accounts
     against the node we migrated to.

 - do_pmd_numa_page():
     Accounts not at all when the page isn't migrated, otherwise
     accounts against the node we migrated towards.

This seems wrong to me; all three sites should have the same
sementaics, furthermore we should accounts against where the page
really is, we already know where the task is.

So modify all three sites to always account; we did after all receive
the fault; and always account to where the page is after migration,
regardless of success.

They all still differ on when they clear the PTE/PMD; ideally that
would get sorted too.

Signed-off-by: Mel Gorman <mgorman@suse.de>
Reviewed-by: Rik van Riel <riel@redhat.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Cc: <stable@kernel.org>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1381141781-10992-8-git-send-email-mgorman@suse.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-10-29 11:37:52 +01:00
Mel Gorman
587fe586f4 mm: Prevent parallel splits during THP migration
THP migrations are serialised by the page lock but on its own that does
not prevent THP splits. If the page is split during THP migration then
the pmd_same checks will prevent page table corruption but the unlock page
and other fix-ups potentially will cause corruption. This patch takes the
anon_vma lock to prevent parallel splits during migration.

Signed-off-by: Mel Gorman <mgorman@suse.de>
Reviewed-by: Rik van Riel <riel@redhat.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Cc: <stable@kernel.org>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1381141781-10992-7-git-send-email-mgorman@suse.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-10-29 11:37:39 +01:00
Mel Gorman
42836f5f8b mm: Wait for THP migrations to complete during NUMA hinting faults
The locking for migrating THP is unusual. While normal page migration
prevents parallel accesses using a migration PTE, THP migration relies on
a combination of the page_table_lock, the page lock and the existance of
the NUMA hinting PTE to guarantee safety but there is a bug in the scheme.

If a THP page is currently being migrated and another thread traps a
fault on the same page it checks if the page is misplaced. If it is not,
then pmd_numa is cleared. The problem is that it checks if the page is
misplaced without holding the page lock meaning that the racing thread
can be migrating the THP when the second thread clears the NUMA bit
and faults a stale page.

This patch checks if the page is potentially being migrated and stalls
using the lock_page if it is potentially being migrated before checking
if the page is misplaced or not.

Signed-off-by: Mel Gorman <mgorman@suse.de>
Reviewed-by: Rik van Riel <riel@redhat.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Cc: <stable@kernel.org>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1381141781-10992-6-git-send-email-mgorman@suse.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-10-29 11:37:19 +01:00
Mel Gorman
1dd49bfa34 mm: numa: Do not account for a hinting fault if we raced
If another task handled a hinting fault in parallel then do not double
account for it.

Signed-off-by: Mel Gorman <mgorman@suse.de>
Reviewed-by: Rik van Riel <riel@redhat.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Cc: <stable@kernel.org>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1381141781-10992-5-git-send-email-mgorman@suse.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-10-29 11:37:05 +01:00
Ingo Molnar
cd65718712 perf/urgent fixes:
. Add color overhead for stdio output buffer, which fixes
   --stdio output being chopped up on the hot (red) entries,
   fix from Jiri Olsa.
 
 . Get 'perf record -g -a sleep 1' working again, removing the
   need for -- separating perf options from the workload, restoring
   ages old behaviour, fix from Jiri Olsa.
   More patches allowing ~/.perfconfig setting up of default
   callchain collecting method ("fp" or "dwarf") left for next
   merge window.
 
 . Fixup mmap event consumption, where we were acking the
   consumption by writing the tail before actually accessing
   the event, which could lead to using overwritten records
   in things like 'perf record --call-graph'.  from Zhouyi Zhou.
 
 Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.14 (GNU/Linux)
 
 iQIcBAABAgAGBQJSbrexAAoJENZQFvNTUqpA+MIP/1dMAat/+jAobBvy9s69rJbT
 OV2QxqS5haA6rjbyogY7AiaCM/bSFO9dpLDF31NUTc3s6hiMsHnXnogQxvGGrBzT
 lzj45V9/a8XXl8Wxbpv0z4mI7ppNR3KjJdkBTsuqso5GivzMuG/8/N7ZxYB8kSlL
 vnWLRBKiBa/sDKIhKy5oEqmMMa+kPmwIXGZiKFxWkNMjmBtK80SB7MtPernhLNGf
 wj6EEulphPR1s8VHXqZIoijEvgw2vEXDTFapKd53EV+UPCII13NW7gicgPXyiWrY
 BIRW4Pwl98y9qHEc5gSq5Eotcl9JhIPQVegTSnvx8jJ7x0bYrxmChb/I9eAjWQcj
 YQAKGqUoQGu0PJv8TeIoGsxNMFxeeeMKzf275KKc5g6Les6gIHYKOmW7RPXHIIA7
 DHGYOJ808jMrNFmAkKNmxQGbE7forMFj6CPDgGpGezqTUbGsszpzga2QQgI0QmrY
 l8aPUsirF5cZ67vPk7dnUt58FayfKMdGME2dkU352hjOL1V5EeVdxx46zZsPQNyQ
 evQvkGBmRm6WFi6Sybjc6i4Gkn9F1O5sNDtEDbSQ6aZH5tNnktzLoufPrWwHKV95
 GzEHpCuAeZbagPF6WSxpNTb2xfScPb6PfveGm0gbypwl/KaAr47w8VCWJiuvpSOv
 yV83qEooKW0LUn4JXrH3
 =U+a8
 -----END PGP SIGNATURE-----

Merge tag 'perf-urgent-for-mingo' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/urgent

Pull perf/urgent fixes from Arnaldo Carvalho de Melo:

 * Add color overhead for stdio output buffer, which fixes
   --stdio output being chopped up on the hot (red) entries,
   fix from Jiri Olsa.

 * Get 'perf record -g -a sleep 1' working again, removing the
   need for -- separating perf options from the workload, restoring
   ages old behaviour, fix from Jiri Olsa.
   More patches allowing ~/.perfconfig setting up of default
   callchain collecting method ("fp" or "dwarf") left for next
   merge window.

 * Fixup mmap event consumption, where we were acking the
   consumption by writing the tail before actually accessing
   the event, which could lead to using overwritten records
   in things like 'perf record --call-graph'. From Zhouyi Zhou.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2013-10-29 09:06:07 +01:00
Wei Liu
059dfa6a93 xen-netback: use jiffies_64 value to calculate credit timeout
time_after_eq() only works if the delta is < MAX_ULONG/2.

For a 32bit Dom0, if netfront sends packets at a very low rate, the time
between subsequent calls to tx_credit_exceeded() may exceed MAX_ULONG/2
and the test for timer_after_eq() will be incorrect. Credit will not be
replenished and the guest may become unable to send packets (e.g., if
prior to the long gap, all credit was exhausted).

Use jiffies_64 variant to mitigate this problem for 32bit Dom0.

Suggested-by: Jan Beulich <jbeulich@suse.com>
Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Reviewed-by: David Vrabel <david.vrabel@citrix.com>
Cc: Ian Campbell <ian.campbell@citrix.com>
Cc: Jason Luan <jianhai.luan@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-10-29 00:24:49 -04:00