Commit Graph

665443 Commits

Author SHA1 Message Date
Tobias Klauser
d1892e4ec9 rtnl: simplify error return path in rtnl_create_link()
There is only one possible error path which reaches the err label, so
return ERR_PTR(-ENOMEM) directly if alloc_netdev_mqs() fails. This also
allows to omit the err variable.

Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-21 12:17:43 -05:00
H Hartley Sweeten
3ea07127d9 rtc: m48t86: verify that the RTC is actually present
The RTC is an optional feature at purchase time on some Technologic
Systems boards. Verify that it actually exists by checking if the
last two bytes of the NVRAM can be changed.

Signed-off-by: H Hartley Sweeten <hsweeten@visionengravers.com>
Signed-off-by: Alexandre Belloni <alexandre.belloni@free-electrons.com>
2017-02-21 18:16:31 +01:00
Shawn Guo
54d82e0f2d drm: qxl: use vblank hooks in struct drm_crtc_funcs
The vblank hooks in struct drm_driver are deprecated and only meant for
legacy drivers.  For modern drivers with DRIVER_MODESET flag, the hooks
in struct drm_crtc_funcs should be used instead.

Signed-off-by: Shawn Guo <shawn.guo@linaro.org>
Cc: Dave Airlie <airlied@redhat.com>
Signed-off-by: Sean Paul <seanpaul@chromium.org>
Link: http://patchwork.freedesktop.org/patch/msgid/1486458995-31018-16-git-send-email-shawnguo@kernel.org
2017-02-21 11:17:55 -05:00
Shawn Guo
a5073a5b7a drm: mediatek: use vblank hooks in struct drm_crtc_funcs
The vblank hooks in struct drm_driver are deprecated and only meant for
legacy drivers.  For modern drivers with DRIVER_MODESET flag, the hooks
in struct drm_crtc_funcs should be used instead.

Signed-off-by: Shawn Guo <shawn.guo@linaro.org>
Cc: CK Hu <ck.hu@mediatek.com>
Signed-off-by: Sean Paul <seanpaul@chromium.org>
Link: http://patchwork.freedesktop.org/patch/msgid/1486458995-31018-14-git-send-email-shawnguo@kernel.org
2017-02-21 11:17:54 -05:00
Shawn Guo
d4f6750f9b drm: kirin: use vblank hooks in struct drm_crtc_funcs
The vblank hooks in struct drm_driver are deprecated and only meant for
legacy drivers.  For modern drivers with DRIVER_MODESET flag, the hooks
in struct drm_crtc_funcs should be used instead.

Reviewed-by: Xinliang Liu <xinliang.liu@linaro.org>
Signed-off-by: Shawn Guo <shawn.guo@linaro.org>
Cc: Xinliang Liu <z.liuxinliang@hisilicon.com>
Signed-off-by: Sean Paul <seanpaul@chromium.org>
Link: http://patchwork.freedesktop.org/patch/msgid/1486458995-31018-12-git-send-email-shawnguo@kernel.org
2017-02-21 11:17:45 -05:00
Mike Leach
7e6a69ee95 arm64: dts: juno: update definition for programmable replicator
Juno platforms have a programmable replicator splitting the trace output
to TPIU and ETR. Currently this is not being programmed as it is being
treated as a none-programmable replicator - which is the default
operational mode for these devices. The TPIU in the system is enabled by
default, and this combination is causing back-pressure in the trace
system resulting in overflows at the source.

Replaces the existing definition with one that defines the programmable
replicator, using the "qcom,coresight-replicator1x" driver that provides
the correct functionality for CoreSight programmable replicators.

Reviewed-and-Tested-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Signed-off-by: Mike Leach <mike.leach@linaro.org>
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
2017-02-21 16:02:02 +00:00
Dan Carpenter
9761a2469d sunrpc: silence uninitialized variable warning
kstrtouint() can return a couple different error codes so the check for
"ret == -EINVAL" is wrong and static analysis tools correctly complain
that we can use "num" without initializing it.  It's not super harmful
because we check the bounds.  But it's also easy enough to fix.

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2017-02-21 10:53:36 -05:00
Linus Torvalds
9763dd6f81 We've got eight GFS2 patches for this merge window:
1. Andy Price submitted a patch to make gfs2_write_full_page a
    static function.
 2. Dan Carpenter submitted a patch to fix a ERR_PTR thinko.
 
 I've also got a few patches, three of which fix bugs related to
 deleting very large files, which cause GFS2 to run out of
 journal space:
 
 3. The first one prevents GFS2 delete operation from requesting too
    much journal space.
 4. The second one fixes a problem whereby GFS2 can hang because it
    wasn't taking journal space demand into its calculations.
 5. The third one wakes up IO waiters when a flush is done to restart
    processes stuck waiting for journal space to become available.
 
 The other three patches are a performance improvement related to
 spin_lock contention between multiple writers:
 
 6. The "tr_touched" variable was switched to a flag to be more atomic
    and eliminate the possibility of some races.
 7. Function meta_lo_add was moved inline with its only caller to make
    the code more readable and efficient.
 8. Contention on the gfs2_log_lock spinlock was greatly reduced by
    avoiding the lock altogether in cases where we don't really need
    it: buffers that already appear in the appropriate metadata list
    for the journal. Many thanks to Steve Whitehouse for the ideas and
    principles behind these patches.
 -----BEGIN PGP SIGNATURE-----
 
 iQEcBAABAgAGBQJYrEEEAAoJENeLYdPf93o7bjoIAIqPG/EAzi+idgMWDPQa9Eit
 53dPy16snkrbWwtaK6spSWlH6bGYuHeanXORYon9bvtVjKYaa4NQclGihN2IE6uB
 O8zT+MGwP45LDhNplVJpumaOALZ9ZDqQSe+3tHeNK3FhNirLyiIjSqrHt/7Yi1qi
 fPLlT4Jx0TBo5rhvEGa7Yg01WhWVtnmVSMqJXj/7ZtC50s1aPyDUikdNIDfDCN2X
 LxfKGDXuk6p63VQ6JKqYSBVATCR0/bbKfkuk/kBUTYLoHoapImxB8d0HgIdsh1Mv
 9PlbZnnNW8k5oapuhVxjl0T5G0JsQgCkPb/wlte+ryOCjBoc2L2fCUV5qc0QxWc=
 =xQyl
 -----END PGP SIGNATURE-----

Merge tag 'gfs2-4.11.fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2

Pull GFS2 updates from Robert Peterson:
 "We've got eight GFS2 patches for this merge window:

   - Andy Price submitted a patch to make gfs2_write_full_page a static
     function.

   - Dan Carpenter submitted a patch to fix a ERR_PTR thinko.

  Three patches fix bugs related to deleting very large files, which
  cause GFS2 to run out of journal space:

   - The first one prevents GFS2 delete operation from requesting too
     much journal space.

   - The second one fixes a problem whereby GFS2 can hang because it
     wasn't taking journal space demand into its calculations.

   - The third one wakes up IO waiters when a flush is done to restart
     processes stuck waiting for journal space to become available.

  The final three patches are a performance improvement related to
  spin_lock contention between multiple writers:

   - The "tr_touched" variable was switched to a flag to be more atomic
     and eliminate the possibility of some races.

   - Function meta_lo_add was moved inline with its only caller to make
     the code more readable and efficient.

   - Contention on the gfs2_log_lock spinlock was greatly reduced by
     avoiding the lock altogether in cases where we don't really need
     it: buffers that already appear in the appropriate metadata list
     for the journal. Many thanks to Steve Whitehouse for the ideas and
     principles behind these patches"

* tag 'gfs2-4.11.fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2:
  gfs2: Make gfs2_write_full_page static
  GFS2: Reduce contention on gfs2_log_lock
  GFS2: Inline function meta_lo_add
  GFS2: Switch tr_touched to flag in transaction
  GFS2: Wake up io waiters whenever a flush is done
  GFS2: Made logd daemon take into account log demand
  GFS2: Limit number of transaction blocks requested for truncates
  GFS2: Fix reference to ERR_PTR in gfs2_glock_iter_next
2017-02-21 07:46:34 -08:00
Vinod Koul
1ad651154b Merge branch 'topic/zx' into for-linus 2017-02-21 21:14:35 +05:30
Vinod Koul
8a3ec58382 Merge branch 'topic/stm32-dma' into for-linus 2017-02-21 21:14:27 +05:30
Vinod Koul
25036f2a73 Merge branch 'topic/ste' into for-linus 2017-02-21 21:14:07 +05:30
Linus Torvalds
70fcf5c339 Merge branch 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs
Pull UDF fixes and cleanups from Jan Kara:
 "Several small UDF fixes and cleanups and a small cleanup of fanotify
  code"

* 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs:
  fanotify: simplify the code of fanotify_merge
  udf: simplify udf_ioctl()
  udf: fix ioctl errors
  udf: allow implicit blocksize specification during mount
  udf: check partition reference in udf_read_inode()
  udf: atomically read inode size
  udf: merge module informations in super.c
  udf: remove next_epos from udf_update_extent_cache()
  udf: Factor out trimming of crtime
  udf: remove empty condition
  udf: remove unneeded line break
  udf: merge bh free
  udf: use pointer for kernel_long_ad argument
  udf: use __packed instead of __attribute__ ((packed))
  udf: Make stat on symlink report symlink length as st_size
  fs/udf: make #ifdef UDF_PREALLOCATE unconditional
  fs: udf: Replace CURRENT_TIME with current_time()
2017-02-21 07:44:03 -08:00
Vinod Koul
e4e48c47d1 Merge branch 'topic/intel' into for-linus 2017-02-21 21:13:57 +05:30
Christoph Hellwig
783112f740 nfsd: special case truncates some more
Both the NFS protocols and the Linux VFS use a setattr operation with a
bitmap of attributes to set to set various file attributes including the
file size and the uid/gid.

The Linux syscalls never mix size updates with unrelated updates like
the uid/gid, and some file systems like XFS and GFS2 rely on the fact
that truncates don't update random other attributes, and many other file
systems handle the case but do not update the other attributes in the
same transaction.  NFSD on the other hand passes the attributes it gets
on the wire more or less directly through to the VFS, leading to updates
the file systems don't expect.  XFS at least has an assert on the
allowed attributes, which caught an unusual NFS client setting the size
and group at the same time.

To handle this issue properly this splits the notify_change call in
nfsd_setattr into two separate ones.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Cc: stable@vger.kernel.org
Tested-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2017-02-21 10:13:37 -05:00
Jaehoon Chung
e7cd7ef58e PCI: exynos: Support the PHY generic framework
Switch the pci-exynos driver to generic PHY framework.  At the same time
backward compatibility is preserved: Warning will be printed for old DTB.

Refer to the binding file:
- Documentation/devicetree/bindings/pci/samsung,exynos5440-pcie.txt

Signed-off-by: Jaehoon Chung <jh80.chung@samsung.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Pankaj Dubey <pankaj.dubey@samsung.com>
Reviewed-by: Alim Akhtar <alim.akhtar@samsung.com>
Acked-by: Krzysztof Kozlowski <krzk@kernel.org>
Acked-by: Jingoo Han <jingoohan1@gmail.com>
2017-02-21 07:48:47 -06:00
Jaehoon Chung
34f80c7ddf Documentation: binding: Modify the exynos5440 PCIe binding
According to using PHY framework, updates the exynos5440-pcie binding.  For
maintaining backward compatibility, leaves the current dt-binding.  (It
should be deprecated.)

Recommends to use the PHY Framework and "config" property to follow the
designware-pcie binding.  If you use the old way, can see "missing *config*
reg space" message.  Because the getting configuration space address from
range is old way.

NOTE: When use the "config" property, first name of 'reg-names' must be set
to "elbi".  Otherwise driver can't maintain the backward capability.

Signed-off-by: Jaehoon Chung <jh80.chung@samsung.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Pankaj Dubey <pankaj.dubey@samsung.com>
Reviewed-by: Alim Akhtar <alim.akhtar@samsung.com>
Acked-by: Rob Herring <robh@kernel.org>
2017-02-21 07:48:44 -06:00
Jaehoon Chung
cf0adb8e28 phy: phy-exynos-pcie: Add support for Exynos PCIe PHY
Add support for Generic PHY framework about Exynos SoCs.  Current Exynos
PCIe driver doesn't use the PHY framework, which makes it difficult to
upstream the other Exynos variants because of different PHY registers.

Move the codes relevant to PHY from Exnyos PCIe driver to PHY Exynos PCIe
driver.

[bhelgaas: depend on "OF && (ARCH_EXYNOS || COMPILE_TEST)", update
copyright year, both per Vivek]
Signed-off-by: Jaehoon Chung <jh80.chung@samsung.com>
Acked-by: Krzysztof Kozlowski <krzk@kernel.org>
Reviewed-by: Jingoo Han <jingoohan1@gmail.com>
Reviewed-by: Pankaj Dubey <pankaj.dubey@samsung.com>
Reviewed-by: Vivek Gautam <vivek.gautam@codeaurora.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2017-02-21 07:48:42 -06:00
Jaehoon Chung
ad8ec41afa Documentation: samsung-phy: Add exynos-pcie-phy binding
Add the exynos-pcie-phy binding for Exynos PCIe PHY.  This is for using
generic PHY framework.

Signed-off-by: Jaehoon Chung <jh80.chung@samsung.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Rob Herring <robh@kernel.org>
2017-02-21 07:48:40 -06:00
Jani Nikula
15f080f08d drm/edid: respect connector force for drm_get_edid ddc probe
Skip DDC probe for forced connector status. Don't try to read the EDID
if the connector is forced off. Skipping probe for forced on connectors
will make more sense when drm_do_get_edid() will handle override and
firmware EDIDs.

Suggested-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1487344854-18777-4-git-send-email-jani.nikula@intel.com
2017-02-21 15:43:04 +02:00
Jani Nikula
e9bd0b84f4 drm: do not debug log about missing CEA extensions on NULL edid
Make the drm_edid_to_eld() function useful for resetting, not just
setting, the ELD and HDMI VSDB data, without debug warnings about
missing CEA extensions.

Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1487344854-18777-3-git-send-email-jani.nikula@intel.com
2017-02-21 15:42:16 +02:00
Jani Nikula
07c2b84b99 drm: move edid property update and add modes out of edid firmware loader
Make the firmware loader more generic and generally useful.

Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1487344854-18777-2-git-send-email-jani.nikula@intel.com
2017-02-21 15:41:24 +02:00
Linus Walleij
3498d8694d gpio: reintroduce devm_get_gpiod_from_child()
We need to keep this API around for the merge window to avoid
nasty build problems in the merges.

Cc: Lee Jones <lee.jones@linaro.org>
Suggested-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
2017-02-21 14:19:45 +01:00
Tvrtko Ursulin
99c181a0fa drm/i915/tracepoints: Add hw_id to context tracepoints
It is useful to provide this info to match the one provided
in the request tracepoints.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170221091350.14605-1-tvrtko.ursulin@linux.intel.com
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
2017-02-21 13:18:30 +00:00
Tvrtko Ursulin
d7d96833f2 drm/i915/tracepoints: Add backend level request in and out tracepoints
Two new tracepoints placed at the call sites where requests are
actually passed to the GPU enable userspace to track engine
utilisation.

These tracepoints are only enabled when the
DRM_I915_LOW_LEVEL_TRACEPOINTS Kconfig option is enabled.

v2: Fix compilation with !CONFIG_DRM_I915_LOW_LEVEL_TRACEPOINTS.

v3: Name global seqno consistently across tracepoints.

v4: Remove port info from request out tracepoint. (Chris Wilson)

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
2017-02-21 13:18:25 +00:00
Tvrtko Ursulin
dffabc8f4e drm/i915/tracepoints: Rename i915_gem_request_notify
i915_gem_ring_notify is more appropriate since we do not have
the request information at this point, but it is simply a
signal from the engine that some request has been completed.

v2:
  * Always trace and log if there were any waiters.
  * Rename to intel_engine_notify. (Chris Wilson)

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
2017-02-21 13:18:19 +00:00
Tvrtko Ursulin
354d036fcf drm/i915/tracepoints: Add request submit and execute tracepoints
These new tracepoints are emitted once the request is ready to
be submitted to the GPU and once the request is about to
be submitted to the GPU, respectively.

Former condition triggers as soon as all the fences and
dependencies have been resolved, and the latter once the
backend is about to submit it to the GPU.

New tracepoint are enabled via the new
DRM_I915_LOW_LEVEL_TRACEPOINTS Kconfig option which is disabled
by default to alleviate the performance impact concerns.

v2: Move execute tracepoint to __i915_gem_request_submit.
    (Chris Wilson)

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
2017-02-21 13:18:14 +00:00
Tvrtko Ursulin
90aa412d50 drm/i915/tracepoints: Remove unused i915_gem_request_complete
Tracepoint is not used and won't be suitable for its replacement.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
2017-02-21 13:18:08 +00:00
Tvrtko Ursulin
9369250298 drm/i915/tracepoints: Tidy i915_gem_request_wait_begin
Provide the same information as the other request event classes.

v2: Pass in flags so we can properly report the blocking status.
    (Chris Wilson)

v3: Log hex with 0x prefix for clarity.

v4: Derive blocking status from flags. (Chris Wilson)

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
2017-02-21 13:18:04 +00:00
Tvrtko Ursulin
1cce8922df drm/i915/tracepoints: Adjust i915_gem_ring_dispatch
Rename it to i915_gem_request_queue and fix the logged info
equivalent to the i915_gem_request even class. Also moved it
a bit further apart from the i915_gem_request_add tracepoint
since they otherwise provide similar information too close in
time.

v2: Remove sw fence singalling. We will rely on the soon to
    come GuC scheduling backend to enable that. (Chris Wilson)

v3: Log hex with 0x prefix for clarity.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
2017-02-21 13:17:57 +00:00
Tvrtko Ursulin
e235b53019 drm/i915/tracepoints: Tidy request event class
At the moment only the global seqno is logged which is not set
until the request is ready for submission.

Add the per-contex seqno and the context hardware id which are
both interesting data points.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
2017-02-21 13:17:42 +00:00
Tvrtko Ursulin
56e51bf036 drm/i915: Tidy execlists_init_reg_state
Compact the name of the macro and reg_state variable, and cache
some data in local variables to make the function more compact
and more readable.

v2: Fixup some checkpatch warnings.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/20170221095839.30525-1-tvrtko.ursulin@linux.intel.com
2017-02-21 13:15:41 +00:00
Pablo Neira Ayuso
3ef767e5cb Merge branch 'master' of git://blackhole.kfki.hu/nf
Jozsef Kadlecsik says:

====================
ipset patches for nf

Please apply the next patches for ipset in your nf branch.
Both patches should go into the stable kernel branches as well,
because these are important bugfixes:

* Sometimes valid entries in hash:* types of sets were evicted
  due to a typo in an index. The wrong evictions happen when
  entries are deleted from the set and the bucket is shrinked.
  Bug was reported by Eric Ewanco and the patch fixes
  netfilter bugzilla id #1119.
* Fixing of a null pointer exception when someone wants to add an
  entry to an empty list type of set and specifies an add before/after
  option. The fix is from Vishwanath Pai.
====================

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2017-02-21 14:01:05 +01:00
Chris Wilson
e2989f140e drm/i915: Use reservation_object_lock()
Replace the calls to ww_mutex_lock(&resv->lock) with the helper
reservation_object_lock(resv) and similarly for unlock.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/20170221091723.6219-1-chris@chris-wilson.co.uk
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2017-02-21 12:48:51 +00:00
Liping Zhang
4eba8b78e1 netfilter: nfnetlink: remove static declaration from err_list
Otherwise, different subsys will race to access the err_list, with holding
the different nfnl_lock(subsys_id).

But this will not happen now, since ->call_batch is only implemented by
nftables, so the err_list is protected by nfnl_lock(NFNL_SUBSYS_NFTABLES).

Signed-off-by: Liping Zhang <zlpnobody@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2017-02-21 13:45:47 +01:00
Ken-ichirou MATSUZAWA
ba896a05ad netfilter: nfnetlink_queue: fix NFQA_VLAN_MAX definition
Should be - 1 as in other _MAX definitions.

Signed-off-by: Ken-ichirou MATSUZAWA <chamas@h4.dion.ne.jp>
Acked-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2017-02-21 13:45:45 +01:00
Waiman Long
dd0fd8bca1 x86/kvm: Provide optimized version of vcpu_is_preempted() for x86-64
It was found when running fio sequential write test with a XFS ramdisk
on a KVM guest running on a 2-socket x86-64 system, the %CPU times
as reported by perf were as follows:

 69.75%  0.59%  fio  [k] down_write
 69.15%  0.01%  fio  [k] call_rwsem_down_write_failed
 67.12%  1.12%  fio  [k] rwsem_down_write_failed
 63.48% 52.77%  fio  [k] osq_lock
  9.46%  7.88%  fio  [k] __raw_callee_save___kvm_vcpu_is_preempt
  3.93%  3.93%  fio  [k] __kvm_vcpu_is_preempted

Making vcpu_is_preempted() a callee-save function has a relatively
high cost on x86-64 primarily due to at least one more cacheline of
data access from the saving and restoring of registers (8 of them)
to and from stack as well as one more level of function call.

To reduce this performance overhead, an optimized assembly version
of the the __raw_callee_save___kvm_vcpu_is_preempt() function is
provided for x86-64.

With this patch applied on a KVM guest on a 2-socket 16-core 32-thread
system with 16 parallel jobs (8 on each socket), the aggregrate
bandwidth of the fio test on an XFS ramdisk were as follows:

   I/O Type      w/o patch    with patch
   --------      ---------    ----------
   random read   8141.2 MB/s  8497.1 MB/s
   seq read      8229.4 MB/s  8304.2 MB/s
   random write  1675.5 MB/s  1701.5 MB/s
   seq write     1681.3 MB/s  1699.9 MB/s

There are some increases in the aggregated bandwidth because of
the patch.

The perf data now became:

 70.78%  0.58%  fio  [k] down_write
 70.20%  0.01%  fio  [k] call_rwsem_down_write_failed
 69.70%  1.17%  fio  [k] rwsem_down_write_failed
 59.91% 55.42%  fio  [k] osq_lock
 10.14% 10.14%  fio  [k] __kvm_vcpu_is_preempted

The assembly code was verified by using a test kernel module to
compare the output of C __kvm_vcpu_is_preempted() and that of assembly
__raw_callee_save___kvm_vcpu_is_preempt() to verify that they matched.

Suggested-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Waiman Long <longman@redhat.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-02-21 12:48:35 +01:00
Waiman Long
6c62985d57 x86/paravirt: Change vcp_is_preempted() arg type to long
The cpu argument in the function prototype of vcpu_is_preempted()
is changed from int to long. That makes it easier to provide a better
optimized assembly version of that function.

For Xen, vcpu_is_preempted(long) calls xen_vcpu_stolen(int), the
downcast from long to int is not a problem as vCPU number won't exceed
32 bits.

Signed-off-by: Waiman Long <longman@redhat.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-02-21 12:48:06 +01:00
Chao Peng
96794e4ed4 KVM: VMX: use correct vmcs_read/write for guest segment selector/base
Guest segment selector is 16 bit field and guest segment base is natural
width field. Fix two incorrect invocations accordingly.

Without this patch, build fails when aggressive inlining is used with ICC.

Cc: stable@vger.kernel.org
Signed-off-by: Chao Peng <chao.p.peng@linux.intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-02-21 12:45:49 +01:00
Andy Lutomirski
b7ffc44d5b x86/kvm/vmx: Defer TR reload after VM exit
Intel's VMX is daft and resets the hidden TSS limit register to 0x67
on VMX reload, and the 0x67 is not configurable.  KVM currently
reloads TR using the LTR instruction on every exit, but this is quite
slow because LTR is serializing.

The 0x67 limit is entirely harmless unless ioperm() is in use, so
defer the reload until a task using ioperm() is actually running.

Here's some poorly done benchmarking using kvm-unit-tests:

Before:

cpuid 1313
vmcall 1195
mov_from_cr8 11
mov_to_cr8 17
inl_from_pmtimer 6770
inl_from_qemu 6856
inl_from_kernel 2435
outl_to_kernel 1402

After:

cpuid 1291
vmcall 1181
mov_from_cr8 11
mov_to_cr8 16
inl_from_pmtimer 6457
inl_from_qemu 6209
inl_from_kernel 2339
outl_to_kernel 1391

Signed-off-by: Andy Lutomirski <luto@kernel.org>
[Force-reload TR in invalidate_tss_limit. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-02-21 12:45:08 +01:00
Andy Lutomirski
d3273deac9 x86/asm/64: Drop __cacheline_aligned from struct x86_hw_tss
Historically, the entire TSS + io bitmap structure was cacheline
aligned, but commit ca241c7503 ("x86: unify tss_struct") changed it
(presumably inadvertently) so that the fixed-layout hardware part is
cacheline-aligned and the io bitmap is after the padding.  This wastes
24 bytes (the hardware part should be 104 bytes, but this pads it to
128 bytes) and, serves no purpose, and causes sizeof(struct
x86_hw_tss) to have a confusing value.

Drop the pointless alignment.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-02-21 11:49:02 +01:00
Andy Lutomirski
8c2e41f7ae x86/kvm/vmx: Simplify segment_base()
Use actual pointer types for pointers (instead of unsigned long) and
replace hardcoded constants with the appropriate self-documenting
macros.

The function is still a bit messy, but this seems a lot better than
before to me.

This is mostly borrowed from a patch by Thomas Garnier.

Cc: Thomas Garnier <thgarnie@google.com>
Cc: Jim Mattson <jmattson@google.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-02-21 11:48:56 +01:00
Andy Lutomirski
e28baeadcf x86/kvm/vmx: Get rid of segment_base() on 64-bit kernels
It was a bit buggy (it didn't list all segment types that needed
64-bit fixups), but the bug was irrelevant because it wasn't called
in any interesting context on 64-bit kernels and was only used for
data segents on 32-bit kernels.

To avoid confusion, make it explicitly 32-bit only.

Cc: Thomas Garnier <thgarnie@google.com>
Cc: Jim Mattson <jmattson@google.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-02-21 11:48:46 +01:00
Andy Lutomirski
e0c230634a x86/kvm/vmx: Don't fetch the TSS base from the GDT
The current CPU's TSS base is a foregone conclusion, so there's no need
to parse it out of the segment tables.  This should save a couple cycles
(as STR is surely microcoded and poorly optimized) but, more importantly,
it's a cleanup and it means that segment_base() will never be called on
64-bit kernels.

Cc: Thomas Garnier <thgarnie@google.com>
Cc: Jim Mattson <jmattson@google.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-02-21 11:48:40 +01:00
Andy Lutomirski
4f53ab1428 x86/asm: Define the kernel TSS limit in a macro
Rather than open-coding the kernel TSS limit in set_tss_desc(), make
it a real macro near the TSS layout definition.

This is purely a cleanup.

Cc: Thomas Garnier <thgarnie@google.com>
Cc: Jim Mattson <jmattson@google.com>
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-02-21 11:48:35 +01:00
Michael Roth
3dbbaf200f powerpc/pseries: Advertise Hot Plug Event support to firmware
With the inclusion of commit 333f7b7686 ("powerpc/pseries: Implement
indexed-count hotplug memory add") and commit 753843471c
("powerpc/pseries: Implement indexed-count hotplug memory remove"), we
now have complete handling of the RTAS hotplug event format as described
by PAPR via ACR "PAPR Changes for Hotplug RTAS Events".

This capability is indicated by byte 6, bit 2 (5 in IBM numbering) of
architecture option vector 5, and allows for greater control over
cpu/memory/pci hot plug/unplug operations.

Existing pseries kernels will utilize this capability based on the
existence of the /event-sources/hot-plug-events DT property, so we
only need to advertise it via CAS and do not need a corresponding
FW_FEATURE_* value to test for.

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-02-21 21:32:53 +11:00
Andrew Donnellan
171ed0fcd8 cxl: fix nested locking hang during EEH hotplug
Commit 14a3ae34bf ("cxl: Prevent read/write to AFU config space while AFU
not configured") introduced a rwsem to fix an invalid memory access that
occurred when someone attempts to access the config space of an AFU on a
vPHB whilst the AFU is deconfigured, such as during EEH recovery.

It turns out that it's possible to run into a nested locking issue when EEH
recovery fails and a full device hotplug is required.
cxl_pci_error_detected() deconfigures the AFU, taking a writer lock on
configured_rwsem. When EEH recovery fails, the EEH code calls
pci_hp_remove_devices() to remove the device, which in turn calls
cxl_remove() -> cxl_pci_remove_afu() -> pci_deconfigure_afu(), which tries
to grab the writer lock that's already held.

Standard rwsem semantics don't express what we really want to do here and
don't allow for nested locking. Fix this by replacing the rwsem with an
atomic_t which we can control more finely. Allow the AFU to be locked
multiple times so long as there are no readers.

Fixes: 14a3ae34bf ("cxl: Prevent read/write to AFU config space while AFU not configured")
Cc: stable@vger.kernel.org # v4.9+
Signed-off-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Acked-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-02-21 21:32:52 +11:00
Ingo Molnar
8a5897fec9 perf/core improvements and fixes:
New features:
 
 - Make -a/--all-cpus be the default target in 'perf record' and 'perf stat',
   just like it is with 'perf trace' (Jiri Olsa)
 
 - Introduce -q/--quiet to the 'annotate', 'diff' and 'report', fix up
   its behaviour in 'record'. This makes the output more compact by
   elliminating headers, leaving just the histogram lines (Namhyung Kim)
 
 Fixes:
 
 - Handle offline/absent CPUs (Jan Stancek)
 
 Infrastructure:
 
 - Filter out -specs=/a/b/c from CC options when building the python
   support, allowing that feature to be built with clang (Arnaldo Carvalho de Melo)
 
 - Fix DEBUG=1 build with clang (Arnaldo Carvalho de Melo)
 
 Trivial:
 
 - Fix spelling of 'preempt' in a libtraceevent function name (Steven Rostedt)
 
 Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iQIcBAABCAAGBQJYqzrTAAoJENZQFvNTUqpADCwP/inGA8OeQGTgZgt5FW/fcrAS
 eXZUy3GPtRl1UQ/8FyegwMgkNziaipr3UbND0+lpuqdamh9uDw25jtR/h5tKJQ1R
 ni11sHJJBJIkCJIMeuDRQMrOYf7vMp3tnVOoPsYc2ZLXh5F9ZvlFS2AcSffe6OaT
 dmISHcPDgpp2PepHm2hWDxsOaaH3NhM/SePJy8zaQfQe9duUwceAAYceIZDNE3gW
 vI8lDPXF5xeetELDe3Mv+kxcCMjI8QqlQBJbtxfBLHb68w3BFGYqzuJIUvxIEDtk
 YkupAht35VV7of2m0aCzZpi8/rr5HnmRif7VHP9a1CBTUlGpFH7f8mz4VNvBHUqi
 fp6ROOSfuQT6DqWBqXklsF4lxDBVbNNktdLfjrAgZYqOa0e0H6pr02PhW7YG9xSO
 dyHCWFDoU+8aqpUkWB4WcBNQCO/fgAb61LAHYq9IRUQNInOjCxeiHIiYOSzAHk9r
 S17ToUvS9eDph7/8dtF0IVLbfKQfCouO/09HLhkR49ftncJneS68dWwRs0h/iE5O
 +MXltscN91bJ8Dwm3zMa+gHPmy2HSudmHirphmcWNdo19biyTEGuBr89yL2bqmtC
 Z4cMjhYjOEiG78k7LLrvHFYDTyHtj0TD8lQRR27X8AYXOkv3fwXG1zO5qQrYCGMK
 5DN1TTl0EP7GUmfV9fMo
 =LWfb
 -----END PGP SIGNATURE-----

Merge tag 'perf-core-for-mingo-4.11-20170220' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/urgent

Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo:

New features:

 - Make -a/--all-cpus be the default target in 'perf record' and 'perf stat',
   just like it is with 'perf trace' (Jiri Olsa)

 - Introduce -q/--quiet to the 'annotate', 'diff' and 'report', fix up
   its behaviour in 'record'. This makes the output more compact by
   elliminating headers, leaving just the histogram lines (Namhyung Kim)

Fixes:

 - Handle offline/absent CPUs (Jan Stancek)

Infrastructure changes:

 - Filter out -specs=/a/b/c from CC options when building the python
   support, allowing that feature to be built with clang (Arnaldo Carvalho de Melo)

 - Fix DEBUG=1 build with clang (Arnaldo Carvalho de Melo)

Trivial changes:

 - Fix spelling of 'preempt' in a libtraceevent function name (Steven Rostedt)

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-02-21 09:05:29 +01:00
Zhang Rui
ef1d75263f Merge branch 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/evalenti/linux-soc-thermal into thermal-soc 2017-02-21 15:50:23 +08:00
Douglas Miller
5e48dc0aa4 powerpc/xmon: Dump memory in CPU endian format
Extend the dump command to allow display of 2, 4, and 8 byte words in
CPU endian format. Also adds dump command for "1 byte values" for the
sake of symmetry. New commands are:

  d1	dump 1 byte values
  d2	dump 2 byte values
  d4	dump 4 byte values
  d8	dump 8 byte values

Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Douglas Miller <dougmill@linux.vnet.ibm.com>
Acked-by: Balbir Singh <bsingharora@gmail.com>
2017-02-21 16:00:21 +11:00
Jason Wang
017b29c38c virito-net: set queues after reset during xdp_set
We set queues before reset which will cause a crash[1]. This is
because is_xdp_raw_buffer_queue() depends on the old xdp queue pairs
number to do the correct detection. So fix this by

- passing xdp queue pairs and current queue pairs to virtnet_reset()
- change vi->xdp_qp after reset but before refill, to make sure both
  free_unused_bufs() and refill can make correct detection of XDP.
- remove the duplicated queue pairs setting before virtnet_reset()
  since we will do it inside virtnet_reset()

[1]

[   74.328168] general protection fault: 0000 [#1] SMP
[   74.328625] Modules linked in: nfsd xfs libcrc32c virtio_net virtio_pci
[   74.329117] CPU: 0 PID: 2849 Comm: xdp2 Not tainted 4.10.0-rc7+ #499
[   74.329577] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.10.1-0-g8891697-prebuilt.qemu-project.org 04/01/2014
[   74.330424] task: ffff88007a894000 task.stack: ffffc90004388000
[   74.330844] RIP: 0010:skb_release_head_state+0x28/0x80
[   74.331298] RSP: 0018:ffffc9000438b8d0 EFLAGS: 00010206
[   74.331676] RAX: 0000000000000000 RBX: ffff88007ad96300 RCX: 0000000000000000
[   74.332217] RDX: ffff88007fc137a8 RSI: ffff88007fc0db28 RDI: 0001bf00000001be
[   74.332758] RBP: ffffc9000438b8d8 R08: 000000000005008f R09: 00000000000005f9
[   74.333274] R10: ffff88007d001700 R11: ffffffff820a8a4d R12: ffff88007ad96300
[   74.333787] R13: 0000000000000002 R14: ffff880036604000 R15: 000077ff80000000
[   74.334308] FS:  00007fc70d8a7b40(0000) GS:ffff88007fc00000(0000) knlGS:0000000000000000
[   74.334891] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   74.335314] CR2: 00007fff4144a710 CR3: 000000007ab56000 CR4: 00000000003406f0
[   74.335830] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[   74.336373] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[   74.336895] Call Trace:
[   74.337086]  skb_release_all+0xd/0x30
[   74.337356]  consume_skb+0x2c/0x90
[   74.337607]  free_unused_bufs+0x1ff/0x270 [virtio_net]
[   74.337988]  ? vp_synchronize_vectors+0x3b/0x60 [virtio_pci]
[   74.338398]  virtnet_xdp+0x21e/0x440 [virtio_net]
[   74.338741]  dev_change_xdp_fd+0x101/0x140
[   74.339048]  do_setlink+0xcf4/0xd20
[   74.339304]  ? symcmp+0xf/0x20
[   74.339529]  ? mls_level_isvalid+0x52/0x60
[   74.339828]  ? mls_range_isvalid+0x43/0x50
[   74.340135]  ? nla_parse+0xa0/0x100
[   74.340400]  rtnl_setlink+0xd4/0x120
[   74.340664]  ? cpumask_next_and+0x30/0x50
[   74.340966]  rtnetlink_rcv_msg+0x7f/0x1f0
[   74.341259]  ? sock_has_perm+0x59/0x60
[   74.341586]  ? napi_consume_skb+0xe2/0x100
[   74.342010]  ? rtnl_newlink+0x890/0x890
[   74.342435]  netlink_rcv_skb+0x92/0xb0
[   74.342846]  rtnetlink_rcv+0x23/0x30
[   74.343277]  netlink_unicast+0x162/0x210
[   74.343677]  netlink_sendmsg+0x2db/0x390
[   74.343968]  sock_sendmsg+0x33/0x40
[   74.344233]  SYSC_sendto+0xee/0x160
[   74.344482]  ? SYSC_bind+0xb0/0xe0
[   74.344806]  ? sock_alloc_file+0x92/0x110
[   74.345106]  ? fd_install+0x20/0x30
[   74.345360]  ? sock_map_fd+0x3f/0x60
[   74.345586]  SyS_sendto+0x9/0x10
[   74.345790]  entry_SYSCALL_64_fastpath+0x1a/0xa9
[   74.346086] RIP: 0033:0x7fc70d1b8f6d
[   74.346312] RSP: 002b:00007fff4144a708 EFLAGS: 00000246 ORIG_RAX: 000000000000002c
[   74.346785] RAX: ffffffffffffffda RBX: 00000000ffffffff RCX: 00007fc70d1b8f6d
[   74.347244] RDX: 000000000000002c RSI: 00007fff4144a720 RDI: 0000000000000003
[   74.347683] RBP: 0000000000000003 R08: 0000000000000000 R09: 0000000000000000
[   74.348544] R10: 0000000000000000 R11: 0000000000000246 R12: 00007fff4144bd90
[   74.349082] R13: 0000000000000002 R14: 0000000000000002 R15: 00007fff4144cda0
[   74.349607] Code: 00 00 00 55 48 89 e5 53 48 89 fb 48 8b 7f 58 48 85 ff 74 0e 40 f6 c7 01 74 3d 48 c7 43 58 00 00 00 00 48 8b 7b 68 48 85 ff 74 05 <f0> ff 0f 74 20 48 8b 43 60 48 85 c0 74 14 65 8b 15 f3 ab 8d 7e
[   74.351008] RIP: skb_release_head_state+0x28/0x80 RSP: ffffc9000438b8d0
[   74.351625] ---[ end trace fe6e19fd11cfc80b ]---

Fixes: 2de2f7f40e ("virtio_net: XDP support for adjust_head")
Cc: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-20 22:22:20 -05:00